Mathematical Foundations of Computer Science 2003, 28 conf., MFCS 2003

Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis, and J. van Leeuwen 2747 3 Berlin Heidelberg New Y...

Author: Branislav Rovan | Peter Vojtas

35 downloads 1329 Views 5MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis, and J. van Leeuwen

2747

3

Berlin Heidelberg New York Hong Kong London Milan Paris Tokyo

Branislav Rovan

Peter Vojt´asˇ (Eds.)

Mathematical Foundations of Computer Science 2003 28th International Symposium, MFCS 2003 Bratislava, Slovakia, August 25-29, 2003 Proceedings

13

Series Editors Gerhard Goos, Karlsruhe University, Germany Juris Hartmanis, Cornell University, NY, USA Jan van Leeuwen, Utrecht University, The Netherlands Volume Editors Branislav Rovan Comenius University Department of Computer Science 84248 Bratislava, Slovakia E-mail: [email protected] Peter Vojt´asˇ ˇ arik University P.J. Saf´ Department of Computer Science, Faculty of Science Jesenn´a 5, 04154 Koˇsice, Slovakia E-mail: [email protected]

Cataloging-in-Publication Data applied for A catalog record for this book is available from the Library of Congress Bibliographic information published by Die Deutsche Bibliothek Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data is available in the Internet at .

CR Subject Classification (1998): F., G.2, D.3, I.3, E.1 ISSN 0302-9743 ISBN 3-540-40671-9 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag Berlin Heidelberg New York a member of BertelsmannSpringer Science+Business Media GmbH http://www.springer.de © Springer-Verlag Berlin Heidelberg 2003 Printed in Germany Typesetting: Camera-ready by author, data conversion by Olgun Computergrafik Printed on acid-free paper SPIN 10929285 06/3142 543210

Preface

This volume contains papers selected for presentation at the 28th Symposium on Mathematical Foundations of Computer Science – MFCS 2003, held in Bratislava, Slovakia, August 25–29, 2003. MFCS 2003 was organized by the Slovak Society for Computer Science and the Comenius University in Bratislava, in cooperation with other institutions in Slovakia. It was supported by the European Association for Theoretical Computer Science and the Slovak Research Consortium for Informatics and Mathematics. The series of MFCS symposia, organized alternately in the Czech Republic, Poland and Slovakia since 1972, has a well-established tradition. The MFCS symposia encourage high-quality research in all branches of theoretical computer science. Their broad scope provides an opportunity to bring together specialists who do not usually meet at specialized conferences. The previous meetings took place ˇ in Jablonna, 1972; Strbsk´ e Pleso, 1973; Jadwisin, 1974; Mari´ ansk´e L´aznˇe, 1975; Gda` nsk, 1976; Tatransk´ a Lomnica, 1977; Zakopane, 1978; Olomouc, 1979; Ryˇ dzina, 1980; Strbsk´ e Pleso, 1981; Prague, 1984; Bratislava, 1986; Carlsbad, 1988; Por¸abka-Kozubnik, 1989; Bansk´ a Bystrica, 1990; Kazimierz Dolny, 1991; Prague, 1992; Gda` nsk, 1993, Koˇsice, 1994; Prague, 1995; Krak´ow, 1996; Bratislava, 1997; Brno, 1998; Szklarska Por¸eba, 1999; Bratislava, 2000; Mari´ansk´e L´aznˇe, 2001; and Warsaw-Otwock, 2002. The MFCS 2003 Proceedings consists of 7 invited papers and 55 contributed papers. The latter were selected by the Program Committee from a total of 137 submitted papers. The following program committee members took part in the evaluation and selection of submitted papers (those denoted by ∗ took part in the selection meeting in Bratislava on May 10, 2003): Julian Bradﬁeld∗ (Edinburgh), J´ anos Csirik (Szeged), Pierpaolo Degano (Pisa), Mariangiola Dezani-Ciancaglini∗ (Torino), Krzysztof Diks (Warsaw), Juhani Karhum¨ aki∗ (Turku), Marek Karpinski (Bonn), Mojm´ır Kˇret´ınsk´ y∗ (Brno), Werner Kuich (Vienna), Jan van Leeuwen (Utrecht), Christoph Meinel (Trier), Leszek Pacholski (Wroclaw), David Peleg∗ (Rehovot), Jos´e D.P. Rolim (Geneˇıma∗ va), Branislav Rovan∗ (Bratislava, Chair ), Jan Rutten (Amsterdam), Jiˇr´ı S´ (Prague), Paul Spirakis∗ (Patras), Ulrich Ultes-Nitsche (Fribourg), Peter Vojt´ aˇs∗ (Koˇsice, Vice-chair ), and Igor Walukiewicz (Bordeaux). We would like to thank all Program Committee members for their meritorious work in evaluating the submitted papers, as well as the following referees who assisted the Program Committee members: F. Ablayev, L. Aceto, C.J. van Alten, A. Ambainis, G. Andrejkov´ a, M. Andreou, F. Arbab, V. Auletta, S. Bala, F. Bartels, M. Bellia, Y. Benenson, L. Bertossi, B. Blanchet, N. Blum, H.L. Bodlaender, F.S. de Boer, M. Bojanczyk, M.M. Bonsangue, M. Boreale, A. Bucciarelli, H. Buhrman, C. Busch, ˇ ˇ R. Cada, O. Carton, I. Cern´ a, B. Chlebus, P. Chrzastowski-Wachtel, K. Ciebiera,

VI

Preface

A. Condon, F. Corradini, B. Courcelle, P. Crescenzi, M. Crochemore, A. Czumaj, C. Damm, J. Dassow, A. Di Pierro, V. Diekert, S. Dziembowski, P. Eli´ aˇs, L. Epstein, K. Etessami, E. Fachini, S. Fedin, M. Fellows, P. Flajolet, L. Fortnow, D. Fotakis, F. Franˇek, M.P. Frank, R. Freund, S. Fr¨ oschle, S. Fujita, Z. F¨ ul¨ op, F. Gadducci, G. Galbiati, A. Gambin, V. Geﬀert, R. Gennaro, M. GhasemZadeh, K. Golab, R. Govindan, G. Gramlich, S. Gruner, K. Grygiel, V. Halava, T. Harju, T. Hartman, M. Hauptmann, L.A. Hemaspaandra, M. Hermann, P. Hertling, E.A. Hirsch, M. Hirvensalo, J. Honkala, C.S. Iliopoulos, S. Irani, R. Irving, G.F. Italiano, P. Janˇcar, K. Jansen, G. Jir´ askov´a, J. Kari, D. Kavvadias, C. Kenyon, L. Kirousis, V. Klotz, G. Kortsarz, V. Koubek, O. Koval, L . Kowalik, D. Kowalski, S. Krajˇci, J. Kratochv´ıl, M. Krause, V. Kreinovich, A. Kuˇcera (Brno), A. Kuˇcera (Prague), M. Kuﬂeitner, A. Kulikov, M. Kunc, C. Kupke, P. K˚ urka, M. Kurowski, S. Lasota, R. Lencses, M. Lenisa, S. Leonardi, M. Lewenstein, C. Lhoussaine, U. de’Liguoro, L. Liquori, M. Li´skiewicz, J. Longley, M. Loreti, Z. Lotker, A. de Luca, B. Luttik, M. Maidl, Ch. Makris, A. Malinowski, A. Marchetti-Spaccamela, B. Martin, J. Matouˇsek, G. Mauri, M. Mavronicolas, F. Mera, F. Meyer auf der Heide, M. Mlotkowski, B. Monien, F. de Montgolﬁer, K. Morita, Ph. Moser, F. Mr´ az, M. Mucha, R. Neruda, R. Niedermeier, S. Nikoletseas, D. Niwinski, D. Nowotka, D. Oddoux, D. von Oheimb, V. van Oostrom, B. Palano, A. Panholzer, V. Papadopoulou, L. Parida, R. Paturi, G. Paun, R. Pel´ anek, A. Pelc, J. Pelik´ an, P. Penna, E. Petre, J.ˇ Porubsk´ E. Pin, W. Plandowski, M. Ploˇsˇcica, E. Porat, S. y, O. Powell, J. Power, ˇ ak, M. Repick´ M. Przybylski, P. Pudl´ ak, F. van Raamsdonk, V. Reh´ y, P. Rychlikowski, W. Rytter, P. Sankowski, V. Sassone, P. Savick´ y, Ch. Scheideler, V. Schillings, U. Sch¨ oning, N. Schweikardt, G. Semaniˇsin, M. Serna, J. Sgall, ˇ R. Silvestri, L. Skarvada, R. Solis-Oba, D. Sonowska, P. Sos´ık, D. Spielman, ˇ edr´ S. St James, I. Stark, A. Stˇ y, C. Stirling, J. Strejˇcek, T. Suel, G. Sutre, L . Sznuk, L. Tendera, Sh.-H. Teng, P. Tesson, D. Th´erien, W. Thomas, L. Trevisan, E. Tronci, T. Truderung, U. Vaccaro, M.Y. Vardi, F.-J. de Vries, T. Wale´ n, D. Walukiewicz-Chrzaszcz, I. Wegener, P. Weil, D. West, M. Westermann, ˇ ak, P. Widmayer, Th. Wilke, Th. Worsch, InSeon Yoo, Sh. Yu, R. Yuster, S. Z´ M. Zawadowski, H. Zhang, T. Zwissig. EATCS oﬀered a Best Student Paper Award for the best paper submitted to MFCS and authored solely by students. The Program Committee decided to give this award in 2003 to Gregor Gramlich (Institut f¨ ur Informatik, Johann Wolfgang G¨ othe-Universit¨ at, Frankfurt) for his paper “Probabilistic and Nondeterministic Unary Automata.” As the editors of these proceedings, we are much indebted to all contributors to the scientiﬁc program of the symposium, especially to the authors of papers. Special thanks go to those authors who prepared the manuscripts according to the instructions and made life easier for us. We would also like to thank those who responded promptly to our requests for minor modiﬁcations and corrections in their manuscript. The database and electronic support system for the Program Committee was designed by Miroslav Chladn´ y who, together with Miroslav Zervan, made everything run smoothly. Our special thanks go to Miroslav Chladn´ y

Preface

VII

for most of the hard technical work in preparing this volume. We are also thankful to the members of the Organizing Committee who made sure that the conference ran smoothly in a pleasant environment. Last, but not least, we want to thank Springer-Verlag for excellent co-operation in the publication of this volume.

Bratislava, June 2003

Branislav Rovan Peter Vojt´ aˇs

VIII

Preface

Organized by Slovak Society for Computer Science Faculty of Mathematics, Physics and Informatics, Comenius University, Bratislava

Supported by European Association for Theoretical Computer Science Slovak Research Consortium for Informatics and Mathematics

Program Committee

Julian Bradﬁeld (Edinburgh), J´ anos Csirik (Szeged), Pierpaolo Degano (Pisa), Mariangiola Dezani-Ciancaglini (Torino), Krzysztof Diks (Warsaw), Juhani Karhum¨ aki (Turku), Marek Karpinski (Bonn), Mojm´ır Kˇret´ınsk´ y (Brno), Werner Kuich (Vienna), Jan van Leeuwen (Utrecht), Christoph Meinel (Trier), Leszek Pacholski (Wroclaw), David Peleg (Rehovot), Jos´e D.P. Rolim (Geneva), ˇıma Branislav Rovan (Bratislava, Chair ), Jan Rutten (Amsterdam), Jiˇr´ı S´ (Prague), Paul Spirakis (Patras), Ulrich Ultes-Nitsche (Fribourg), Peter Vojt´ aˇs (Koˇsice, Vice-chair ), and Igor Walukiewicz (Bordeaux)

Organizing Committee Miroslav Chladn´ y, Vanda Hamb´ alkov´ a, Rastislav Kr´ aloviˇc, Zuzana Kubincov´ a, Marek Nagy, Martin Neh´ez, Dana Pardubsk´ a (Chair ), Edita Riˇc´anyov´ a, Branislav Rovan, Miroslav Zervan

Table of Contents

Invited Talks Distributed Quantum Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Harry Buhrman and Hein R¨ ohrig

1

Selﬁsh Routing in Non-cooperative Networks: A Survey . . . . . . . . . . . . . . . . 21 R. Feldmann, M. Gairing, Thomas L¨ ucking, Burkhard Monien, and Manuel Rode Process Algebraic Frameworks for the Speciﬁcation and Analysis of Cryptographic Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Roberto Gorrieri and Fabio Martinelli Semantic and Syntactic Approaches to Simulation Relations . . . . . . . . . . . . . 68 Jo Hannay, Shin-ya Katsumata, and Donald Sannella On the Computational Complexity of Conservative Computing . . . . . . . . . . 92 Giancarlo Mauri and Alberto Leporati Constructing Inﬁnite Graphs with a Decidable MSO-Theory . . . . . . . . . . . . . 113 Wolfgang Thomas Towards a Theory of Randomized Search Heuristics . . . . . . . . . . . . . . . . . . . . 125 Ingo Wegener

Contributed Papers Adversarial Models for Priority-Based Networks . . . . . . . . . . . . . . . . . . . . . . . 142 ` C. Alvarez, M. Blesa, J. D´ıaz, A. Fern´ andez, and M. Serna On Optimal Merging Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152 Kazuyuki Amano and Akira Maruoka Problems which Cannot Be Reduced to Any Proper Subproblems . . . . . . . . 162 Klaus Ambos-Spies ACID-Uniﬁcation Is NEXPTIME-Decidable . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Siva Anantharaman, Paliath Narendran, and Michael Rusinowitch Completeness in Diﬀerential Approximation Classes . . . . . . . . . . . . . . . . . . . . 179 G. Ausiello, C. Bazgan, M. Demange, and V. Th. Paschos On the Length of the Minimum Solution of Word Equations in One Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189 Kensuke Baba, Satoshi Tsuruta, Ayumi Shinohara, and Masayuki Takeda

X

Table of Contents

Smoothed Analysis of Three Combinatorial Problems . . . . . . . . . . . . . . . . . . . 198 Cyril Banderier, Ren´e Beier, and Kurt Mehlhorn Inferring Strings from Graphs and Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Hideo Bannai, Shunsuke Inenaga, Ayumi Shinohara, and Masayuki Takeda Faster Algorithms for k-Medians in Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 Robert Benkoczi, Binay Bhattacharya, Marek Chrobak, Lawrence L. Larmore, and Wojciech Rytter Periodicity and Transitivity for Cellular Automata in Besicovitch Topologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228 F. Blanchard, J. Cervelle, and E. Formenti Starting with Nondeterminism: The Systematic Derivation of Linear-Time Graph Layout Algorithms . . . . 239 Hans L. Bodlaender, Michael R. Fellows, and Dimitrios M. Thilikos Error-Bounded Probabilistic Computations between MA and AM . . . . . . . . 249 Elmar B¨ ohler, Christian Glaßer, and Daniel Meister A Faster FPT Algorithm for Finding Spanning Trees with Many Leaves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Paul S. Bonsma, Tobias Brueggemann, and Gerhard J. Woeginger Symbolic Analysis of Crypto-Protocols Based on Modular Exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 Michele Boreale and Maria Grazia Buscemi Denotational Testing Semantics in Coinductive Form . . . . . . . . . . . . . . . . . . . 279 Michele Boreale and Fabio Gadducci Lower Bounds for General Graph–Driven Read–Once Parity Branching Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 Henrik Brosenne, Matthias Homeister, and Stephan Waack The Minimal Graph Model of Lambda Calculus . . . . . . . . . . . . . . . . . . . . . . . . 300 Antonio Bucciarelli and Antonino Salibra Unambiguous Automata on Bi-inﬁnite Words . . . . . . . . . . . . . . . . . . . . . . . . . . 308 Olivier Carton Relating Hierarchy of Temporal Properties to Model Checking . . . . . . . . . . . 318 ˇ a and Radek Pel´ Ivana Cern´ anek Arithmetic Constant-Depth Circuit Complexity Classes . . . . . . . . . . . . . . . . . 328 Hubie Chen

Table of Contents

XI

Inverse NP Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338 Hubie Chen A Linear-Time Algorithm for 7-Coloring 1-Planar Graphs . . . . . . . . . . . . . . . 348 Zhi-Zhong Chen and Mitsuharu Kouno Generalized Satisﬁability with Limited Occurrences per Variable: A Study through Delta-Matroid Parity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358 Victor Dalmau and Daniel K. Ford Randomized Algorithms for Determining the Majority on Graphs . . . . . . . . 368 Gianluca De Marco and Andrzej Pelc Using Transitive–Closure Logic for Deciding Linear Properties of Monoids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378 Christian Delhomm´e, Teodor Knapik, and D. Gnanaraj Thomas Linear-Time Computation of Local Periods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388 Jean-Pierre Duval, Roman Kolpakov, Gregory Kucherov, Thierry Lecroq, and Arnaud Lefebvre Two Dimensional Packing: The Power of Rotation . . . . . . . . . . . . . . . . . . . . . 398 Leah Epstein Approximation Schemes for the Min-Max Starting Time Problem . . . . . . . . 408 Leah Epstein and Tamir Tassa Quantum Testers for Hidden Group Properties . . . . . . . . . . . . . . . . . . . . . . . . 419 Katalin Friedl, Fr´ed´eric Magniez, Miklos Santha, and Pranab Sen Local LTL with Past Constants Is Expressively Complete for Mazurkiewicz Traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 Paul Gastin, Madhavan Mukund, and K. Narayan Kumar LTL with Past and Two-Way Very-Weak Alternating Automata . . . . . . . . . 439 Paul Gastin and Denis Oddoux Match-Bounded String Rewriting Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449 Alfons Geser, Dieter Hofbauer, and Johannes Waldmann Probabilistic and Nondeterministic Unary Automata . . . . . . . . . . . . . . . . . . . 460 Gregor Gramlich On Matroid Properties Deﬁnable in the MSO Logic . . . . . . . . . . . . . . . . . . . . 470 Petr Hlinˇen´y Characterizations of Catalytic Membrane Computing Systems . . . . . . . . . . . 480 Oscar H. Ibarra, Zhe Dang, Omer Egecioglu, and Gaurav Saxena

XII

Table of Contents

Augmenting Local Edge-Connectivity between Vertices and Vertex Subsets in Undirected Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 490 Toshimasa Ishii and Masayuki Hagiwara Scheduling and Traﬃc Allocation for Tasks with Bounded Splittability . . . . 500 Piotr Krysta, Peter Sanders, and Berthold V¨ ocking Computing Average Value in Ad Hoc Networks . . . . . . . . . . . . . . . . . . . . . . . . 511 Miroslaw Kutylowski and Daniel Letkiewicz A Polynomial-Time Algorithm for Deciding True Concurrency Equivalences of Basic Parallel Processes . . . 521 Slawomir Lasota Solving the Sabotage Game Is PSPACE-Hard . . . . . . . . . . . . . . . . . . . . . . . . . 531 Christof L¨ oding and Philipp Rohde The Approximate Well-Founded Semantics for Logic Programs with Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 Yann Loyer and Umberto Straccia Which Is the Worst-Case Nash Equilibrium? . . . . . . . . . . . . . . . . . . . . . . . . . . 551 Thomas L¨ ucking, Marios Mavronicolas, Burkhard Monien, Manuel Rode, Paul Spirakis, and Imrich Vrto A Unique Decomposition Theorem for Ordered Monoids with Applications in Process Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 562 Bas Luttik Generic Algorithms for the Generation of Combinatorial Objects . . . . . . . . . 572 Conrado Mart´ınez and Xavier Molinero On the Complexity of Some Problems in Interval Arithmetic . . . . . . . . . . . . 582 K. Meer An Abduction-Based Method for Index Relaxation in Taxonomy-Based Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592 Carlo Meghini, Yannis Tzitzikas, and Nicolas Spyratos On Selection Functions that Do Not Preserve Normality . . . . . . . . . . . . . . . . 602 Wolfgang Merkle and Jan Reimann On Converting CNF to DNF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612 Peter Bro Miltersen, Jaikumar Radhakrishnan, and Ingo Wegener A Basis of Tiling Motifs for Generating Repeated Patterns and Its Complexity for Higher Quorum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622 N. Pisanti, M. Crochemore, R. Grossi, and M.-F. Sagot

Table of Contents

XIII

On the Complexity of Some Equivalence Problems for Propositional Calculi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 632 Steﬀen Reith Quantiﬁed Mu-Calculus for Control Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . 642 St´ephane Riedweg and Sophie Pinchinat On Probabilistic Quantiﬁed Satisﬁability Games . . . . . . . . . . . . . . . . . . . . . . . 652 Marcin Rychlik A Completeness Property of Wilke’s Tree Algebras . . . . . . . . . . . . . . . . . . . . . 662 Saeed Salehi Symbolic Topological Sorting with OBDDs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671 Philipp Woelfel Ershov’s Hierarchy of Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 681 Xizhong Zheng, Robert Rettinger, and Romain Gengler

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691

Distributed Quantum Computing Harry Buhrman and Hein R¨ ohrig Centrum voor Wiskunde en Informatica (CWI) P.O. Box 94079, 1090 GB Amsterdam, The Netherlands [email protected], [email protected]

Abstract. Quantum computing combines the framework of quantum mechanics with that of computer science. In this paper we give a short introduction to quantum computing and survey the results in the area of distributed quantum computing and its applications to physics.

1

Introduction

Computing is a physical process and therefore the theory of computing should incorporate the laws of physics. Quantum mechanics, developed during the last century, is to date the most accurate description of nature. Quantum computing is the area that combines the laws of quantum mechanics with computer science. In this paper we give a short introduction to quantum computing and its formalism; for a more detailed treatment of this we refer the reader to the excellent textbook of Nielsen and Chuang [46]. Quantum bits or “qubits” are the basic building blocks for quantum computers. As was shown already in the seventies by Holevo [30], qubits cannot be used to compress messages better than with bits. In general, a k-bit message needs also k qubits to be stored or sent over a channel. Qubits can, however, reduce the communication of certain distributed computational tasks, as was ﬁrst demonstrated in [22] and subsequent papers, among them [17,23,21,48,7]. We survey some of these results here. The ﬁrst result of a cheaper quantum than classical communication protocol [22] was inspired by nonlocality experiments constructed by physicist in order to test the strange and nonlocal behavior of entanglement. In 1935, Einstein, Podolsky, and Rosen devised a thought experiment that sought to show how quantum mechanics is incomplete because it would allow for some form of faster-than-light communication. Much later, in 1964, Bell [10] came up with an experimental way of testing the nonlocal behavior of quantum mechanics. These tests and the so-called Bell inequalities lead to experiments [9], that seem to demonstrate the nonlocality of quantum mechanics. However, these tests suﬀered from the drawback that implementations in the lab are error prone and sometimes do not give the right outcome or none at all. When the classical local theory is also allowed to make such errors, it can be shown that the nonlocality tests can also be explained by classical

Supported in part by the EU ﬁfth framework project RESQ, IST-2001-37559, and NWO grant 612.055.001.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 1–20, 2003. c Springer-Verlag Berlin Heidelberg 2003

2

Harry Buhrman and Hein R¨ ohrig

physics, and do not demonstrate the nonlocal behavior of quantum mechanics at all! In this paper we survey how the results obtained in quantum communication complexity can be used to propose nonlocality experiments that are robust against errors and would for the ﬁrst time demonstrate conclusively nonlocality. The paper is organized as follows. In Sect. 2 we give a short introduction to quantum mechanics and the notation used in this paper. In Sect. 3 we describe the quantum analogue of the black-box model of computation and describe one of the ﬁrst quantum algorithms, due to Deutsch and Jozsa. Section 4 surveys some of the results in distributed computing.

2

Quantum Mechanics and Computing

One of the main and very counterintuitive features of quantum mechanics is the superposition principle. A physical system may be in a superposition of two or more diﬀerent states at the same time. Quantum mechanics prescribes that when we observe such a system we see one of these states with a certain probability resulting in a collapse of the system into the state that we observed. 2.1

Qubits, Superposition, and Measurement

Let us concentrate now to computation. Classically a bit can be in any of two states: 0 or 1. Quantum mechanically a quantum bit or qubit may be in a superposition of both 0 and 1. It is useful to describe such systems as vectors in a ﬁnite-dimensional Hilbert space, in this case a two-dimensional one. We 1 will identify the vector with the symbol |0 to denote the classical bit 0 0 0 and vector with the symbol |1 denoting the classical bit 1. This notation 1 is called Dirac or “ket” notation, from “bra-ket.” The “bra” is | and a|b denotes the inner product between a and b. Quantum mechanics now allows for a superposition of these two classical states: α α|0 + β|1 = , (1) β where α and β, called amplitudes, are complex numbers with the property that |α|2 + |β|2 = 1 .

(2)

Next, observing or measuring a qubit α|0 + β|1 will yield outcome 0 with probability |α|2 and 1 with probability |β|2 . Moreover, after this measurement the qubit is either in the classical state |0 if we measured a 0, and in |1 if we measured a 1. Note that equation (2) guarantees that a qubit, when measured, indeed induces a probability distribution over 0 and 1. Let us try to plug in some values for α and β: 1 1 √ |0 + √ |1 2 2

(3)

Distributed Quantum Computing

3

Observing this qubit will result with probability 0.5 in seeing a 0 and with probability 0.5 in a 1. In general, our system will consist of more than just one qubit. Equations (1) and (2) generalize in the obvious way. Suppose we want to model k qubits. Classically k bits can be in any of 2k diﬀerent conﬁgurations: 1 . . . 2k . This means that k qubits can be in a superposition of all (or part) of these 2k basis states: k

k

α1 |00 . . . 0 + . . . + α2k |11 . . . 1 =

αi |i

(4)

i∈{0,1}k

with the additional requirement that

|αi |2 = 1 .

(5)

i∈{0,1}k

When observing these k qubits we will see i with probability |αi |2 . If we have two qubits |x = α0 |0+α1 |1 and |y = β0 |0+β1 |1 then |x⊗|y are the two qubits in a four-dimensional Hilbert space. This construction is called the tensor or Kronecker product: |x ⊗ |y = (α0 |0 + α1 |1) ⊗ (β0 |0 + β1 |1) = α0 β0 |00 + α0 β1 |01 + α1 β0 |10 + α1 β1 |11. By convention, |0 ⊗ |0, |0|0, and |00 denote the same thing. In general not all the two-qubit states that satisfy (4) and (5) are obtained as the tensor of two qubits. We will see an important example, the EPR pair, in Subsect. 2.3. Such states are called entangled. 2.2

Unitary Operations

Next we would like to model operations on qubits. Quantum mechanics tells us that these operation have to be modeled as linear operations with the additional constraint that these operations preserve the probability interpretation, i.e., the squares of the amplitudes sum up to 1 (see (2) and (5)). Such transformations are called unitary; they are the square matrices U that satisfy UU∗ = I , where U ∗ is the complex conjugate transpose of U and I is the identity matrix. In terms of computation, the unitarity constraint implies that the computation is reversible. The following transformation on a single qubit is important and very useful. It is called the Hadamard transform. 1 1 1 H=√ (6) 2 1 −1 It is a unitary operation since:

4

Harry Buhrman and Hein R¨ ohrig

1 1 1 1 1 1 10 √ ·√ = 01 2 1 −1 2 1 −1 Now let us do a Hadamard operation on a qubit that is in the classical state |0: 1 1 1 1 1 1 √ · =√ 0 2 1 −1 2 1 This is in ket notation √12 |0 + √12 |1, which is the random qubit from (3). When we apply the Hadamard transform again on this qubit, 1 1 1 1 1 1 1 1 + √ , (7) ·√ = 21 21 = 0 1 −1 1 − 2 2 2 2 we get the |0 again. The important point is the minus sign in the Hadamard transform. Its eﬀect is illustrated in (7) above. The minus sign caused the 12 − 12 in the lower half of the vector to cancel out, or interfere destructively, while both terms in the upper half interfered constructively. It is the superposition principle together with this interference behavior that gives quantum computing its power. The tensor product is also deﬁned on linear operations. If we have an m × n matrix A and an m × n matrix B then A ⊗ B is a (m · m ) × (n · n ) matrix deﬁned as:   a1,1 · B a1,2 · B . . . a1,n · B  a2,1 · B a2,2 · B . . . a2,n · B     .. .. . . ..   . . . . am,1 · B am,2 · B . . . am,n · B 2.3

Einstein-Podolsky-Rosen Paradox

In Sect. 2.1 we have seen that any set of k qubits is admissible if it satisﬁes (4) and (5). Bearing this in mind let us examine the following state consisting out of 2 qubits: 1 1 √ |00 + √ |11 (8) 2 2 Note that the ﬁrst 0 and the ﬁrst 1 form the ﬁrst qubit and the second 0 and the second 1 form the second qubit. This state is called the “EPR state” after its inventors Einstein, Podolsky, and Rosen [25]. The purpose of this state was to devise a thought experiment to show the incompleteness of quantum mechanics. Imagine that we have this EPR state and that Alice has the ﬁrst qubit somewhere on Mars and that Bob has the second, say, here on earth. If Alice measures her qubit she will see a 0 or a 1 with equal probability and the state will have collapsed to either |00, if she saw a 0 or |11 in case it was a 1. The same is true for Bob. This leads to the following situation. Suppose that the ﬁrst qubit, on Mars, was measured ﬁrst and that Alice saw a 1. This now means that when Bob measures his qubit he will also measure a 1. It appears that some information, i.e., the outcome of Alice’s measurement, has somehow traveled to

Distributed Quantum Computing

5

earth instantaneously. Since nothing can travel faster than the speed of light something must be wrong. It turns out that EPR pairs cannot be used for communication: straightforward arithmetic shows that the probabilities of Bob obtaining a certain measurement outcome are not changed no matter what Alice does. However, they can be used to reduce communication complexity as we are going to see in Sect. 4.2. Classical bits can be copied. Qubits on the other hand can not be copied. Theorem 1. [24,52] Qubits cannot be copied The reason for this is that the copy-qubit operation is not linear and, hence, not unitary. Suppose we had a linear operation Uc that would copy a qubit. This means on state (α|0 + β|1) ⊗ |0 it would do the following: Uc [(α|0 + β|1) ⊗ |0] = (α|0 + β|1) ⊗ (α|0 + β|1)

(9)

= α |00 + αβ|01 + αβ|10 + β |11 2

2

(10)

On the other hand, since Uc is linear and because (α|0 + β|1) ⊗ |0 = α|00 + β|10: Uc [α|00 + β|10] = α|00 + β|11 (11) It is clear that (10) and (11) are the same if and only if α = 1 and β = 0 or α = 0 and β = 1. This is precisely the case if we have a classical 0 or 1. Hence, there cannot be a linear operation that copies an arbitrary unknown qubit. Now imagine that Alice has an unknown qubit |x = α|0 + β|1 that she wants to send to Bob and that she furthermore can only communicate using classical bits. Is it in this case possible for Alice to communicate |x to Bob? In the light of the no-cloning Theorem 1 it certainly is impossible to do this since whenever she measures x she will destroy/collapse it to a classical bit and she cannot copy it ﬁrst. But suppose that Alice and Bob in addition each share one half of an EPR pair (8). The surprising observation is that there is a scheme that allows Alice to send or “teleport” |x to Bob using only 2 classical bits [11]. In operational terms, the scheme works as follows. Let φ+ be the ﬁrst part of an EPR pair and φ− the other half. That is, φ+ is the ﬁrst bit of √12 [|00 + |11] and φ− the second bit. Alice has φ+ and Bob has φ− . At some point Alice gets the unknown qubit |x = α|0 + β|1. She now does a unitary operation1 on the two qubits, i.e., φ+ and x. Then she measures these two qubits, obtaining two bits: 00, 01, 10, or 11. Next she send these two bits to Bob, who depending on the two bits, does one of four unitary operations on his φ− . It turns out that this last unitary operation changes2 φ− into the unknown qubit |x. After the protocol, the EPR pair is destroyed, so in order to repeat this procedure a fresh EPR pair is needed. 1 2

The unitary operation is a controlled-not of x on φ+ , followed by a Hadamard on x. In fact after the controlled-not and the Hadamard transform of Alice, it follows that their joint state is: |00(α|0+β|1)+|01(α|1+β|0)+|10(α|0−β|1)+|11(α|1− β|0). This means that after Alice does her measurement, the third bit, i.e., φ− , is the unknown qubit x up to a possible bit ﬂip and/or phase shift depending on the outcome of Alice’s measurement.

6

Harry Buhrman and Hein R¨ ohrig

The important point for communication complexity is that this teleportation scheme is a way to simulate a qubit channel between Alice and Bob with a classical channel, at the cost of two bits per qubit, whenever Alice and Bob share EPR pairs. Theorem 2. [11] When Alice and Bob share EPR pairs, they can simulate a qubit channel with a classical bit channel at the cost of two classical bits per qubit.

3

Quantum Black-Box Computation

Perhaps the simplest form of a computational task is the following. Suppose we have n Boolean variables X0 , . . . , Xn−1 , and we want to compute a property P (X0 , . . . , Xn−1 ). The goal is to compute P looking at as few variables as possible. For example, suppose P (X0 , . . . , Xn−1 ) = 1 iﬀ there exists an i such that Xi = 1. That is, we want to compute the OR(X0 , . . . , Xn−1 ). How many variables do we have to query? It is not too hard to see that we have to look at all the variables. A similar kind of reasoning shows that also in the randomized setting the bound is Ω(n). It has been shown by Grover [29] that a quantum √ algorithm can solve the OR with only O( n) quantum queries. Next we will turn our attention to another problem that allows even an exponential speedup. Deﬁne the following promise on the variables. We are guaranteed that they are either constant (i.e., all the Xi are either all 0 or all 1) or they are balanced: exactly half the Xi are 0 and the other half is 1. The problem is to ﬁnd out whether the variables are constant or balanced. It is easy to see that classically this problem requires n/2 + 1 queries to the variables. One of the ﬁrst quantum algorithms, by Deutsch and Jozsa [33], establishes that this problem can be solved with just a single quantum query! Before we explain this algorithm we ﬁrst have to explain how we model a quantum query. Quantum Query. We have to model a quantum query in such a way that it is a unitary operation. We deﬁne a quantum query to variable Xi as follows. The state |i, 0 becomes after the query |i, Xi and |i, 1 becomes |i, 1 − Xi . That is, for 1 ≤ i ≤ n and b ∈ {0, 1} : |i, b → |i, b ⊕ Xi Since this describes what a query does on basis states, because of linearity it also works on states that are in superposition: αi,b |i, b → αi,b |i, b ⊕ Xi . (12) i∈{0,1}log(n) ,b∈{0,1}

i∈{0,1}log(n) ,b∈{0,1}

It can be easily checked that this operation is unitary.

Distributed Quantum Computing

7

The Deutsch-Jozsa Algorithm. Suppose n is a power of 2 and l = log n. We start in a state with l 0s followed by a 1: |0l 1 Remember the Hadamard transform H on one qubit from (6). We do a Hadamard transform on all the qubits of the state, i.e., the following operation l+1

H ⊗ H ⊗ . . . ⊗ H = H ⊗l+1 . This will result in the following state: 1 √ n

1 |i √ (|0 − |1) 2 i∈{0,1}l

(13)

Then we perform the only quantum query. This will aﬀect our state according to (12) as follows: 1 1 √ (−1)Xi |i √ (|0 − |1) (14) n 2 l i∈{0,1}

To see that this is correct, ﬁrst observe that we perform the quantum query with the target qubit in superposition (|0 − |1) This means that |i √12 (|0 − |1) after the query becomes |i √12 (|0 ⊕ Xi − |1 ⊕ Xi ). Furthermore, if Xi is 0 then this is simply |i √12 (|0 − |1); on the other hand if Xi = 1 then it becomes |i √12 (|1 − |0), which is the same as (−1)|i √12 (|0 − |1). Hence, we get a factor of −1 iﬀ Xi = 1. Next we apply again H ⊗l+1 to the state and obtain the following messy-looking expression: 1 √ n

i∈{0,1}l

1 √ n

(−1)Xi ⊕(i,j) |j|1 ,

(15)

j∈{0,1}l

where (i, j) is the inner-product between i and j modulo 2. Let us take a closer look at the part of this sum where j = 0l : 1 n

(−1)Xi |0l |1

(16)

i∈{0,1}l

Suppose that all the X i = 0 and we are in the case “variables constant 0.” Then (16) boils down to: n1 i∈{0,1}l |0l |1 = |0l 1. For the “constant 1” case we will end up in (−1)|0l 1. This means that when we observe the ﬁnal state in (15), we will see 0l 1 with probability 1. On the other hand, if half of the Xi = 1 and the other half are 0, then half of the terms in (16) are 1 and the other half are −1 and cancel each other out. The result of this is that |0l 1 has amplitude 0 and will be seen with probability 0. So by observing state (15) we can conclude that if we observe 0l 1 we are in the constant case and if we observe anything else we are in the balanced case.

8

4 4.1

Harry Buhrman and Hein R¨ ohrig

Applications in Distributed Computing Communication

One of the main themes in quantum information processing is to extend classical communication and communication schemes with quantum ones. Here we will consider three models of quantum communication and compare them with classical communication. 1. Communication is done with qubits. 2. Both parties share EPR pairs but communication is done via a classical-bit channel. 3. Both parties share EPR pairs and communication is done with qubits. The most simple form of communication is where Alice wants to send a message m of say k bits to Bob. We know that classically in general Alice needs to send k bits to Bob. Is this still true in the setting 1, 2, and 3? It follows from a theorem of Holevo [30] that when only qubits are used for communication Alice still needs to send k qubits. Moreover Cleve et al. [23] show that the same is true when both parties share EPR-pairs and classical communication is used. For the third variant, where both EPR pairs and qubits are used, things are slightly diﬀerent. Bennett and Wiesner [12] show that in this case there is a kind of a reverse of Theorem 2. This is a scheme, called super-dense coding, that allows Alice to send two classical bits with one qubit to Bob provided they share an EPR pair. It can be shown that, like Holevo’s theorem, this is optimal. 4.2

Communication Complexity

Communication Complexity was introduced by Yao and Abelson [2,53]. Alice has an n-bit string x and Bob has an n-bit string y and their goal is to compute some function f : {0, 1}n × {0, 1}n → {0, 1}, minimizing the number of bits they communicate to each other. The area of communication complexity is well studied, see for example the books by Kushilevitz and Nisan [37] and Hromkoviˇc [32]. The question we want to address here is: how does the communication complexity of certain problems vary when diﬀerent models of quantum communication are used. We will denote C(f ) to denote the classical communication complexity of f . That is the number of bits the optimal protocol uses on the worst-case input. The model where only qubits can be used for communication (model 1, Sect. 4.1) was introduced by Yao [54]. We will use Q(f ) for the quantum communication complexity in the model where only qubits are used for communication. The ﬁrst results in that model were lower bounds or impossibility results due to Yao and Kremer [36] and we will discuss them in Sect. 3. The model where the communication is classical but both parties share entanglement, model 2, was introduced by Cleve and Buhrman [22]. We will denote the communication complexity in this model with C ∗ (f ), the model which uses both EPR pairs and qubits will be Q∗ (f ). Cleve and Buhrman were the ﬁrst to show that communication complexity can be reduced contrary to what one might

Distributed Quantum Computing

9

believe considering Holevo’s theorem. Their setting diﬀered slightly from the models we discuss here. In this setting they exhibit an example of a three party communication problem where the three parties share an entangled state, like an EPR pair but then for three parties. It is shown that when the parties share this entangled state the communication problem can be solved with two bits of communication whereas without such a prior shared state three bits are necessary. That is, there is a function f such that C ∗ (f ) = 2 whereas C(f ) ≥ 3. Better separations in the multiparty setting were found in [15] and [21]. The latter paper exhibits a function f for k parties such that C ∗ (f ) = k and C(f ) = Ω(k log(k)). Next we will turn our attention to the qubit communication model Q(f ). However, keep in mind that protocols for this model can be translated to the model where both parties share EPR pairs and communicate classically, since via teleportation, Theorem 2 gives us: C ∗ (f ) ≤ 2Q(f ). Deutsch-Jozsa Communication Problem. The ﬁrst gap for two-party qubit communication complexity was demonstrated by Buhrman, Cleve, and Wigderson [17]. They showed for a promise version of the equality problem3 , EQ , that Q(EQ ) = O(log(n)) and that also C(EQ ) = Ω(n). This exhibits an exponential gap between classical and quantum communication complexity. The quantum protocol is inspired by the Deutsch-Jozsa algorithm from Sect. 3 and the classical bound stems from a deep and surprising combinatorial theorem from Frankl and R¨ odl [27]. EQ (x, y) = 1 iﬀ x = y but with the extra promise that it will always be the case that the Hamming distance ∆(x, y) = 0 or n/2. The Hamming distance between two strings x and y, ∆(x, y), is the total number of bits where x and y are diﬀerent. We will see that EQ can be solved with just log(n) + 1 qubits of communication from Bob to Alice. Note that under the Hamming distance promise, Alice and Bob have to ﬁgure out whether x1 ⊕ y1 . . . xn ⊕ yn is constant or balanced, since in the constant 0 case x = y and in the balanced x = y. So if we set Xi = xi ⊕ yi then we have the Deutsch-Jozsa problem back. If Alice could obtain the ﬁnal state from equation (15), 1 √ n

i∈{0,1}l

1 √ n

(−1)Xi ⊕(i,j) |j|1 ,

j∈{0,1}l

she would do a ﬁnal measurement and know the answer. To this end Bob prepares the following state: 1 √ n 3

1 |i √ (|0 ⊕ yi − |1 ⊕ yi ) 2 i∈{0,1}l

EQ(x, y) = 1 if x = y and 0 otherwise. EQ requires n bits of communication. A promise version of a problem means that Alice and Bob are only required to compute the answer correctly on certain instances that fall within the promise and it doesn’t matter what they compute on the other instances that don’t satisfy the promise.

10

Harry Buhrman and Hein R¨ ohrig

and sends these log(n) + 1 qubits to Alice. Alice then performs the unitary transformation that changes state |i|b to |i|b ⊕ xi resulting in state: 1 √ n

1 |i √ (|0 ⊕ yi ⊕ xi − |1 ⊕ yi ⊕ xi ) 2 i∈{0,1}l

which is after we rewrite it precisely the state from (14): 1 √ n

1 (−1)Xi |i √ (|0 − |1) 2 i∈{0,1}l

Next Alice proceeds as in the Deutsch-Josza algorithm and applies H ⊗ log(n)+1 and measures the ﬁnal state. The general idea is to use a quantum black-box algorithm in a distributed setting. Whenever the black-box algorithm wants to make a query, Alice and Bob exchange a round of log(n) + 1 qubits and Alice continues the black-box algorithm. This allows one in general to use any black-box algorithm as a communication protocol. In this way it can be shown that, by using √ Grover’s algorithm [29] the Disjointness problem can be solved with O( n log(n)) many qubits [17]. Bounded-Error Protocols. All the above (quantum) protocols don’t make errors and compute the outcome exactly. When studying randomized versions of communication complexity, however, it is unavoidable to introduce errors. A classical randomized protocol for f , R2 (f ), is a protocol where both Alice and Bob can use random bits. They are required to compute the correct outcome with probability at least 2/3. The distinction between private and public random bits can be made, where in the public bit/coin model Alice and Bob see the same random bits and in the private they each have a diﬀerent random source. Newman [45] has shown that up to an additive logarithmic term the models are the same. Rabin and Yao show for EQ that there exists a classical randomized protocol that only needs O(log(n)) bits: R2 (EQ) = O(log(n)). This implies that the promise problem EQ also has a O(log(n)) randomized classical bit protocol that is correct with probability at least 2/3. Note, however, that the quantum protocol never makes an error. The disjointness problem DISJ is deﬁned as follows. Alice and Bob each have a subset A and B of {0, 1}n , they have to decide whether A ∩ B = ∅. Kalyanasundaram and Schnitger [34] show that this problem also has high communication complexity in the randomized setting: R2 (DISJ) = Ω(n). Buhrman et al. in the same paper show that when we allow the quantum protocol to compute the answer with probability at least 2/3, we denote this by Q2 (f ), that √ √ Q2 (DISJ) = O( n log(n)). This bound was improved by [31] and a O( n) protocol was recently constructed by Aaronson and Ambainis [1]. Razborov has shown, using some variant of the polynomial method, that this bound is tight [49]

Distributed Quantum Computing

11

The disjointness problem demonstrates a quadratic gap between classical randomized and quantum communication complexity. Moreover this is an example of a gap known where the function f is not a promise problem. The only other known total problem that allows for a more eﬃcient quantum protocol is that of the equality problem in the simultaneous message passing model [16]. The main ingredients to the protocol are a quantum ﬁngerprinting scheme and a test to distinguish orthogonal states from parallel ones. In the simultaneous message passing model Alice and Bob don’t send messages to each other but send one message to a third party, called the referee. The referee only sees the messages from Alice and Bob and has to output f (x, y). The biggest gap between the randomized and the quantum two party communication complexity model was obtained by Raz [48]. He showed that √ there is a promise problem f such that Q(f ) = O(log(n)) but R2 (f ) = Ω( n). Ambainis et al. [7] also exhibit an exponential gap between quantum protocols and classical protocols for a diﬀerent form of communication problem called sampling which we shall not discuss here further. Summarizing for promise problems there exist exponential gaps between classical and quantum communication complexity. For total problems the best known gap is only quadratic. In turn this sheds some light on the EPR paradox. Holevo’s theorem proves that EPR pairs cannot be used to reduce communication. Since all the protocols in this section work for the model where the parties share EPR pairs and communicate classically it follows that EPR pairs can reduce the communication complexity of certain problems. This situation seems contradictory but notice that the actual amount of information that needs to be communicated between Alice and Bob is only 1 bit, namely the outcome of f . Lower Bounds. In the previous section we showed that quantum communication protocols are sometimes superior to classical protocols. In this section we examine the converse and turn our attention to lower bounds for quantum communication complexity. Classically for deterministic communication complexity there is a general technique for proving lower bounds. For any function f : {0, 1}n × {0, 1}n → {0, 1} one can deﬁne the boolean 2n × 2n communication matrix Mf (x, y) = f (x, y). Mehlhorn and Schmidt [44] related the rank of this matrix to the communication complexity. They show that log(rank(Mf )) ≤ C(f ). This is a very useful tool. Take for example the equality problem. The communication complexity matrix for EQ is the 2n × 2n identity matrix which has only 1’s on the diagonal and is 0 on oﬀ-diagonal entries. Since this matrix has rank 2n it follows that C(f ) ≥ n. A similar statement is true in the quantum setting: Theorem 3. For any communication problem f : 1. log(rank(Mf ))/2 ≤ Q(f ) [36]. 2. log(rank(Mf )) ≤ C ∗ (f ) [18]. 3. log(rank(Mf ))/2 ≤ Q∗ (f ) [18].

12

Harry Buhrman and Hein R¨ ohrig

A natural and long standing open problem is whether the communication complexity is also a lower bound for the log-rank. That is, whether the log-rank characterizes the communication complexity. The biggest known gap between the log-rank and the communication complexity is almost quadratic [47]. The log-rank conjecture states that for every total f , log(rank(f )) and C(f ) are all polynomially related. It follows from Theorem 3 that if the log-rank conjecture is true then for total f : Q(f ), C ∗ (f ), Q∗ (f ), and C(f ) are polynomially related. The log rank lower bound method only works well for errorless protocols. For bounded error models there is another bound called discrepancy. Kremer [36] and Yao show that the discrepancy bound also works for the bounded error qubit communication model Q2 . This enables them to show a linear lower bound in this model for a problem called inner product modulo 2, IP . Here IP (x, y) = x1 · y1 + · · · + xn · yn mod 2. Ambainis et al. [7] extend this bound to also yield a Ω(n) bound even when Alice and Bob are allowed to make an error which is very close to 1/2. For the model where both parties share EPR pairs, Cleve et al. [23] were the ﬁrst to show a linear lower bound for IP . They came up with a new technique that is essentially quantum mechanical in nature. It can be seen as a quantum adversary argument. This enabled them to show that any (quantum) protocol for IP can be (ab)used, when run in superposition, to communicate n bits from Alice to Bob. Let Q∗2 (f ) denote the communication complexity of f where Alice and Bob compute f correctly with probability 2/3, they share EPR pairs and the communication is with qubits. Theorem 3 yields a lower bound of Ω(n) for DISJ in the errorless models since the MDISJ has rank 2n . In the bounded error model √ recently Razborov showed, in a very nice paper, that the DISJ needs Ω( n) qubits of communication even in the presence of shared EPR pairs. Summarizing we have the following theorem: (IP ) = Ω(n) [36,23]. Theorem 4. 1. Q∗2√ 2. Q∗2 (DISJ) = Ω( n) [49] 4.3

Loopholes in Nonlocality Experiments

Tools and results from the study of quantum communication complexity have been applied fruitfully to tune parameters in physical experiments that test the “quantumness” of our world. The EPR paradox has been and still is a subject of dispute. Much progress was made when Bell [10] came up with a test that would, in case quantum mechanics was correct, show correlations that could not be explained with just classical reasoning. Such nonlocality experiments have been performed in the lab and non-classical correlations have been observed [9]. However, experimental realizations of the nonlocality tests are hampered by noise and imperfections in the physical apparatus. In particular, measurement devices for individual quantum systems (e.g., single-photon detectors) tend to fail on most runs of the experiment, allowing local classical explanations of the data by means of local classical theories that are allowed to make the same kind

Distributed Quantum Computing

13

of errors and this opens the so-called “detection loophole.” Ideas from quantum communication complexity have been used by Brassard et al. [14], Massar [40], and Buhrman et al. [19] to propose new nonlocality experiments and to bound the maximum detector eﬃciency, minimum noise, and hidden communication using which the results can be explained by means of a classical local model. The goal is to construct an experiment that demonstrates the nonlocal character of quantum mechanics even when the experiments are faulty and make errors. An experiment is modeled as two (or more) parties Alice and Bob that each have an input of length n. However, contrary to the communication complexity model, Alice and Bob are not allowed to communicate with each other. In the classical setting Alice and Bob share a common source of random bits, and in the quantum scenario Alice and Bob share EPR pairs or more generally an entangled state. Alice now will depending on her input and her random bits (or some operation on her part of the EPR pairs) output some string a of m bits. Bob follows some protocol to also output m bits b. This way they produce correlation distributions Pr[a, b | x, y]. The goal now is to come up with a set of correlation distributions and show that there is a quantum protocol that generates these distributions whereas every classical protocol fails to do so even if it is allowed to make small errors or sometimes not produce an output at all. Deutsch-Josza Correlations. To demonstrate these ideas, we return once more to the Deutsch-Josza problem, following Brassard et al. [14] and Massar [40]. This time, Alice and Bob cannot communicate, but they start out sharing a quantum state, receive classical bit strings x, y ∈ {0, 1}n , respectively; both Alice and Bob produce outputs, a, b ∈ {0, 1}l , respectively, and we are interested in the correlations between these outputs, namely the probability distributions Pr[a, b | x, y] of Alice outputting a and Bob outputting b given that Alice got input x and Bob input y. Recall that the “trick” in turning the Deutsch-Josza algorithm into a communication protocol was to let Bob perform the ﬁrst steps of the algorithm and then send the quantum state to Alice who completed the steps with her input. Now, since Alice and Bob cannot communicate, we replace the quantum channel by EPR pairs. Alice and Bob start out with the following state comprised of l = log(n) EPR pairs and two auxiliary qubits:

1 √

2 n

|i (|0 − |1) |i (|0 − |1)

i∈{0,1}l

Here, Alice has the ﬁrst l+1 qubits and Bob the remaining l+1 qubits. Now they pretend that each on her/his side are in the Deutsch-Jozsa algorithm before the oracle query, as given in (13). Accordingly, they perform the operation |i|b → |i|b ⊕ yi on their part of the state, resulting in the following global state: 1 √

2 n

(−1)xi +yi |i (|0 − |1) |i (|0 − |1)

i∈{0,1}l

Then they apply the Hadamard operation on their l +1 qubits, yielding the state

14

Harry Buhrman and Hein R¨ ohrig

1 √ n n

i∈{0,1}l

 (−1)xi +yi 



(−1)(i,a) |a |1 

a∈{0,1}l

1 = √ n n







a,b∈{0,1}l

b∈{0,1}l

 (−1)(i,b) |b |1 

(−1)xi +yi +(i,a⊕b)  |a|1|b|1

i∈{0,1}l

Now they both measure and output their measurement. By the laws of quantum mechanics, the probability for Alice to observe |a|1 and Bob |b|1 is  2 1  Pr[a, b | x, y] = 3 (−1)xi +yi +(i,a⊕b)  n l i∈{0,1}

If x = y, then

Pr[a, b | x, y] =

1 n

0

if a = b if a = b

whereas for ∆(x, y) = n/2 and a = b we have Pr[a, b | x, y] = 0. Hence, the outputs are correlated in that whenever x = y, we always see a = b and whenever ∆(x, y) = n/2, we never see a = b. Can these correlations be realized by a classical protocol with shared randomness and no communication? No, since then Bob could send his output to Alice, solving the communication problem with O(log n) bits, which is ruled out by the lower bound of Ω(n). Then, how closely can they be realized approximately, i.e., how precise does an experiment need to be? For the “detection loophole,” it is assumed that any measurement succeeds with probability at least η and if it fails, there will be no output. Then η 2 is the probability that both Alice’s and Bob’s measurements succeed. If the world is classical, then we have an adversary who is trying to reproduce the correlations without communication using the possibility not to produce an output on a η 2 fraction of the runs of the experiment. By the Yao principle there will be for any distribution on the inputs a classical local deterministic strategy which produces a (correct) output for an η 2 fraction of the inputs. Consider the input distribution where x ∈ {0, 1}n is chosen uniformly and random and y = x; ﬁx the best deterministic strategy. Let Za = {x : Alice and Bob output a}, then η 2 2n ≤ |Za | a∈{0,1}l

Moreover, for each a ∈ {0, 1}l , Za ⊆ {0, 1}n must not contain x, y with ∆(x, y) = n/2, therefore, by a deep theorem odl [27], |Za | ≤ 20.993n . This √ by Frankl and R¨ 2 n 0.993n −0.007n implies η 2 ≤ n2 or η ≤ n2 . Hence, with growing n, the detector eﬃciency at which there still exists a classical local model decreases exponentially. So if the quality of the measurement equipment does not decrease too fast with growing n, the detection loophole can be “closed” with an experiment for the Deutsch-Jozsa correlations.

Distributed Quantum Computing

15

There are several issues with this approach. In a nonlocality experiment, the input distribution should a product distribution so that it can be implemented locally in the lab. Furthermore, there are very eﬃcient classical bounded-error protocols for equality, implying that the quantum correlations above can be very well simulated classically if the experiment is subject to noise. And ﬁnally, an asymptotic analysis is often too coarse since the region where the bounds kick in may be out of reach experimentally. Concerning the bounded-error case, a multiparty nonlocality experiment has been constructed, building again on an earlier multiparty quantum communication protocol [21]. This family of experiments has η ≤ 1/k 1/6 and tolerates error 1/2 − 1/ o(k 1/6 ), where k is the number of parties [20]. 4.4

Coin Tossing

Research into quantum cryptography is motivated by two observations about quantum mechanics: 1. Nonorthogonal quantum states cannot be distinguished perfectly and parts of certain orthogonal quantum states cannot be distinguished if the remaining parts are inaccessible; 2. Measurement disturbs the quantum state. This is the so-called “collapse of the wave function.” The second observation hints at the possibility of detecting eavesdroppers or other types of cheaters, whereas the ﬁrst property appears to allow hiding data, both unhampered by unproven computational assumptions. Indeed, for the task of cooperatively establishing a random bit string between two parties in the presence of eavesdroppers, quantum key distribution [13,41,39] achieves security against the most general attack by an adversary that has unbounded computational power but has to obey the laws of quantum mechanics. Initially, it was thought that these properties would admit protocols for the cryptographic primitive “bit commitment.” In bit commitment, there are two parties Alice and Bob; in the initial phase of the protocol, Alice has a bit b and communicates with Bob to “commit” to the value of b without revealing it. At a later time, Alice “unveils” her bit, allowing Bob to perform checks against the information obtained in the initial phase. The properties sought of bit-commitment protocols are that they are “concealing” (Bob does not learn anything about b in the initial phase) and “binding” (Bob will catch Alice trying to unveil 1 − b instead of b). Unfortunately, Mayers [42] and Lo and Chau [38] proved that perfect quantum bit commitment is impossible. Their impossibility result extends to “coin tossing” [43,38], a weaker cryptographic primitive where the two parties want to agree on a random bit whose value cannot be inﬂuenced by either of them. Moreover, the impossibility extends even to the case of “weak coin tossing” [4], where outcome b = 0 is favorable for Alice and outcome b = 1 favorable for Bob, thus ruling out perfect quantum protocols for leader election. However, what

16

Harry Buhrman and Hein R¨ ohrig

turned out to be possible are coin-tossing protocols, where there are guarantees on how much a cheater can bias the outcome. Consider k parties out of which at most k < k are dishonest; which players are dishonest is ﬁxed in advance but unknown to the honest players. The players can communicate over broadcast channels. Initially they do not share randomness, but they can privately ﬂip coins; the probabilities below are with respect to the private random coins. A coin-ﬂipping protocol establishes among the honest players a bit b such that – if all players are honest, Pr[b = 0] = Pr[b = 1] = 1/2 – if up to k players are dishonest, then Pr[b = 0], Pr[b = 1] ≤ 1/2 + is called the bias; a small bias implies that colluding dishonest players cannot strongly inﬂuence the outcome of the protocol. Players may abort the protocol. Classically, if a (weak) majority of the players is bad then no bias < 1/2 can be achieved and hence no meaningful protocols exist [50]. For example, if we only have two players and one of them is dishonest, then no protocols with bias < 1/2 exist. (For a minority of bad players, quite nontrivial protocols exist; see [26].) Allowing quantum bits (qubits) to be sent instead of classical bits changes the situation dramatically. Surprisingly, in the two-party case coin ﬂipping with bias < 1/2 is possible, as was ﬁrst shown in [3]. The best known bias is 1/4 and this is optimal for a special class of three-round protocols [4]; for a bias of at least Ω(log log(1/)) rounds of communication are necessary [4]. Recently, Kitaev (unpublished, see [35,6]) showed that in the two-party case no bias smaller than √ 1/ 2 − 1/2 is possible. In the weak version of the coin-ﬂipping problem, we know in advance that outcome 0 beneﬁts Alice and outcome 1 beneﬁts Bob. In this case, we only need to bound the probabilities of a dishonest Alice convincing Bob that the outcome is 0 and a dishonest Bob convincing Alice that the outcome is 1. In the classical setting, a standard argument shows that even weak coin ﬂipping with a bias < 1/2 is impossible when a majority of the players is dishonest. In the quantum setting, this scenario was ﬁrst studied for two parties under the name quantum gambling [28]. Subsequently, Spekkens and Rudolph √ [51] gave a quantum protocol for two-party weak coin ﬂipping with bias 1/ 2 − 1/2 √ (i.e., no party can achieve the desired outcome with probability greater than 1/ 2). Notice that this is a better bias than in the best strong coin ﬂipping protocol of [4]. Kitaev’s lower bound for strong coin ﬂipping does not apply to weak coin ﬂipping. Thus, weak protocols with arbitrarily small > 0 may be possible. The only known lower bounds for weak coin ﬂipping are that the protocol of [51] is optimal for a restricted class of protocols [5] and that a protocol must use at least Ω(log log(1/)) rounds of communication to achieve bias (shown in [4] for strong coin ﬂipping but the proof also applies to weak coin ﬂipping). Quantum coin ﬂipping and leader election for more than two parties were investigated by Ambainis et al. [6]: Even if there is only a single honest party among k players, bias 1/2 − c/k 1.78 can still be achieved by a quantum protocol (for some c > 0) and there is a lower bound that for some c > 0, 1/2 − c /k cannot be achieved. Both bounds can be generalized to the situation where at

Distributed Quantum Computing

17

most (1 − )k of the players are bad, for > 0; in this case, bias δ < 1/2 − c 1.78 is achievable independent of the number of players and achieving constant bias δ < 1/2 − c is impossible, for constants c , c > 0.

5

Conclusion and Open Problems

We have surveyed some of the results in quantum distributed computing. Many problems however remain. What is the relationship between the various models, Q, C ∗ , Q∗ both in the errorless and in the bounded error setting? For the errorless models, a positive answer to the log-rank conjecture shows that they are all polynomially related but also this is at the moment still wide open. We have seen that exponential gaps between classical and quantum communication complexity problems are possible, however, all of these examples entailed promise problems. Can there also be exponential gaps for total problems in the bounded error setting? Techniques and protocols from quantum and classical communication complexity can help to construct nonlocality experiments. It remains an open question what the best bounds for two and more parties are for the error of the experiment and the detector eﬃciency. A question that is sheds some light on the relationship between Q, C ∗ , and ∗ Q is the following. Given a correlation game with two parties that each get inputs of size n and produce outputs of size m, and use an entangled state of a ﬁnite amount of qubits. Is there a protocol that uses only an O log(n + m)) qubit entangled state that can be used to approximate, say in terms of small total variation distance, the correlations from the original protocol? Such a statement is true in the classical scenario with respect to the number of shared random bits and has a very similar prove of the fact that for any communication complexity protocol only O(log(n)) shared random bits are needed [45]. Note that if it can be shown that O(log(n)) entangled qubits are enough to simulate an arbitrary communication complexity protocol on inputs of length n then C2∗ , Q∗2 , and Q2 are all related with an additional overhead of O(log(n)) qubits of communication. In the simultaneous message passing model there exists a O(log n) protocol that solves the equality problem. The equality problem is equivalent to the problem of deciding whether x ⊕ y = 0, where x ⊕ y is the bitwise XOR of x and y. No such protocol is known for the three party problem to decide whether x ⊕ y ⊕ z = 0.√The best known quantum protocol is due to Ambainis and Shi [8] who need O( n) qubits to solve this problem. However, the best √ known lower bound for this three party problem in the classical setting is Ω( n) and the best known classical upper bound is O(n2/3 ) Quantum bit commitment and perfect coin tossing have been shown to be impossible, but there are protocols for coin tossing with constant bias. The best known impossibility bound for strong coin tossing matches the bias of the best known protocol for weak coin tossing – it is not clear whether this is a coincidence. Tight bounds on the achievable bias are not known in both cases; we even do not know whether there exists a protocol with a ﬁnite number of rounds and qubits that guarantees the optimal bias, or whether there are more and more complex protocols whose biases converge.

18

Harry Buhrman and Hein R¨ ohrig

References 1. S. Aaronson and A. Ambainis. Quantum search of spatial regions. quant-ph/0303041, 2003. 2. H. Abelson. Lower bounds on information transfer in distributed computations. J. Assoc. Comput. Mach., 27(2):384–392, 1980. Earlier version in FOCS’78. 3. D. Aharonov, A. Ta-Shma, U. Vazirani, and A. Yao. Quantum bit escrow. In Proceedings of STOC’00, pages 705–714, 2000. 4. A. Ambainis. A new protocol and lower bounds for quantum coin ﬂipping. In Proceedings of 33rd ACM STOC, pages 134–142, 2001. 5. A. Ambainis. Lower bound for a class of weak quantum coin ﬂipping protocols. quant-ph/0204063, 2002. 6. A. Ambainis, H. Buhrman, Y. Dodis, and H. R¨ ohrig. Multiparty quantum coin ﬂipping. Submitted, 2003. 7. A. Ambainis, L. Schulman, A. Ta-Shma, U. Vazirani, and A. Wigderson. The quantum communication complexity of sampling. In 39th IEEE Symposium on Foundations of Computer Science, pages 342–351, 1998. 8. A. Ambainis and Y. Shi. Distributed construction of quantum ﬁngerprints. quantph/0305022, 2003. 9. A. Aspect, J. Dalibard, and G. Roger. Experimental test of Bell’s inequalities using time-varying analyzers. Phys. Rev. Lett., 49(25):1804, 1982. 10. J. S. Bell. On the Einstein-Podolsky-Rosen paradox. Physics, 1, 1964. 11. C. Bennett, G. Brassard, C. Cr´epeau, R. Jozsa, A. Peres, and W. Wootters. Teleporting an unknown quantum state via dual classical and Einstein-Podolsky-Rosen channels. Physiscal Review Letters, 70:1895–1899, 1993. 12. C. Bennett and S. Wiesner. Communication via one- and two-particle operators on Einstein-Podolsky-Rosen states. Physiscal Review Letters, 69:2881–2884, 1992. 13. C. H. Bennett and G. Brassard. Quantum cryptography: Public key distribution and coin tossing. In Proceedings of the IEEE International Conference on Computers, Systems and Signal Processing, pages 175–179, 1984. 14. G. Brassard, R. Cleve, and A. Tapp. The cost of exactly simulating quantum entanglement with classical communication. Physical Review Letters, 83(9):1874– 1877, 1999. 15. H. Buhrman, R. Cleve, and W. van Dam. Quantum entanglement and communication complexity. SIAM Journal on Computing, 30(8):1829–1841, 2001. quantph/9705033. 16. H. Buhrman, R. Cleve, J. Watrous, and R. de Wolf. Quantum ﬁngerprinting. Physical Review Letters, 87(16), September 26, 2001. 17. H. Buhrman, R. Cleve, and A. Wigderson. Quantum vs. classical communication and computation. In 30th Annual ACM Symposium on Theory of Computing, 1998. quant-ph/9702040. 18. H. Buhrman and R. de Wolf. Communication complexity lower bounds by polynomials. In 16th IEEE Annual Conference on Computational Complexity (CCC’01), pages 120–130, 2001. cs.CC/9910010. 19. H. Buhrman, P. Høyer, S. Massar, and H. R¨ ohrig. Combinatorics and quantum nonlocality. Accepted for publication in Physical Review Letters, 2002. 20. H. Buhrman, P. Høyer, S. Massar, and H. R¨ ohrig. Resistance of quantum nonlocality to imperfections. Manuscript, 2003. 21. Harry Buhrman, Wim van Dam, Peter Høyer, and Alain Tapp. Multiparty quantum communication complexity. Physical Review A, 60(4):2737 – 2741, October 1999.

Distributed Quantum Computing

19

22. R. Cleve and H. Buhrman. Substituting quantum entanglement for communication complexity. Physical Review A, 56(2):1201–1204, august 1997. 23. R. Cleve, W. van Dam, M. Nielsen, and A. Tapp. Quantum entanglement and the communication complexity of the inner product function. In Springer-Verlag, editor, Proceedings of the 1st NASA International Conference on Quantum Computing and Quantum Communications, 1998. 24. D. Dieks. Communication by EPR devices. Phys. Lett. A, 92(6):271–272, 1982. 25. A. Einstein, B. Podolsky, and N. Rosen. Can quantum-mechanical description of physical reality be considered complete? Phys. Rev., 47:777, 1935. 26. U. Feige. Noncryptographic selection protocols. In Proceedings of 40th IEEE FOCS, pages 142–152, 1999. 27. P. Frankl and V. R¨ odl. Forbidden intersections. Trans. Amer. Math. Soc., 300(1):259–286, 1987. 28. L. Goldenberg, L. Vaidman, and S. Wiesner. Quantum gambling. Physical Review Letters, 88:3356–3359, 1999. 29. L. Grover. A fast quantum mechenical algorithm for database search. In 28th ACM Symposium on Theory of Computing, pages 212–218, 1996. 30. A. S. Holevo. Bounds for the quantity of information transmitted by a quantum communication channel. Problemy Peredachi Informatsii, 9(3):3–11, 1973. English translation in Problems of Information Transmission, 9:177–183, 1973. 31. P. Høyer and R. de Wolf. Improved quantum communication complexity bounds for disjointness and equality. In Proceedings of 19th Annual Symposium on Theoretical Aspects of Computer Science (STACS’2002), volume 2285 of Lecture Notes in Computer Science, pages 299–310. Springer, 2002. quant-ph/0109068. 32. J. Hromkoviˇc. Communication Complexity and Parallel Computing. EATCS series: Texts in Theoretical Computer Science. Springer, 1997. 33. D. Deutsch R. Josza. Rapid solutions of problems by quantum computation. Proc. Roy. Soc. London Se. A, 439:553–558, 1992. 34. B. Kalyanasundaram and G. Schnitger. The probabilistic communication complexity of set intersection. SIAM J. Discrete Mathematics, 5(4):545–557, 1992. 35. A. Yu. Kitaev. Quantum coin-ﬂipping. Talk at QIP 2003 (slides and video at MSRI), December 2002. 36. I. Kremer. Quantum communication. Master’s thesis, Computer Science Department, The Hebrew University, 1995. 37. E. Kushilevitz and N. Nisan. Communication Complexity. Cambridge University Press, 1997. 38. H. K. Lo and H. F. Chau. Why quantum bit commitment and ideal quantum coin tossing are impossible. Physica D, 120:177–187, 1998. 39. H-K. Lo and H. F. Chau. Unconditional security of quantum key distribution over arbitrarily long distances. quant-ph/9803006, 3 Mar 1998. 40. S. Massar. Nonlocality, closing the detection loophole, and communication complexity. Physical Review A, 65:032121, 2002. 41. D. Mayers. Unconditional security in quantum cryptography. quant-ph/9802025, 10 Feb 1998. 42. D. Mayers. Unconditionally secure quantum bit commitment is impossible. Physical Review Letters, 78:3414–3417, 1997. 43. D. Mayers, L. Salvail, and Y. Chiba-Kohno. Unconditionally secure quantum coin tossing. quant-ph/9904078, 22 Apr 1999.

20

Harry Buhrman and Hein R¨ ohrig

44. K. Mehlhorn and E. M. Schmidt. Las Vegas is better than determinism in VLSI and distributed computing (extended abstract). In Proceedings of the Fourteenth Annual ACM Symposium on Theory of Computing, pages 330–337, San Francisco, California, 5–7 May 1982. 45. I. Newman. Private vs. common random bits in communication complexity. Information Processing Letters, 39(2):67–71, July 1991. 46. M. A. Nielsen and I. L. Chuang. Quantum Computation and Quantum Information. Cambridge University Press, 2000. 47. N. Nisan and A. Wigderson. On rank vs. communication complexity. Combinatorica, 15:557–566, 1995. Earlier version in FOCS’94. 48. R. Raz. Exponential separation of quantum and classical communication complexity. In Proceedings of 31th STOC, pages 358–367, 1999. 49. A. A. Razborov. Quantum communication complexity of symmetric predicates. Izv. Math., 67(1):145–159, 2003. 50. M. Saks. A robust noncryptographic protocol for collective coin ﬂipping. SIAM J. Discrete Math., 2(2):240–244, 1989. 51. R. Spekkens and T. Rudolph. A quantum protocol for cheat-sensitive weak coin ﬂipping. quant-ph/0202118, 2002. 52. W. K. Wootters and W. H. Zurek. A single quantum cannot be cloned. Nature, 299(5886):802–803, 1982. 53. A. C-C. Yao. Some complexity questions related to distributive computing. In Proceedings of 11th STOC, pages 209–213, 1979. 54. A. C-C. Yao. Quantum circuit complexity. In Proceedings of 34th FOCS, pages 352–360, 1993.

Selﬁsh Routing in Non-cooperative Networks: A Survey R. Feldmann, M. Gairing, Thomas L¨ ucking, Burkhard Monien, and Manuel Rode Department of Computer Science, Electrical Engineering and Mathematics University of Paderborn, F¨ urstenallee 11, 33102 Paderborn, Germany {obelix,gairing,luck,bm,rode}@uni-paderborn.de

Abstract. We study the problem of n users selﬁshly routing traﬃcs through a shared network. Users route their traﬃcs by choosing a path from their source to their destination of the traﬃc with the aim of minimizing their private latency. In such an environment Nash equilibria represent stable states of the system: no user can improve its private latency by unilaterally changing its strategy. In the ﬁrst model the network consists only of a single source and a single destination which are connected by m parallel links. Traﬃcs are unsplittable. Users may route their traﬃcs according to a probability distribution over the links. The social optimum minimizes the maximum load of a link. In the second model the network is arbitrary, but traﬃcs are splittable among several paths leading from their source to their destination. The goal is to minimize the sum of the edge latencies. Many interesting problems arise in such environments: A ﬁrst one is the problem of analyzing the loss of eﬃciency due to the lack of central regulation, expressed in terms of the coordination ratio. A second problem is the Nashiﬁcation problem, i.e. the problem of converting any given non-equilibrium routing into a Nash equilibrium without increasing the social cost. The Fully Mixed Nash Equilibrium Conjecture (FMNE Conjecture) states that a Nash equilibrium, in which every user routes along every possible edge with probability greater than zero, is a worst Nash equilibrium with respect to social cost. A third problem is to exactly specify the sub-models in which the FMNE Conjecture is valid. The wellknown Braess’s Paradox shows that there exist networks, such that strict sub-networks perform better when users are selﬁsh. A natural question is the following network design problem: Given a network, which edges should be removed to obtain the best possible Nash equilibrium. We present complexity results for various problems in this setting, upper and lower bounds for the coordination ratio, and algorithms solving the problem of Nashiﬁcation. We survey results on the validity of the FMNE Conjecture in the model of unsplittable ﬂows, and for the model of splittable ﬂows we survey results for the network design problem.

Partly supported by the DFG-SFB 376 and by the IST Program of the EU under contract numbers IST-1999-14186 (ALCOM-FT), and IST-2001-33116 (FLAGS). International Graduate School of Dynamic Intelligent Systems

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 21–45, 2003. c Springer-Verlag Berlin Heidelberg 2003

22

1

R. Feldmann et al.

Introduction

Motivation-Framework. Large-scale traﬃc and communication networks, like e.g. the internet, telephone networks, or road traﬃc systems often lack a central regulation for several reasons: The size of the network may be too large, the networks may be dynamically evolving over time, or the users of the network may be free to act according to their private interest, without regard to the overall performance of the system. Besides the lack of central regulation even cooperation of the users among themselves may be impossible due to the fact that the users may not even know each other. Networks with non-cooperative users have already been studied in the early 1950s in the context of road traﬃc systems [41,3]. Recently, this kind of networks has become increasingly important in computer science. Modern computational artifacts, like e.g. the internet, are modeled as communication networks with non-cooperative users. We survey routing problems in communication networks where n sources of traﬃc, called users, are going to route their traﬃcs through a shared network. Traﬃcs are routed through links of the network at a certain rate depending on the link, and diﬀerent users may have diﬀerent objectives, e.g. speed, quality of service, etc. The users choose routing strategies in order to minimize their private costs in terms of their private objectives without cooperating with other users. Such networks are called non-cooperative networks [22]. A famous example of such a network is the internet. Motivated by non-cooperative systems like the internet, combining ideas from game theory and computer science has become increasingly important [12,21,30,31,34]. Such an environment, which lacks a central control unit due to its size or operational mode, can be modeled as a non-cooperative game [32]. Users selﬁshly choose their private strategies, which in our environment correspond to paths (or probability distributions over the paths) from their sources to their destinations. When routing their traﬃcs according to the strategies chosen, the users will experience an expected latency caused by the traﬃcs of all users sharing edges. Each user tries to minimize its private cost, expressed in terms of its expected individual latency. This often contradicts the goal of optimizing the social cost which measures the global performance of the whole network. The degradation of the global performance due to the selﬁsh behavior of its users is often termed price of anarchy [34,38] and measured in terms of the coordination ratio. The theory of Nash equilibria [29] provides us with an important solution concept for environments of this kind: a Nash equilibrium is a state of the system such that no user can decrease its individual cost by unilaterally changing its strategy. It has been shown by Nash that a Nash equilibrium exists under fairly broad circumstances. The concept of Nash equilibria has become an important mathematical tool in analyzing the behavior of selﬁsh users in non-cooperative systems [34]. Many algorithms have been developed to compute a Nash equilibrium in a general game (see [27] for an overview). Although the theorem of Nash [29] guarantees the existence of a Nash equilibrium the computational complexity of computing a Nash equilibrium in general games is open even if only n = 2 users are

Selﬁsh Routing in Non-cooperative Networks: A Survey

23

involved. Papadimitriou [34] states that due to the guaranteed existence of a solution the problem is unlikely to be N P-hard. The problem becomes even more challenging when global objective functions have to be optimized over the set of all Nash equilibria. In this work we survey results for two diﬀerent models of non-cooperative communication networks. In the ﬁrst model, the network consists of a single source and a single destination which are connected by m parallel links of capacities c1 , . . . , cm . Users 1, . . . , n are going to selﬁshly route their traﬃcs w1 , . . . , wn from the source to the destination. Traﬃcs are unsplittable. Users may choose mixed strategies to route their traﬃcs. This model has been introduced by Koutsoupias and Papadimitriou [23]. We denote it as the KP-model. The individual cost of a user is deﬁned as the maximum expected latency over all links it has chosen with positive probability. Depending on how the latency of a link is deﬁned we distinguish between three variations of the model: In the identical link model all links have equal capacity. In the model of related links the latency for a link j is deﬁned to be the quotient of the sum of the traﬃcs through j and the capacity cj . In the general case of unrelated links traﬃc i induces load wij on link j. In all these models the social cost is deﬁned to be the maximum expected latency on a link, where the expectation is taken over all random choices of the users. The second model, which we denote by Wardrop-model, has already been studied in the 1950’s [41,3] in the context of road traﬃc systems. It allows to split traﬃcs into arbitrary pieces. Wardrop [41] introduced the concept of equilibrium to describe user behavior in this kind of traﬃc networks. For a survey of the early work on this model see [1]. In this environment unregulated trafﬁc is modeled as network ﬂow. Given an arbitrary network with edge latency functions, equilibrium ﬂows have been classiﬁed as ﬂows with all ﬂow paths used between a given source-destination pair having equal latency. Equilibrium ﬂows are optimal solutions to a convex program, if the edge latencies are given by convex functions. A lot of subsequent work (see [36, Sec. 1.2] for a brief survey) on this model has been motivated by Braess’s Paradox [4]. An equilibrium in this model can be interpreted as a Nash equilibrium in a game with inﬁnitely many users, each carrying an inﬁnitesimal amount of traﬃc from a source to a destination. The individual cost of a user is deﬁned to be the sum of the edge latencies on a path from the user’s source to its destination, the social cost is deﬁned to be the sum over all edge latencies in the network. Inspired by the new interest in the coordination ratio, Roughgarden and Tardos [36,37,38] reinvestigated the Wardrop-model. We survey some of their results in section 6. The practical relevance of the Wardrop-model is underpinned by its use by traﬃc engineers, who utilized equilibria in route-guidance systems to prescribe user behavior. Recent analyses of this framework have been done by Schulz and Stier Moses [39] for capacitated networks; algorithms and experimental benchmarking on real world problems are given by Jahn et al. [20]. The two models diﬀer in terms of their deﬁnition of social cost. Routing in the KP-model is equivalent to scheduling n jobs on m parallel machines [15]. The maximum expected latency is then equivalent to the expected makespan of

24

R. Feldmann et al.

a schedule. The Wardrop-model has its origins in the deﬁnition of road traﬃc systems. Here the total network load expressed in terms of the sum of edge loads is a natural measure for the network performance. In the KP-model there may be Nash equilibria with diﬀerent social costs, while in the Wardrop-model the costs of Nash equilibria are equal. It is well known that in non-cooperative networks, due to the lack of coordination, the users may get to a solution, i.e. a Nash equilibrium, that is sub-optimal in terms of the social cost. Koutsoupias and Papadimitriou [23] deﬁned the coordination ratio as the ratio of the social cost of a worst Nash equilibrium and the social cost of the global optimal solution. The coordination ratio is a measure for the price of anarchy. The well known Braess’s Paradox [4] shows that there exist networks, such that strict sub-networks perform better when users are selﬁsh. If the goal is to construct a network in such a way that the coordination ratio of the network is small, an interesting network design problem arises: Given a network and the corresponding routing tasks, determine a set of edges which should be removed from the network to obtain a best possible routing at Nash equilibrium. Such network design problems arise e.g. in the routing of road traﬃc, when traﬃc engineers want to determine roads that should be closed, or changed to one-way roads, in order to obtain an optimal traﬃc ﬂow. In the case that the users are allowed to randomize their strategies, the set of solutions of the routing problem equals the set of all mixed Nash equilibria. A special class of mixed equilibria is the class of Fully Mixed Nash equilibria. A Nash equilibrium is fully mixed, if every user plays each of its pure strategies with positive probability. Gairing et al. [15] conjecture that, if a fully mixed Nash equilibrium exists, it is the worst equilibrium with respect to the social cost in the KP-model with related links. In the case that the users are not allowed to randomize their strategies, the set of solutions of the routing problem consists of all pure Nash equilibria. In this environment the problem of Nashiﬁcation becomes important. The problem of Nashiﬁcation is to compute an equilibrium routing from a given non-equilibrium one without increasing the social cost. The intention to centrally nashify a nonequilibrium solution is to provide a routing from which no user has an incentive to deviate. One way to nashify an assignment is to perform a sequence of greedy selﬁsh steps. A greedy selﬁsh step is a user’s change of its current pure strategy to its best pure strategy with respect to the current strategies of all other users. Any sequence of greedy selﬁsh steps leads to a pure Nash equilibrium. However, the length of such a sequence may be exponential in n. We present polynomial time algorithms for the problem of Nashiﬁcation. Overview. In section 2 we review basic notations. Then, in section 3 we focus on the sub-model of identical links. In section 4 we turn to the more general model of related links. The most general sub-model of the KP-model is considered in section 5. Finally, in section 6 we review results for the Wardrop-model.

Selﬁsh Routing in Non-cooperative Networks: A Survey

2

25

Basic Notations

Mathematical Preliminaries. For an integer i ≥ 1, denote [i] = {1, . . . , i}. Denote Γ the Gamma function; that is, for any natural ∞number i, Γ (i + 1) = i!, while for any arbitrary real number x > 0, Γ (x) = 0 tx−1 e−t dt. We use the fact that Γ (x + 1) = x · Γ (x). The Gamma function is invertible; both Γ and its inverse Γ −1 are increasing. Game Theoretic Framework. We consider a non-cooperative network game in which each network user i ∈ [n], or user for short, wishes to route a particular amount of traﬃc along (non-ﬁxed) paths from source si to destination ti . Denote Pi as the set of all distinct simple paths from si to ti . Then, a pure strategy for user i ∈ [n] is a path in Pi . A mixed strategy for user i is a probability distribution over pure strategies. Each user i wants to minimize its private cost, which depends on the latency of the shared edges used by user i to route its traﬃc. A Nash equilibrium is then a setting where no user can decrease its private cost by unilaterally changing its strategy. The performance of the total network is measured in terms of social cost. In this paper we consider two models which are diﬀerent with respect to the following aspects: 1. The structure of the network. 2. Traﬃcs may be splittable or unsplittable. 3. The deﬁnition of the private and social cost functions. 2.1

KP-Model

We ﬁrst consider a model introduced by Koutsoupias and Papadimitriou [23], called KP-model. Here, the network consists only of a single source node and a single destination node connected by m parallel links. Traﬃcs are unsplittable, that is, a pure strategy of a user is some speciﬁc link, and a mixed strategy is a probability distribution over the set of links. The private cost of user i is deﬁned to be the maximum expected latency on a link used by user i. The social cost is the expected maximum latency of a link. General. Due to the simplicity of the network, a pure strategy proﬁle L is represented by an n-tuple l1 , l2 , . . . , ln ∈ [m]n ; a mixed strategy proﬁle P is represented by an n × m probability matrix of nm probabilities pij , i ∈ [n] and j ∈ [m], where pij is the probability that user i chooses link j. A mixed strategy proﬁle F is fully mixed if for all users i ∈ [n] and all links j ∈ [m], pij > 0. The support of the mixed strategy for user i ∈ [n], denoted support(i), is the set of those pure strategies (links) to which i assigns positive probability; so, support(i) = {j ∈ [m] | pij > 0}. For pure strategies we denote link(i) = li . System, Models and Cost Measures. Denote wi the traﬃc of user i ∈ [n]. Deﬁne the n × 1 traﬃc vector w in the natural way. Assume, without loss of n generality, that w1 ≥ w2 ≥ . . . ≥ wn , and denote W = i=1 wi the total traﬃc. Denote cj > 0 the capacity of link j ∈ [m], representing the rate at which the link processes traﬃc. In the model of identical links, all link capacities are equal.

26

R. Feldmann et al.

Link capacities may vary arbitrarily in the model of links. Without loss of related m generality assume c1 ≥ . . . ≥ cm , and denote C = j=1 cj the total capacity. So, the latency for traﬃc wi through link j equals wcji . In the model of unrelated links, there exists neither an ordering on the traﬃcs nor on the capacities. We denote wij the traﬃc of user i ∈ [n] on link j ∈ [m]. Link capacities are not necessary as a problem input when links are unrelated. However, to obtain common expressions in the following deﬁnitions we assume that the link capacities are all equal to 1. Let P be an arbitrary mixed strategy proﬁle. The expected latency of user i on link j is wij + k∈[n],k=i pkj wkj . λij = cj Denote IC(w, P) the maximum expected individual latency by IC(w, P) = max

max

i∈[n] j∈[m] | pij >0

λij .

The minimum expected latency of user i is λi = minj∈[m] λij . The expected load Λj on link j is the ratio between the expected traﬃc on link j and the capacity of link j. Thus, i∈[n] pij wij . Λj = cj The maximum expected load Λ = maxj∈[m] Λj is the maximum (over all links) of the expected load Λj on a link j. Associated with a traﬃc vector w and a mixed strategy proﬁle P is the social cost [23, Section 2], denoted SC(w, P), which is the expected maximum latency on a link, where the expectation is taken over all random choices of the users. Thus, n k:lk =j wkj SC(w, P) = . pklk · max cj j∈[m] n l1 ,l2 ,...,ln ∈[m]

k=1

Note that SC(w, P) reduces to the maximum latency through a link in the case of pure strategies. Moreover, by deﬁnition of the social cost, there always exists a pure strategy proﬁle with minimum social cost. So, the social optimum [23, Section 2] associated with a traﬃc vector w, denoted OPT(w), is the least possible maximum (over all links) latency through a link, that is, k:lk =j wkj OPT(w) = min max . cj l1 ,l2 ,...,ln ∈[m]n j∈[m] Nash Equilibria and Coordination Ratio. Say that a user i ∈ [n] is satisﬁed for the probability matrix P if λij = λi for all links j ∈ support(i), and λij ≥ λi for all j ∈ support(i). Otherwise, user i is unsatisﬁed. Thus, a satisﬁed user has no incentive to unilaterally deviate from its mixed strategy. P is a Nash equilibrium [29] iﬀ all users i ∈ [n] are satisﬁed for P. Fix any traﬃc vector w. A best (worst) Nash equilibrium is a Nash equilibrium that minimizes (maximizes) SC(w, P). The best social cost is the social cost

Selﬁsh Routing in Non-cooperative Networks: A Survey

27

of a best Nash equilibrium. The worst social cost is the social cost of a worst Nash equilibrium and is denoted by WC(w). Gairing et.al. [15] conjecture that in case of its existence the fully mixed Nash equilibrium, which is unique, is the worst Nash equilibrium. Fully Mixed Nash Equilibrium Conjecture [15]. Consider the model of arbitrary traﬃcs and related links. Then, for any traﬃc vector w such that the fully mixed Nash equilibrium F exists, and for any Nash equilibrium P, SC(w, P) ≤ SC(w, F). The coordination ratio [23] is the maximum of WC(w)/OPT(w), over all traﬃc vectors w. Correspondingly, we denote the maximum of IC(w, P)/OPT(w) the individual coordination ratio. Though a mixed Nash equilibrium always exists, as implied by the fundamental theorem of Nash [29], this is not the case for pure Nash equilibria in general settings. However, for the model under consideration, there is always some pure strategy proﬁle, which fulﬁlls the Nash equilibrium condition. A proof can be given with the help of sequences of greedy selﬁsh steps. In a selﬁsh step, exactly one unsatisﬁed user is allowed to change its pure strategy. A selﬁsh step is a greedy selﬁsh step if the user chooses its best strategy. Selﬁsh steps do not increase the social cost of the initial pure strategy proﬁle. Starting with any pure strategy proﬁle, every sequence of (greedy) selﬁsh steps eventually ends in a pure Nash equilibrium. Moreover, starting with any pure strategy proﬁle with minimum social cost, this also shows that there always exists a pure Nash equilibrium L with social cost SC(w, L) = OPT(w). Theorem 1 ([11], Theorem 1). There exists a pure Nash equilibrium. Algorithmic Problems. We list a few algorithmic problems related to Nash equilibria that we consider in this work. The deﬁnitions are given in the style of Garey and Johnson [14]. A problem instance is a tuple (n, m, w, c), where n is the number of users (traﬃcs), m is the number of links, w = (wi,j ) is a n × m matrix of traﬃcs and c = (cj ) is a vector of m link capacities. Π1 : BEST NASH EQUILIBRIUM SUPPORTS INSTANCE: A problem instance (n, m, w, c). OUTPUT: A best Nash equilibrium P. Π2 : WORST NASH EQUILIBRIUM SUPPORTS INSTANCE: A problem instance (n, m, w, c). OUTPUT: A worst Nash equilibrium P. The corresponding problem to compute a worst pure Nash equilibrium is denoted as WORST PURE NASH EQUILIBRIUM SUPPORTS. If, additionally, m is constant the problem is denoted as m-WORST PURE NASH EQUILIBRIUM SUPPORTS.

28

R. Feldmann et al.

Π3 : NASH EQUILIBRIUM SOCIAL COST INSTANCE: A problem instance (n, m, w, c); a Nash equilibrium P for (n, m, w, c). OUTPUT: The social cost of the Nash equilibrium P. Π4 : NASHIFY INSTANCE: A problem instance (n, m, w, c); a pure strategy proﬁle L for the system of the users; an integer k > 0. QUESTION: Is there a sequence of at most k selﬁsh steps that transforms L to a (pure) Nash equilibrium? If k is a constant and not part of the input the corresponding decision problem is denoted as k-NASHIFY. 2.2

Wardrop-Model

In the Wardrop-model traﬃcs are splittable into inﬁnitesimally small pieces. Each of these inﬁnitely many pieces may be viewed as the traﬃc of a single user who chooses a path from its source to its destination. The problem of routing splittable traﬃc in a congested network has already been studied since the 1950’s [41,3]. Given a network, rates of traﬃc between pairs of nodes, and a latency function for each edge, the objective is to route traﬃc such that the sum of all latencies is minimized. More formally, the model is deﬁned as follows: General. An instance (G, r, l) for the routing problem consists of a network G = (V, E), a set r = {(ri , si , ti ) ∈ IR>0 × V × V | i ∈ [k]} of routing tasks, and a set l = {le | e ∈ E} of edge latency functions. A triple (ri , si , ti ) ∈ r deﬁnes the task to route a traﬃc k of rate ri from si to ti . Traﬃcs are to be routed via : P → IR≥0 . For paths in Pi . Let P = i=1 Pi . A traﬃc ﬂow then is a function f a ﬁxed ﬂow f the ﬂow fe along edge e ∈ E is deﬁned as fe = P ∈P:e∈P fP . f is a feasible solution for the routing problem instance (G, r, l) iﬀ P ∈Pi fP = ri for all i ∈ [k]. System, Models and Cost Measures. The latency of an edge e is given by a non-negative, diﬀerentiable and non-decreasing function le : IR≥0 → IR≥0 . For a ﬁxed ﬂow f the latency of an edge e ∈ E is deﬁned as ce (fe ) = le (fe )fe , the product of e s latency when routing traﬃc fe times the traﬃc fe itself. The latency cP (f ) = e∈P ce (fe ) of a path P is deﬁned to be the sum of the edge latencies on P . The social cost C(f ) of a ﬂow f is deﬁned as the sum of the edge latencies C(f ) = ce (fe ) = lP (f )fP , where lP (f ) =

e∈E

P ∈P

e∈P le (fe ).

Nash Equilibria and Coordination Ratio. By deﬁnition, a ﬂow f is a Nash equilibrium, if any arbitrarily small amount of ﬂow routed from, say, si to ti via

Selﬁsh Routing in Non-cooperative Networks: A Survey

29

a path P1 ∈ Pi cannot improve the latency experienced on P1 by switching to a path P2 ∈ Pi . More formally, from [7] we obtain the following deﬁnition: A ﬂow f in G is a Nash equilibrium, occasionally called a Wardrop equilibrium [7], if for all P1 , P2 ∈ P and all δ ∈ [0, fP1 ], we have lP1 (f ) ≤ lP2 (f˜), where   fP − δ if P = P1 f˜P = fP + δ if P = P2  fP if P ∈ {P1 , P2 }. In contrast to the notion of mixed Nash equilibria in the KP-model, here ﬂows may be split among diﬀerent paths. Consequently the social cost C(f ) of a ﬂow as deﬁned above is not a measure of expected cost. Nash equilibria in this model are pure, however, there are inﬁnitely many users, each carrying an inﬁnitesimal amount of the overall traﬃc. As for the KP-model, the coordination ratio is deﬁned by the ratio of the social cost of the worst Nash equilibrium and the minimum possible total latency.

3

Routing in the KP-Model with Identical Links

In this section we consider a simple model for our routing game, namely the KP-model where all links have the same capacity. We distinguish between pure Nash equilibria, where each user chooses a single link as its strategy, and mixed Nash equilibria. Here the strategy of a user is a probability distribution over the links. 3.1

Pure Nash Equilibria

Fotakis et al. [11, Theorem 3] show that computing a pure Nash equilibrium with minimum social cost is N P-hard even in the model of identical links. Since this problem can be formulated as an integer program, it follows that it is N Pcomplete. However, computing some pure Nash equilibrium L can be done with help of the LPT-algorithm introduced by Graham [16], using polynomial time and yielding 4 1 OPT(w). SC(w, L) ≤ − 3 3m Another way to compute a pure Nash equilibrium is, starting from any pure strategy proﬁle, to allow the users to perform greedy selﬁsh steps. Every sequence of greedy selﬁsh steps eventually yields a pure Nash equilibrium. We now give bounds on the maximum length of a sequence of greedy selﬁsh steps, that show that such a sequence may be of exponential length. Theorem 2 ([8]). There exists an instance of n users on m identical links for which the maximum length of a sequence of greedy selﬁsh steps is at least

m−1 n m−1

2(m − 1)!

.

An upper bound on the number of greedy selﬁsh steps is given by

30

R. Feldmann et al.

Theorem 3 ([9]). For any instance with n users on identical links, the length of any sequence of greedy selﬁsh steps is at most 2n − 1. Instead of the maximum length one may ask about the minimum length of a sequence of greedy selﬁsh step. In particular one may consider whether a given pure strategy proﬁle can be transformed into a pure Nash equilibrium with at most k selﬁsh steps. This problem is called NASHIFY and was shown to be N P-complete. Theorem 4 ([15]). NASHIFY is N P-complete on identical links even if m = 2. The proof relies on a reduction from PARTITION. This reduction implies that NASHIFY is N P-complete in the strong sense (cf. [14, Section 4.2]) if m is part of the input. Thus, there is no pseudo-polynomial-time algorithm for NASHIFY (unless P = N P). In contrast, there is a natural pseudo-polynomial-time algorithm for k-NASHIFY, which exhaustively searches all sequences of k selﬁsh steps; since a selﬁsh step involves a (unsatisﬁed) user and a link for a total of mn choices, the running time of such an algorithm is Θ((mn)k ). In order to compute a Nash equilibrium by using sequences of greedy selﬁsh steps, we can use two types of sequences of polynomial length introduced by Even-Dar et al. [8]. Theorem 5 ([8]). For FIFO and Random strategy, the length of a sequence of greedy selﬁsh steps is at most n(n + 1)/2 before reaching a Nash equilibrium. We continue to present algorithm NashifyIdentical that solves NASHIFY when n selﬁsh steps are allowed. A crucial observation is Lemma 1 ([15,8]). A greedy selﬁsh step of an unsatisﬁed user i with traﬃc wi makes no user k with traﬃc wk ≥ wi unsatisﬁed. NashifyIdentical sorts the user traﬃcs in non-increasing order so that w1 ≥ . . . ≥ wn . Then for each user i := 1 to n, it removes user i from the link it is currently assigned to, it ﬁnds the link j with the minimum latency, and it reassigns user i to the link j. We prove: Theorem 6 ([15,8]). Let L = l1 , . . . , ln be a pure strategy proﬁle for n users with traﬃcs w1 , ..., wn on m identical links with social cost SC(w, L). Then algorithm NashifyIdentical computes a Nash equilibrium from L with social cost ≤ SC(w, L) using O(n log n) time. The proof relies on Lemma 1 and the usage of appropriate data structures. Running the PTAS of Hochbaum and Shmoys [18] for scheduling n jobs on m identical machines yields a pure strategy proﬁle L such that SC(w, L) ≤ (1 + ε)OPT(w). On the other hand, applying NashifyIdentical to L yields a Nash equilibrium L such that SC(w, L ) ≤ SC(w, L). Thus, SC(w, L ) ≤ (1 + ε)OPT(w). Since also OPT(w) ≤ SC(w, L ), it follows that:

Selﬁsh Routing in Non-cooperative Networks: A Survey

31

Theorem 7 ([15]). There exists a PTAS for BEST PURE NASH EQUILIBRIUM for the model of identical links. After studying best pure Nash equilibria we now turn our attention to worst pure Nash equilibria. We ﬁrst show: Theorem 8 ([40,15]). Fix any traﬃc vector w and pure Nash equilibrium L. Then, 2 SC(w, L) ≤2− . OPT(w) m+1 Furthermore, this upper bound is tight. Theorem 8 shows that the social cost of any Nash equilibrium is at most 2 the factor 2 − m+1 away from the social cost of an optimal Nash equilibrium. This also implies, that every Nash equilibrium approximates the social cost of the worst Nash equilibrium within this factor. We now establish, that approximating a worst Nash equilibrium with a better guaranty is N P-hard. Theorem 9 ([15]). It is N P-hard to ﬁnd a Nash equilibrium L with 2 WC(w) <2− −ε SC(w, L)) m+1

for any ε > 0.

It is N P-hard in the strong sense if the number of links m is part of the input. The proof of Theorem 9 is based on a reduction from PARTITION. Since WORST CASE PURE NASH EQUILIBRIUM is N P-hard in the strong sense [11], there exists no pseudo-polynomial algorithm to solve WORST CASE PURE NASH EQUILIBRIUM. However, for a constant number of links m such an algorithm exists [15]. 3.2

Mixed Nash Equilibria

For a mixed Nash equilibrium it is hard even to compute its social cost. Theorem 10 ([11]). NASH EQUILIBRIUM SOCIAL COST is #P-complete even in the model of identical links. However, Fotakis et al. [11] show that there exists a fully polynomial, randomized approximation scheme for NASH EQUILIBRIUM SOCIAL COST. The coordination ratio was introduced by Koutsoupias and Papadimitriou [23]. They provided a lower bound of Ω(log m/ log log m). This result can be tightened as follows: Theorem 11 ([24,6]). For m identical links the worst-case coordination ratio is at most log m · (1 + o(1)). Γ −1 (m) + Θ(1) = log log m

32

R. Feldmann et al.

Together with the lower bound from [23], this bound is tight up to an additive constant. In the remainder of this section, we consider fully mixed Nash equilibria. Mavronicolas and Spirakis [28] showed that in the model of identical links there always exists a unique fully mixed Nash equilibrium. Lemma 2 ([28]). There is a unique fully mixed Nash equilibrium P with pij = 1 m for any user i ∈ [n] and link j ∈ [m]. Since we know that in our model the fully mixed Nash equilibrium always exists, we can compare it to a worst Nash equilibrium. The following theorem provides evidence for the Fully Mixed Nash Equilibrium Conjecture. Theorem 12 ([25]). Consider the model of identical traﬃcs and identical links, and assume that m = 2 and n is even. Then, the FMNE Conjecture is valid.

4

Routing in the KP-Model with Related Links

In this section we are engaged in the two node routing network with related links. Often the term uniform links is used to refer to this model in literature. Subsection 4.1 deals with pure Nash equilibria only, whereas the results quoted in subsection 4.2 hold for general (i.e. mixed) Nash equilibria. Subsection 4.3 concentrates on the fully mixed Nash equilibrium. 4.1

Pure Nash Equilibria

In [11] it was shown that the LPT algorithm, which was ﬁrst explored by Graham [16], can be used to compute some pure Nash equilibrium. Its coordination ratio lies between 1.52 and 1.67 [13]. Keep in mind, that the exact computation of the best Nash equilibrium is N P-complete in the strong sense (see section 3). We now describe another approach to approximate the best Nash equilibrium with arbitrary but constant precision. A Polynomial Time Algorithm for Nashiﬁcation. We call the process of converting a given pure strategy proﬁle for related links into a Nash equilibrium without increasing the social cost Nashiﬁcation. A simple Nashiﬁcation approach is to perform a greedy selﬁsh step for any user which can improve by this as long as such a user exists. Unfortunately, this can lead to an exponential number of steps, even on identical links (see section 3). In Figure 1 we present the algorithm NashifyRelated which nashiﬁes any pure routing by a polynomial number of (not necessarily selﬁsh) moves without increasing the maximum latency. A crucial observation for proving the correctness of the algorithm is stated in Lemma 3, which is a generalization of Lemma 1. It shows that greedy-moving a user from its current link to a non-slower link can only make users of smaller size unsatisﬁed.

Selﬁsh Routing in Non-cooperative Networks: A Survey

33

NashifyRelated Input: n users with traffics w1 ≥ · · · ≥ wn m links with capacities c1 ≥ · · · ≥ cm Assignment of users to links Output: Assignment of users to links with less or equal maximum latency, which is a NE { // phase 1: i := n; S := {n}; while i ≥ 1 { move user i to link with highest possible index without increasing overall maximum latency; if i was moved or i ∈ S or link(i) ≤ link(i + 1) then S := S ∪ {i}; i := i − 1; else { move user i to link with smallest possible index without increasing overall maximum latency; if i was moved then S := S ∪ {i}; i := n; else break; } } // phase 2: while ∃i ∈ S { make greedy selfish step for user i = min(S); S := S\{i}; } }

Fig. 1. NashifyRelated: Converts any assignment into a Nash equilibrium

Lemma 3 ([9]). If user i with traﬃc wi performs a greedy selﬁsh step from link j to link k with cj ≤ ck , then no user s with traﬃc ws ≥ wi becomes unsatisﬁed. NashifyRelated works in two phases. At every time link(i) denotes the link user i is currently assigned to. The main idea is to ﬁll up slow links with users with small traﬃcs as close to the maximum latency as possible in the ﬁrst phase (but without increasing the maximum latency) and to perform greedy selﬁsh steps for unsatisﬁed users in the second phase. During the ﬁrst phase, set S is used to collect all users that have already been considered by the algorithm. No user is deleted from S in this phase. Throughout the whole algorithm, each user in S is located on a link with non-greater index than any smaller user in S. In other words, the smaller the traﬃc of a user in S, the slower the link it is assigned to. We may start with S = {n}, because the above property is trivially fulﬁlled if S contains only one user. When no further user can be added to S, the ﬁrst phase terminates. In the second phase we successively perform

34

R. Feldmann et al.

greedy selﬁsh steps for all unsatisﬁed users, starting with the largest one. That is, we move each user, who can improve by changing its link, to its best link with respect to the current situation. Because of the special conditions that have been established by phase 1, and by Lemma 3, these greedy selﬁsh steps do not cause other users with larger traﬃcs to become unsatisﬁed. Note that during phase 1 the algorithm does not necessarily perform selﬁsh steps. The social cost, however, cannot be increased due to the constraints of the move commands in phase 1. The following lemma formalizes the above mentioned conditions which are established by phase 1. Lemma 4 ([9]). After phase 1 the following holds: (1) All unsatisﬁed users are in S. (2) S = {n, (n − 1), . . . , (n + 1 − |S|)}, that is, S contains the |S| users with smallest traﬃcs. (3) i, i + 1 ∈ S ⇒ link(i) ≤ link(i + 1). (4) Every user i ∈ S can only improve by moving to a link with smaller index. Each of the properties in Lemma 4 also holds after each run of the loop in phase 2 ([9]). In particular, S contains all unsatisﬁed users (property (1)). But, as S is empty when the algorithm terminates, this implies, that there are no unsatisﬁed users, i.e., the new assignment, deﬁned by link(·), is a Nash equilibrium. Implementing the algorithm in a proper way yields a running time as stated in the following theorem. Theorem 13 ([9]). Given any pure strategy proﬁle for the model of related links, algorithm NashifyRelated computes a Nash equilibrium with non-increased social cost, performing at most (m + 1)n moves in sequential running time O(m2 n). Combining any approximation algorithm for the computation of good routings with the algorithm NashifyRelated yields a method for approximating the best Nash equilibrium. Particularly, using the PTAS for the Scheduling Problem from Hochbaum and Shmoys [19], we get: Corollary 1. There is a PTAS for approximating a best pure Nash equilibrium. We cannot expect to ﬁnd an FPTAS for this problem, since the exact computation of the best Nash equilibrium is N P-complete in the strong sense [11]. Coordination Ratio. Theorem 14 ([6]). The coordination ratio for pure Nash equilibria on m related links with capacities c1 ≥ · · · ≥ cm is bounded from above by c1 log m −1 (1 + o(1)) as well as O log . Γ (m) + 1 = log log m cm

Selﬁsh Routing in Non-cooperative Networks: A Survey

35

The upper bound in Theorem 14 can be improved to Γ −1 (m) [9]. The following Example shows that this improved bound is asymptotically tight. Example 1 ([9]) Let k ∈ N, and consider the following instance with k diﬀerent classes of users: – Class U1 : |U1 | = k users with traﬃcs 2k−1 – Class Ui : |Ui | = 2i−1 · (k − 1) j=1,...,i−1 (k − j) users with traﬃcs 2k−i for all 2 ≤ i ≤ k. In the same way we deﬁne k + 1 diﬀerent classes of links: – Class P0 : One link with capacity 2k−1 . – Class P1 : |P1 | = |U1 | − 1 links with capacity 2k−1 . – Class Pi : |Pi | = |Ui | links with capacity 2k−i for all 2 ≤ i ≤ k. Consider the following assignment: – Class P0 : All users in U1 are assigned to this link. – Class Pi : On each link in Pi there are 2(k−i) users from Ui+1 , respectively, for all 1 ≤ i ≤ k − 1. – Class Pk : The links from Pk remain empty. The above assignment is a pure Nash equilibrium L with social cost SC(w, L) = k and OPT(w) = 1. Lemma 5 ([9]). For each k ∈ N there exists an instance with a pure Nash equilibrium L with k= 4.2

SC(w, L) ≥ Γ −1 (m) · (1 + o(1)). OPT(w)

Mixed Nash Equilibria

In this subsection we state upper bounds on the coordination ratio and on the individual coordination ratio for mixed Nash equilibria on related links. Theorem 15 ([6]). The coordination ratio for m related parallel links is      log m log m .

Θ min , log m  log log log m log  log c1 /cm

Whereas the above theorem bounds the coordination ratio depending only on the number of links m and the relation between the fastest and slowest link, we now introduce a structural parameter p, which can yield better bounds on the individual coordination ratio (and the coordination ratio for pure Nash equilibria). We denote M1 = {j ∈ [m] | w1 ≤ cj · OPT(w)} and p = j∈M1 cj /C. In other words, p is the ratio between the sum of link capacities of links to which the largest traﬃc can be assigned causing latency at most OPT(w) and the sum of all link capacities. With the help of p we are able to prove an upper bound on the individual coordination ratio.

36

R. Feldmann et al.

Theorem 16 ([9]). For any mixed Nash equilibrium P the ratio between the maximum expected individual latency IC(w, P) = maxi∈[n] λi and OPT(w) for the model of related links is bounded by  3 1 3  + if 13 ≤ p ≤ 1,  2 p − 4   IC(w, P) 1 < 2 + 3 p1 − 2 if 37 ≤ p < 13 , OPT(w) 

  1  if p < 37 . Γ −1 p1 1 Since wc11 ≤ OPT(w), we have p ≥ cC1 ≥ m . Furthermore, IC(w, P) ≥ Λ holds for every assignment. Thus, from Theorem 16 we can derive an upper bound of Γ −1 (m)OPT(w) on both the maximum expected load and the maximum expected individual latency. This leads to an improvement of the upper bound on the coordination ratio for pure Nash equilibria [6]. From Example 1 the following lower bound on the coordination ratio subject to p can be obtained.

Lemma 6 ([9]). For each k ∈ N there exists an instance with a pure Nash equilibrium L with 1 SC(w, L) −1 ≥Γ . k= OPT(w) 3p We can also prove k ≥ Γ −1 ( p1 ) − 1. This shows that the generalized upper bound is tight up to an additive constant for all m whereas Lemma 5 shows tightness of Γ −1 (m) up to an additive constant only for large m. We conclude this section by giving another upper bound on the maximal expected individual latency of a mixed Nash equilibrium, which only depends on the number of links m. The same bound also applies to the social cost of a pure Nash equilibrium. Theorem 17 ([9]). For any mixed Nash equilibrium P on m related links, the maximum expected individual cost is bounded by √ 1 + 4m − 3 OPT(w). IC(w, P) ≤ 2 This bound is tight if and only if m ≤ 5. Only if m ≤ 3, there is a pure Nash equilibrium matching the bound. For m ≥ 2, there is no fully mixed Nash equilibrium matching the bound. √

That 1+ 4m−3 is an upper bound on the coordination ratio for pure Nash 2 equilibria, which is an implication of Theorem 17, can also be obtained from Cho and Sahni [5] and Schuurman and Vredeveld [40], where jump optimal schedules are considered. A schedule is said to be jump optimal if no user on a link with maximum load can improve by moving to another processor. Obviously, the set of pure Nash equilibria is √ a subset of the set of jump optimal schedules. Thus, the strict upper bound of 1+ 4m−3 on the ratio between best and worst makespan 2 of jump optimal schedules [5,40] also holds for pure Nash equilibria. This bound is not asymptotically tight, but for small numbers of links (m ≤ 19) better than the asymptotically tight bound Γ −1 (m).

Selﬁsh Routing in Non-cooperative Networks: A Survey

4.3

37

Fully Mixed Nash Equilibria

A fully mixed Nash equilibrium is a special Nash equilibrium, where all probabilities pij for a user i to choose link j are strictly positive. Such a Nash equilibrium does not always exist, but if it exists, it is unique and can be eﬃciently computed. Theorem 18 ([28]). For the model of related links there is a fully mixed Nash equilibrium F for a traﬃc vector w if and only if

W cj mcj · 1− ∈ (0, 1) + 1− C (n − 1)wi C

∀i ∈ [n], j ∈ [m].

If F exists, F is unique and has associated Nash probabilities W cj mcj · 1− + , pij = 1 − C (n − 1)wi C

for any user i ∈ [n] and link j ∈ [m]. Gairing et al. [15] conjecture that a fully mixed Nash equilibrium (FMNE ) is the worst-case Nash equilibrium (the one with highest social cost) among all Nash equilibria for the same instance. So far, this conjecture is proved to hold in special cases (Theorem 20) and to hold up to a constant multiplicative factor in the general related links case (Theorem 19). Theorem 19 ([11]). Consider an instance with identical traﬃcs for the model of related links for which the fully mixed Nash equilibrium exists. Then the social cost of the worst mixed Nash equilibrium is at most 49.02 times the social cost of any fully mixed Nash equilibrium.

Theorem 20 ([25]). For the model of related links and two identical traﬃcs the FMNE Conjecture holds. In contrast to the yet unproved claim of the FMNE Conjecture (that the fully mixed Nash equilibrium has the worst social cost), Lemma 7 shows, that each user indeed experiences the worst individual cost in the fully mixed Nash equilibrium. This implies, that the social cost of a pure Nash equilibrium is bounded from above by the social cost of the fully mixed Nash equilibrium. Lemma 7 ([15]). Fix any traﬃc vector w, mixed Nash equilibrium P and user i. Then, λi (w, P) ≤ λi (w, F). Theorem 21 ([15]). Fix any traﬃc vector w and pure Nash equilibrium L. Then, SC(w, L) ≤ SC(w, F).

38

5

R. Feldmann et al.

Routing in the KP-Model with Unrelated Links

Up to now only little attention has been payed to Nash equilibria in the model of unrelated links. Similar to the model of identical links, a pure Nash equilibrium in the model of unrelated links can be computed by performing sequences of (greedy) selﬁsh steps. However, up to now it is unknown whether, starting with any pure strategy proﬁle, there always exists a sequence of polynomial length ending in a pure Nash equilibrium. A trivial upper bound on the number of selﬁsh steps before reaching a pure Nash equilibrium is mn . Starting with a pure strategy proﬁle with minimal social cost, the convergence of any sequence of (greedy) selﬁsh steps implies that there always exists a pure Nash equilibrium with optimal social cost. However, we can not hope to approximate a best pure Nash equilibrium within factor 32 since Minimum Multiprocessor Scheduling on unrelated processors is not approximable within a factor 32 − ε for any ε > 0 [26]. Consider the case n ≤ m, that is, there are at most as many users as links. Then, the minimum expected latency of any user i in a pure Nash equilibrium L is smaller than the minimum expected latency of user i in the fully mixed Nash equilibrium F. Proposition 1 ([25]). For the model of unrelated links, let w = (wij ) be a traﬃc matrix such that a fully mixed Nash equilibrium F exists, and let L be a pure Nash equilibrium. Let n ≤ m. Then, for every user i, λi (L) < λi (F). Clearly, the social cost of any pure Nash equilibrium L is equal to the maximum of the expected latencies, while the social cost of a fully mixed Nash equilibrium F is at least the expected latency of any user. Hence, Proposition 1 implies that for n ≤ m the social cost of every pure Nash equilibrium L is at most the social cost of the fully mixed Nash equilibrium F. Theorem 22 ([25]). For the model of unrelated links, let w = (wij ) be a traﬃc matrix such that a fully mixed Nash equilibrium F exists, and let L be a pure Nash equilibrium. Let n ≤ m. Then, SC(w, L) ≤ SC(w, F). In case of n = 2, the minimum expected latency of user i ∈ [2] in any mixed Nash equilibrium P is also bounded by its minimum expected latency in the fully mixed Nash equilibrium F. Proposition 2 ([25]). For the model of unrelated links, let w = (wij ) be a traﬃc matrix such that a fully mixed Nash equilibrium F exists, and let P be a Nash equilibrium. If n = 2 then λi (P) ≤ λi (F) for every user i ∈ [2]. This implies that the FMNE Conjecture holds for n = m = 2. However, for the case n = 3 and m = 2 there exist instances for which the FMNE Conjecture does not hold. Theorem 23 ([25]). For the model of unrelated links, if n = 2 and m = 2, then the FMNE Conjecture holds. If n = 3 and m = 2, then the FMNE Conjecture does not hold.

Selﬁsh Routing in Non-cooperative Networks: A Survey

6

39

Routing in the Wardrop-Model

The problem of routing splittable traﬃcs in a congested network has already been studied since the 1950’s [41,3]. In contrast to the notion of mixed Nash equilibria in the previous sections, here ﬂows may be split among diﬀerent paths. The model can be interpreted as a game with inﬁnitely many users each carrying an inﬁnitesimally small amount of traﬃc. A Nash equilibrium is deﬁned to be a state in which no such user has an incentive to unilaterally deviate from its chosen path. In the Wardrop-model the following characterization of a Nash equilibrium, occasionally called a Wardrop equilibrium, can be obtained: Theorem 24 ([41]). A ﬂow f is a Nash equilibrium iﬀ for every i ∈ [k] and P1 , P2 ∈ Pi with fP1 > 0 we have lP1 (f ) ≤ lP2 (f ). Thus, in a Nash equilibrium all paths nP ∈ Pi which are used by user i have equal latency, say Li (f ). Then C(f ) = i=1 Li (f )ri . The main diﬀerences to the KP-model are the following: a) The social cost of a ﬂow is deﬁned to be the sum of the edge latencies, as opposed to the maximum of the edge latencies in the KP-model. b) Traﬃc rates and ﬂows may be split arbitrarily. Equivalently, the Wardropmodel can be viewed as a system of inﬁnitely many users, each controlling an inﬁnitesimally small amount of traﬃc. In the Wardrop-model the fact that ﬂows are splittable into arbitrarily small pieces guarantees the existence of a Nash equilibrium. c) Edge latencies are deﬁned by using quite general, but not arbitrary, edge latency functions, giving the latency on an edge per unit of ﬂow. If all edge latency functions le (fe ) are of the form le (fe ) = ae fe , the deﬁnition of latency coincides with the deﬁnition of latency in the KP-model with related links. d) Due to the diﬀerences in the deﬁnition of latencies (a) and due to the fact that all Nash equilibria in the Wardrop-model are pure (b), the semantic of the coordination ratios is diﬀerent in the two models. In the Wardrop-model optimal ﬂows can be computed as a solution of the non-linear program. N LP : min e∈E ce (fe ) s.t. P ∈Pi

fe =

fP = ri

P ∈P:e∈P

fP ≥ 0

fP

∀i ∈ [k] ∀e ∈ E ∀P ∈ P

From this non-linear program the following necessary and suﬃcient condition for a ﬂow of being optimal can be derived [3] in the case of convex edge latencies ce (fe ): Theorem 25 ([3]). Let ce (fe ) be convex for all e ∈ E. A ﬂow f is optimal iﬀ for every i ∈ [k] and P1 , P2 ∈ Pi with fP1 > 0 we have cP1 (f ) ≤ cP2 (f ), where d cP (f ) denotes the derivative of cP (f ). cP (f ) = dx

40

R. Feldmann et al.

Note that from the two characterizations of Nash equilibria and optimal ﬂows an optimal ﬂow f for a problem instance (G, r, l) with convex edge latencies ce (fe ) = le (fe )fe can be regarded as a ﬂow at Nash equilibrium with respect to the edge latencies ce (fe ) = le (fe ) + le (fe )fe . From this it follows that there always exists a Nash equilibrium in the Wardrop-model, and that, if f, f˜ are Nash equilibria then C(f ) = C(f˜). Roughgarden and Tardos proved the following bicriteria bound for the coordination ratio: Theorem 26 ([36]). Let the edge latency functions le (fe ) be continuous and nondecreasing for all e ∈ E. If f is a ﬂow at Nash equilibrium for (G, r, l) and f ∗ is a feasible ﬂow for (G, (1 + δ)r, l), then C(f ) ≤ 1δ C(f ∗ ). In particular (δ = 1), the cost of a ﬂow at Nash equilibrium never increases the cost of a feasible ﬂow for the problem instance obtained from doubling the traﬃc rates.

xp s

x t

s

1 0

1

t x

1 Fig. 2. Pigou’s example (left) and Braess’s Paradox (right)

Pigou’s example [35,3] (see Figure 2) can be used to show that the bound of theorem 26 is tight: In the graph G with two nodes s, t and two edges from s to t with edge latency functions le1 = 1 and le2 = xp the optimal solution to route 1 a total of one unit of ﬂow from s to t is to route a fraction of (p + 1)− p units 1 along e2 and a ﬂow of 1 − (p + 1)− p along edge e1 . The social cost of the optimal solution tends to zero when p tends to inﬁnity. The unique Nash equilibrium in Pigou’s example is to route all ﬂow along edge e2 with social cost 1. For a ﬁxed δ > 0 and any ε ∈ [0, δ[, p can be chosen large enough such that C(f ∗ ) < δ for an optimal ﬂow f ∗ of (G, (1 + δ − ε), l) and such that C(f ∗∗ ) is arbitrarily close to δ for an optimal ﬂow f ∗∗ of (G, (1 + δ), l). Then 1δ · C(f ∗∗ ) is arbitrarily close to C(f ) = 1. In the Wardrop-model the coordination ratio cannot be bounded from above by a constant, when arbitrary edge latency functions are allowed. However, when the edge latency functions le are restricted to be linear ones, le (fe ) = ae fe + be for some ae , be ≥ 0, the coordination ratio is bounded from above by 43 :

Selﬁsh Routing in Non-cooperative Networks: A Survey

41

Theorem 27 ([36]). If (G, r, l) has linear edge latency functions le (fe ) = ae fe + be then 4 WC(G, r, l) ≤ OPT(G, r, l) 3 A problem instance based on Braess’s Paradox [4] (see Figure 2) is used to show that the bound from the above theorem is tight. Suppose one unit of ﬂow is to be routed from s to t. The globally optimal solution is to route 12 along the upper path and 12 along the lower path, resulting in an overall social cost of 3 2 . The unique Nash equilibrium in this environment is obtained when the total traﬃc is routed via the edge connecting the two intermediate nodes, resulting in a social cost of 2. The coordination ratio in this example is 43 . The interesting fact justifying the title “Paradox” is that after removing the edge with latency 0 the globally optimal solution does not change and the unique Nash equilibrium coincides with the global optimal solution. The network performs better after deletion of an edge. Motivated by Braess’s Paradox Roughgarden [37] studied a natural network design problem: NETWORK DESIGN INSTANCE: (G, r, l) with a network G = (V, E), rates r and latencies l. OUTPUT: A subgraph H = (V, EH ) ⊂ G such that a ﬂow f at Nash equilibrium for (H, r, l) has minimum social cost over all possible subgraphs H. In the right graph of ﬁgure 2 deleting the edge with latency zero leads to a ﬂow at Nash equilibrium which equals the globally optimal solution. For linear latency functions it follows from Theorem 27 that the trivial algorithm, which does not remove any edge, is a 43 -approximation. Roughgarden proved the following Theorem 28 ([37]). If (G, r, l) has linear edge latency functions le (fe ) = ae fe + be then for any ε > 0, there is no ( 43 − ε)-approximation algorithm for NETWORK DESIGN unless P = N P. For a more general class of latency functions Roughgarden proved Theorem 29 ([37]). If (G, r, l) with G = (V, E) and |V | = n has continuous, nonnegative and nondecreasing edge latency functions then, for any ε > 0, there is no ( n2 − ε)-approximation algorithm for NETWORK DESIGN unless P = N P. Moreover, Roughgarden showed that the approximation quality of the trivial algorithm, which returns the entire graph G, is upper bounded by n2 and that there are instances (generalizations of the Braess graph), for which the trivial algorithm returns an n2 -approximation. Recently, Roughgarden [38] proved that the worst coordination ratio, that may occur for a network, is equal to the coordination ratio of some simple 2-node

42

R. Feldmann et al.

2-edge network. In the example of Pigou the ﬂow at Nash equilibrium incurs 1 unit of social cost, whereas the optimal ﬂow has social cost that tend to zero at a rate of Θ( logp p ). Thus, as the latency function xp gets “steeper”, the coordination ratio for Pigou’s example grows to inﬁnity. The main result of Roughgarden [38] bounds the coordination ratio of problem instance (G, r, l) by a measure of the steepness of the latency functions allowed. This bound is independent of the network topology of G. A class of nonnegative, diﬀerentiable and nondecreasing latency functions L is called standard, if it contains a non-zero function and if for each l ∈ L, the function cl (x) = x · l(x) is convex on [0, ∞[. For a non-zero latency function l d d such that cl (x) = x · l(x) is convex on [0, ∞[ let cl (x) = dx cl (x) = dx (x · l(x)) and deﬁne the anarchy value α(l) of l as α(l) =

1 r>0:l(r)>0 λµ + (1 − λ) sup

where λ ∈]0, 1[ satisﬁes cl (λr) = l(r) and µ ∈ [0, 1] is deﬁned as µ = l(λr)/l(r). Given the anarchy values of single latency functions, the anarchy value α(L) of a standard class of latency functions L is deﬁned as α(L) = sup0=l∈L α(l). Let ρ(G, r, l) denote the ratio between the social cost of a Nash equilibrium and the social cost of an optimal ﬂow for (G, r, l) Since in the Wardrop-model all Nash equilibria have equal social cost, ρ(G, r, l) is well deﬁned and expresses the coordination ratio of the problem instance (G, r, l). That α(L) is always an upper bound on the coordination ratio is stated in the following Theorem 30 ([38]). Let L be a standard class of latency functions with anarchy value α(L). Let (G, r, l) be a problem instance with latency functions drawn from L. Then ρ(G, r, l) ≤ α(L). Based on Pigou’s example Roughgarden constructed instances of the routing problem such that their coordination ratio approaches the anarchy value: Lemma 8 ([38]). Given a standard class L of latency functions containing the constant functions, there are instances (G, r, l), where G is a network of two nodes and two links and latency functions l ∈ L such that ρ(G, r, l) is arbitrarily close (from below) to α(L). Lemma 8 and Theorem 30 can be combined to achieve Theorem 31 ([38]). Let G2 denote the graph with one source node, one sink node, and two edges directed from source to sink. Let L be a standard class of latency functions containing the constant functions. Let I denote the set of all instances (G, r, l) with latency functions in L and I2 ⊆ I the instances with underlying network G2 , then sup (G2 ,r,l)∈I2

ρ(G2 , r, l) = α(L) =

sup (G,r,l)∈I

ρ(G, r, l)

Selﬁsh Routing in Non-cooperative Networks: A Survey

43

An application of this theory to the set of polynomials taken as a standard class of latency functions results in the following Theorem 32 ([38]). Let Lp be the set of polynomials, let Ip be the set of instances with latency functions in Lp . Then sup (G,r,l)∈Ip

ρ(G, r, l) =

1 1−

p (p+1)

= Θ( p+1 p

p ) log p

Roughgarden generalized the above mentioned results on the coordination ratio for sets of latency functions that do not contain all constant functions and found similar results based on graphs with one source, one sink, and m parallel edges from the source to the sink.

References 1. M.J. Beckmann. On the theory of traﬃc ﬂow in networks. Traﬃc Quart, 21:109– 116, 1967. 2. P. Brucker, J. Hurink, and F. Werner. Improving local search heuristics for some scheduling problems. part ii. Discrete Applied Mathematics, 72:47–69, 1997. 3. M. Beckmann, C.B. McGuire, and C.B. Winsten. Studies in the Economics of Transportation. Yale University Press, 1956. ¨ 4. D. Braess. Uber ein Paradoxon der Verkehrsplanung. Unternehmensforschung, 12:258–268, 1968. 5. Y. Cho and S. Sahni. Bounds for list schedules on uniform processors. SIAM Journal on Computing, 9(1):91–103, 1980. 6. A. Czumaj and B. V¨ ocking. Tight bounds for worst-case equilibria. In Proc. of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’02), pages 413–420, 2002. 7. S.C. Dafermos and F.T. Sparrow. The traﬃc assignment problem for a general network. Journal of Research of the National Bureau of Standards, Series B, Vol. 73B, No. 2, pages 91–118, 1969. 8. E. Even-Dar, A. Kesselmann, and Y. Mansour. Convergence time to nash equilibria. In Proc. of the 30th International Colloquium on Automata, Languages, and Programming (ICALP’03), 2003. 9. R. Feldmann, M. Gairing, T. L¨ ucking, B. Monien, and M. Rode. Nashiﬁcation and the coordination ratio for a selﬁsh routing game. In Proc. of the 30th International Colloquium on Automata, Languages, and Programming (ICALP’03), 2003. 10. G. Finn and E. Horowitz. A linear time approximation algorithm for multiprocessor scheduling. BIT, 19:312–320, 1979. 11. D. Fotakis, S. Kontogiannis, E. Koutsoupias, M. Mavronicolas, and P. Spirakis. The structure and complexity of nash equilibria for a selﬁsh routing game. In Proc. of the 29th International Colloquium on Automata, Languages, and Programming (ICALP’02), pages 123–134, 2002. 12. J. Feigenbaum, C. Papdimitriou, and S. Shenker. Sharing the cost of multicast transmissions. In Proc. of the 32nd Annual ACM Symposium on the Theory of Computing, pages 218–227, 2000. 13. D.K. Friesen. Tighter bounds for lpt scheduling on uniform processors. SIAM Journal on Computing, 16(3):554–560, 1987.

44

R. Feldmann et al.

14. M.R. Garey and D.S. Johnson. Computers and intractability: A Guide to the Theory of NP-Completeness. W.H. Freeman and Company, 1979. 15. M. Gairing, T. L¨ ucking, M. Mavronicolas, B. Monien, and P. Spirakis. Extreme nash equilibria. Technical report, FLAGS-TR-03-10, University of Paderborn, 2002. 16. R:L. Graham. Bounds on multiprocessing timing anomalies. SIAM Journal of Applied Mathematics, 17(2):416–429, 1969. 17. A. Haurie and P. Marcotte. On the relatonship between nash-cournot and wardrop equilibria. Networks, 15:295–308, 1985. 18. D.S. Hochbaum and D. Shmoys. Using dual approximation algorithms for scheduling problems: Theoretical and practical results. Journal of the ACM, 34(1):144– 162, 1987. 19. D.S. Hochbaum and D. Shmoys. A polynomial approximation scheme for scheduling on uniform processors: using the dual approximation approach. SIAM Journal on Computing, 17(3):539–551, 1988. 20. O. Jahn, R.H. M¨ ohring, A.S. Schulz, N.E. Stier Moses. System-Optimal Routing of Traﬃc Flows With User Constraints in Networks With Congestion. MIT Sloan School of Management Working Paper No. 4394-02, 2002. 21. K. Jain and V. Vazirani. Applications of approximation algorithms to cooperative games. In Proc. of the 33rd Annual ACM Symposium on Theory of Computing (STOC’01), pages 364–372, 2001. 22. Y.A. Korilis, A.A. Lazar, and A. Orda. Architecting noncooperative networks. IEEE Journal on Selected Areas in Communications, 13(7):1241–1251, 1995. 23. E. Koutsoupias and C. Papadimitriou. Worst-case equilibria. In Proc. of the 16th International Symposium on Theoretical Aspects of Computer Science (STACS’99), pages 404–413, 1999. 24. E. Koutsoupias, M. Mavronicolas, and P. Spirakis. Approximate Equilibria and Ball Fusion In Proc. of the 9th International Colloquium on Structural Information and Communication Complexity (SIROCCO’02), 2002, (accepted for TOCS). 25. T. L¨ ucking, M. Mavronicolas, B. Monien, M. Rode, P. Spirakis, and I. Vrto. Which is the worst-case nash equilibrium? In Proc. of the 28th International Symposium on Mathematical Foundations of Computer Science (MFCS’03), 2003. 26. J.K. Lenstra, D.B. Shmoys, and E. Tardos. Approximation algorithms for scheduling unrelated parallel machines. In Proc. of the 28th Annual Symposium on Foundations of Computer Science (FOCS’87), pages 217–224, 1987. 27. R.D. McKelvey and A. McLennan. Computation of equilibria in ﬁnite games. In H. Amman, D. Kendrick, and J. Rust, editors, Handbook of Computational Economics, 1996. 28. M. Mavronicolas and P. Spirakis. The price of selﬁsh routing. In Proc. of the 33rd Annual ACM Symposium on Theory of Computing (STOC’01), pages 510–519, 2001. 29. J. Nash. Non-cooperative games. Annals of Mathematics, 54(2):286–295, 1951. 30. N. Nisan. Algorithms for selﬁsh agents. In Proc. of the 16th International Symposium on Theoretical Aspects of Computer Science (STACS’99), pages 1–15, 1999. 31. N. Nisan and A. Ronen. Algorithmic mechanism design. In Proc. of the 31st ACM Symposium on Theory of Computing (STOC’99), pages 129–140, 1999. 32. M.J. Osborne and A. Rubinstein. A Course in Game Theory. The MIT Press, 1994. 33. C.H. Papadimitriou. On the complexity of the parity argument and other ineﬃcient proofs of existence. Journal of Computer and System Science, 48(3):498–532, 1994.

Selﬁsh Routing in Non-cooperative Networks: A Survey

45

34. C.H. Papadimitriou. Algorithms, games, and the internet. In Proc. of the 33rd Annual ACM Symposium on Theory of Computing (STOC’01), pages 749–753, 2001. 35. A.C. Pigou. The economics of welfare. Macmillan, 1920. 36. T. Roughgarden and E. Tardos. How bad is selﬁsh routing? Journal of the ACM, 49(2):236–259, 2002. 37. T. Roughgarden. Designing Networks for Selﬁsh Users is Hard. In Proc. of the 42nd Annual ACM Symposium on Foundations of Computer Science (FOCS’01), pages 472–481, 2001. 38. T. Roughgarden. The Price of Anarchy is Independent of the Network Topology. In Proc. of the 34th Annual ACM Symposium on Theory of Computing (STOC’02), pages 428–437, 2002. 39. A.S. Schulz and N.E. Stier Moses. On The Performance of User Equilibria in Traﬃc Networks. MIT Sloan School of Management Working Paper No. 4274-02, 2002. 40. P. Schuurman and T. Vredeveld. Performance guarantees of load search for multiprocessor scheduling. In Proc. of the 8th Conference on Integer Programming and Combinatorial Optimization (IPCO’01), pages 370–382, 2001. 41. J.G. Wardrop. Some theoretical aspects of road traﬃc research. In Proc. of the Institute of Civil Engineers, Pt. II, Vol. 1, pages 325–378, 1952.

Process Algebraic Frameworks for the Speciﬁcation and Analysis of Cryptographic Protocols Roberto Gorrieri1 and Fabio Martinelli2 1

Dipartimento di Scienze dell’Informazione, Universit` a di Bologna, Italy 2 Istituto di Informatica e Telematica C.N.R., Pisa, Italy

Abstract. Two process algebraic approaches for the analysis of cryptographic protocols, namely the spi calculus by Abadi and Gordon and CryptoSPA by Focardi, Gorrieri and Martinelli, are surveyed and compared. We show that the two process algebras have comparable expressive power, by providing an encoding of the former into the latter. We also discuss the relationships among some security properties, i.e., authenticity and secrecy, that have diﬀerent deﬁnitions in the two approaches.

1

Introduction

Security protocols are those protocols that accomplish security goals such as preserving the secrecy of a piece of information during a protocol or establishing the integrity of the transmitted information. Cryptographic protocols are those security protocols running over a public network that use cryptographic primitives (e.g., encryption and digital signatures) to achieve their security goals. In the analysis of cryptographic protocols, one has to cope with the insecurity of the network. So, it is assumed that one attacker (sometimes called enemy or intruder) of the protocol has complete control over the communication medium. On the other hand, to make analysis less intricate, it is also usually assumed perfect cryptography, i.e., such an enemy is not able to perform cryptanalytic attacks: an encrypted message can be decrypted by the enemy only if he knows (or is able to learn) the relevant decryption key. Such an analysis scenario is often referred to as the Dolev-Yao approach [10]. Because of the above, cryptographic protocols are diﬃcult to be analysed and to be proved correct. Indeed, a lot of them have ﬂaws or inaccuracies. As a well-known example, we mention that Lowe [22] pointed out one inaccuracy in an authentication protocol by Needham and Schroeder [31]. Hence, the need of a formal approach to the analysis of cryptographic protocols. The spi calculus [3], proposed by Abadi and Gordon, and CryptoSPA

Work partially supported by MURST Progetto “Metodi Formali per la Sicurezza” (MEFISTO); IST-FET project “Design Environments for Global ApplicationS (DEGAS)”; Microsoft Research Europe; by CNR project “Tecniche e Strumenti Software per l’analisi della sicurezza delle comunicazioni in applicazioni telematiche di interesse economico e sociale” and by a CSP grant for the project “SeTAPS II”.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 46–67, 2003. c Springer-Verlag Berlin Heidelberg 2003

Process Algebraic Frameworks

47

[17,16], proposed by the authors in joint work with R. Focardi, are two wellknown possible answers. The goal of this paper is to show similarities and differencies of these two approaches from the point of view of both modeling and analysis. A small running example is used throughout the paper in order to illustrate the basic features of the two approaches. The basic idea is that, in order to analyse a protocol, one has to begin by modeling it as a program of the calculus. At ﬁrst sight, there are some diﬀerencies in the spi model and in the CryptoSPA one. In particular, the spi calculus, being based on the π calculus, is apparently more expressive as it handles mobility (of channels) as a ﬁrst class primitive of the language. On the other hand we show that it is possible to deﬁne an encoding from the spi calculus to CryptoSPA that preserves a rather strong notion of equivalence. The core idea of the encoding is that name generation of spi can be simulated by means of a suitable process in CryptoSPA that uses the inference system hidden inside the language. Moreover, the spi calculus oﬀers the possibility to describe secret pieces of information inside the syntax, by means of the restriction operator. On the contrary, in CryptoSPA these secrets are to be speciﬁed separately as limitations on the knowedge of the enemy that tries to attack the protocol. A major diﬀerence between the two approaches can be summarised by the motto: Contextual equivalence vs Equivalence of contexts. In the spi calculus the properties of secrecy and message authenticity are expressed as the equivalence of systems, where the used notion of equivalence is may testing: it is based on the idea that two systems are equivalent if they cannot be distinguished by an external observer. According to the spi calculus approach, the tester is playing at the same time the role of observer and attacker of the protocol; hence, elegantly, spi includes the notion of external enemy inside the deﬁnition of the semantics of the calculus by using a contextual equivalence. Indeed, the two processes must exhibit the same observable behavior w.r.t. any context (the observer). On the contrary, in CryptoSPA the properties of secrecy and authenticity (or integrity) are formulated as instances of the following general form: ∀X ∈ ECφI

(S | X) \ C ∼ α(S)

where X is any process in the set ECφI of admissible enemies, C is the set of communication channels, ∼ is a behavioural semantics (actually, trace semantics for our purpose) and α(S) is the correct speciﬁcation of S when run in isolation. The equation above amounts to say that the behaviour of system S when exposed to any enemy X is the same as the correct behaviour of S. Hence, such properties are expressed as a form of equivalence of contexts: a closed system, i.e. α(S), is compared with an open system (S | •) \ C and the comparison takes the form of an inﬁnity of checks between closed systems, for each possible enemy X. (Actually, nothing prevents ∼ from being itself a contextual equivalence, though trace equivalence is the usual relation used in the CryptoSPA approach.) We will show that the latter approach is more ﬂexible, by providing an example of an attack scenario where there are several enemies with diﬀerent capabilities that can be naturally treated in the CryptoSPA approach. On the contrary,

48

Roberto Gorrieri and Fabio Martinelli

the spi approach is appropriate for modeling a scenario where there is one single enemy, as in the Dolev-Yao approach. The ﬁnal part of the paper is devoted to show similarities and diﬀerences among secrecy and authenticity as deﬁned in the two approaches. We show that, in spite of the many technical diﬀerencies, the notion of spi authenticity is the same as the notion of integrity in CryptoSPA. On the contrary, the two notions of secrecy are quite diﬀerent. A short summary of other process algebraic approaches to the analysis of cryptographic protocols concludes the paper.

2

The Spi Calculus

The spi calculus [3] is a version of the π calculus [30] equipped with abstract cryptographic primitives, e.g. primitives for perfect encryption and decryption. Names represent encryption keys as well as communication channels. Here we give a short overview of the main features of the calculus, by presenting a simple version with asynchronous communication and shared-key cryptography. The interested reader can ﬁnd more details in [3] and [20] (a tutorial on the subject that has inspired the current short survey). 2.1

Syntax and Reduction Semantics

In this section we brieﬂy recall some basic concepts about the asynchronous spi calculus with shared-key cryptography. The choice of the asynchronous version is inessential for the results of the paper and is only taken for simplicity. The restriction to shared-key cryptography only is for the sake of simplicity too. Given a countable set of names N (ranged over by a, b, . . . , n, m, . . .) and a countable set of variables V (ranged over by x, y, . . . ,), the set of terms is deﬁned by the grammar: M, N ::= m | x | (M, N ) | {M }N with the proviso that in (M, N ) and {M }N the term M (and similarly N ) can be either a ground term (i.e., without variable occurrences) or simply a variable. The set of spi calculus processes is deﬁned by the BN F -like grammar: P, Q ::= 0 | M N | M (x).P | (νn) P | P | Q | [M = N ]P else Q | AM1 , . . . , Mn | let (x, y) = M in P else Q | case M of {x}N in P else Q The name n is bound in the term (νn)P . In M (x).P the variable x is bound in P . In let (x, y) = M in P else Q the variables x and y are bound in P . In case M of {x}N in P else Q the variable x is bound in P . The set f n(P ) of free names of P is deﬁned as usual. We give an intuitive explanation of the operators of the calculus:

Process Algebraic Frameworks

49

– 0 is the stuck process that does nothing. – M N is the output construct. It denotes a communication on the channel M of the term N . – M (x).P is the input construct. A name is received on the channel M and its value is substituted for the free occurrences of x in P (written P [M/x]). – (νn)P is the process that makes a new, private name n, for P , and then behaves as P . – P | Q is the parallel composition of two processes P and Q. Each may interact with the other on channels known to both, or with the outside world, independently of the other. – [M = N ]P else Q is the match construct. The process behaves as P when M = N , otherwise it behaves as Q. – Ax1 , . . . , xn is a process constant. We assume that constants are equipped . by a constant deﬁnition like Ax1 , . . . , xn = P , where the free variables of P are contained in {x1 , . . . , xn }. – let (x, y) = M in P else Q is the pair splitting process. If the term M is of the form (N, L), then it behaves as P [N/x][L/y]; otherwise, it behaves as Q. – case M of {x}N in P else Q is the decryption process. If M is of the form {L}N , then the process behaves as P [L/x]; otherwise, it behaves as Q. We also deﬁne the structural congruence as follows. Let ≡ be the least congruence relation over processes closed under the following rules: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17.

P ≡ Q, if P is obtained through α–conversion from Q P |0 ≡ P P |Q ≡ Q|P P |(Q | R) ≡ (P | Q) | R (νn)0 ≡ 0 (νn)(νm)P ≡ (νm)(νn)P (νn)(νn)P ≡ (νn)P / f n(M ) ∪ f n(N ) (νn)M N ≡ M N , if n ∈ (νn)M (x).P ≡ M (x).(νn)P , if n ∈ / sort(M ) (νn)(P | Q) ≡ P |(νn)Q if n ∈ / f n(P ) [M = M ]P else Q ≡ P , if M = M [M = M ]P else Q ≡ Q, if M = M and M, M are ground let (x, y) = ( M, N ) in P else Q ≡ P [M/x][N/y] /V let (x, y) = M in P else Q ≡ Q, if M = (N, N1 ), for some N, N1 and M ∈ case {M }N of {x}N in P else Q ≡ P [M/x] /V case M of {x}N in P else Q ≡ Q, if M = {N }N , for some N and M ∈ . AM1 , . . . , Mn ≡ P [M1 /x1 , . . . , M1 /x1 ], when Ax1 , . . . , xn = P .

We give the reduction semantics for the asynchronous spi calculus. Processes communicate among them by exchanging messages. An internal communication (or reduction) of the process P is denoted by P −→ P . We have the following rules for calculating the reduction relation between processes: P ≡ Q, Q −→ Q , Q ≡ P mN | m(x).P −→ P [N/x] P −→ P P −→ P P −→ P νn(P ) −→ νn(P ) P | Q −→ P | Q

50

Roberto Gorrieri and Fabio Martinelli

2.2

May Testing Semantics

May testing equivalence [9] is the equivalence notion that is used in the spi calculus to deﬁne the security properties. In order to deﬁne this equivalence, we ﬁrst deﬁne a predicate that describes the channels on which a process can communicate. We let a barb β be an output channel. For a closed process P , we deﬁne the predicate P exhibits barb β, written P ↓ β, by the following rules: P ↓β mN ↓ m P |Q ↓ β P ↓ β β ∈ {m, m} P ≡ Q Q ↓ β P ↓β (νm)P ↓ β Intuitively, P ↓ β holds if P may output immediately along β. The convergence predicate P ⇓ β holds if P exhibits β after some reactions. P ↓ β P −→ Q Q ⇓ β P ⇓β P ⇓β A test consists of any closed process R and any barb β. A closed process P passes the test if and only if (P | R) ⇓ β. May testing equivalence is then deﬁned on the set of closed processes as follows: P ≈may Q ⇐⇒ for any test (R, β), (P | R) ⇓ β if and only if (Q | R) ⇓ β May-testing has been chosen because it corresponds to partial correctness (or safety), and security properties are often safety properties. Moreover, a test neatly formalises the idea of a generic experiment or observation another process (such as an attacker) might perform on a process. So testing equivalence captures the concept of equivalence in an arbitrary environment; as a matter of fact, maytesting equivalence is a contextual equivalence.

3

CryptoSPA

Cryptographic Security Process Algebra (CryptoSP A for short) is a slight modiﬁcation of CCS process algebra [29], adopted for the description of cryptographic protocols. It makes use of cryptographic-oriented modeling constructs and can deal with conﬁdential values [15,17,26]. The CryptoSPA model consists of a set of sequential agents able to communicate by exchanging messages. The data handling part of the language consists of a set of inference rules used to deduce messages from other messages. We consider a set of relations among closed messages as: r ⊆ P f in (M) × M, where r is the name of the rule. Given a set R of inference rules, we consider the deduction relation DR ⊆ P f in (M)×M. Given a ﬁnite set of closed messages, say φ, then (φ, M ) ∈ DR if M can be derived by iteratively applying the rules in R. For the sake of simplicity, we assume that r (for each r ∈ R) and DR ⊆ P f in (M) × M are decidable.

Process Algebraic Frameworks

3.1

51

The Language Syntax

CryptoSP A syntax is based on the following elements: – A set Ch of channels, partitioned into a set I of input channels (ranged over by c) and a set O of output channels (ranged over by c, the output corresponding to the input c); – A set V ar of variables, ranged over by x; – A set M of messages, deﬁned as above for the spi calculus, ranged over by M, N (and by m, n, with abuse of notation, to denote closed messages). The set L of CryptoSPA terms (or processes) is deﬁned as follows: P, Q ::= 0| c(x).P | cM.P | τ.P | P | Q | P \L | A(M1 , . . . , Mn ) | [M1 , . . . , Mr rule x]P ; Q where M, M , M1 , . . . , Mr are messages or variables and L is a set of channels. Both the operators c(x).P and [M1 . . . Mr rule x]P ; Q bind variable x in P . We assume the usual conditions about closed and guarded processes, as in [29]. We call P the set of all the CryptoSP A closed and guarded terms. The set of actions is Act = {c(M ) | c ∈ I}∪{cM | c ∈ O}∪{τ } (τ is the internal, invisible action), ranged over by a. We deﬁne sort(P ) to be the set of all the channels syntactically occurring in the term P . Moreover, for the sake of readability, we always omit the termination 0 at the end of process speciﬁcations, e.g. we write a in place of a.0. We give an informal overview of CryptoSP A operators: – 0 is a process that does nothing. – c(x).P represents the process that can get an input M on channel c behaving like P [M/x]). – cm.P is the process that can send m on channel c, and then behaves like P . – τ.P is the process that executes the invisible τ and then behaves like P . – P1 | P2 (parallel) is the parallel composition of processes that can proceed in an asynchronous way but they must synchronize on complementary actions to make a communication, represented by a τ . – P \L is the process that cannot send and receive messages on channels in L; for all the other channels, it behaves exactly like P ; – A(M1 , . . . , Mn ) behaves like the respective deﬁning term P where all the variables x1 , . . . , xn are replaced by the messages M1 , . . . , Mn ; – [M1 , . . . , Mr rule x]P ; Q is the process used to model message manipulation as cryptographic operations. Indeed, the process [M1 , . . . , Mr rule x]P ; Q tries to deduce an information z from the tuple M1 , . . . , Mr through the application of rule rule ; if it succeeds then it behaves like P [z/x], otherwise it behaves as Q. The set of rules that can be applied is deﬁned through an inference system (e.g., see Figure 1 for an instance).

52

Roberto Gorrieri and Fabio Martinelli m m (pair ) (m, m )

(m, m ) (f st ) m

m k (enc ) {m}k

{m}k m

(m, m ) (snd ) m k

(dec )

Fig. 1. An example inference system for shared key cryptography

3.2

The Operational Semantics of CryptoSPA

In order to model message handling and cryptography we use a set of inference rules. Note that CryptoSP A syntax, its semantics and the results obtained are completely parametric with respect to the inference system used. We present in Figure 1 an instance inference system, with rules: to combine two messages obtaining a pair (rule pair ); to extract one message from a pair (rules f st and snd ); to encrypt a message m with a key k obtaining {m}k and, ﬁnally, to decrypt a message of the form {m}k only if it has the same key k (rules enc and dec , respectively). In a similar way, inference systems can contain rules for handling the basic arithmetic operations and boolean relations among numbers, so that the valuepassing CCS if-then-else construct can be obtained via the rule operator. Example 1. Natural numbers may be encoded by assuming a single value 0 and a function S(y), with the following rule: x inc. Similarly, we can deﬁne sumS(x) mations and other operations on natural numbers. Example 2. We do not explicitly deﬁne equality check among messages in the syntax. However, this can be implemented through the usage of the inference x x construct. E.g., consider rule equal. Then [m = m ]A (with the Equal(x, x) expected semantics) may be equivalently expressed as [m m equal y]A where y does not occur in A. Similarly, we can deﬁne inequalities, e.g., ≤, among natural numbers. More interestingly, this form of inference constructs of CryptoSPA is also useful to model common access control mechanisms in distributed systems. Example 3. Indeed, consider a set of credentials, i.e. (signed) messages containing information about access rights. Assume that {A, ob1 , +}pr(C) means that the user C (via the signature with its private key pr(C)) asserts A has the right to access the object ob1 and may grant this access to other users (this is denoted through the symbol +). A rule like: {A, ob1 , +}pr(C)

pr(C) {grant {B, ob1 , +}pr(C)

B, ob1 }pr(A)

(accC )

Process Algebraic Frameworks

(input)

m∈M

c(m)

(\L)

(output)

c(m)

c(x).P −→ P [m/x]

P −→ P

c ∈ L

c(m)

P \L −→ P \L

c(x)

a

(| ) 1

P1 −→ P1 a

P1 | P2 −→

P1

(| )

| P2

P [m1 /x1 , . . . , mn /xn ] −→ P

(D1 )

τ

τ.P −→ P cm

τ

P2 −→ P2

P1 | P2 −→

P1

| P2

.

a

A(x1 , . . . , xn ) = P

A(m1 , . . . , mn ) −→ P (D)

P1 −→ P1

2

a

(Def )

(internal)

cm

cm.P −→ P

53

a

P [m/x] −→ P

m1 , . . . , mr rule m

a

[m1 , . . . , mr rule x]P ; Q −→ P

∃m s.t. m1 , . . . , mr rule m

a

Q −→ Q

a

[m1 , . . . , mr rule x]P ; Q −→ Q

Fig. 2. Structured Operational Semantics for CryptoSPA (symmetric rules for |1 , |2 and \L are omitted)

may be used by the controller C to issue other access right credentials, after receiving an indication by A, i.e. the signed message {grant B, ob1 }pr(A) . Thus, we may also consider the inference rules as an abstract mechanism to express security policies usually deﬁned using other mathematical models and logics (e.g., see [21,34]). The operational semantics of a CryptoSP A term is described by means of the a a labelled transition system (lts, for short) P, Act, {−→}a∈Act , where {−→}a∈Act is the least relation between CryptoSP A processes induced by the axioms and inference rules of Figure 2. As a notation we also use P =⇒ P for denoting γ τ that P and P belong to the reﬂexive and transitive closure of −→; P =⇒ P a1 τ if γ is a ﬁnite sequence of actions ai , 1 ≤ i ≤ n s.t. ai = τ and P =⇒−→=⇒ an τ . . . =⇒−→ =⇒ P . γ Let T r(P ) denote the set {γ ∈ (Act \ {τ })∗ |P =⇒ P } of executable observable traces. We deﬁne the trace preorder, ≤trace , as follows: P ≤trace Q if T r(P ) ⊆ T r(Q). We say that P and Q are trace equivalent, denoted P ∼tr Q, iﬀ T r(P ) = T r(Q).

4

Comparison in Expressiveness

We compare the two languages by providing an encoding from the spi calculus to CryptoSPA (actually a sublanguage). Basically, the encoding [ ], preserves the equality of terms, i.e. assume that ≈1 (≈2 ) is an equivalence relation on spi calculus (CryptoSPA), then P ≈1 Q

⇐⇒

[P ] ≈2 [Q]

The technical machinery in [35] about barbed equivalences will be of help, as ≈i , with i = 1, 2, will be weak barbed equivalences. Barbed equivalences are based

54

Roberto Gorrieri and Fabio Martinelli

on a minimal notion of observable, i.e. the barb. This makes them very suitable to provide natural equivalence notions in diﬀerent languages in a uniform way. It is worthwhile noticing that these forms of equivalence are ﬁner than may-testing (in their respective languages). Barbed Equivalences. Given the predicate P exhibits a barb β, P ↓ β, it is possible to deﬁne in a standard way a set of useful process equivalences. We say that a symmetric relation R among processes is a barbed bisimulation, if (P, Q) ∈ R then: – For all β, it holds P ↓ β iﬀ Q ↓ β; – if P −→ P then ∃ Q s.t. Q −→ Q and (P , Q ) ∈ R. The union of all barbed bisimulations, denoted by ∼, is a barbed bisimulation. There exists also a form of weak barbed bisimulation where in the previous statements, the observable predicate is replaced by the weak one, i.e. P ⇓ β and Q −→ Q by Q =⇒ Q . We say that a symmetric relation R among processes is a barbed equivalence whenever, if (P, Q) ∈ R then for each static (cf [35]) context C[·], it holds that C[P ] ∼ C[Q]. Our encoding works in two steps: – The spi calculus will be encoded, up to weak barbed equivalence, into a sublanguage, called spires -calculus. In this sublanguage, the input operator is replaced by a new one: a process can receive only pairs on a public channel, i.e. net, that cannot be restricted. After receiving a pair, a process is obliged to check the ﬁrst element of the pair with a given message. Only if the match is successful the process proceeds, otherwise it has to reproduce the message in the net and to try with another pair. – Then, the spires -calculus is encoded into CryptoSPAres , a similar variant of spires but on CryptoSPA. Basically, we encode the decryption, splitting and matching constructs through inference rules. Moreover, the new name generation is simulated as the receiving action of a fresh message generated by a special process, called Gen. 4.1

An Encoding of Spi Calculus into spires -Calculus

The encoding [P ]1 acts as an homomorphism on spi calculus process, except for MN.P and M(x).P . In particular, we have the homomorphic [(νc)P ]1 = (νc)[P ]1 ; moreover, [M N ]1 = net(M, N ), where net is a special channel name that cannot be restricted, and [M (x).P ]1 = A where the deﬁning equation for A is . A = net(x).let (z1 , z2 ) = x in ([M = z1 ][P [z2 /x]]1 else net(z1 , z2 ) | A) else netx | A The encoding works as follows. Sent messages are encoded as pairs: the ﬁrst element denotes the channel, and the second one the message itself. When a process wishes to receive a message on a certain channel, say M , it has to get

Process Algebraic Frameworks

55

a pair from the network, and then it is obliged to check if the ﬁrst element of the pair is the channel; if so, the process proceeds as before (provided that the derivative is encoded), otherwise the pair that has been captured from the network is inserted again. Note that we cannot avoid that a communication happens within the channel net, however, in the case that the channel is not the expected one then the system returns to the original conﬁguration. Consider the following example of a communication on a restricted channel. Example 4. Suppose P = νc(cn | c(x).xx). Then, [P ]1 is νc(net(c, n) | A) where A is deﬁned as .

A = net(x).let (z1 , z2 ) = x in ([c = z1 ](net(z2 , z2 )) else net(z1 , z2 ) | A) else netx | A Now, P −→ νc(nn) and similarly [P ]1 −→ νc(net(n, n)). Consider now the process Q = c1 n. Then, [P | Q]1 = [P ]1 |[Q]1 = νc(net(c, n) | A) | netc1 , n. Note that [P | Q]1 −→ T , by means of a synchronization of net and T ≡ [P | Q]1 : [P | Q]1 = νc(net(c, n) | A) | net(c1 , n) ≡ νc(net(c, n) | A | net(c1 , n)) −→ νc(net(c, n) |([c = c1 ](net(n, n)) else net(c1 , n) | A)) ≡ νc(net(c, n) | A | net(c1 , n)) Indeed, the encoded process may perform useless communications on the channel net; the crucial point is that such communications do not signiﬁcantly change the status of the process. Thus, we may deﬁne a form of weak barbed equivalence among processes in spi and the ones obtained through [ ]1 , i.e. spires . The idea is that whenever P exhibits an output on a barb c, say P ↓ c, then [P ]1 exhibits an output on a barb net of c, say [P ]1 ↓ netc, ∗, and conversely. Moreover, if P performs a reduction then also [P ]1 must perform it, on the contrary if [P ]1 performs a reduction P may also choose to stay blocked. This encoding is clearly not satisfactory for the point of view of implementation, as it introduces divergence. However, it is useful when we are simply interested in verifying security properties that usually depends on may testing equivalence. Indeed, weak barbed equivalence implies may testing equivalence. 4.2

Encoding spires into CryptoSPAres

We may encode the spires calculus into CryptoSPAres . For most operators, the [ ]2 function works as a homomorphism, e.g., [P | Q]2 = [P ]2 |[Q]2 . However, we must face two relevant problems: (i) the diﬀerent forms of cryptography handling;(ii) the diﬀerent treatment of new name generation. We deal ﬁrst with the simpler one that is the treatment of cryptography. In CryptoSPA cryptographic primitives are modeled by means of the inference system. Hence, we can map the decryption construct of spi as follows: [case M of {x}N in P else Q]2 = [M N dec x][P ]2 ; [Q]2

56

Roberto Gorrieri and Fabio Martinelli

and similarly for the splitting construct and the matching one. For the second problem, we have to consider the restriction operator (new name generation). The restriction operator of the spi calculus is used to denote “secret” values known locally by the process. The treatment of such secret values is very elegant in the pi/spi calculus. Ultimately, the operator (νn)P deﬁnes a new fresh name n in P , that no one else should be ever able to create/guess. We encode these features through the usage of a speciﬁc process that creates new names. This process is the unique one allowed to generate such messages and it creates them iteratively. A CryptoSPA speciﬁcation for such a process could be the following: Gen(x) = [x nonce y].geny.Gen(y) where the (omitted) rule for nonce creation could be as inc in Example 1. The encoding [ ]2 may map each νn(P ) construct to a receiving action, i.e. [νn(P )]2 = gen(x).[P ]2 This second encoding [ ]∗ will be actually the following [P ]∗ = [P ]2 | Gen(0), because we need one single instance of the name generator process. Note that restricted names are mapped to new names that are unguessable by the enemy because we assume that only process Gen can send along gen, hence ensuring that an enemy cannot eavesdrop new names sent from Gen. The encoding [ ]∗ is sound as we consider as observable the ﬁrst component in a pair which is an output over net, taking care not to consider in CryptoSPAres the new (nonce) names (which correspond to restricted channels in spires ). We show a complete example of encoding from spi to CryptoSPAres . Example 5. Consider P = νc(cn). Then, [P ]∗ = gen(c).net(c, n) | Gen(0). Note that although [P ]∗ may perform a communication step, that cannot be matched by P , the observable behaviour is the same since the output of nonce values cannot be observed.

5 5.1

Comparison of the Two Approaches Protocol Analysis in the Spi Calculus

We show a very basic example. We have two principals A and B that use a public (hence insecure) channel, cAB , for communication; in order to achieve privacy, messages are encrypted with a shared key KAB . The protocol is simply that A sends along cAB a single message M to B, encrypted with KAB . A → B : {M }KAB on public cAB The spi calculus speciﬁcation is as follows: A(M ) = cAB {M }KAB B = cAB (x).case x of {y}KAB in F (y) P (M ) = (νKAB )(A(M ) | B) where F (y) is the continuation of B. The fact that the channel cAB is public is witnessed by the fact that it is not restricted. On the other hand, KAB is

Process Algebraic Frameworks

57

restricted to model that it is a secret known only by A and B. When B receives {M }KAB on cAB , B attempts to decrypt it using KAB ; if this decryption succeeds, B applies F to the result. Two important properties hold for this protocol: – Authenticity (or integrity): B always applies F to the message M that A sends; an enemy cannot cause B to apply F to some other message M . – Secrecy: No information on the message M can be inferred by an observer while M is in transit from A to B: if F does not reveal M , then the whole protocol does not reveal M . Intuitively, the secrecy property should establish that if F (M ) is indistinguishable from F (M ), then the protocol with message M is indistinguishable from the protocol with message M . This intuition can be formulated in terms of equivalences as follows: if F (M ) ≈may F (M ), for any M , M , then P (M ) ≈may P (M ). Also integrity can be formalized in terms of an equivalence. This equivalence compares the protocol with another version of the protocol which is secure by construction. For this example, the required speciﬁcation is: A(M ) = cAB {M }KAB Bspec (M ) = cAB (x).case x of {y}KAB in F (M ) Pspec (M ) = (νKAB )(A(M ) | Bspec (M )) The principal B is replaced with a variant Bspec (M ) that receives an input from A and then acts like B when B receives M . Bspec (M ) is a sort of “magical” version of B that knows the message M sent by A, hence ensuring integrity by construction. Therefore, we take the following equivalence as our integrity property: P (M ) ≈may Pspec (M ), for any M . 5.2

Protocol Analysis in CryptoSPA

In this section we want to show how to use CryptoSPA for the analysis of cryptographic protocols. The following subsections are devoted (i) to illustrate how security properties can be speciﬁed by decorating suitably protocol speciﬁcation, then (ii) to discuss the actual deﬁnition of admissible attackers and, ﬁnally, (iii) to show how secrecy and integrity can be modeled in this framework. Noninterference for Cryptographic Protocols Analysis. Noninterference essentially says that a system P is secure if its low behaviour in isolation is the same as its low behaviour when exposed to the interaction with any high level process Π. Analogously, we may think that a protocol P is secure if its (low) behaviour is the same as its (low) behaviour when exposed to the possible attacks of any intruder X. To set up the correspondence, this analogy forces to consider the enemies as the high processes. Since the enemy has complete control over the communication medium, the CryptoSPA public channels in set C (i.e., the names used for

58

Roberto Gorrieri and Fabio Martinelli

message exchange) are the high level actions while the private channels in set (I ∪O)\C are the low level ones. As a protocol speciﬁcation is usually completely given by message exchanges, it may be not obvious what are the low level actions. In our approach, they are extra observable actions that are included into the protocol speciﬁcation to observe properties of the protocol. Of course, the choice of these extra actions (and the place into the speciﬁcation where they are to be inserted) is property dependent. Considering the example A → B : {M }KAB on public cAB the basic integrity property that we want to model can be obtained by enriching the protocol speciﬁcation with a (low) extra action received(M ) that B performs when receiving the message M . Hence, the protocol speciﬁcation is: A(M ) = cAB {M }KAB B = cAB (x).[{M }KAB , KAB dec y].receivedy P (M ) = A(M ) | B where cAB {M }KAB is a shorthand for [M, KAB enc x]cAB x and where the received message is sent along the private channel received. Hence, we can state that integrity holds if the following equation holds for the protocol enriched with the event received: P (M ) satisﬁes integrity of M if for all (admissible) enemies X we have P (M ) \ {cAB } ∼tr (P (M ) | X) \ {cAB } This noninterference-based deﬁnition seems intuitively quite strong; it is of the form of equivalence of contexts: the closed term deﬁning the security property, i.e. P (M ) \ {cAB } ∼tr {received(M )}, is checked for (trace) equivalence with the open system (P (M ) | •) \ {cAB }; such a comparison takes the form of an inﬁnity of equivalence checks for all possible enemies X closing the term. Admissible Enemies. Intuitively, an enemy X can be thought of as a process which tries to attack a protocol by stealing and faking the information which is transmitted on the CryptoSPA public channels in set C. However, we are to be sure that X is not a too powerful attacker. Indeed, a peculiar feature of the enemies is that they should not be allowed to know secret information in advance: as we assume perfect cryptography, the initial knowledge of an enemy must be limited to include only publicly available pieces of information, such as names of entities and public keys, and its own private data (e.g., enemy’s private key). If we do not impose such a limitation, the attacker would be able to “guess” every secret piece of information. Considering the example above, if the enemy knows the key kAB that should be known only by A and B, the protocol would be easily attacked by the instance of the enemy def X(m, k) = cAB {m}k

Process Algebraic Frameworks

59

where m = MX and k = KAB . The problem of guessing secret values can be solved by imposing some constraints on the initial data known by the enemies. Given a process P , we call ID(P ) the set of messages that occur syntactically in P . Now, let φI ⊆ M be the ﬁnite, initial knowledge that we would like to give to the enemies, i.e., the public information such as the names of the entities and the public keys, plus some possible private data of the intruders (e.g., their private keys or nonces). For a certain intruder X, we want that all the messages in ID(X) are deducible from φI . Formally, given a ﬁnite set φI ⊆ M, called the initial knowledge, we deﬁne the set ECφI of admissible enemies as ECφI = {X ∈ P | sort(X) ⊆ C and ID(X) ⊆ D(φI )}. To see how ECφI prevents the problem presented in the running example, to indicate that KAB is secret, we can now require that KAB ∈ D(φI ). Since ID(X(MX , KAB )) = {MX , KAB }, we ﬁnally have that X(MX , KAB ) ∈ ECφI . Integrity. In order to specify integrity (or authenticity), one has simply to decorate the protocol speciﬁcation P with an action of type received in correspondence of the relevant point of the protocol speciﬁcation, obtaining a decorated protocol P . Once this has been done, the integrity equation reads as follows: P (M ) satisﬁes integrity of M if ∀X ∈ ECφI (P (M ) | X) \ C ∼tr P (M ) \ C where C is the set of public channels. The intuition is that P (M ) \ C represents the protocol P running in isolation (because of the restriction on public channels), while (P (M ) | X) \ C represents the protocol under the attack of an admissible (i.e., that does not know too much) enemy X. The equality imposes that the enemy X is not able to violate the integrity property speciﬁcation that is represented by the correct (low) trace received(M ). The property above is very similar to a property known in the literature as Non Deducibility on Compositions [13,14] (NDC for short) and, as we will show in the next subsection is valid also for analysing secrecy. NDC for CryptoSPA is deﬁned as follows (see [17]): A process S is NDC iﬀ ∀X ∈ ECφI (S | X) \ C ∼tr S \ C. In other words S is NDC if every possible enemy X which has an initial knowledge limited by φI is not able to signiﬁcantly change the behaviour of the system. This deﬁnition can be generalized to the scheme GNDC [15,17] as follows: α iﬀ ∀X ∈ ECφI : (S | X) \ C ≈ α(S) S is GN DC≈ where α(S) denotes the secure speciﬁcation of the system, which is then compared with the open term (S | •)\C. GNDC is a very general scheme under which many security properties for cryptographic protocols can be deﬁned as suitable instances. See, e.g., [16] for some examples about authentication properties. Secrecy. Also secrecy can be deﬁned via a variation of the NDC equation above, or better as an instance of the GNDC scheme. Consider a protocol P (M ) and assume that we want to verify if P (M ) preserves the secrecy of message M .

60

Roberto Gorrieri and Fabio Martinelli

This can be done by proving that every enemy which does not know message M , cannot learn it by interacting with P (M ). Thus, we need a mechanism that notiﬁes whenever an enemy is learning M . We implement it through a simple process called knowledge notiﬁer which reads from a public channel ck ∈ C \ sort(P (M )) not used in P (M ) and executes a learntM action if the read value is exactly equal to M . For a generic message m, it can be deﬁned as follows: def

KN (m) = ck (y).[m = y]learntm We assume that learnt is a special channel that is never used by protocols def and is not public, i.e., learnt ∈ sort(P ) ∪ C. We now consider P (M ) = P (M ) | KN (M ), i.e., a modiﬁed protocol where the learning of M is now notiﬁed. A very intuitive deﬁnition of secrecy can be thus given as follows: P (M ) preserves the secrecy of M iﬀ for all secrets N ∈ M \ D(φI ) ∀X ∈ ECφI

P (N ) \ C ∼tr (P (N ) | X) \ C

In other words, we require that for every secret M and for every admissible enemy X, process (P (M ) | X) \ C never executes a learntM action, as P (N ) \ C is not able to do so. Most Powerful Enemy. A serious obstacle to the widespread use of these GNDC -like properties is the universal quantiﬁcation over all admissible enemies. While the proof that a protocol is not NDC can be naturally given by exhibiting an enemy that breaks the semantic equality, much harder is the proof that a protocol is indeed NDC, as it requires an inﬁnity of equivalence checks, one for each admissible enemy. One reasonable way out could be to study if there is an attacker that is more powerful than all the others, so that one can reduce the inﬁnity of checks to just one, albeit huge, check with respect to such most powerful enemy. Indeed, it is easy to prove that if is a pre-congruence1 and if there exists a process T op ∈ ECφI such that for every process X ∈ ECφI we have X T op, then: P ∈ N DC

iﬀ

(P | T op) \ C P \ C

If the hypotheses of the proposition above hold, then it is suﬃcient to check that P \ C is equivalent to (P | T op) \ C. Given the pre-congruence , let ≈= ∩ −1 . If there exist two processes Bot, T op ∈ ECφI such that for every process X ∈ ECφI we have Bot X T op then P ∈ N DC≈ iﬀ (P | Bot) \ C ≈ (P | T op) \ C ≈ P \ C Given these very general results, one may wonder if they are instanciable to some of the semantics we have described so far. Indeed, this is the case, at least for the trace preorder ≤trace , which is a pre-congruence. 1

A preorder is a pre-congruence (w.r.t. the operators | and \C) if for every P, Q, R ∈ P if Q R then P | Q P | R and Q \ C R \ C.

Process Algebraic Frameworks

61

The easy part is to identify the minimal element Bot in ECφI w.r.t. ≤trace : the minimum set of traces is the emptyset, that is generated, e.g., by process 0. Let us now try to identify the top element T op in ECφI w.r.t. ≤trace . The “most powerful enemy” can be deﬁned by using a family of processes T opC,φ trace each representing the instance of the enemy with knowledge φ: C,φ∪{m} T opC,φ c(m).T optrace + cm.T opC,φ trace = trace c∈C m ∈ Msg(c)

c∈C m ∈ D(φ) ∩ Msg(c)

I The “initial element” of the family is T opC,φ trace as φI is the initial knowledge. Note that it may accept any input message, to be bound to the variable x which is then added to the knowledge set φ∪{x}, and may output only messages that can transit on the channel c and that are deducible from the current knowledge set I φ via the deduction function D. It is easy to see that T opC,φ trace is the top element of the trace preorder. As a consequence of the fact that the trace preorder is a C,φI pre-congruence and that T optrace is the top element for that preorder, we have that the single check against the top element is enough to ensure NDC. Formally: φI I P ∈ N DCC iﬀ (P | T opC,φ trace ) \ C ∼tr P \ C.

The Example. The example studied for the spi calculus can be easily modeled in CryptoSPA: A(M ) = cAB {M }KAB B = cAB (x).[{M }KAB , KAB dec y]0 P (M ) = A(M ) | B

In order to study if integrity and secrecy hold, we have to deﬁne a suitably decorated version P of the protocol P : A(M ) = cAB {M }KAB B = cAB (x).[{M }KAB , KAB dec y]receivedy

P (M ) = A(M ) | B | KN (M )

where we have inserted the extra event received for integrity analysis purpose and the knowledge notiﬁer for secrecy analysis purpose. In the single session described above, both secrecy and integrity holds, that is (P (M ) | X) \ C can never show up a low trace that is not possible for P (M ) \ C. 5.3

Comparing the Analysis Scenarios

In the spi calculus, the analysis scenario is rather delicate: a tester is at the same time the enemy that tries to inﬂuence the behaviour of the system and the observer that should keep track of the behaviour of the system. On the contrary, in CryptoSPA the two roles are separate: on the one hand we explicitly introduce

62

Roberto Gorrieri and Fabio Martinelli

the enemy X inside the scope of restriction, on the other hand we use trace equivalence to compare the two behaviours. In essence, we can say that in spi we use a contextual equivalence, while in CryptoSPA we use and equivalence of contexts. We argue that the elegance of the spi approach is paid in terms of lack of ﬂexibility in modeling enemies; for instance, it is not straightforward to model passive attackers (less powerful enemies) or situations in which diﬀerent parties of the protocol are subject to diﬀerent enemies (more holes in the context). Both cases can be easily represented in the CryptoSPA approach by choosing either suitable ECφI or by including more enemies inside the GNDC-like equations. Example 6. Consider a system with three components, say A, B and C. The components A and B can communicate on a private channel c and share a secret b, while the components B and C through the private channel c1 and shares a secret d. Assume that both A and C are malicious; this means to regard them as enemies. Considering a generic speciﬁcation for A (resp. C), as XA (resp. XC ), then a possible context to be analyzed is νcνb(XA | νc1 νd(B | XC )), with b, c ∈ / f n(XC ) 2 . As an extension of the spi-calculus approach, we may deﬁne a suitable class of contexts w.r.t. one performs certain observations. Thus the previous analysis problem could be instantiated as a contextual equivalence problem. However, techniques for dealing with such generic contexts are currently not well developed. On the contrary, within the GNDC approach, a simple generalization to having more “holes” (unknown components as remarked in [25,26]), makes it possible to model and analyze it in the case of trace equivalence by resorting to the use of the most general enemies3 . As a matter of fact, assume that EcφA (resp. Ecφ1C ) denotes the possible behaviours of enemies with the knowledge of φA (resp. φC ). Then, the GNDC speciﬁcation could be ((XA ) | B |(XC )) \ {c, c1 } ∼tr α(B) Note that α(B) could also take into account the description of A and C. Using the most general intruder approach the veriﬁcation is equal to just one check. As a matter of fact we may simply check that: ((T opφc A ) | B |(T opφc1C )) \ {c, c1 } ∼tr α(B) A main diﬀerence, w.r.t. the two approaches may be noted in the treatment of integrity. In the spi-calculus approach we must ﬁnd the magical correct implementation to be used as reference w.r.t. the system under investigation. (We use the term implementation because it is very close to the description of the system). On the contrary, in the GNDC approach, we simply specify the intended observable behavior, indeed, that the messages are correctly delivered through a control action. Thus, the correct speciﬁcation is indeed rather more abstract 2

3

Due to the limitations of using a spi term for expressing the system, it seems diﬃcult to specify such a case study without imposing some side conditions on the processes XA and XC . In several cases, to have more enemies that have the possibility to directly communicate with each other is equivalent to consider just one enemy (see [26]).

Process Algebraic Frameworks

63

than the system. Recently Gordon and Jeﬀrey (e.g., see [19]) developed type systems for a spi-calculus variant that embodies a form of control (correspondence) actions. Using that type systems they were able to check authentication properties as agreement ([33]). In that framework, authenticity is exactly speciﬁed as control actions, following the Woo-Lam approach (see [36]). Another diﬀerence is that in spi it is necessary to perform two diﬀerent analyses in order to prove the two security properties of secrecy and authenticity. On the contrary, in CryptoSPA one single NDC check is enough for both, as both properties are in the NDC form. As a matter of fact, it is enough to consider the decorated speciﬁcation which includes the control action for integrity and the knowledge notiﬁer. Then, we can use this single, combined speciﬁcation for a single check against the most powerful enemy. This idea of combined analysis can be generalized to many diﬀerent properties and has shown its usefulness (higher probability to ﬁnd unexplored attacks) in some concrete cases [12]. Finally, the way secrets are handled is quite diﬀerent. In spi this is achieved elegantly by means of the restriction operator, while in CryptoSPA we have explicitly to manage the set of pieces of information that are given to the enemies.

5.4

Comparing the Security Properties

One may wonder if the properties of secrecy and integrity deﬁned in the two diﬀerent process algebraic frameworks are somehow related. In spite of the technical diﬀerences, integrity is indeed the same property. The actual deﬁnition of integrity asks to check that the system meets its magical speciﬁcation for each continuation F (y), provided F (y) does not reveal information on the message y. In [16], it has been shown that it is enough to consider the simple continuation F (x) = receivedx to establish whether a protocol enjoys integrity. On the contrary, the two notions of secrecy are clearly diﬀerent: the spi one is based on the idea of indistinguishablility, while the CryptoSPA one is based on the idea of possession, i.e. the so-called Dolev-Yao (see also [1]). To see the main point we must note that secret parameter x, in the process S(x), must be a public value, i.e. a non-restricted one. Thus, among the possible tests we may found at least one that “knows” it, i.e. it has it as a free name. While, on the usual CryptoSPA approach the secret values are never “known” by the enemies; in fact, when an enemy is able to discover it we say that there is a secrecy attack. Consider the process P (x) = (νn)M n. Note that x does not occur in the term of P . Thus, P necessarily preserves the secrecy of the public messages M, M , . . ., indeed, P (M ) = P (M ) = P . However, from another point of view, P reveals one of its (declared) “secrets”, i.e. the private name n, since it communicates n on the public channel M . In the usual CryptoSPA approach for secrecy, we would be interested in studying the secrecy of the restricted names, rather than the public ones. The notion of secrecy developed in the spi approach is rather a form of information ﬂow, i.e. non-interference. As a matter of fact, it can be formulated in the GNDC schema as follows:

64

Roberto Gorrieri and Fabio Martinelli S(M )

– S(x) preserves the secrecy of x iﬀ c(x).S(x) enjoys N DC≈may (with M public) w.r.t. all the enemies whose sort is {c} that output at least one message on the channel c and c ∈ / Sort(S). Thus, simply by considering other assumptions on the set of possible intruders it is possible to code secrecy in the spi as a GNDC property. Moreover, it is possible P to show that if S(x) preserves the secrecy of x and S(M ) enjoys GN DCmay then P also S(M ) enjoys GN DCmay . This holds because may testing may be deﬁned in CryptoSPA and it is a congruence w.r.t. restriction and parallel composition (under certain assumptions, e.g. see [16]).

6

Other Frameworks

Very shortly we mention also other well-known process algebraic approaches that have been proposed in recent years. The oldest and most widely deployed approach is the one based on CSP, which is well-illustrated in the book [33]. It shares similarities with CryptoSPA, as also in this approach security properties are modeled as observable events decorating the protocol, even if the idea of explicitly applying non-interference is not used. This approach has been mechanized, by using a compiler (called Casper [23] that translates protocol speciﬁcations into CSP code) and the FDR model checker; it has been enhanced to deal with symbolic reasoning (data independence) in [32]. Similarly, we have a compiler, called CVS [11] that translates speciﬁcations in a pre-dialect of CryptoSPA into SPA (i.e., CCS) code, and the Concurrency Workbench model checker. These analyses are approximated by considering an enemy that has limited memory and capability of generating new messages. More advanced symbolic semantics have been studied for CryptoSPA in [24]. In [25,26], the usage of contexts (open systems) to describe the security analysis scenarios has been advocated. The language used is similar to CryptoSPA. However, diﬀerently from the GNDC approach, the correct speciﬁcation is given through logical formulas and the treatment of the admissible enemies is done by reducing the veriﬁcation problem to a validity one in the logic. Recently, the approach has been extended with symbolic techniques (see [27]). For a symbolic semantics of a spi-like language see, for instance, [6]. As spi is an extension of the π calculus with cryptographic primitive, similarly sjoin [2] extends the join calculus with constructs for encryption and decryption and with names that can be used as keys, nonces or other tags. The applied pi calculus (see [18]) deals with the variety of diﬀerent cryptosystems by adopting a general term algebra with an equality relation. This process calculus permits to describe cryptographic protocols using diﬀerent cryptosystems. Thus, both applied pi and CryptoSPA recognize the necessity to manage uniformly diﬀerent kind of cryptography that may be present in a complex protocol. The former exploits term algebras plus equality while the latter exploits a generic inference system. A more recent approach is LySa [8], which is a very close relative of spi and pi-calculus. LySa mainly diﬀers in two respects: (i) absence of channels

Process Algebraic Frameworks

65

(one global communication medium) and (ii) tests on values being received in communications as well as values being decrypted are directly embedded inside inputs and decryptions. A static analysis technology, based on Control Flow Analysis, has been applied to security protocols, expressed in LySa, providing a fully automatic and eﬃcient tool. The same technology has been successfully used to analyse secrecy for pi [4] and spi [5]. Other process algebraic approaches not strictly related to the analysis of cryptographic protocols include the ambient calculus and the security pi calculus. The ambient calculus [7] is concerned with mobility of ambients (abstract collection of processes and objects that functions both as a unit of mobility and a unit of security). Communication takes place only inside an ambient, hence the hierarchy of nested ambients regulates who can communicate with who. Differently from spi, the security pi calculus [28] extends the π calculus with a new construct [P ]σ denoting that process P is running at security level σ. This calculus is not very suited to talk about cryptographic protocols but is tailored for access control policies.

Acknowledgements We would like to thank Nadia Busi and Marinella Petrocchi for helpful comments.

References 1. M. Abadi. Security protocols and speciﬁcations. In Proc. Foundations of Software Science and Computation Structures, volume 1578 of LNCS, pages 1–13, 1999. 2. M. Abadi, C. Fournet, and G. Gonthier. Secure implementation of channel abstractions. Information and Computation, 174(1):37–83, 2002. 3. M. Abadi and A. D. Gordon. A calculus for cryptographic protocols: The spi calculus. Information and Computation, 148(1):1–70, 1999. 4. C. Bodei, P. Degano, F. Nielson, and H. R. Nielson. Static analysis for the picalculus with applications to security. Information and Computation, 168:68–92, 2001. 5. C. Bodei, P. Degano, F. Nielson, and H. R. Nielson. Flow logic for dolev-yao secrecy in cryptographic processes. Future Generation Computer Systems, 18(6):747–756, 2002. 6. M. Boreale. Symbolic trace analysis of cryptographic protocols. In Automata, Languages and Programming, LNCS, pages 667–681, 2001. 7. L. Cardelli and A. Gordon. Mobile ambients. Theoretical Computer Science, 240(1):177–213, 2000. 8. C.Bodei, M. Buchholtz, P.Degano, F. Nielson, and H. R. Nielson. Automatic validation of protocol narration. In Proceedings of The 16th Computer Security Foundations Workshop. IEEE Computer Society Press, 2003. 9. R. De Nicola and M. C. B. Hennessy. Testing equivalences for processes. Theoretical Computer Science, 34(1-2):83–133, 1984. 10. D. Dolev and A. Yao. On the security of public key protocols. IEEE Transactions on Information Theory, 29(12):198–208, 1983.

66

Roberto Gorrieri and Fabio Martinelli

11. A. Durante, R. Focardi, and R. Gorrieri. A compiler for analysing cryptographic protocols using non-interference. ACM Transactions on Software Engineering and Methodology, 9(4):489–530, 2000. 12. A. Durante, R. Focardi, and R. Gorrieri. Cvs at work: A report on new failures upon some cryptographic protocols. In Workshop on Mathematical Methods, Models and Architectures for Computer Networks Security, LNCS 2052, 2001. 13. R. Focardi and R. Gorrieri. A classiﬁcation of security properties. Journal of Computer Security, 3(1):5–33, 1995. 14. R. Focardi and R. Gorrieri. Classiﬁcation of security properties (part i: Information ﬂow). In Foundations of Security Analysis and Design, volume 2171 of LNCS, pages 331–396, 2001. 15. R. Focardi, R. Gorrieri, and F. Martinelli. Non interference for the analysis of cryptographic protocols. In Proceedings of 27th International Colloquium in Automata, Languages and Programming, volume 1853 of LNCS, pages 354–372, 2000. 16. R. Focardi, R. Gorrieri, and F. Martinelli. A comparison of three authentication properties. Theoretical Computer Science, 291(3):285–327, 2003. 17. R. Focardi and F. Martinelli. A uniform approach for the deﬁnition of security properties. In Proceedings of World Congress on Formal Methods (FM’99), volume 1708 of LNCS, pages 794–813, 1999. 18. C. Fournet and M. Abadi. Mobile values, new names, and secure communication. In Proceedings of the 28th ACM Symposium on Principles of Programming Languages (POPL’01), pages 104–115, 2001. 19. A. Gordon and A. Jeﬀrey. Authenticity by typing in security protocols. In Proceedings of The 14th Computer Security Foundations Workshop. IEEE Computer Society Press, 2001. 20. A. D. Gordon. Notes on nominal calculi for security and mobility. In Foundations of Security Analysis and Design, volume 2171 of LNCS, pages 262–330, 2001. 21. J. Halpern and R. van der Meyden. A logic for SDSI’s linked local name spaces. In PCSFW: Proceedings of The 12th Computer Security Foundations Workshop. IEEE Computer Society Press, 1999. 22. G. Lowe. Breaking and ﬁxing the Needham Schroeder public-key protocol using FDR. In Proceedings of Tools and Algorithms for the Construction and the Analisys of Systems, volume 1055 of LNCS, pages 147–166. Springer Verlag, 1996. 23. G. Lowe. Casper: A compiler for the analysis of security protocols. Journal of Computer Security, 6:53–84, 1998. 24. F. Martinelli. Symbolic semantics and analysis for crypto-ccs with (almost) generic inference systems. In Proceedings of the 27th international Symposium in Mathematical Foundations of Computer Sciences(MFCS’02), volume 2420 of LNCS, pages 519–531. 25. F. Martinelli. Formal Methods for the Analysis of Open Systems with Applications to Security Properties. PhD thesis, University of Siena, Dec. 1998. 26. F. Martinelli. Analysis of security protocols as open systems. Theoretical Computer Science, 290(1):1057–1106, 2003. 27. F. Martinelli. Symbolic partial model checking for security analysis. In Workshop on Mathematical Methods, Models and Architectures for Computer Networks Security, LNCS, 2003. To appear. 28. J. Rieley and Matthew Hennessy. Information ﬂow vs. resource access in the asynchronous pi-calculus. ACM Trans. on Progr. Lang. and Systems (TOPLAS), 24(5):566–591, 2002. 29. R. Milner. Communication and Concurrency. International Series in Computer Science. Prentice Hall, 1989.

Process Algebraic Frameworks

67

30. R. Milner, J. Parrow, and D. Walker. A calculus of mobile processes. Information and Computation, 100(1):1–77, 1992. 31. R. M. Needham and M. D. Schroder. Using encryption for authentication in large networks of computers. Communications of the ACM, 21(12):993–999, 1978. 32. A. Roscoe and P. Broadfoot. Proving security protocols with model checkers by data independence techniques. Journal of Computer Security, 7(2-3):147–190, 1999. 33. P. Ryan, S. Schneider, M. Goldsmith, G. Lowe, and B. Roscoe. The Modelling and Analysis of Security Protocols: the CSP Approach. Addison-Wesley, 2001. 34. P. Samarati and S. D. C. di Vimercati. Access control: Policies, models, and mechanisms. In R. Focardi and R. Gorrieri, editors, Foundations of Security Analysis and Design, LNCS 2171. Springer-Verlag, 2001. 35. D. Sangiorgi. Expressing Mobility in Process Algebras: First-Order and HigherOrder Paradigms. CST–99–93, Department of Computer Science, University of Edinburgh, 1992. Also published as ECS–LFCS–93–266. 36. T. Woo and S. Lam. A semantic model for authentication protocols. In IEEE Computer Society Symposium on Research in Security and Privacy, pages 178– 194, 1993.

Semantic and Syntactic Approaches to Simulation Relations Jo Hannay1 , Shin-ya Katsumata2 , and Donald Sannella2 1

2

Department of Software Engineering, Simula Research Laboratory Laboratory for Foundations of Computer Science, University of Edinburgh

Abstract. Simulation relations are tools for establishing the correctness of data reﬁnement steps. In the simply-typed lambda calculus, logical relations are the standard choice for simulation relations, but they suﬀer from certain shortcomings; these are resolved by use of the weaker notion of pre-logical relations instead. Developed from a syntactic setting, abstraction barrier-observing simulation relations serve the same purpose, and also handle polymorphic operations. Meanwhile, second-order prelogical relations directly generalise pre-logical relations to polymorphic lambda calculus (System F). We compile the main reﬁnement-pertinent results of these various notions of simulation relation, and try to raise some issues for aiding their comparison and reconciliation.

1

Introduction

One of the central activities involved in stepwise development of programs is the transformation of “abstract programs” involving types of data that are not normally available as primitive in programming languages (graphs, sets, etc.) into “concrete programs” in which a representation of these in terms of simpler types of data (integers, arrays, etc.) is provided. Apart from the change to data representation, such data reﬁnement should have no eﬀect on the results computed by the program: the concrete program should be equivalent to the abstract program in the sense that all computational observations should return the same results in both cases. The usual way of establishing this property, known as observational equivalence, is by exhibiting a simulation relation that gives a correspondence between the data values involved in the two programs that is respected by the functions they implement. The details depend on the nature of the language in which the programs are written. In the simple case of a language with only ﬁrst-order functions, it is usually enough to use an invariant on the domain of concrete values together with a function mapping concrete values (that satisfy the invariant) to abstract values [Hoa72], but a strictly more general method is to use a homomorphic relation [Mil71], [Sch90], [ST97]. If non-determinism is present in the language then some kind of bisimulation relation is required.

This research was partly supported by the MRG project (IST-2001-33149) which is funded by the EC under the FET proactive initiative on Global Computing. SK was supported by an LFCS studentship.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 68–91, 2003. c Springer-Verlag Berlin Heidelberg 2003

Semantic and Syntactic Approaches to Simulation Relations

69

When the language in question is the simply-typed lambda calculus, the standard choice of simulation relation – which originates with Reynolds in [Rey81, Rey83] but is described most clearly in [Ten94], cf. [Mit96] – is to use a logical relation, a type-indexed family of relations that respects not just function application (like homomorphisms) but also lambda abstraction. Logical relations are used extensively in the study of typed lambda calculus and have applications outside lambda calculus. A problem with the use of logical relations, in connection with data reﬁnement and other applications, is the fact that they lack some convenient algebraic properties; in particular, the composition of two logical relations is not in general a logical relation. This calls into question their application to data reﬁnement at least, where one might expect composition to account for the correctness of stepwise reﬁnement. An alternative is to use instead a pre-logical relation [HS02], a weaker form of logical relations that nevertheless has many of the features that make logical relations so useful as well as being composable. This yields a proof method for establishing observational equivalence that is not just sound, as with logical relations, but is also complete. The use of pre-logical relations in data reﬁnement is studied in [HLST00]. The situation is more complicated when we consider polymorphically typed lambda calculi such as System F [Gir71,Rey74]. Pre-logical relations can be extended to this context, see [Lei01], but then they do not compose in general although they remain sound and complete for observational equivalence. At the same time, the power of System F opens the possibility of taking a syntactic approach, placing the concept of simulation relation in a logical setting and using existential type quantiﬁcation for data abstraction [MP88]. This line of development has been investigated in a string of papers on abstraction barrierobserving simulation relations by Hannay [Han99,Han00,Han01,Han03] based on a logic for parametric polymorphism due to Plotkin and Abadi [PA93]. A clear advantage of such an approach is that it is amenable to computer-aided reasoning but there are certain compromises forced by the syntactic nature of the framework. We present this background in Sects. 2–4 and then make a number of remarks aiming at some kind of reconciliation in Sect. 5. There are more questions than answers but some possible lines of enquiry are suggested.

2

Pre-logical Relations

Our journey begins with λ→ , the simply-typed lambda calculus having → as its only type constructor. Deﬁnition 2.1. The set of types over a set B of base types (or type constants) is given by the grammar σ ::= b | σ → σ where b ranges over B. A signature Σ consists of a set B of type constants and a collection C of typed term constants c : σ. Types → (Σ) denotes the set of types over B. Σ-terms are given by the grammar M ::= x | c | λx:σ.M | M M where x ranges over variables and c over term constants. The usual typing rules associate

70

Jo Hannay, Shin-ya Katsumata, and Donald Sannella

each well-formed term M in a Σ-context Γ = x1 :σ1 , . . . , xn :σn with a type σ ∈ Types → (Σ), written Γ M : σ. If Γ is empty then we write simply M : σ. Deﬁnition 2.2. A Σ-combinatory algebra A consists of: – – – –

a carrier set [[σ]]A for each σ ∈ Types → (Σ); → A A A a function App σ,τ A : [[σ → τ ]] → [[σ]] → [[τ ]] for each σ, τ ∈ Types (Σ); A A an element [[c]] ∈ [[σ]] for each term constant c : σ in Σ; and σ,τ ρ,σ,τ ∈ [[σ → (τ → σ)]]A and SA ∈ [[(ρ → σ → τ ) → (ρ → combinators KA → A σ) → ρ → τ ]] for each ρ, σ, τ ∈ Types (Σ)

σ,τ σ,τ →σ σ,τ ρ,σ,τ such that KA x y = x (i.e. App τ,σ KA x) y = x) and SA xyz = A (App A (x z)(y z) (ditto).

A Γ -environment η on a combinatory algebra A assigns elements of A to variables, with η(x) ∈ [[σ]]A for x : σ in Γ . A Σ-term Γ M : σ is interpreted in A under a Γ -environment η in the usual way with λ-abstraction interpreted via translation to combinators, written [[Γ M : σ]]A η , and this is an element of [[σ]]A . If M is closed then we write simply [[M : σ]]A . A signature Σ models the interface (type names and function names) of a functional program, with Σ-combinatory algebras modelling programs that match interface Σ. Observational equivalence of two Σ-combinatory algebras is then fundamental to the notion of data reﬁnement. Deﬁnition 2.3. Let A and B be Σ-combinatory algebras and let OBS , the observable types, be a subset of Types → (Σ). Then A is observationally equivalent to B with respect to OBS , written A ≡OBS B, if for any two closed Σ-terms M, N : σ for σ ∈ OBS , [[M : σ]]A = [[N : σ]]A iﬀ [[M : σ]]B = [[N : σ]]B . It is usual to take OBS to be the “built-in” types for which equality is decidable, for instance bool and/or nat. Then A and B are observationally equivalent iﬀ it is not possible to distinguish between them by performing computational experiments. Note that OBS ⊆ OBS implies ≡OBS ⊇ ≡OBS . Logical relations are structure-preserving relations on combinatory algebras. Deﬁnition 2.4. A logical relation R over Σ-combinatory algebras A and B is a family of relations {Rσ ⊆ [[σ]]A × [[σ]]B }σ∈Types → (Σ) such that: – Rσ→τ (f, g) iﬀ ∀a ∈ [[σ]]A .∀b ∈ [[σ]]B .Rσ (a, b) ⇒ Rτ (App A f a, App B g b). – Rσ ([[c]]A , [[c]]B ) for every term constant c : σ in Σ. For OBS = {nat}, the connection between logical reﬁnement and observational equivalence is given by Mitchell’s representation independence theorem. Theorem 2.5 (Representation Independence [Mit96]). Let Σ be a signature that includes a type constant nat, and let A and B be Σ-combinatory algebras1 with [[nat]]A = [[nat]]B = N. If there is a logical relation R over A and 1

Actually Henkin models, which are extensional combinatory algebras; however extensionality is not a necessary condition for this theorem.

Semantic and Syntactic Approaches to Simulation Relations

71

B with Rnat the identity relation on natural numbers, then A ≡{nat} B. Conversely, if A ≡{nat} B, Σ provides a closed term for each element of N, and Σ contains only ﬁrst-order term constants, then there is a logical relation R over A and B with Rnat the identity relation. 2 This theorem corresponds directly to the following method for establishing the correctness of data reﬁnement steps. Proof Method ([Ten94]). Let A and B be Σ-combinatory algebras and let OBS ⊆ Types → (Σ). To show that B is a reﬁnement of A, ﬁnd a logical relation R over A and B such that Rσ is the identity relation for each σ ∈ OBS . We R then say that B is a logical reﬁnement of A and write A B, or A B when we want to make R explicit. A well-known problem with logical relations is the fact that they are not R closed under composition. It follows that, given logical reﬁnements A B and S B C, the composition S ◦ R cannot in general be used as a witness for the composed reﬁnement A C. (In fact, the problem is more serious than it appears at ﬁrst: sometimes there is no witness for A C at all.) This is at odds with the stepwise nature of reﬁnement, and the transitivity of the underlying notion of observational equivalence. It is one source of examples demonstrating the incompleteness of the above proof method; there are other examples that do not involve composition of reﬁnement steps, see [HLST00]. The restriction to signatures with ﬁrst-order term constants in the second part of Theorem 2.5 is necessary, and this is the key to the incompleteness of logical reﬁnements as a proof method and the problem with composability of logical reﬁnements. If A B C then A ≡OBS B ≡OBS C, and so A ≡OBS C since ≡OBS is an equivalence relation. But then it follows that A C only for signatures without higher-order term constants. In [HS02], a weakening of the notion of logical relations called pre-logical relations was studied; see [PPST00] for a categorical formulation. Deﬁnition 2.6 ([HS02]). An algebraic relation R over Σ-combinatory algebras A, B is a family of relations {Rσ ⊆ [[σ]]A × [[σ]]B }σ∈Types → (Σ) such that: – If Rσ→τ (f, g) then ∀a ∈ [[σ]]A .∀b ∈ [[σ]]B .Rσ (a, b) ⇒ Rτ (App A f a, App B g b). – Rσ ([[c]]A , [[c]]B ) for every term constant c : σ in Σ. A pre-logical relation R is an algebraic relation such that: ρ,σ,τ σ,τ – R(SA , SBρ,σ,τ ) and R(KA , KBσ,τ ) for all ρ, σ, τ ∈ Types → (Σ).

The idea of this deﬁnition is to replace the reverse implication in the deﬁnition of logical relations with a requirement that the relation contains the S and K combinators. Since these suﬃce to express all lambda terms, this amounts to requiring the reverse implication to hold only for pairs of functions that are expressible by the same lambda term. It is easy to see that any logical relation is a pre-logical relation.

72

Jo Hannay, Shin-ya Katsumata, and Donald Sannella

Example 2.7. A Σ-homomorphism h : A → B is a type-indexed family of functions {hσ : [[σ]]A → [[σ]]B }σ∈Types → (Σ) such that for any term constant σ,τ σ→τ c : σ in Σ, hσ ([[c]]A ) = [[c]]B , hτ (App σ,τ (f ) hσ (a) and A f a) = App B h σ→τ A ([[Γ λx:σ.M : σ → τ ]]ηA ) = [[Γ λx:σ.M : σ → τ ]]B h h◦ηA . Any Σhomomorphism is a pre-logical relation but is not in general a logical relation. The binary case of pre-logical relations over A and B is derived from the unary case of pre-logical predicates for the product structure A × B. Similarly for n-ary relations for n > 2. Deﬁnition 2.8 ([HS02]). A pre-logical predicate P over a Σ-combinatory algebra A is a family of predicates {P σ ⊆ [[σ]]A }σ∈Types → (Σ) such that: – If P σ→τ (f ) then ∀a ∈ [[σ]]A .P σ (a) ⇒ P τ (App A f a). – P σ ([[c]]A ) for every term constant c : σ in Σ. ρ,σ,τ σ,τ – P (SA ) and R(KA ) for all ρ, σ, τ ∈ Types → (Σ). Example 2.9. For any signature Σ and combinatory algebra A, the family P σ (v) ⇔ v is the value of a closed Σ-term M : σ is a pre-logical predicate over A. (In fact, P is the least such – see Prop. 2.17 below.) Now, consider the signature Σ containing the type constant nat and term constants 0 : nat and succ : nat → nat and let A be the combinatory algebra over N where 0 and succ have their usual interpretations and [[σ → τ ]]A = [[σ]]A → [[τ ]]A for every σ, τ ∈ Types → (Σ) with App σ,τ A f x = f (x). Then P is not a logical predicate over A: any function f ∈ [[nat → nat]]A , including functions that are not lambda deﬁnable, takes values in P to values in P and so must itself be in P . An improved version of Theorem 2.5, without the restriction to ﬁrst-order signatures, holds if pre-logical relations are used in place of logical relations. Theorem 2.10 (Representation Independence for Pre-Logical Relations [HS02]). Let A and B be Σ-combinatory algebras and let OBS ⊆ Types → (Σ). Then A ≡OBS B iﬀ there exists a pre-logical relation over A and B which is a partial injection on OBS . 2 This suggests the following. (We switch to a notation that makes the set of observable types explicit.) Deﬁnition 2.11 ([HLST00]). Let A and B be Σ-combinatory algebras and OBS > B, OBS ⊆ Types → (Σ). Then B is a pre-logical reﬁnement of A, written A ∼ ∼∼∼∼ σ if there is a pre-logical relation R over A and B such that R is a partial injection for each σ ∈ OBS . We phrase this as a deﬁnition, rather than as a proof method for the underlying notion of data reﬁnement, in contrast to logical reﬁnements. As a proof method it is sound and complete, and therefore equivalent to this underlying notion. Pre-logical relations compose – in fact, for extensional models they are the minimal weakening of logical relations with this property (see [HS02] for details).

Semantic and Syntactic Approaches to Simulation Relations

73

Proposition 2.12 ([HS02]). The composition S ◦ R of pre-logical relations R over A, B and S over B, C is a pre-logical relation over A, C. 2 So pre-logical reﬁnements compose, and this explains why stepwise reﬁnement OBS OBS >B ∼ >C ⇒ ∼∼∼∼ ∼∼∼∼ is sound. Another explanation goes via Theorem 2.10: A ∼ OBS > C. The set of observable types A ≡OBS B ≡OBS C ⇒ A ≡OBS C ⇒ A ∼∼∼∼∼ need not be the same in both steps, as the following result spells out.

OBS OBS OBS >B ∼ > C and OBS ⊆ OBS then A ∼ > C. ∼∼∼∼ ∼∼∼∼ ∼∼∼∼ Proposition 2.13. If A ∼ 2

The key to many of the applications of logical relations, including Theorem 2.5, is the Basic Lemma, which says that any logical relation over A and B relates the interpretation of each lambda term in A to its interpretation in B. Lemma 2.14 (Basic Lemma for Logical Relations). Let R be a logical relation over A and B. Then for all Γ -environments ηA , ηB such that RΓ (ηA , ηB ) B and every term Γ M : σ, Rσ ([[Γ M : σ]]A 2 ηA , [[Γ M : σ]]ηB ). (Here, RΓ (ηA , ηB ) refers to the obvious extension of R to environments.) For pre-logical relations, we get a two-way implication. This says that pre-logical relations are the most liberal weakening of logical relations that give the Basic Lemma. (The reverse implication fails for logical relations.) Lemma 2.15 (Basic Lemma for Pre-Logical Relations [HS02]). Let R = {Rσ ⊆ [[σ]]A × [[σ]]B }σ∈Types → (Σ) be a family of relations over A and B. Then R is a pre-logical relation iﬀ for all Γ -environments ηA , ηB such that RΓ (ηA , ηB ) B 2 and every Σ-term Γ M : σ, Rσ ([[Γ M : σ]]A ηA , [[Γ M : σ]]ηB ). Composability of pre-logical relations (Prop. 2.12) is an easy consequence of this. Pre-logical relations enjoy a number of useful algebraic properties apart from closure under composition. For instance: Proposition 2.16 ([HS02]). Pre-logical relations are closed under intersection, product, projection, permutation and ∀. Logical relations are closed under product, permutation and ∀ but not under intersection or projection. 2 A consequence of closure under intersection is that given a property P of relations that is preserved under intersection, there is always a least pre-logical relation satisfying P . We then have the following lambda-deﬁnability result (recall Example 2.9 above): Proposition 2.17 ([HS02]). The least pre-logical predicate over a given combinatory algebra contains exactly those elements that are the values of closed Σ-terms. 2 In a signature with no term constants, a logical relation may be constructed by deﬁning a relation R on base types and using the deﬁnition to “lift” R inductively to higher types. The situation is diﬀerent for pre-logical relations: there are in general many pre-logical liftings of a given R, one being of course

74

Jo Hannay, Shin-ya Katsumata, and Donald Sannella

its lifting to a logical relation. But since the property of lifting a given R is preserved under intersection, the least pre-logical lifting of R is also a well-deﬁned relation. Similarly, the least pre-logical extension of a given family of relations is well-deﬁned for any signature. Lifting R to a logical relation is not possible in general for signatures containing higher-order term constants. Extension is also problematic: the cartesian product A × A is a logical relation that trivially extends any binary relation on A, but this is uninteresting.

3

Pre-logical Relations for System F

The simply-typed lambda calculus λ→ considered in the last section is a very simple language. Extending it with other type constructors, for example sum and product types, is unproblematic, see [HS02]. Much more challenging is the addition of parametric polymorphism as found in functional programming languages, which yields System F [Gir71,Rey74]. A hint of the power this adds, apart from the obvious ability to deﬁne functions that work uniformly over a family of types, is the fact that it is possible to encode inductive types, including the natural numbers, booleans, lists and products, in pure System F [BB85]. Our interest is in data reﬁnement over System F viewed as a programming language, as a means of applying the ideas in the previous section to languages like Standard ML. The key concept underlying data reﬁnement, as we have seen, is that of observational equivalence, and thus understanding the notion of observational equivalence between models of System F is the main theme. Towards this goal, we extend the semantic approach described in Sect. 2 to System F. This involves extending pre-logical relations to System F, and characterising observational equivalence by pre-logical relations. Leiß [Lei01] has developed a formulation of pre-logical relations and studied their properties in Fω , the extension of System F by type constructors. In the context of this paper we restrict attention to plain System F, even though the further extension to Fω presents no additional diﬃculties. Deﬁnition 3.1. The set of types for System F over a set B of base types is given by the grammar σ ::= b | α | σ → σ | ∀α . σ, where α ranges over type variables. A signature Σ consists of a set B of type constants and a collection C of typed term constants c : σ where σ is closed. Types →∀ (Σ) denotes the set of types over B. The set of Σ-terms for System F is given by the grammar M ::= x | c | λx : σ . M | M M | Λα . M | M σ. For simplicity, we obey Barendregt’s variable convention: bound variables are chosen to diﬀer in name from free variables in any type or term. A Σ-type context (ranged over by ∆) is a list of distinct type variables. A Σ-context (ranged over by Γ ) is a list of pairs of variables and types in Types →∀ (Σ), where variables are distinct from each other. We often omit Σ if it is clear from the context. For the type system and representation of data types, see e.g. [GTL90]. By ∆ τ, ∆ Γ

Semantic and Syntactic Approaches to Simulation Relations

75

and ∆ | Γ M : τ we mean to declare a type, a context and a term which are well-formed. First we introduce the underlying model theory of System F in the style of Bruce, Mitchell and Meyer [BMM90]. Deﬁnition 3.2. A Σ-BMM interpretation (abbreviated BMMI) A consists of n n – a set TA and a family [TA → TA ] ⊆ TA → TA for each n ∈ N satisfying 2 certain conditions , – an element [[b]]A ∈ TA for each b ∈ B of Σ, – functions ⇒A : TA × TA → TA and ∀A : [TA → TA ] → TA , – a TA -indexed family of sets A, : At⇒A u → (At → Au ) for each t, u ∈ TA , – a function Appt⇒u A ∀f – a function AppA : A∀A f → t∈A Af (t) for each f ∈ [TA → TA ].

Here we introduce two terminologies. A ∆-environment is a mapping from type ∆ variables in ∆ to TA . We write TA for the set of ∆-environments. For a context ∆ Γ , a Γ -environment is a mapping which maps a variable in Γ to an element A A in A[[Γ (x)]]χ , where χ is a ∆-environment. We write A[[Γ ]]χ for Γ -environments. We continue the deﬁnition: ∆ – a meaning function for types [[−]]A , which maps a type ∆ σ and χ ∈ TA A to [[σ]]χ ∈ TA (for details, see [Has91]), A

– an element [[c]]A ∈ A[[σ]] for each c : σ in Σ, – a meaning function for terms [[−]]A (we use the same symbol), which maps A ∆ a term ∆ | Γ M : σ and environments χ ∈ TA , η ∈ R[[Γ ]]χ to a value [[σ]]A χ (for details, see [Has91]). [[M ]]A χ;η ∈ R Given Σ-BMMIs A and B, we can deﬁne the product Σ-BMMI A × B in the obvious componentwise fashion. Deﬁnition 3.3. Let A be a Σ-BMMI. A predicate R over A (written R ⊆ A) consists of a subset TR ⊆ TA and TR -indexed family of subsets Rt ⊆ At . For t, u ∈ TA and f ∈ [TA → TA ] such that f (t) ∈ TR for any t ∈ TR , we deﬁne u Rt → Ru = {x ∈ At⇒A u | ∀y ∈ Rt . Appt⇒u A (x)(y) ∈ R }

f (t) } ∀x ∈ TR . Rf = {x ∈ A∀A f | ∀t ∈ TR . πt (App∀f A (x)) ∈ R

Binary relations for Σ-BMMIs are just predicates over product interpretations. Now a predicate R ⊆ A is – pre-logical for types if it satisﬁes the following: • [[b]]A ∈ TR , • t, u ∈ TR implies t ⇒A u ∈ TR , 2

n [TA → TA ] includes projections and is closed under composition, ⇒A and ∀A . See [Has91] for a detailed account.

76

Jo Hannay, Shin-ya Katsumata, and Donald Sannella

∆ • for all types ∆, α σ with χ ∈ TR , if [[σ]]A χ{α→t} ∈ TR holds for all t ∈ TR , then [[∀α . σ]]A ∈ T , R χ – algebraic if it is pre-logical for types and A • [[c]]A ∈ R[[σ]] for all c : σ in Σ, • for all t, u ∈ TR , Rt⇒A u ⊆ Rt → Ru , ∆ • for all types ∆, α σ with χ ∈ TR , R∀A f ⊆ ∀x ∈ TR . Rf holds, where A 3 f (t) = [[σ]]χ{α→t} , – pre-logical if it is algebraic and A ∆ • for all terms ∆ | Γ, x : σ M : σ with χ ∈ TR and η ∈ R[[Γ ]]χ , if A [[σ ]]A χ holds for all v ∈ R[[σ]]χ , then [[λx : σ . M ]]A [[M ]]A χ;η ∈ χ;η{x→v} ∈ R A

A

R[[σ]]χ ⇒A [[σ ]]χ , A ∆ • for all terms ∆, α | Γ M : σ with χ ∈ TR and η ∈ R[[Γ ]]χ , if ft ∀A f [[M ]]A holds for all t ∈ TR , then [[Λα . M ]]A , χ;η ∈ R χ{α→t};η ∈ R A where f (t) = [[σ]]χ{α→t} . – logical if it is pre-logical for types and conditions of algebraic relations hold by equality. Alternatively, we can extend the deﬁnition of pre-logical relations (Deﬁnition 2.6) in terms of algebraic relations relating additional combinators for System F [BMM90]. We have the following main theorem of pre-logical predicates: Theorem 3.4 (Basic Lemma for Pre-Logical Predicates [Lei01]). Let A be a Σ-BMMI and R ⊆ A be a predicate. ∆ 1. R is pre-logical for types iﬀ for all types ∆ σ with χ ∈ TR , [[σ]]A χ ∈ TR holds. A ∆ 2. R is pre-logical iﬀ for all terms ∆ | Γ M : σ with χ ∈ TR and η ∈ R[[Γ ]]χ , A [[σ]]χ [[M ]]A holds.

χ;η ∈ R

Corollary 3.5 ([Lei01]). Logical predicates over Σ-BMMIs are pre-logical.

Proposition 3.6 ([Lei01]). Let A be a Σ-BMMI. We deﬁne the deﬁnability A predicate D by TD = {[[∅ σ]]A } and D[[∅σ]] = {[[∅ | ∅ M : σ]]A }. Then D is the least pre-logical predicate over A.

It is easy to see that pre-logical predicates for System F are closed under product, permutation and arbitrary intersection. On the other hand, they are not closed under composition (nor under projection). This is pointed out by Leiß in the setting of F ω [Lei01]. The composition R◦S of two relations R ⊆ A×B, S ⊆ B×C is given as follows: TR◦S = TR ◦ TS , 3

R ◦ S t,u = {Rt,r ◦ S r,u | ∃r.(t, r) ∈ TR , (r, s) ∈ TS }

At this point we know that f (t) ∈ TR for any t ∈ TR by the ﬁrst part of Theorem 3.4. Thus ∀x ∈ TR . Rf is deﬁned.

Semantic and Syntactic Approaches to Simulation Relations

77

Proposition 3.7. Binary pre-logical relations between Σ-BMMIs do not compose in general. Proof. Let Σc = ({b}, {c : b}) be a signature and A be a Σc -BMMI where [[b]]A contains at least two elements, namely {, ⊥}. Any non-trivial BMMI for the empty signature can be used for this purpose. We interpret the constant by [[b]]A ⇒A [[b]]A . [[c]]A = . We use λx . ⊥ as shorthand for [[λx : b . y]]A ∅;{y→⊥} ∈ A Let θ = [b/α] and θ = [b ⇒ b/α] be type substitutions. We deﬁne relation TR by TR = {([[σθ]]A , [[σθ ]]B ) | α σ}. It is easy to show that this is pre-logical A B ∆ by induction: for types. Next we deﬁne R[[σθ]]χ ,[[σθ ]]χ for all ∆, α σ and χ ∈ TR A

B

R[[bθ]]χ ,[[bθ ]]χ = {(, )} A

B

A

B

R[[αθ]]χ ,[[αθ ]]χ = {(⊥, λx . ⊥)} R[[βθ]]χ ,[[βθ ]]χ = Rχ(β) A A B B A B R[[(σ⇒σ )θ]]χ ,[[(σ⇒σ )θ ]]χ = R[[σθ]]χ ,[[σθ ]]χ → R[[σ θ]]χ ,[[σ θ ]]χ A

B

R[[(∀β.σ)θ]]χ ,[[(∀β.σ)θ ]]χ = ∀x ∈ TR . Rf

B (f (t, u) = ([[σθ]]A χ{β→t} , [[σθ ]]χ{β→u} ))

We can show that R = (TR , R) is pre-logical. However the relation R−1 ◦ R relates (λx . ⊥, λx . ⊥) but not (⊥, ⊥). This contradicts algebraicity.

One natural question is when the composition of two pre-logical relations is pre-logical, and Leiß showed a suﬃcient condition [Lei01]. We give a characterisation of observational equivalence by pre-logical relations, as in Theorem 2.10. We can reuse Deﬁnition 2.3 for observational equivalence between Σ-BMMIs with respect to closed observable types. Theorem 3.8 ([Lei01]). Let A and B be Σ-BMMIs and let OBS be a set of closed types of System F. Then A ≡OBS B iﬀ there exists a pre-logical relation over A and B which is a partial injection on OBS .

4

Expressing Simulation Relations Syntactically

Our journey now moves into the syntactic realm, placing the concepts of simulation relation and representation independence in a logical setting. The main incentive is that computer-aided reasoning requires syntactic expressibility. One reasonable choice for syntactic formalism is the polymorphic lambda calculus, together with a second-order logic, with the lambda calculus being the object programming language. The decision in this section to make use of polymorphism is motivated primarily by expressibility, since semantic notions may then be internalised in syntax. At the outset, we use this expressive power to express the simply-typed notions of Sect. 2. However, in Sect. 4.4, polymorphism in data types is also handled. Nevertheless, in Sect. 4.6 we suggest that the appropriate setting for describing polymorphic data types is actually F3 . Our choice here of formalism inﬂuences the way we regard the failure of standard simulation relations, i.e., logical relations, at higher order. It turns out that a natural trail of development gives a solution that is conceptually

78

Jo Hannay, Shin-ya Katsumata, and Donald Sannella

diﬀerent from that of pre-logical relations, although it should be evident that the concepts are strongly related. The syntactic approach here is developed directly from syntactic abstraction barriers inherent in polymorphic types. This gives a notion of abstraction barrier-observing (abo) simulation relation. 4.1

Internalisation into Syntax

To start, we consider System F, cf. Sect. 3. Here we wish to be formalistic, so we use pure System F without constants. Self-iterating inductive types can be def def encoded [BB85], e.g., nat = ∀α.α → (α → α) → α, bool = ∀α.α → α → α, and def list σ = ∀α.α → (σ → α → α) → α, with programmable constructors, destrucdef tors and conditionals. Products encode as σ × τ = ∀α.(σ → τ → α) → α with constructor pair σ,τ and destructors proj 1σ,τ and proj 2σ,τ . We use the logic for parametric polymorphism due to [PA93], a second-order logic over System F augmented with relation symbols, relation deﬁnition, and the axiomatic assertion of relational parametricity. See also [Mai91,Tak98]. Formulae now include relational statements as basic predicates and quantiﬁables, φ ::= (M =σ N ) | M ξ N | · · · | ∀ξ ⊂ σ×τ . φ | ∃ξ ⊂ σ×τ . φ where ξ ranges over relation variables. Relation deﬁnition is given by the syntax Γ (x : σ, y : τ ) . φ ⊂ σ×τ def

where φ is a formula. For example eq σ = (x : σ, y : σ) . (x =σ y). We write U [X] to indicate possible occurrences of variable X in type, term or formula U , and write U [A] for the capture-correct substitution U [A/X]. Complex relations may be built from simpler ones. We get the arrow-type relation R → R ⊂ (σ → σ )×(τ → τ ) from R ⊂ σ×τ and R ⊂ σ ×τ by (R → R ) = (f : σ → σ , g : τ → τ ) . (∀x : σ.∀y : τ . (x R y ⇒ (f x) R (gy))) def

The universal-type relation ∀(α, β, ξ ⊂ α × β)R[ξ] ⊂ (∀α.σ[α]) × (∀β.τ [β]) is deﬁned from R[ξ] ⊂ σ[α] × τ [β], where α, β and ξ ⊂ α × β are free, by def

∀(α, β, ξ ⊂ α×β)R[ξ] = (y : ∀α.σ[α], z : ∀β.τ [β]) . (∀α.∀β.∀ξ . (yα)R[ξ](zβ)) For n-ary α, σ, τ , R, where Ri ⊂ σi ×τi , we get ρ[R] ⊂ ρ[σ]×ρ[τ ], the action of type ρ[α] on R, by substituting relations for type variables: ρ[α] = αi : ρ[R] = Ri ρ[α] = ρ [α] → ρ [α] : ρ[R] = ρ [R] → ρ [R] ρ[α] = ∀α .ρ [α, α ] : ρ[R] = ∀(β, γ, ξ ⊂ β ×γ)ρ [R, ξ] Here, R may be seen as base relations from which one uniquely deﬁnes relations according to type construction. This is logical lifting and gives the mechanism for logical relations in our syntactic setting. The proof system is intuitionistic natural deduction, augmented with inference rules for relation symbols in the obvious way. There are standard axioms for equational reasoning implying extensionality for arrow and universal types.

Semantic and Syntactic Approaches to Simulation Relations

79

Parametric polymorphism requires all instances of a polymorphic functional to exhibit a uniform behaviour [Str67,BFSS90,Rey83]. We adopt relational parametricity [Rey83,MR91]: A polymorphic functional instantiated at two related domains should give related instances. This is asserted by the schema Param : ∀γ.∀f : (∀α.σ[α, γ]) . f (∀α.σ[α, eq γ ]) f The logic with Param is sound; we have, e.g., the parametric per -model of [BFSS90] and the syntactic models of [Has91]. In order to prove the existence of a model, one has to show that Param holds for all closed f . If one then expands the statement, one obtains a syntactic analogue of the Basic Lemma for logical relations, but here involving universal types. Lemma 4.1 (Basic Lemma Param [PA93]). For all closed f : ∀α.σ[α], we derive without Param, f (∀α.σ[α]) f . 2 Constructs such as products, sums, initial and ﬁnal (co-)algebras are encodable in System F. With Param, these become provably universal constructions. Relational parametricity also yields the fundamental Lemma 4.2 (Identity Extension Param [PA93]). With Param, we derive ∀γ.∀u, v : σ[γ] . (u σ[eq γ ] v ⇔ (u =σ[γ] v))

2

For data types, we use the following notation: A data type over a signature T consists of a data representation A and an implementation of a set of operations a : T [A]. Encapsulation is provided in the style of [MP88] by the following encoding of existential (abstract) types and pack and unpack combinators: def

∃α.T [α] = ∀β.(∀α.(T [α] → β) → β),

β not free in σ

def

pack T (A)(a) = Λβ.λf : ∀α.(T [α] → β).f (A)(a) def unpack T (package)(τ )(f ) = package(τ )(f ) Operationally, pack packages a data representation and an implementation of operations on that data representation to give a data type of the existential type. The resulting package is a polymorphic functional, that given a client computation and its result domain, instantiates the client with the particular elements of the package. The unpack combinator is merely the application operator for pack . An abstract type for stacks of natural numbers could be ∃α.(α × (nat → α → α) × (α → α) × (α → nat)) A data type of this type is, e.g., (pack list nat l), where (proj 1 l) = nil , (proj 2 l) = cons, (proj 3 l) = λl : list nat . (cond list nat (isnil l) nil (cdr l)), (proj 4 l) = λl : list nat . (cond nat (isnil l) 0 (car l)). For convenience we use a labelled product notation,

80

Jo Hannay, Shin-ya Katsumata, and Donald Sannella

∃α.TStack nat [α] def

where TStack nat [α] = (empty : α, push : nat → α → α, pop : α → α, top : α → nat). Each fi : Ti [α] is a proﬁle in T [α]. The analogy to Sect. 2 is that fi is a term constant in the signature T , and models are internalised as packages (pack Aa). Consider now the issue of when two packages are interchangeable in a program. To each reﬁnement stage, a set OBS of observable types is associated, assumed to contain closed inductive types, such as bool or nat, and also any parameters. Two data types are interchangeable if their observable properties are the same, i.e., packages should be observationally equivalent if it makes no diﬀerence which one is used in computations with observable result types. Thus: Deﬁnition 4.3 (Observational Equivalence). Observational equivalence of (pack Aa), (pack Bb) with respect to OBS is expressed by ∀f : ∀α.(T [α] → ι).(f A a) =ι (f B b) ι∈OBS

For example, an observable computation on natural-number stacks could be Λα.λx : TStack nat [α] . x.top(x.push n x.empty). Observational equivalence is the conceptual description of interchangeability, and simulation relations is a means for showing observational equivalence. In the logic one uses the action of types on relations to deﬁne logical relations. Two data types are related by a simulation relation if there exists a relation on their data representations that is preserved by their corresponding operations. Deﬁnition 4.4 (Simulation Relation). The existence of a simulation relation between (pack Aa) and (pack Bb) is expressed by ∃ξ ⊂ A×B . a(T [ξ, eq γ ])b. We want the two notions to be equivalent. For data types with ﬁrst-order operations, this equivalence is a fact under relational parametricity. At higher-order this is not the case. Also, the composability of simulation relations fails at higher order, compromising the constructive composition of reﬁnement steps. Consider the assumption that T [α] has only ﬁrst-order function proﬁles: T : Every proﬁle Ti [α] = Ti1 [α] → · · · → Tni [α] → Tci [α] of T [α] is ﬁrst FDT OBS order, and such that Tci [α] is either α or some ι ∈ OBS . T Theorem 4.5 (Composability [Han99]). Assuming FDTOBS , with Param we get

∀A, B, C, ξ ⊂ A×B, ζ ⊂ B ×C, a : T [A], b : T [B], c : T [C]. a T [ξ, eq γ ] b ∧ b T [ζ, eq γ ] c ⇒ a T [ξ ◦ ζ, eq γ ] c 2 Theorem 4.6 (Representation Independence [Han99]). Assuming T , we get with Param, for A, B, a : T [A], b : T [B] and OBS , FDTOBS ∀f : ∀α.(T [α] → ι) . (f A a) =ι (f B b) ∃ξ ⊂ A×B . a T [ξ, eq γ ] b ⇔ ι∈OBS

2

Semantic and Syntactic Approaches to Simulation Relations

81

For Theorem 4.6, consider how to derive ⇒. In Sect. 2, we would use the Basic Lemma in this situation. Here, we apply Param; f (∀α.T [α, eq γ ] → ι) f . As mentioned before, Param for closed f is essentially the Basic Lemma for logical relations. In the semantic setting one can talk about closed terms. This is not immediately possible syntactically, and note that the deﬁnition of observational equivalence here says nothing about f being closed. To compensate for this, we use a extended ‘Basic Lemma’, namely relational parametricity. For the opposite direction, one must construct a relation ξ. Analogous to the case in the semantic setting, we use deﬁnability, but since closedness is intangible def for us, we can only exhibit ξ = (a : A, b : B) . (Dfnbl (a, b)), where def

Dfnbl = (x : A, y : B) . (∃fα : ∀α.T [α] → α . fα Aa = x ∧ fα Bb = y) This works since observational equivalence is deﬁned ‘open’ as well. For example, if there is a proﬁle g : α → α in T , then we show a.g (Dfnbl → Dfnbl ) b.g which def follows easily by giving f = Λα.λx : α.x.g(fα αx), where fα is postulated by the antecedentary Dfnbl . If g : α → ι, then f : ∀α.T [α] → ι, and by observational equivalence we have f Aa =ι f Bb, which gives f Aa ι f Bb by Param. This particular proof fails if T [α] has higher-order proﬁles. Consider def

T [α] = (p : (α → α) → nat, s : α → α) We must derive ∀x : A → A, y : B → B . x(Dfnbl → Dfnbl )y ⇒ a.px =nat b.py. However, x(Dfnbl → Dfnbl )y does not give us an fα→α : ∀α.T [α] → (α → α) such that fα→α Aa = x ∧ fα→α Bb = y, so we cannot construct our f : ∀α.T [α] → nat to complete the proof. This negative result involving Dfnbl generalises. At higher order, there might not exist any simulation relation in the presence of observational equivalence [Han01]. To exemplify with T [α] above, any candidate R ⊂ A × B has to satisfy ∀x : A → A, y : B → B . x (R → R) y ⇒ a.px =nat b.py, and this includes x and y that do not belong to, or are not expressible by, operations in the respective data types. This, one might argue, is unreasonable. In fact, it is. Consider a computation f = Λα.λx : T [α].M [α, x]. A crucial observation is now embodied in the following obvious statement. Abs-Bar : A computation Λα.λx : T [α].M [α, x] cannot have free variables of types involving the virtual data representation α. This has a direct bearing on how data type operations may be used. For example, x : A → A and y : B → B above cannot be arbitrary, but must be expressible by respective data type operations; in this case, the only possible candidate for x is a.g, and b.g for y. 4.2

abo-Simulation Relations with Special Parametricity

An obvious solution is now to deﬁne a notion of simulation relation where arrowtype relations are weakened by deﬁnability clauses for arguments [Han00,Han01]. def For example, write a.s (R → R)‫ ג‬b.s, for ‫ = ג‬A, Ba, b, meaning ∀x : A, y : B . x R y ∧ Dfnbl ‫ג‬α (x, y) ⇒ a.sx R b.sy

82

Jo Hannay, Shin-ya Katsumata, and Donald Sannella

where Dfnbl ‫ג‬α (x, y) = (x : A, y : B) . (∃fα : ∀α.T [α] → α . fα Aa = x ∧ fα Bb = y). In general, deﬁnability clauses are inserted recursively in arrow types, bottoming def out at base relations, i.e., R‫ = ג‬R. The full deﬁnition is in [Han01], and includes the formulation at universal type as well. With a slight abuse of notation, def

Deﬁnition 4.7 (abo-Simulation Relation). For any A, B and R ⊂ A × B, T [R, eq γ ]‫( = ג‬a : T [A, γ], b : T [B, γ]) . (∧1≤i≤k a.gi (Ti [R, eq γ ]‫ ) ג‬b.gi ) def

Observable types such as nat are universal types and appear in ‫ג‬-variants inside T [ξ]. Therefore, it is important that nat ‫ ג‬is eq nat . This holds for closed inductive types using Param. However, we do not get desired relational properties at product type, hence the formulation in Deﬁnition 4.7. We are still working under the assumption of relational parametricity. However, notice that we cannot apply Param, e.g., when using T [ξ, eq γ ]‫ ג‬. One can recover the needed proof power by asserting the missing piece of relational parametricity. Write f (∀α.T [α, eq γ ]ε → σ[α, eq γ ]ε ) f , meaning ∀A, B, ξ ⊂ A×B.∀a : T [A, γ], b : T [B, γ] . a(T [ξ, eq γ ]‫) ג‬b ⇒ (f A a)(σ[ξ, eq γ ]‫() ג‬f B b) where ‫ = ג‬A, B, a, b. We assume the following: T HDT OBS : Every proﬁle Ti [α] = Ti1 [α] → · · · → Tni [α] → Tci [α] of T [α] is such that Tij [α] has no occurrences of universal types other than those in OBS , and Tci [α] is either α or some ι ∈ OBS . T Deﬁnition 4.8 (Special abo-Parametricity (SpParam)). For HDTOBS , for σ[α, γ] having no occurrences of universal types other than those in OBS , and whose only free variables are among α and γ,

SpParam: ∀f : ∀α.(T [α, γ] → σ[α, γ]) . f (∀α.T [α, eq γ ]ε → σ[α, eq γ ]ε ) f T Lemma 4.9 (Basic Lemma SpParam [Han01]). For HDTOBS , for σ[α] having no occurrences of universal types other than those in OBS , and for closed f : ∀α.(T [α] → σ[α]), we derive f (∀α.T [α]ε → σ[α]ε ) f . 2 Lemma 4.9 entails soundness for the logic with Param and SpParam with respect to the closed type and term model and the parametric minimal model due to [Has91]. T Theorem 4.10 (Composability [Han00]). Assuming HDTOBS , we get using SpParam, for ‫ = ג‬A, Ba, b, ‫ = ג‬B, Cb, c, and ‫ = ג‬A, Ca, c,

∀A, B, C, ξ ⊂ A×B, ζ ⊂ B ×C, a : T [A], b : T [B], c : T [C]. a T [ξ, eq γ ]‫ ג‬b ∧ b T [ζ, eq γ ]‫ ג‬c ⇒ a T [ζ ◦ ξ, eq γ ]‫ ג‬c 2 Theorem 4.11 (Representation Independence [Han00]). With the T assumption HDTOBS , we get with SpParam, for A, B, a : T [A], b : T [B], OBS , and ‫ = ג‬A, Ba, b, ∃ξ ⊂ A×B . a T [ξ, eq γ ]‫ ג‬b ⇔ ∀f : ∀α.(T [α] → ι) . (f A a) =ι (f B b) ι∈OBS

2

Semantic and Syntactic Approaches to Simulation Relations

4.3

83

abo-Simulation Relations with Closed Special Parametricity

If we can express closedness in the logic, then we can relate to non-syntactic models as well. Closedness is inherently intractable, but we can approximate to a certain degree. We add a basic predicate Closed to the syntax together with a pre-deﬁned semantics. The eﬀect of this is, for example, that the interpretation in any model of ∀f : ∀α.(T [α, γ] → ι) . Closed OBS (f ) ⇒ φ(f ) restricts attention in φ to those interpretations of all f : ∀α.(T [α] → ι) that are denotable by terms whose only free variables are of types in OBS . The semantics for the predicate Closed is not stable under term formation, so we cannot make axioms for Closed in order to derive the closedness of a term from its subterms. We can however add a second symbol ClosedS with a pre-deﬁned semantics that does allow the derivation of closedness, but ClosedS will then not satisfy substitutivity. This is resolved by giving a separate nonsubstitutive calculus for deriving closedness, together with rules for importing the needed results into the main logic. Details are in [Han01]. We now get: Deﬁnition 4.12 (Observational Equivalence by Closed Computation). Observational equivalence by closed computation of (pack Aa) and (pack Bb) with respect to OBS is expressed as ∀f : ∀α.(T [α] → ι) . Closed OBS (f ) ⇒ (f A a) =ι (f B b) ι∈OBS def

Also we write for example, a.s (R → R)‫ג‬C b.s, for ‫ = ג‬A, Ba, b, meaning ∀x : A, y : B . x R y ∧ Dfnbl Cα‫( ג‬x, y) ⇒ a.sx R b.sy def

where Dfnbl Cα‫( ג‬x, y) = (x : A, y : B) . (∃fα : ∀α.T [α] → α . Closed OBS (f ) ∧ fα Aa = x ∧ fα Bb = y) Again, with a slight abuse of notation: Deﬁnition 4.13 (abo-Simulation Relation by Closed Computation). For any A, B and R ⊂ A × B, T [R, eq γ ]‫ג‬C = (a : T [A, γ], b : T [B, γ]) . (∧1≤i≤k a.gi (Ti [R, eq γ ]‫ג‬C ) b.gi ) def

Deﬁnition 4.14 (Special Closed abo-Parametricity (spParamC)). For T , for σ[α, γ] having no occurrences of universal types other than those in HDTOBS OBS , and whose only free variables are among α and γ, spParamC: ∀f : ∀α.(T [α, γ] → σ[α, γ]) . Closed OBS (f ) ⇒ f (∀α.T [α, eq γ ]εC → σ[α, eq γ ]εC ) f Using this, we get analogous results to the previous section. The corresponding Basic Lemma entails the soundness of the logic with Param and spParamC with respect to any relational parametric model.

84

Jo Hannay, Shin-ya Katsumata, and Donald Sannella

4.4

abo-Relational Parametricity

The previous two subsections augmented relational parametricity with special instances of what one could call abo-relational parametricity. Now we replace relational parametricity altogether with full-ﬂedged abo-relational parametricity. This gives a much simpler treatment than the previous approaches, but at a price: we now need inﬁnite conjunctions in the logic. These are however well-behaved in the sense that proofs only need pointwise treatment. Moreover, reﬁnement proofs need not be concerned with inﬁnite conjunctions. Abs-Bar says that function arguments in computations are bounded by fordef mal parameters, e.g., in the computation f = Λα.λx : α.λs : α → α.t[x, s], s will only be applied to arguments built from formal parameters x and s. This transfers to instances f σ and f τ . So even at the basic level of universal types, one could for R ⊂ σ×τ say that e.g., sσ (R → R) sτ should reﬂect this, in that only those x, y are considered for the antecedent x R y that are admissible in the computations. Thence, f (∀α.α → (α → α) → α)abo g is the relation given by ∀γ, δ, ξ ⊂ γ × δ . ∀a : γ, b : δ, s : γ → γ, s : δ → δ . a ξ b ⇒ s (ξ → ξ) s ⇒ f γas ξ gδbs where s (ξ → ξ) s for = γ, δa, bs, s is ∀x : γ, y : δ . x ξ y ∧ Dfnbl γ (x, y) ⇒ sx ξ s y where Dfnbl γ (x, y) = (x : γ, y : δ) . (∃fα : ∀α.T [α] → α . fα γas = x ∧ fα γbs = y). In general, Dfnbl clauses are inserted recursively in arrow types, bottoming out at base relations. The notion of abo-relation in [Han03] formalises the idea. Universal types play two rˆ oles. Consider (∀α.α → (α → α) → (∀β.(α → β) → β) → α)abo , and a term of this type, e.g., Λα.λx : α, s : α → α, p : ∀β.(α → β) → β . s(pαs). The abo-relation treats the outer universal type as the type of a computation, and sets up the Dfnbl clauses according to formal parameters x, s, p. Then, the inner universal type must be treated as a polymorphic parameter, and it is necessary to capture that instances pσ may only vary in α. This is where inﬁnite conjunctions enters the scene, but this discernability for universal types is what enable abo-simulation relations to handle polymorphism in data types, see below. The abstraction barrier-observing formulation of relational parametricity is now given by the following axiom schema. def

Deﬁnition 4.15 (abo-Parametricity). abo-Param : ∀γ.∀f : (∀α.σ[α, γ]) . f (∀α.σ[α, eq γ ])abo f The abo-version of the identity extension lemma does not follow from aboParam, because we can no longer use extensionality. Nevertheless, in the spirit of observing abstraction barriers, we argue that in virtual computations, it sufﬁces to consider extensionality only with respect to function arguments that will actually occur. The simplest way to capture this is in fact by asserting identity extension.

Semantic and Syntactic Approaches to Simulation Relations

85

Deﬁnition 4.16 (abo-Identity Extension for Universal Types). abo-Iel : ∀γ.∀u, v : (∀α.σ[α, γ]) . u (∀α.σ[α, eq γ ])abo v ⇔ u = v Both abo-Param and abo-Iel hold in the abo-parametric per -model [Han03]. We can also formulate a basic lemma for abo-Param, if we allow inﬁnite derivations. Regular parametricity, Param, will not hold in this model; in fact any logic containing both Param and abo-Param is inconsistent. Note that abo-Iel implies abo-Param. Nevertheless, we choose to display both. With abo-Param and abo-Iel, we regain universal properties, for example for products: ∀σ, τ.∀z : σ×τ . pair (proj 1 z)(proj 2 z) = z ∀u, v : σ × τ . u (σ[eq γ ]×τ [eq γ ])abo v ⇔ (proj 1 u) σ[eq γ ]abo (proj 1 v) ∧ (proj 2 u) τ [eq γ ]abo (proj 2 v) Theorem 4.17 (Representation Independence [Han03]). Under the asT sumption HDTOBS , we get with abo-Param and abo-Iel, ∀A, B.∀a : T [A], b : T [B] . ∃ξ ⊂ A×B . a T [ξ, eq γ ] A,B a,b b 2 ⇔ ι∈OBS ∀f : ∀α.(T [α] → ι) . (f A a) =ι (f B b) T , we get with Theorem 4.18 (Composability [Han03]). Assuming HDTOBS abo-Param and abo-Iel,

∀A, B, C, ξ ⊂ A×B, ζ ⊂ B ×C, a : T [A], b : T [B], c : T [C]. a(T [ξ, eq γ ] A,B a,b )b ∧ b(T [ζ, eq γ ] B,C b,c )c ⇒ a(T [ζ ◦ ξ, eq γ ] A,C a,c )c 2 If we allow inﬁnite derivations, we get representation independence and composability for data types with polymorphic operations, under one requirement: In the sense of Sects. 2 and 3, all type constants must either be observable or hidden, if they are the result type of any operation. Here, type constants correspond to closed types, and since we have polymorphism, type instantiation requires added caution. For example, if OBS = {bool }, then T [α] may not have a proﬁle g : α → (∀β.β → β), since ∀β.β → β is not observable, nor any proﬁle g : α → (∀β.α → β), since gx can then be instantiated by a non-observable closed type yielding a derived proﬁle gx(∀β.β → β) : α → (∀β.β → β). Thus, the requirement takes the form T DT OBS : Every proﬁle Ti [α] = Ti1 [α] → · · · → Tni [α] → Tci [α] of T [α] is such that if Tci [α] has a deepest rightmost universal type ∀β.V , then this subtype is not closed, nor is the deepest rightmost subtype of ∀β.V the quantiﬁed β. T Then, Theorem 4.17 and Theorem 4.18 hold under DT OBS . T In closing, we mention that for this section, HDT OBS can in any case be relaxed by dropping the restriction on Tij .

86

Jo Hannay, Shin-ya Katsumata, and Donald Sannella

4.5

pl -Relational Parametricity

It is possible to deﬁne algebraic relations in the logic. We do this from basic principles, just as we do for abo-relations. Consider again for example the universal type ∀α.α → (α → α) → α. In a sense, universal types determine signatures with function proﬁles. This inductive type has a proﬁle for ‘zero’, and a proﬁle for ‘successor’. Relative to the ‘signature’ consisting of these proﬁles, one can then deﬁne algebraic relations in a ﬁnite way. Here this can be done for any σ and τ , by giving a relation Rα ⊂ σ × τ , taking the rˆ ole of a base type relation, and then giving a relation Rα→α ⊂ (σ → σ) × (τ → τ ) that we insist satisﬁes algebraicity: Rα→α (s, s ) ⇒ s (Rα → Rα ) s . In this manner, the universal type induces a family of relations, namely Rα and Rα→α , over the ‘signature’ of the universal type. Thus, we write e.g., f (∀α.α → (α → α) → α)pl g for the relation ∀γ, δ, ξα ⊂ γ × δ, ξα→α ⊂ γ → γ × δ → δ . pl α→α (ξα→α ; ξα ) ⇒ ∀a : γ, b : δ, s : γ → γ, s : δ → δ . Rα (a, b) ∧ Rα→α (s, s ) ⇒ Rα (f γas, gδbs ) where pl α→α (ξα→α ; ξα ) asserts algebraicity of ξα→α relative to ξα . In general, one completes the ﬁnite family of relations with all so-called free subtypes, in order to ensure well-deﬁnedness of the algebraicity conditions. Also, the full deﬁnition of algebraic relations in this manner must reﬂect the two levels of polymorphism mentioned in the previous section. To get pre-logical relations (pl -relations), one must additionally ensure closure over abstraction. This spoils ﬁniteness, since for a combinatorial approach, we must assert relatedness of an inﬁnite set of combinators. Again this inﬁnite conjunction is well-behaved, since it ranges over all types only varying over the data representations. We may soundly assert pl -relational parametricity and using this, we get similar results to those in the previous section. It might be possible to get away with a ﬁnite number of combinators. The rationale behind this is that one may proceed with only a ﬁnite family of algebraic relations. If relations of higher order than those in the family are needed, then these can be constructed by logical lifting. This is relevant for polymorphic instantiation. Based on this, it may suﬃce to have an upper bound on the type complexity of combinators needed. This is under investigation. 4.6

Polymorphic Data Types in F3

Polymorphism within data types is dealt with in a somewhat general manner in the two previous sections. However, it is hard to ﬁnd natural examples of data types with polymorphic operations that are expressible in System F. Instead, F3 is appropriate, and then one could give e.g., polymorphic stacks as follows. ∃X : ∗ → ∗.TpolyStack [X], TpolyStack [X] = (empty : ∀γ . Xγ, push : ∀γ . γ → Xγ → Xγ, pop : ∀γ . Xγ → Xγ, top : ∀γ . Xγ → γ → γ, map : ∀γ, γ . (γ → γ ) → Xγ → Xγ )

Semantic and Syntactic Approaches to Simulation Relations

87

This provides polymorphic stack operations. The data representation is a type constructor X to be instantiated by the relevant stack element type. One can treat this kind of shallow polymorphism in a pointwise fashion in F2 [Han01], so that one essentially reduces the problem to non-polymorphic signatures. Then it is not necessary that F2 technology deals with polymorphic signatures, neither in one way or another. Alternatively, one could devise appropriate notions of relational parametricity for F3 .

5

Reconciliation

In comparison with the neat and tidy story of pre-logical relations in the simplytyped lambda calculus told in Sect. 2, both the semantic account of pre-logical relations in System F in Sect. 3 and the syntactic approach of abstraction barrier-observing simulation relations in Sect. 4 exhibit certain shortcomings. Our present feeling is that true enlightenment on this subject will require some bridge between the two. The following subsections suggest some possible lines of enquiry that seem promising to us. 5.1 Internalisation of Semantic Notions into Syntax Sects. 2 and 3 deal with semantic simulation relations between models for lambda calculi. Sect. 4, on the other hand, internalises models and simulation relations into syntax. Models (data types) then become terms of an existential type of the def form ∃α.T [α] = ∀β.(∀β.T [α] → β) → β, for some ‘signature’ T [α], and computations or programs using data types are f : ∀α.T [α] → σ. Thus, polymorphism is used to internalise semantic notions. This use of polymorphism is at a level external to data types (models); as in the outermost universal type in computations f : ∀α.T [α] → σ, in contrast to polymorphism within models arising from any polymorphic proﬁles in T [α]. Relationally, this two-leveled aspect gives rise to certain diﬃculties. For logical relations it suﬃces to give a uniform relational deﬁnition for universal types, but for abo-relations which use deﬁnability relative to data type signatures, it is necessary to reﬂect the two levels in the relational deﬁnitions. This gives a nonuniform relational treatment at universal type. Note that the semantic approach in Sect. 3 does not work on internalised structures in the syntax, and the notion of relational parametricity is there cleaner. Internalising models as, e.g., inhabitants of existential type, gives syntactic control, especially in the context of reﬁning abstract speciﬁcations to executable programs. However, we think that the beneﬁts of this to mechanised reasoning should be weighed against a possible scenario without internalisation, but perhaps with sound derived proofs rules for data reﬁnement. This would be a more domain-speciﬁc calculus, but might provide a simpler formalism, perhaps more in style with semantic reasoning. 5.2 Equivalence of Models versus Equivalence of Values in a Model As a consequence of the internalisation of semantic notions discussed above, the term “observational equivalence” has been applied at two diﬀerent levels to

88

Jo Hannay, Shin-ya Katsumata, and Donald Sannella

achieve similar aims. In Sects. 2 and 3, observational equivalence is a relation between two models over the same signature, representing programs. In Sect. 4, it is a relation on encapsulated data types within a single model; such a relation is sometimes referred to as indistinguishability, written ≈. In both of the latter two sections, the power of System F would allow the opposite approach to be taken. Then the question of the relationship between the resulting deﬁnitions arises. This question has been investigated in a number of simpler frameworks in [BHW95,HS96,Kat03], where the connection is given by a factorisability result of the form A ≡ B iﬀ A/≈ ∼ = B/≈. It is likely that the same applies in the context of System F, and this might in turn help to shed light on the relationship between the semantic and syntactic worlds. 5.3

Finiteness

Pre-logical relations are deﬁned in terms of deﬁnable elements. In logic it is hard to deal with deﬁnability in a term-speciﬁc way. In Sect. 4.3 we approximated by introducing a new predicate Closed , and in Sect. 4.2 we explicitly related to syntactic models. In both Sect. 4.4 and Sect. 4.5, we basically end up with inﬁnitary logic, albeit in a tractable manner. It may also be feasible to combine elements, for example to use the Closed predicate together with pl -relations. From a purist point of view, all these approaches are slightly unsatisfactory, although for practical purposes they provide methods for proving reﬁnement, since the inﬁnitary issues are basic and of no concern when doing reﬁnement proofs. In fact this is true a fortiori for abo-simulation relations, since these are in fact ﬁnite, unlike pre-logical relations. 5.4

Pre-logical Relations and abo-Relations

Both pre-logical relations (pl -relations in the logic) and abo-relations solve the same problems for reﬁnement. The question is then what else they have in common. To make a comparison easier, one can do two things. First, one can transpose the syntactic idea of abo-relation into the semantic setting around e.g., combinatory algebras. It is then probably natural to interpret the Dfnbl clauses as term-deﬁnability. In that case when considering data types, it is evident that abo-simulation relations specialise to a ﬁnitary version of the minimal pre-logical relation, which is not surprising. The general relationship is however unclear. Conversely, one might transpose the idea of pre-logical relation into syntax with internalised data types. This is mentioned in Sect. 4.5. Then, the connection is not so clear, since the Dfnbl clause says nothing about term-deﬁnability, unless we use the Closed predicate of Sect. 4.3. Any comparison would probably depend on the model of choice for the logic. Furthermore, at a more fundamental level, one gets various concepts of relational parametricity. We have the ones in the syntactic setting where data types are internalised, but we also have the external semantic concept in connection to the scenario in Sect. 3. Characterising these in terms of one another is left as an interesting challenge, the start of which is described in the next section.

Semantic and Syntactic Approaches to Simulation Relations

5.5

89

Connection with Pre-logical Relations and Relational Interpretation of System F

We can regard a binary relation over a BMM interpretation as an interpretation of types by relations. The origin of this viewpoint, the relational interpretation of System F, goes back to Reynolds [Rey83] in his attempt to obtain a set-theoretic model of System F. This relational viewpoint enables him to capture the nature of polymorphism in terms of relational parametricity. A semantic account of this viewpoint is given in [MR91,Has91,RR94,BAC95]. Roughly speaking, a relational interpretation of System F consists of two components: a reﬂexive graph (a graph with an identity edge at each node), which gives a skeleton of binary relations; and an underlying interpretation of System F together with binary relations over it. The interpretation ties nodes and edges of the reﬂexive graph to the carrier sets of the underlying interpretation and binary relations. This mapping respects identity, i.e. identity edges are mapped to identity relations. We can ﬁnd a correspondence between pre-logical relations and the relational interpretation of System F. Reﬂexive graphs are a generalisation of reﬂexive relations. Thus a pair consisting of a BMM interpretation A and a relation R ⊆ A×A such that TR is reﬂexive and Rt,t = idAt form a relational interpretation of System F. Conversely, any relational interpretation whose reﬂexive graph is just a reﬂexive relation can be regarded as a relation over its underlying interpretation of System F. Moreover we often assume that the mapping from edges to relations respects the interpretation of types. This situation is called natural in [Has91], and under the above correspondence, this means that the corresponding relation is logical. This correspondence suggests that we can bring back our notion of pre-logical relations to consider a class of relational interpretations of System F. We expect that this new class includes interpretations which satisfy Reynolds’ abstraction theorem. The question is how do we understand the notion of relational parametricity in this new class. Parametricity states that relations at universal types include identity relations, but the identity relation itself is a logical notion (in extensional models). One negative consequence of this mismatch is that the Identity Extension Lemma does not hold. On the other hand, modifying parametricity is a good idea for achieving a ﬁner characterisation of observational equivalence. This is exactly achieved on the syntactic side in Sect. 4. We expect that this modiﬁcation and relevant results developed in the syntactic approach will provide interesting feedback to the relational interpretation of System F.

References BFSS90. BAC95.

E. Bainbridge, P. Freyd, A. Scedrov, and P. Scott. Functorial polymorphism. Theoretical Computer Science 70:35–64 (1990). R. Bellucci, M. Abadi, and P.-L. Curien. A model for formal parametric polymorphism: a PER interpretation for system R. Proc. 2nd Intl. Conf. on Typed Lambda Calculi and Applications, TLCA’95, Edinburgh. Springer LNCS 902, 32–46 (1995).

90 BHW95.

Jo Hannay, Shin-ya Katsumata, and Donald Sannella

M. Bidoit, R. Hennicker and M. Wirsing. Behavioural and abstractor speciﬁcations. Science of Computer and Programming, 25:149–186 (1995). BB85. C. B¨ ohm and A. Berarducci. Automatic synthesis of typed λ-programs on term algebras. Theoretical Computer Science 39:135–154 (1985). BMM90. K. Bruce, A. Meyer, and J. Mitchell. The semantics of the second-order lambda calculus. Information and Computation 85(1):76–134 (1990). Gir71. J.-Y. Girard. Une extension de l’interpr´etation de G¨ odel ` a l’analyse, et son application ` a l’´elimination des coupures dans l’analyse et la th´eorie des types. Proc. 2nd Scandinavian Logic Symp., Oslo. Studies in Logic and the Foundations of Mathematics, Vol. 63, 63–92. North-Holland (1971). GTL90. J.-Y. Girard, P. Taylor, and Y. Lafont. Proofs and Types. Cambridge University Press (1990). Han99. J. Hannay. Speciﬁcation reﬁnement with System F. Proc. 13th Intl. Workshop on Computer Science Logic, CSL’99, Madrid. Springer LNCS 1683, 530–545 (1999). Han00. J. Hannay. A higher-order simulation relation for System F. Proc. 3rd Intl. Conf. on Foundations of Software Science and Computation Structures. ETAPS 2000, Berlin. Springer LNCS 1784, 130–145 (2000). Han01. J. Hannay. Abstraction Barriers and Reﬁnement in the Polymorphic Lambda Calculus. PhD thesis, Laboratory for Foundations of Computer Science (LFCS), University of Edinburgh (2001). Han03. J. Hannay. Abstraction barrier-observing relational parametricity. Proc. 6th Intl. Conf. on Typed Lambda Calculi and Applications, TLCA 2003, Valencia. Springer LNCS 2701 (2003). Has91. R. Hasegawa. Parametricity of extensionally collapsed term models of polymorphism and their categorical properties. Proc. Intl. Conf. on Theoretical Aspects of Computer Software, TACS’91, Sendai. Springer LNCS 526, 495– 512 (1991). Hoa72. C.A.R. Hoare. Proof of correctness of data representations. Acta Informatica 1:271–281 (1972). HS96. M. Hofmann and D. Sannella. On behavioural abstraction and behavioural satisfaction in higher-order logic. Theoretical Computer Science 167:3–45 (1996). HLST00. F. Honsell, J. Longley, D. Sannella and A. Tarlecki. Constructive data reﬁnement in typed lambda calculus. Proc. 3rd Intl. Conf. on Foundations of Software Science and Computation Structures. ETAPS 2000, Berlin. Springer LNCS 1784, 161–176 (2000). HS02. F. Honsell and D. Sannella. Prelogical relations. Information and Computation 178:23–43 (2002). Short version in Proc. Computer Science Logic, CSL’99, Madrid. Springer LNCS 1683, 546–561 (1999). Kat03. S. Katsumata. Behavioural equivalence and indistinguishability in higherorder typed languages. Selected papers from the 16th Intl. Workshop on Algebraic Development Techniques, Frauenchiemsee. Springer LNCS, to appear (2003). Lei01. H. Leiß. Second-order pre-logical relations and representation independence. Proc. 5th Intl. Conf. on Typed Lambda Calculi and Applications, TLCA’01, Cracow. Springer LNCS 2044, 298–314 (2001). MR91. Q. Ma and J. Reynolds. Types, abstraction and parametric polymorphism, part 2. Proc. 7th Intl. Conf. on Mathematical Foundations of Programming Semantics, MFPS, Pittsburgh. Springer LNCS 598, 1–40 (1991).

Semantic and Syntactic Approaches to Simulation Relations Mai91.

91

H. Mairson. Outline of a proof theory of parametricity. Proc. 5th ACM Conf. on Functional Programming and Computer Architecture, Cambridge, MA. Springer LNCS 523, 313–327 (1991). Mil71. R. Milner. An algebraic deﬁnition of simulation between programs. Proc. 2nd Intl. Joint Conf. on Artiﬁcial Intelligence. British Computer Society, 481–489 (1971). Mit96. J. Mitchell. Foundations for Programming Languages. MIT Press (1996). MP88. J. Mitchell and G. Plotkin. Abstract types have existential type. ACM Trans. on Programming Languages and Systems 10(3):470–502 (1988). PA93. G. Plotkin and M. Abadi. A logic for parametric polymorphism. Proc. Intl. Conf. Typed Lambda Calculi and Applications, TLCA’93, Utrecht. Springer LNCS 664, 361–375 (1993). PPST00. G. Plotkin, J. Power, D. Sannella and R. Tennent. Lax logical relations. Proc. 27th Int. Colloq. on Automata, Languages and Programming, Geneva. Springer LNCS 1853, 85–102 (2000). Rey74. J. Reynolds. Towards a theory of type structures. Programming Symposium (Colloque sur la Programmation), Paris, Springer LNCS 19, 408–425 (1974). Rey81. J. Reynolds. The Craft of Programming. Prentice Hall (1981). Rey83. J. Reynolds. Types, abstraction and parametric polymorphism. Proc. 9th IFIP World Computer Congress, Paris. North Holland, 513–523 (1983). RR94. E. Robinson and G. Rosolini. Reﬂexive graphs and parametric polymorphism. Proc., Ninth Annual IEEE Symposium on Logic in Computer Science, Paris, 364–371. IEEE Computer Society Press (1994). ST97. D. Sannella and A. Tarlecki. Essential concepts of algebraic speciﬁcation and program development. Formal Aspects of Computing 9:229–269 (1997). Sch90. O. Schoett. Behavioural correctness of data representations. Science of Computer Programming 14:43–57 (1990). Str67. C. Strachey. Fundamental concepts in programming languages. Lecture notes from the Intl. Summer School in Programming Languages, Copenhagen (1967). Ten94. R. Tennent. Correctness of data representations in Algol-like languages. In: A Classical Mind: Essays in Honour of C.A.R. Hoare. Prentice Hall (1994). Tak98. I. Takeuti. An axiomatic system of parametricity. Fundamenta Informaticae, 33(4):397–432 (1998).

On the Computational Complexity of Conservative Computing Giancarlo Mauri and Alberto Leporati Dipartimento di Informatica, Sistemistica e Comunicazione Universit` a degli Studi di Milano – Bicocca Via Bicocca degli Arcimboldi 8, 20126 Milano, Italy [email protected], [email protected]

Abstract. In a seminal paper published in 1982, Fredkin and Toﬀoli have introduced conservative logic, a mathematical model that allows one to describe computations which reﬂect some properties of microdynamical laws of Physics, such as reversibility and conservation of the internal energy of the physical system used to perform the computations. In particular, conservativeness is deﬁned as a mathematical property whose goal is to model the conservation of the energy associated to the data which are manipulated during the computation of a logic gate. Extending such notion to generic gates whose input and output lines may assume a ﬁnite number d of truth values, we deﬁne conservative computations and we show that they naturally induce a new NP–complete decision problem and an associated NP–hard optimization problem. Moreover, we brieﬂy describe the results of ﬁve computer experiments performed to study the behavior of some polynomial time heuristics which give approximate solutions to such optimization problem. Since the computational primitive underlying conservative logic is the Fredkin gate, we advocate the study of the computational power of Fredkin circuits, that is circuits composed by Fredkin gates. Accordingly, we give some ﬁrst basic results about the classes of Boolean functions which can be computed through polynomial–size constant–depth Fredkin circuits.

1

Introduction

The possibility to perform computations with zero internal energy dissipation has been extensively explored in the past few decades. Considerations of thermodynamics of computing started in the early ﬁfties of the twentieth century. As shown in [11], erasing a bit necessarily dissipates kT ln 2 Joule in a computer operating at temperature T , and generates a corresponding amount of entropy. Here k is Boltzmann’s constant and T the absolute temperature in degrees Kelvin, so that kT ≈ 3 × 10−21 Joule at room temperature. However, in [11] Landauer also demonstrated that only logically irreversible operations necessarily dissipate energy when performed by a physical computer. (An operation

This work has been supported by MIUR project 60% “Teoria degli automi”

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 92–112, 2003. c Springer-Verlag Berlin Heidelberg 2003

On the Computational Complexity of Conservative Computing

93

is logically reversible if its inputs can always be deduced from its outputs.) This result gave substance to the idea that logically reversible computations could be performed with zero internal energy dissipation. Indeed, since the appearance of [11] many authors have concentrated their attention on reversible computations. The importance of reversibility has grown further with the development of quantum computing, where the dynamical behavior of quantum systems is usually described by means of unitary operators, which are inherently logically reversible. Let us note, however, that computing in a logically reversible way says nothing about whether or not the computation dissipates energy: it merely means that the laws of physics do not require that such a dissipation occurs. Of the many papers on reversible computation which have appeared in the literature, the most famous are certainly the work of Bennett on universal reversible Turing machines [3], and the work of Fredkin and Toﬀoli on conservative logic [6]. In particular, conservative logic has been introduced as a mathematical model that allows one to describe computations which reﬂect some properties of microdynamical laws of Physics, such as reversibility and conservation of the internal energy of the physical system used to perform the computations. The model is based upon the so called Fredkin gate (see section 6), a three– input/three–output Boolean gate originally introduced by Petri in [14]. In this model conservative computations are realized through circuits which are composed of Fredkin gates. According to [6], conservativeness is usually modeled by the property that the output patterns of the involved Boolean gates are always a permutation of the corresponding input patterns. Besides being conservative, the Fredkin gate is also reversible, that is, it computes a one–to–one map on {0, 1}3 . Notice that conservativeness and reversibility are two independent notions: a gate can satisfy both properties, only one of them, or none. Since every conservative gate produces permutations of its input patterns, it must have the same number of input and output lines. On the other hand, a necessary condition for a gate to be reversible is that the number of output lines be greater than or equal to the number of input lines. Following [4], in this paper we extend the notion of conservativeness to generic gates whose input and output lines may assume a ﬁnite number d of truth values, and we derive some properties which are satisﬁed by conservative gates. By associating equispaced energy levels to the truth values, we show that our notion of conservativeness corresponds to the energy conservation principle applied to the data which are manipulated during the computation. Let us stress that conservativeness is here considered from a strictly mathematical point of view, as in [6]; that is, we are not requiring that the entire energy used to perform the computation is preserved, or that the computing device be a conservative physical system (an ideal but unrealistic situation). In particular, we do not consider the energy needed to actually perform the computation, that is, to apply the operators that transform input values into output values. Successively we introduce the notion of conservative computation, under the reasonable assumption that a gate may store, or accumulate, some energy in its

94

Giancarlo Mauri and Alberto Leporati

internal machinery. Following again [4] we show that conservative computations naturally induce an interesting optimization problem, that we have named Min Storage. By proving that its decision version is NP–complete we show that such problem is NP–hard. Since it is generally believed that no deterministic polynomial time algorithm exists that always gives the correct solution to an NP–hard optimization problem, we present some polynomial time heuristics that give approximate solutions to Min Storage, and we report the results of ﬁve computer experiments which have been performed to study the behavior of such heuristics on uniformly randomly chosen instances. Finally, since the computational primitive underlying conservative logic is the Fredkin gate, we advocate the study of the computational power of Fredkin circuits, that is, circuits composed by Fredkin gates. Accordingly, we give some ﬁrst basic results about the classes of Boolean functions which can be computed by polynomial–size constant–depth Fredkin circuits. Precisely, we show that all Boolean functions which are computed by this kind of circuits can also be computed by depth–two polynomial–size threshold circuits. However, Fredkin circuits can also be regarded as computational devices which generate ﬁnite sets of permutations. Since the generated set depends upon the topology of the circuit, we believe that it should be interesting to investigate the relationships between the topology of Fredkin circuits, the computational complexity of the Boolean functions computed by them, and the structure of the corresponding sets of permutations. We conclude the paper by giving some directions towards the study of these relationships.

2

Conservativeness

Our notion of conservativeness is based upon many–valued logics. These are extensions of the classical Boolean logic which have known a great diﬀusion due to their ability to manage incomplete and/or uncertain knowledge. Diﬀerent approaches to many–valued logics have been considered in the literature (for an overview, see [16,17]). However, here we are not interested into the study of syntactical or algebraic aspects of many–valued logics; we just use some gates whose input and output lines may assume “intermediate” truth values, such as the gates deﬁned in [5]. 1 2 For every integer d ≥ 2, we consider the ﬁnite set Ld = {0, d−1 , d−1 , . . ., d−2 , 1} of truth values; 0 and 1 denote falsity and truth, respectively, whereas the d−1 other values of Ld indicate diﬀerent degrees of indeﬁniteness. As usually found in literature, we will use Ld both as a set of truth values and as a numerical set equipped with the standard order relation on rational numbers. An n–input/m–output d–valued function (also called an (n, m, d)–function for short) is a map f : Lnd → Lm d . Analogously, an (n, m, d)–gate and an (n, m, d)– circuit are devices that compute (n, m, d)–functions. A gate is considered as a primitive operation, that is, it is assumed that a gate cannot be decomposed into simpler parts. On the other hand, a circuit is composed by layers of gates: for a precise deﬁnition see, for example, [21].

On the Computational Complexity of Conservative Computing

95

Our notion of conservativeness is based upon the conservation of some additive quantities associated with the truth values of Ld . Let us consider the set 1 , ε 2 , . . . , ε d−2 , ε1 ⊆ IR of real values; for exposition conveEd = ε0 , ε d−1 d−1 d−1 nience we can think to such quantities as energy values. To each truth value v ∈ Ld we associate the energy level εv ; moreover, let us assume that the values of Ed are all positive, equispaced, and ordered according to the corresponding 1 truth values: 0 < ε0 < ε d−1 < · · · < ε d−2 < ε1 . If we denote by δ the gap d−1 between two adjacent energy levels then the following linear relation holds: εv = ε0 + δ (d − 1) v

∀ v ∈ Ld

(1)

Notice that it is not required that ε0 = δ. Now, let x = (x1 , . . . , xn ) ∈ Lnd be an input pattern for nan (n, m, d)–gate. We deﬁne the amount of energy associated to x as En (x) = i=1 εxi , where εxi ∈ Ed is the amount of energy associated to the i–th element xi of the input pattern. Let us remark that the map En : Lnd → IR+ is indeed a family of mappings parameterized by n, the size of the input. Analogously, for an output m pattern y ∈ Lm d we deﬁne the associated amount of energy as Em (y) = i=1 εyi . We can now deﬁne a conservative gate as follows. Deﬁnition 1. An (n, m, d)–gate, described by the function G : Lnd → Lm d , is conservative if the following condition holds: ∀ x ∈ Lnd

En (x) = Em (G(x))

(2)

Notice that it is not required that the gate has the same number of input and output lines, as it happens with the reversible and conservative gates considered in [6,5]. Using relation (1), equation (2) can also be written as: n m ε0 m ε0 n xi = yj + + δ(d − 1) i=1 δ(d − 1) j=1 Hence, when n = m conservativeness reduces to the conservation of the sum of truth values given in input; in [5] this property is called “weak conservativeness”. A trivial observation is that conservativeness and weak conservativeness coincide in the Boolean case. As for the d–valued case, with d ≥ 3, it is easy to see that conservativeness implies weak conservativeness, whereas the converse is not true: for example, a weakly conservative (3, 3, 3)–gate may transform the input pattern (1, 0, 0) into the output pattern ( 12 , 21 , 0). Also notice that in the Boolean case conservativeness is equivalent to requiring that the number of 1’s given in input is preserved, as originally stated in [6]. An interesting remark is that conservativeness entails an upper and a lower bound to the ratio m n of the number of output lines versus the number of input lines of a gate (or circuit). In fact, maximum amount of energy that can be the n associated to an input pattern is i=1 ε1 = n ε1 , whereas minimum amount the m of energy that can be associated to an output pattern is i=1 ε0 = m ε0 . Clearly if n ε1 < m ε0 then the gate cannot produce any output pattern in a conservative

96

Giancarlo Mauri and Alberto Leporati

ε1 way. As a consequence it must hold m n ≤ ε0 . Analogously, if we consider the minimum amount of energy n ε0 that can be associated to an input pattern x and the maximum amount of energy m ε1 that can be associated to an output ε0 pattern y, it clearly must hold n ε0 ≤ m ε1 , that is m n ≥ ε1 . Summarizing, we have the bounds: ε0 m ε1 ≤ ≤ ε1 n ε0

that is, for a conservative gate (or circuit) the number m of output lines is constrained to grow linearly with respect to the number n of input lines. A natural question is whether we can compute all (n, m, d)–functions in a conservative way. Let us consider the Boolean case. Let f : {0, 1}n → {0, 1}m be a non necessarily conservative function, and let us deﬁne the following quantities: Of = max 0, max n {Em (f (x)) − En (x)} x∈{0,1} Zf = max 0, max n {En (x) − Em (f (x))} x∈{0,1}

Practically speaking, Of (resp., Zf ) is the maximum number of 1’s (resp., 0’s) in the output pattern that should be converted to 0 (resp., 1) in order to make the computation conservative. This means that if we use a gate Gf with n + Of + Zf input lines and m+Of +Zf output lines then we can compute f in a conservative way as follows: Gf (x, 1Of , 0Zf ) = (f (x), 1w(x) , 0z(x) ) where 1k (resp., 0k ) is the k–tuple consisting of all 1’s (resp., 0’s), and the pair (1w(x) , 0z(x) ) ∈ {0, 1}Of +Zf is such that w(x) = Of + En (x) − Em (f (x)) and z(x) = Zf − En (x) +Em (f (x)). As we can see, we use some additional input (resp., output) lines in order to provide (resp., remove) the required (resp., exceeding) energy that allows Gf to compute f in a conservative way. It is easily seen that the same trick can be applied to generic d–valued functions f : Lnd → Lm d ; instead of the number of missing or exceeding 1’s, we just compute the missing or exceeding number of energy units, and we provide an appropriate number of additional input and output lines.

3

Conservative Computations

Let us introduce now the notion of conservative computation. We have seen that conservativeness amounts to the requirement that the energy En (x) associated to the input pattern is equal to the energy Em (y) associated to the corresponding output pattern. We can weaken this requirement as follows. Let G : Lnd → Lm d be the function computed by an (n, m, d)–gate. Moreover, let Sin = x1 , x2 , . . ., xk be a sequence of elements from Lnd to be used as input patterns for the gate, and let Sout = G(x1 ), G(x2 ), . . . , G(xk ) be the corresponding sequence of output patterns from Lm d . Let us consider the quantities ei = En (xi )−Em (G(xi )) for all

On the Computational Complexity of Conservative Computing

97

i ∈ {1, 2, . . . , k}; note that, without loss of generality, by an appropriate rescaling we may assume that all ei ’s are integer values. We say that the computation of Sout , obtained starting from Sin , is conservative if the following condition holds: k i=1

ei =

k

En (xi ) −

i=1

k

Em (G(xi )) = 0

i=1

This condition formalizes the requirement that the total energy provided by all input patterns of Sin is used to build all output patterns of Sout . Of course it may happen that ei > 0 or ei < 0 for some i ∈ {1, 2, . . . , k}. In the former case the gate has an excess of energy that should be dissipated into the environment after the production of the value G(xi ), whereas in the latter case the gate does not have enough energy to produce the desired output pattern. Since we want to avoid these situations, we assume to perform computations through gates which are equipped with an internal accumulator (also storage unit) which is able to store a maximum amount C of energy units. We call C the capacity of the gate. The amount of energy contained into the internal storage unit at a given time can be used during the next computation if the output pattern energy is greater than the energy of the corresponding input pattern. If the output patterns G(x1 ), G(x2 ), . . . , G(xk ) of Sout are computed exactly in this order then, assuming that the computation starts with no energy stored into the gate, it is easily seen that st1 := e1 , st2 := e1 + e2 , . . . , stk := e1 + e2 + . . . + ek is the sequence of the amounts of energy stored into the gate during the computation of Sout . Notice that stk = 0 for conservative computations. This condition is equivalent to the requirement that the amount of energy stored into the gate at the end of the computation be equal to the amount of energy stored at the beginning (i.e., zero as assumed here; in section 5 we will also consider the more general case where the initial energy can be greater than zero). In some cases the order with which the output patterns of Sout are computed does not matter. We can thus introduce the following problem: Given an (n, m, d)–gate that computes the map G : Lnd → Lm d , an input sequence x1 , . . . , xk and the corresponding output sequence G(x1 ), . . . , G(xk ), is there a permutation π ∈ Sk (the symmetrical group of order k) such that the computation of G(xπ(1) ), G(xπ(2) ), . . . , G(xπ(k) ) can be performed by a gate having a predeﬁned capacity C? This is a decision problem, whose formal statement follows. (Note that we do not actually need to know the values of x1 , . . . , xk and G(x1 ), . . . , G(xk ): all we need are the values ei = En (xi ) − Em (G(xi )), for i ∈ {1, 2, . . . , k}.) Let E = e1 , e2 , . . . , ek be a ﬁnite sequence of integer numbers, deﬁned as i above. For a ﬁxed i ∈ {1, 2, . . . , k}, the i–th preﬁx sum of E is the value j=1 ej . Let C be a positive integer; we say that E is C–feasible if for each i ∈ {1, 2, . . . , k} the i–th preﬁx sum of E is in the closed interval [0, C]. Problem 1. Name: ConsComp. – Instance: a set E = {e1 , e2 , . . . , ek } of integer numbers such that e1 + e2 + . . . + ek = 0, and an integer number C > 0.

98

Giancarlo Mauri and Alberto Leporati

– Question: is there a permutation π ∈ Sk (the symmetric group of order k)

such that the sequence eπ(1) , eπ(2) , . . . , eπ(k) is C–feasible? The fact that the resulting sequence eπ(1) , eπ(2) , . . . , eπ(k) is C–feasible can be explicitly written as: 0≤

i

eπ(j) ≤ C

∀ i ∈ {1, 2, . . . , k}

(3)

j=1

The ConsComp problem can be obviously solved by trying every possible permutation π from Sk . However, this procedure requires an exponential time with respect to k, the length of the computation. A natural question is whether it is possible to give the correct answer in polynomial time. With the following theorem we show that the ConsComp problem is NP–complete, and hence it is very unlikely that a polynomial time algorithm exists that solves it. The proof of this theorem was originally published in [4]. Theorem 1. ConsComp is NP–complete. Proof. ConsComp is clearly in NP, since a permutation π ∈ Sk has linear length and verifying if π is a solution can be done in polynomial time. Let us show a polynomial reduction from Partition, which is notoriously an NP–complete problem ([7], page 47). First of all we restate Partition in the following form: – Instance: a set R = {r1 , r2 , . . . , rk } of rational numbers from the interval [0, 1]. – Question: is there a partition (R1 , R2 ) of R such that r= r? r∈R1

r∈R2

The equivalence of this formulation with the one contained in [7] is trivial to prove. Moreover, without loss of generality we can assume that if (R1 , R2 ) is a solution of Partition then it holds r∈R1 r = r∈R2 r = 1: in fact, given a generic instance R = {r1 , r2 , . . . , rk } of Partition it suﬃces to compute m = k m i=1 ri and divide each element of R by 2 ; as a consequence, it holds r∈R r = 2. Now we consider a generic instance of Partition and we build a corresponding instance of ConsComp such that a solution of the latter can be transformed in polynomial time into a solution of the former. The instance of ConsComp is built as follows: let C = 1, and E = {e1 , e2 , . . . , ek , ek+1 , ek+2 } such that ei = −ri for i ∈ {1, 2, . . . , k} and ek+1 = ek+2 = 1. It is immediate to see that this transformation can be computed in polynomial time. When we solve ConsComp on the instance we have just built, since inequalities stated in (3) must hold, the ﬁrst element of E which has to be selected to build the permutation π is necessarily a 1. For the same reason we have to choose the next elements from the set {e1 , e2 , . . . , ek } ⊆ E. For each choice, the sum of the elements which have been selected up to that point decreases; the remaining 1 can be chosen if and only if such sum becomes exactly zero. When

On the Computational Complexity of Conservative Computing

99

this situation occurs, the negative elements of E selected up to that point sum up to −1, and thus their opposites constitute one of the two sets of the partition of R in the instance of Partition we started from. In other words, we can place the second 1 in the solution of ConsComp (thus solving the problem) if and only if we can solve the instance of Partition we started from.

The ConsComp problem naturally leads to the formulation of the following optimization problem. Problem 2. Name: Min Storage. – Instance: a set {e1 , e2 , . . . , ek } of integer numbers such that e1 + e2 + . . . + ek = 0. i – Solution: a permutation π ∈ Sk such that j=1 eπ(j) ≥ 0 for each i ∈ {1, 2, . . . , k}. i – Measure: max j=1 eπ(j) .

1≤i≤k

Informally, the output of Min Storage is the minimum value of C for which there exists a permutation π ∈ Sk such that the sequence eπ(1) , eπ(2) , . . . , eπ(k) k is C–feasible. Observe that a trivial upper bound for the value of C is i=1 |ei |, while a trivial lower bound is max1≤i≤k |ei |. It is immediate to see that Min Storage is in the class NPO. In fact, checking whether some given integers e1 , e2 , . . . , ek sum up to zero can be trivially done in polynomial time; each feasible solution has linear length and besides it can be veriﬁed in polynomial time whether a given permutation π ∈ Sk is a feasible solution; ﬁnally, the measure function can be computed in polynomial time. Since the underlying decision problem ConsComp is NP–complete, we can immediately conclude that Min Storage is NP–hard ([1], page 30). As with the ConsComp decision problem, this means that it is very unlikely that a polynomial time algorithm exists that gives the correct solution to every instance of Min Storage. A trivial exponential time algorithm for solving Min Storage is the following: for each permutation π ∈ Sk we compute: st1 = eπ(1) ,

st2 = eπ(1) + eπ(2) ,

...,

stk = eπ(1) + eπ(2) + . . . + eπ(k)

(notice that stk is always zero by deﬁnition of the problem). If there exists an i ∈ {1, 2, . . . , k} such that sti < 0 then we discard the permutation, otherwise we compute stmax = max{st1 , st2 , . . . , stk }. The solution is the minimum value of stmax over all the permutations which have not been discarded. Some optimization is actually possible: if, for a given permutation π and i ∈ {1, 2, . . . , k} it holds sti < 0, then we can immediately discard all permutations π such that π (j) = π(j) for all j ∈ {1, 2, . . . , i}. Analogously, we can discard the same permutations if sti is greater than the best solution obtained so far. Of course, also with these tricks the execution time of this algorithm remains in general exponential with respect to k: a computer implementation has shown no relevant

100

Giancarlo Mauri and Alberto Leporati

advantage with respect to the brute force algorithm which tries all permutations π ∈ Sk , but for artiﬁcial instances built on purpose. Another computer experiment has shown that also discarding the permutation under examination whenever sti is found greater than a parameter given in input gives no particular advantage, at least under the reasonable assumption that the input parameter is greater than or equal to the optimal solution of the given instance. In the next section we present some polynomial time heuristics which give approximate solutions to this problem. Successively we report the results of ﬁve computer experiments which have been performed to study the behavior of such heuristics on uniformly randomly chosen instances.

4

Polynomial Time Heuristics for Min Storage

Let us ﬁrst recall the notion of coeﬃcient of approximation. Let cA (I) be the value which is returned as a solution by a heuristic A for the instance I of the Min Storage problem, and let opt(I) be the optimal solution, that is the value returned by the brute force algorithm described above. Then, the coeﬃcient of approximation of algorithm A over the instance I is the value appA (I), where appA (I) =

|cA (I)| opt(I)

Note that appA (I) is always greater than or equal to 1, and that the closer it is to 1, the better the approximate solution is. We say that algorithm A has the guaranteed coeﬃcient of approximation c if appA (I) ≤ c, for every instance I. Notice that in this paper we do not show any guaranteed value for the coeﬃcient of approximation of the proposed algorithms. Indeed, ﬁnding an approximation algorithm for Min Storage with a guaranteed coeﬃcient of approximation is still an open problem. In order to evaluate the time complexity of the proposed algorithms we make the following assumptions. First of all we assume that all lists are implemented as arrays. By associating a Boolean ﬂag to each element of the lists, indicating whether the element has to be considered as deleted or not, we can assume that the removal of a generic element L[i] from a list L takes a constant time. As for sorting operations, we assume to use some comparisons–based optimal algorithm such as QuickSort or MergeSort, which take Θ(k log k) time steps to sort k elements. 4.1

The Greedy Algorithm

The ﬁrst heuristic we propose is the greedy algorithm. This algorithm maintains a list L of elements to be considered. At the beginning of the execution L contains all the elements {e1 , e2 , . . . , ek } of the instance. An integer variable st, initially set to 0, indicates the amount of energy currently stored into the gate. The algorithm repeats the following operations until L becomes empty: ﬁrst it ﬁnds the minimum positive value of st + , with ∈ L, then it updates the value of st with st + , and ﬁnally it removes from L. An integer variable stmax

On the Computational Complexity of Conservative Computing

101

records the maximum value reached by st; the value of stmax at the end of the execution is the result returned by the greedy algorithm. It is easily seen that this algorithm can also be implemented as follows: Greedy(I) L ← Sort(I)

// L initially contains the elements of I in increasing // order

st ← 0 stmax ← 0 while Length(L) > 0 do i←1 while st + L[i] < 0 do i←i+1 endwhile st ← st + L[i] stmax ← max{st, stmax} remove L[i] from L endwhile return stmax From the inspection of the pseudocode it is clear that, under the assumptions made above, the execution time of the whole algorithm is Θ(k 2 ). 4.2

Some Θ(k log k) Algorithms

Let us consider the following algorithm: Min(I) Ln ← negative values of I Lp ← I \ Ln sort Lp and Ln in increasing order st ← 0 stmax ← 0 while Length(Lp ) > 0 do st ← st + min(Lp ) stmax ← max{st, stmax} remove min(Lp ) from Lp while Length(Ln ) > 0 and st + max(Ln ) ≥ 0 do st ← st + max(Ln ) remove max(Ln ) from Ln endwhile endwhile return stmax As we can see, at each iteration of the outer while loop the minimum of the remaining positive elements is chosen. For each considered positive element,

102

Giancarlo Mauri and Alberto Leporati

the inner while loop takes as many negative elements as possible, choosing the maximum of them (that is, the one with minimum absolute value) at each iteration. After an initial sorting, each element is considered only once during the execution of the two while loops; hence, the total execution time of the algorithm is Θ(k log k). We can also consider a dual algorithm, which we have called Max, where at each iteration of the outer loop the maximum of the remaining positive elements is chosen, whereas at each iteration of the inner loop the minimum of the remaining negative values is chosen. Another variation is the MaxMinMax algorithm, where at each iteration of the outer while loop the maximum of the remaining positive values is chosen, as in Max. This time, however, there are two subsequent inner while loops: in the former we remove (as much as possible) the minimum negative elements, that is those with highest absolute value, while in the latter we remove the maximum elements as much as possible. Also in this case there exists a dual algorithm, called MinMaxMin, where at each iteration of the outer loop we remove the minimum of the remaining positive elements, and in the two inner loops we remove ﬁrst the maximum and then the minimum of the remaining negative elements. A further variation is given by algorithms MinMaxMinMax and MaxMinMaxMin. In the outer loop of these algorithms the maximum or the minimum of the remaining positive elements is alternately chosen; in particular, in the former algorithm the ﬁrst element chosen from the instance is the minimum of positive elements, whereas in the latter algorithm the maximum element of the instance is chosen ﬁrst. The two inner loops are just like those of MaxMinMax and MinMaxMin; in particular, if the minimum of positive values has been chosen in the outer loop then we ﬁrst remove the maximum negative elements and then the minimum ones, whereas we do the opposite if the maximum of positive elements was chosen. It is immediately seen that all the variations just exposed are uninﬂuent to the asymptotic execution time, that remains equal to Θ(k log k). 4.3

The Best Fit Algorithm

Another approach to solve Min Storage is given by the following algorithm: Best Fit(I) Ln ← negative values of I Lp ← I \ Ln sort Lp and Ln in increasing order est ← max{max(Lp ), − min(Ln )} st ← 0 while Length(Lp ) > 0 do if st + min(Lp ) > est then est ← st + min(Lp ) st ← st + min(Lp ) remove min(Lp ) from Lp

// est ← max1≤i≤k |ei |

On the Computational Complexity of Conservative Computing

103

else for i ← Length(Lp ) downto 1 do if st + Lp [i] ≤ est then st ← st + Lp [i] remove Lp [i] from Lp endif endfor endif for i ← 1 to Length(Ln ) do if st + Ln [i] ≥ 0 then st ← st + Ln [i] remove Ln [i] from Ln endif endfor endwhile return est The Best Fit algorithm assumes as a ﬁrst estimate for the capacity of the gate the value max1≤i≤k |ei |. This initial estimate follows from the observation that the capacity cannot be less than this value since the corresponding element has to be added or removed from the internal storage of the gate in one step. During the execution of the algorithm the estimate for the capacity is adjusted, that is, increased; of course, each time the value of the estimate is increased of the smallest possible amount. Let est be the estimate for the capacity of the gate. At each iteration of the outer while loop we ﬁrst add to the internal storage some positive values from the instance, and then we add some negative values. Positive values of the instance are scanned from the maximum down to the minimum; each of them is added to the internal storage (and removed from the instance), unless the resulting value exceeds est. Analogously, negative values are scanned from the minimum to the maximum; each of them is added to the internal storage (and removed from the instance), unless the resulting value becomes negative. If at some point no positive value can be added — that is, if st + min(Lp ) > est, where st is the energy currently stored into the gate — then we adjust the value of est by putting est = st + min(Lp ). Now we can add min(Lp ), the minimum of the remaining positive elements, to the internal storage and then try to add some negative elements. The result returned by the algorithm is the value of est at the end of the execution, that is, after all the elements of the instance have been considered. A direct inspection of the pseudocode allows us to see that the execution time of Best Fit is Θ(k 2 ).

5

Experimental Analysis

All the algorithms here exposed have been implemented in the C language and tested over uniformly randomly chosen instances of Min Storage. For each

104

Giancarlo Mauri and Alberto Leporati

heuristic the average coeﬃcient of approximation and the corresponding variance have been computed. In all we have executed ﬁve computer experiments, that we now brieﬂy expose. For a detailed report we refer the reader to [12]. In the ﬁrst experiment we have generated 100 instances, each one containing 12 elements. The elements were chosen from the interval [−106 , 106 ] of integers. The small number and length of instances have been chosen in order to allow the computation of optimal solutions through the “brute force” algorithm that examines all permutations π in Sk . Of the proposed algorithms, Best Fit has obtained both the best coeﬃcient of approximation, about 1.0202729, and the smallest variance. In particular, this means that Best Fit frequently ﬁnds a good solution. In the second experiment we have generated 100000 instances of 100 elements. Each element was taken from the interval [−106 , 106 ] of integers. Due to the length of instances, during this experiment (as well as during the next three) we were not able to compute the optimal solutions by means of the “brute force” algorithm; hence, in order to compute the coeﬃcients of approximation we have used the theoretical lower bound max1≤i≤k |ei | as the optimal solution, thus obtaining upper bounds to the real coeﬃcients. Indeed, the ﬁrst experiment was conceived to compare these upper bounds with the real coeﬃcients of approximation, although computed over very small instances. Once again the winner has been Best Fit, having both the smallest average coeﬃcient of approximation and the smallest variance. We have also considered a variant of the Min Storage problem, where we have relaxed the requirement that the amount of energy stored into the gate at the beginning of the computation is zero. This corresponds to a natural extension of the notion of conservative computation, obtained by letting the gate to have a positive amount ε of energy stored at the beginning of the computation, and requiring that exactly the same amount ε of energy is stored into the gate at the end of the computation. When this situation occurs, we say that the computation is ε–conservative. Clearly also the variant of Min Storage concerning ε–conservative computations (with ε ≥ 0) is NP–hard. The third and fourth experiments have been conceived to study the behavior of the proposed heuristics on this variant of the problem. Setting up these experiments required some minor and trivial modiﬁcations to the algorithms. In the third experiment we generated 100 instances, each one composed by 100 elements taken from the interval [−106 , 106 ] of integers. For each instance we ran the proposed algorithms, varying the initial energy ε from 0 to max1≤i≤k |ei |, with steps of 100. As with the previous experiments, we computed the average coeﬃcients of approximation and the corresponding variances. Surprisingly, Best Fit gives no more the best results: instead, MinMaxMinMax has both the lowest average coeﬃcient of approximation (equal to 1.1470192, versus 1.1619727 of Best Fit) and the lowest variance (0.0042202 versus 0.0219509). It is our opinion that Best Fit does not perform better than MinMaxMinMax under this setting because the former algorithm starts by considering the elements of the instance from the greatest positive to the smallest positive element, each

On the Computational Complexity of Conservative Computing

105

time taking the element if there is enough free storage into the gate; negative elements are considered only later. Of course this may not be the optimal strategy, especially when the initial energy stored into the gate is high with respect to gate capacity. The latter algorithm alternately chooses the minimum and the maximum of the positive elements remaining into the instance, and then it immediately considers negative elements: as a consequence, it has more chances to make the right choices. Some modiﬁcations to the Best Fit algorithm in order to perform better when there is a positive initial amount of energy into the gate are currently under consideration. The fourth experiment is very similar to the third, the only diﬀerence being that the initial energy ε is now varied from 0 to 1.1·max1≤i≤k |ei |. Hence, for some instances we assume that the initial energy is greater than the maximum element of the instance, and thus the only reasonable choice is to start by choosing negative elements. The results obtained during this experiment are similar to the ones obtained during the third experiment. Finally, we conjectured that the variant of Min Storage concerning ε– conservative computations could be harder to solve when ε = 12 · max1≤i≤k |ei |, since in this case we have the smallest number of possible choices for the ﬁrst element to be chosen from the instance. The ﬁfth experiment was performed to test this conjecture. We generated 10000 instances of 100 elements, where each element was taken from the interval [−106 , 106 ] of integers. However, the results we have obtained are similar to those obtained for previous experiments, and hence it seems that our conjecture is false.

6

Conservative Computing and Small Constant Depth Circuits

As stated above, the computational primitive underlying conservative logic is the Fredkin gate. Indeed, in [6] it is understood that conservative computations are performed by circuits composed of such gates. Hence we believe that it is important to give a characterization of the classes of Boolean functions which can be computed by diﬀerent topologies of circuits composed by Fredkin gates. Here we give a ﬁrst step towards this direction: precisely, we show that all Boolean functions which can be computed by constant–depth polynomial–size Fredkin circuits can also be computed by depth-2 polynomial–size threshold circuits with polynomially bounded integer weights. A threshold gate T is a (n, 1, 2)–gate whose computation is uniquely determined by a real vector w = [w1 , w2 , . . . , wn ] ∈ IRn , called the weights vector, and a real value w0 called the threshold. If we denote by x1 , x2 , . . . , xn the input values of T , then the gate computes the following Boolean function: n n 1 if i=1 wi xi ≥ w0 fT (x1 , x2 , . . . , xn ) = step (4) wi xi − w0 = 0 otherwise i=1

It is widely known (see for example [13]) that a Boolean function can be expressed in the form (4) if and only if it is linearly separable, and that integer weights (and

106

Giancarlo Mauri and Alberto Leporati

threshold) are suﬃcient to a threshold gate to compute every linearly separable function. Trivial examples of linearly separable functions are and, or and not. Putting together threshold gates we can build threshold circuits, that is acyclic and connected directed graphs made up of layers of threshold gates. (For a precise and formal deﬁnition of circuits see, for example, [21].) Evaluating a threshold circuit in topological order (i.e. layer by layer, starting from the layer directly connected to the input lines) we can deﬁne the Boolean function computed by the circuit as the composition of the functions computed by the single threshold gates. As usually done in the literature, we assume that we can feed to a threshold circuit not only the input values x1 , x2 , . . . , xn , but also their negated counterparts x1 , x2 , . . . , xn ; without loss of generality, we also assume to have two input wires which supply the constant values 0 and 1 respectively. As it is easily demonstrated, both these modiﬁcations do not alter the computational power of threshold circuits. In evaluating the resources used by a threshold circuit to compute a Boolean function we consider the size, the depth and the weight of the circuit, respectively deﬁned as the number of threshold gates, the number of layers and the maximum of the absolute values of the weights over all the threshold gates of the circuit. We will also consider the fan–in of gates and circuits. The fan–in of a gate is the number of its input lines, while the fan–in of a circuit is deﬁned as the maximum fan–in of the gates contained into the circuit. From now on we will refer to (n, m, 2)–functions with unspeciﬁed values of n and m as multi–output Boolean functions. Of course a Boolean function can be considered as a special case of multi–output Boolean function. When we will state a property for a multi–output Boolean function, it will be understood that each of the m Boolean functions that compose it has the property. We are particularly interested in computing multi–output Boolean functions using threshold circuits of constant depth, polynomial size, polynomial weight and unbounded fan–in, so we give the following deﬁnition. Deﬁnition 2. Using the established notation given in [19], for any integer d ≥ 1 d the class of Boolean functions computed by depth-d, polynomial we denote by LT size, polynomial–weight and unbounded fan–in threshold circuits. Small–depth polynomial–size threshold circuits with polynomial weights have proven to be a very powerful model of computation: in fact, it has been shown that several arithmetic and Boolean operations of practical interest have surprisingly eﬃcient realizations by threshold circuits of depth smaller than 4 [20,18,22,2]. A Boolean function f (x1 , x2 , . . . , xn ) is symmetric if its value is invariant with respect to permutations over the input variables; it is not diﬃcult to see that this property holds n if and only if the value of f depends uniquely on the value of the quantity i=1 xi . A proposition in [9] states that every sym 2 . From this proposition it follows immediately metric function is in the class LT the well known fact that the Parity function: Parity(x1 , x2 , . . . , xn ) = x1 ⊕ x2 ⊕ . . . ⊕ xn

On the Computational Complexity of Conservative Computing

107

1 and LT 2 , since it is symmetric but not linearly sepaseparates the classes LT rable. Again in [9], Hajnal, Maass, Pudl´ ak, Szegedy and Tur´ an have given the ﬁrst proof that the Inner Product mod 2 function (ip): ip(x1 , x2 , . . . , xn , y1 , y2 , . . . , yn ) = (x1 ∧ y1 ) ⊕ (x2 ∧ y2 ) ⊕ . . . ⊕ (xn ∧ yn ) 2 and LT 3 . Hence: separates the classes LT 2 ⊂ LT 3

1 ⊂ LT LT Unfortunately, no one knows whether the subsequent classes in the hierarchy are separated or not. Indeed, many simple questions about the capabilities of the corresponding circuits remain unanswered: for example, the best we can do for depth-3 threshold circuits is to show some strong lower bounds for some restricted types of circuits, such as bounded fan-in circuits [10] and circuits with and gates at the bottom [15]. To the best knowledge of the authors, no superpolynomial lower bound is known for general threshold circuits with a depth greater than or equal to 3; indeed, in principle it is even possible (although not very plausible) that depth-3 threshold circuits can compute every Boolean function in NP. In spite of this, it is generally believed that constant–depth threshold circuits are not powerful enough to compute all functions in the class NC1 , that is those computable by logarithmic–depth and/or/not circuits with fan–in equal to 2. We can now introduce Fredkin circuits. A Fredkin gate is a (3, 3, 2)–gate whose input/output map FG : {0, 1}3 → {0, 1}3 associates any input triple (x1 , x2 , x3 ) with its corresponding output triple (y1 , y2 , y3 ) as follows: y1 = x1 y2 = (¬x1 ∧ x2 ) ∨ (x1 ∧ x3 ) y3 = (x1 ∧ x2 ) ∨ (¬x1 ∧ x3 )

(5)

The Fredkin gate is functionally complete for Boolean logic: in fact, by ﬁxing x3 = 0 we get y3 = x1 ∧ x2 , whereas by ﬁxing x2 = 1 and x3 = 0 we get y2 = ¬x1 . A useful point of view is that the Fredkin gate behaves as a conditional switch: that is, FG(1, x2 , x3 ) = (1, x3 , x2 ) and FG(0, x2 , x3 ) = (0, x2 , x3 ) for every x2 , x3 ∈ {0, 1}. In other words, x1 can be considered as a control input whose value determines whether the input values x2 and x3 have to be exchanged or not. Here we just mention the fact that every permutation can be written in a unique way (up to the order of factors) as a composition of exchanges, that is, cycles of length two. This means not only that the Fredkin gate can be used to build an appropriate circuit to perform any given conservative computation (and thus it is universal also in this sense with respect to conservative computations), but also that it is the most elementary conceivable operation that can be used to describe conservative computations. A Fredkin circuit is a circuit composed by Fredkin gates. As with threshold circuits, the size and the depth of a Fredkin circuit are deﬁned as the number of gates and the number of layers in the circuit, respectively; the notion of weight

108

Giancarlo Mauri and Alberto Leporati

has no meaning for Fredkin circuits. Also in this case we assume to be able to feed a gate both with variables and with their negated counterparts, and to have two input wires which supply the constant values 0 and 1. Deﬁnition 3. For any integer d ≥ 1, let us denote by F Cd the class of multi– output Boolean functions which can be computed by depth-d polynomial–size Fredkin circuits. As a ﬁrst result, we show that the function FG cannot be computed by a depth-1 threshold circuit. 1. Proposition 1. FG ∈ LT Proof. Let FG2 : {0, 1}3 → {0, 1} be the Boolean function computed by a Fredkin gate on its second output line, y2 . We show that assuming the existence of a threshold gate T that computes FG2 yields to a contradiction. Let w1 , w2 , w3 be the weights associated to input lines x1 , x2 and x3 of T , respectively, and let τ be the threshold value. T computes FG2 if and only if: 1 if w1 x1 + w2 x2 + w3 x3 ≥ τ FG2 (x1 , x2 , x3 ) = 0 otherwise By deﬁnition of FG it must hold FG2 (0, 0, 1) = 0 and FG2 (0, 1, 0) = 1, that is w3 < τ and w2 ≥ τ , from which w3 < w2 . Similarly, it should be FG2 (1, 0, 1) = 1 and FG2 (1, 1, 0) = 0, that is w1 + w3 ≥ τ and w1 + w2 < τ . From these inequalities immediately follows w2 < w3 , which contradicts what we have previously obtained.

2 , since every On the other hand, by (5) it is immediately seen that FG ∈ LT and, or and not gate can be realized by an appropriate threshold gate. By substituting every Fredkin gate in a Fredkin circuit with an equivalent depth-2 d are closed under Boolean threshold circuit, and recalling that the classes LT negation, we obtain F Cd ⊆ LT 2d for every integer d ≥ 1. However, we are indeed 2 for every integer d ≥ 1. Let able to show a stronger result: that is, F Cd ⊆ LT us ﬁrst give the following two deﬁnitions. Deﬁnition 4. A set G of Boolean functions is said to be closed under constant– degree functions if, for every ﬁxed integer c ≥ 1, every choice of functions g1 , g2 , . . . , gc from G and every c–ary Boolean function h, the Boolean function f (x) = h g1 (x), g2 (x), . . . , gc (x) is in G. Deﬁnition 5. Let G be a set of n–ary Boolean functions. A function f : {0, 1}n → IR is G–approximable if for any k > 0 there exists a polynomial number of functions g1 , g2 , . . . , gp(n) ∈ G such that: p(n)

f (x) =

j=1

aj gj (x) ± O(n−k )

∀ x ∈ {0, 1}n

On the Computational Complexity of Conservative Computing

where a1 , a2 , . . . , ap(n) are real numbers such that

max 1≤j≤p(n)

109

|aj | is polynomially

bounded. For our purposes we obtain the most interesting case of G–approximability d ; in this case, instead of LT d –approximability we when we choose G = LT will simply call it d–approximability. Since threshold circuits can only compute (multi–output) Boolean functions, let us consider the following important subclass of d–approximable functions. Deﬁnition 6. For any integer d ≥ 1, let us denote by AP Pd the class of all Boolean d–approximable functions. d form the following In [18] it is easily proved that the classes AP Pd and LT intertwined hierarchy: d ⊆ AP Pd ⊆ LT d+1 LT

∀d ≥ 1

Moreover, the following theorem is also proved. Theorem 2 ([18], page 52). For every integer d ≥ 1, the class AP Pd is closed under constant–degree functions. We are now ready to state our result. Let f ∈ F Cd , for some ﬁxed integer d ≥ 1. By deﬁnition, there exists a depth-d polynomial–size Fredkin circuit F that computes f . First of all we build an and/or/not circuit that computes f , by replacing each Fredkin gate in F with an equivalent and/or/not circuit, as indicated in (5). Then, using de Morgan laws we move all negations to the ﬁrst layer of the circuit, and we eliminate them by picking input values or their negations. As a result we obtain a depth-2d and/or circuit, that we denote by C. Since every gate of the ﬁrst layer of C can be replaced by an equivalent threshold gate, such layer computes a multi–output Boolean function whose components 1 , and hence in AP P1 . The fact that AP P1 is closed under constant– are in LT degree functions implies that the multi–output Boolean function computed by the ﬁrst two layers of C is in AP P1 too, and so on. This proves that the multi– output Boolean function computed by the ﬁrst 2d layers of C, that is f , is 2 , f can be computed by a depth-2 polynomial– in AP P1 . Since AP P1 ⊆ LT size polynomial–weight threshold circuit. We have thus proved the following proposition. 2. Proposition 2. For every integer d ≥ 1, F Cd ⊆ LT Actually this proposition can also be proved in a more direct way, without involving the class AP P1 . Let us consider the depth-2d and/or circuit C deﬁned above. Note that such circuit has the maximum number of gates when each output is computed by a subcircuit which is a complete binary tree. This means that the value of each output line of C depends upon at most 22d variables, which 2d is a constant number. Such variables may assume 22 diﬀerent conﬁgurations, which is also a constant number. Hence if we write f in Disjunctive Normal Form (DNF), as a disjunction of all conﬁgurations for which f = 1, we obtain a

110

Giancarlo Mauri and Alberto Leporati

depth-2 constant–size and/or circuit. By replacing each gate of this circuit by 2 . Notice that if we allow an equivalent threshold gate we conclude that f ∈ LT the use of only input variables x1 , x2 , . . . , xn instead of both input variables and their negations, then by writing f in DNF we obtain a depth-3 and/or/not circuit whose negations are all located in the ﬁrst layer. However it is not diﬃcult to see that also in this case we can replace each and and each or gate by an appropriate threshold gate, and embed all negations in the ﬁrst layer of threshold gates, thus obtaining a depth-2 constant–size threshold circuit. A diﬀerent approach to the study of the computational capability of Fredkin circuits is obtained by considering the sets of permutations they generate. If we do not allow the duplication of a signal by splitting the corresponding wire (equivalently, if we impose that each input and output value of the gates can be used just once) then it is easily seen that in a Fredkin circuit the number of input and output lines are the same, and that the circuit is conservative with respect to the notion given in [6]. In particular, the circuit applies a permutation to every possible input pattern. The applied permutation depends upon both the topology of the circuit and the input pattern. In the literature, this kind of permutations are usually named data dependent permutations. Hence if we ﬁx a Fredkin circuit we can associate a set of permutations to the set of all possible input patterns. Here we advocate the use of conditional permutations to study the set of permutations realized by Fredkin circuits. Generally, a conditional permutation can be written as: f (x1 , x2 , . . . , xn ) · π(x1 , x2 , . . . , xn )

(6)

where f : {0, 1}n → {0, 1} is a Boolean function, and π ∈ Sn is a permutation. If f = 1 then (6) behaves as π, otherwise it behaves as the identity. As an example, the Fredkin gate (5) can be expressed as x1 · (x2 x3 ). Clearly every Fredkin circuit can be expressed as a composition of conditional exchanges. Hence the set of permutations realized by a Fredkin circuit is obtained by considering all possible truth assignments to the Boolean formulas which express the conditions in conditional exchanges. Future work will be devoted to study the properties of the sets of permutations generated by Fredkin circuits, as well as the relationships between such properties and the complexity of the corresponding Boolean functions computed by the same circuits.

7

Conclusions and Directions for Future Work

In this paper we have proposed the ﬁrst steps towards a theory of conservative computations, where the amount of energy associated to the data manipulated during the computations is preserved. We have shown that conservative computations induce ConsComp, a new NP–complete decision problem, and Min Storage, its naturally associated optimization problem. Since it is commonly believed that no polynomial time algorithm exists that always gives the optimal solution to any NP–hard optimization

On the Computational Complexity of Conservative Computing

111

problem, we have presented some polynomial time heuristics that give approximate solutions to the instances of Min Storage. A study of their behavior, performed through some computer experiments, suggests that Min Storage seems to be easy to solve on uniformly randomly chosen instances. In particular one of the proposed heuristics, namely Best Fit, seems to perform very well when the initial energy stored into the gate is zero. Interestingly, the same heuristic is no more the best when the initial energy is positive. Hence a ﬁrst open problem is to try to modify Best Fit in order to perform better with a positive initial energy. A more important open problem is to devise an approximation algorithm for Min Storage. Of course perfect conservation of energy is possible only in theory. Hence, a further possibility for future work could be the relaxation of the conservativeness constraint (2), by allowing that the amount of energy dissipated during a computation step be no greater than a ﬁxed value. Analogously, we can suppose that if we try to store an amount of energy that exceeds the capacity of the gate then the energy which cannot be stored is dissipated. In such a case it should be interesting to study trade-oﬀs between the amount of energy dissipated and the hardness of the corresponding modiﬁed ConsComp and Min Storage problems. In this paper we have also started to study the complexity of Boolean functions computed by Fredkin circuits. However, only constant–depth circuits have been here considered, and only very basic results have been exposed. We believe that logarithmic–depth Fredkin circuits should be far more interesting from this point of view. Notice that the proof of Proposition 2 cannot be replicated for 2 , contrary to the fact this kind of circuits, otherwise we could infer NC1 ⊆ LT 1 that the Inner Product mod 2 function is in NC \ LT 2 . Besides the study of permutations generated by Fredkin circuits, it should be interesting to study the computational complexity of many–valued extensions of Fredkin circuits, such as circuits composed by the gates deﬁned in [5]. Finally, it remains to study how to theoretically model circuits whose gates are equipped with an internal storage unit. It is our opinion that it seems appropriate to consider this kind of gates as ﬁnite state automata, by viewing the energy levels of the storage unit as their states.

References 1. G. Ausiello, P. Crescenzi, G. Gambosi, V. Kann, A. Marchetti–Spaccamela, M. Protasi. Complexity and Approximation. Combinatorial Optimization Problems and Their Approximability Properties. Springer–Verlag, 1999. 2. R. Beigel, J. Tarui. On ACC. In Proceedings of the 32nd IEEE Symposium on Foundations of Computer Science, 1991, pp. 783–792. 3. C. H. Bennett. Logical reversibility of computation. IBM Journal of Research and Development, 17, November 1973, pp. 525–532. 4. G. Cattaneo, G. Della Vedova, A. Leporati, R. Leporini. Towards a Theory of Conservative Computing. Submitted for publication, 2002. e-print available at: http://arxiv.org/quant-ph/abs/0211085

112

Giancarlo Mauri and Alberto Leporati

5. G. Cattaneo, A. Leporati, R. Leporini. Fredkin Gates for Finite–valued Reversible and Conservative Logics. Journal of Physics A: Mathematical and General, 35, 2002, pp. 9755–9785. 6. E. Fredkin, T. Toﬀoli. Conservative Logic. International Journal of Theoretical Physics, 21, Nos. 3/4, 1982, pp. 219–253. 7. M. R. Garey, D. S. Johnson. Computers and Intractability. A Guide to the Theory on NP–Completeness. W. H. Freeman and Company, 1979. 8. M. Goldmann, J. H˚ astad, A. Razborov. Majority Gates vs. general weighted Threshold Gates. Computational Complexity, Vol. 2, No. 4, 1992, pp. 277–300. 9. A. Hajnal, W. Maass, P. Pudl´ ak, M. Szegedy, G. Tur´ an. Threshold Circuits of Bounded Depth. Journal of Computer and System Sciences, Vol. 46, No. 2, 1993, pp. 129–154. 10. J. H˚ astad, M. Goldmann. On the power of small–depth threshold circuits. Computational Complexity I, Vol. 2, 1991, pp. 113–129. 11. R. Landauer. Irreversibility and heat generation in the computing process. IBM Journal of Research and Development, 3, 1961, pp. 183–191. 12. A. Leporati, G. Della Vedova, G. Mauri. An Experimental Study of some Heuristics for Min Storage. Submitted for publication, 2003. 13. S. Muroga. Threshold Logic and its Applications. Wiley–Interscience, 1971. 14. C. A. Petri. Gr¨ undsatzliches zur Beschreibung diskreter Prozesse. In Proceedings of the 3 rd Colloquium u ¨ber Automatentheorie (Hannover, 1965), Birkh¨ auser Verlag, Basel, 1967, pp. 121–140. English translation: Fundamentals of the Representation of Discrete Processes, ISF Report 82.04, 1982. 15. A. Razborov, A. Widgerson. nΩ(log n) lower bounds on the size of depth–3 threshold circuits with AND gates at the bottom. Information Processing Letters, Vol. 45, No. 6, 1993, pp. 303–307. 16. N. Rescher. Many–valued logics. McGraw–Hill, 1969. 17. J. B. Rosser, A. R. Turquette. Many–valued logics. North Holland, 1952. 18. V. P. Roychowdhury, K. Y. Siu, A. Orlitsky. Theoretical Advances in Neural Computation and Learning. Kluwer Academic, 1994. 19. K. Y. Siu, J. Bruck. On the Power of Threshold Circuits with Small Weights. SIAM Journal on Discrete Mathematics, Vol. 4, No. 3, 1991, pp. 423–435. 20. K. Y. Siu, V. P. Roychowdhury. On Optimal Depth Threshold Circuits for Multiplication and Related Problems. SIAM Journal on Discrete Mathematics, Vol. 7, No. 2, 1994, pp. 284–292. 21. H. Vollmer. Introduction to Circuit Complexity: A Uniform Approach. Springer– Verlag, 1999. 22. A. C. Yao. On ACC and Threshold Circuits. In Proceedings of the 31st IEEE Symposium on Foundations of Computer Science, 1990, pp. 619–628.

Constructing Inﬁnite Graphs with a Decidable MSO-Theory Wolfgang Thomas RWTH Aachen, Lehrstuhl f¨ ur Informatik VII D-52056 Aachen, Germany [email protected]

Abstract. This introductory paper reports on recent progress in the search for classes of inﬁnite graphs where interesting model-checking problems are decidable. We consider properties expressible in monadic second-order logic (MSO-logic), a formalism which encompasses standard temporal logics and the modal µ-calculus. We discuss a class of inﬁnite graphs proposed by D. Caucal (in MFCS 2002) which can be generated from the inﬁnite binary tree by applying the two processes of MSO-interpretation and of unfolding. The main purpose of the paper is to give a feeling for the rich landscape of inﬁnite structures in this class and to point to some questions which deserve further study.

1

Introduction

A fundamental decidability result which appears in hundreds of applications in theoretical computer science is Rabin’s Tree Theorem [23]. The theorem says that the monadic second-order theory (MSO-theory) of the inﬁnite binary tree is decidable. The system of monadic second-order logic arises from ﬁrst-order logic by adjunction of variables for sets (of tree nodes) and quantiﬁers ranging over sets. In this language one can express many interesting properties, among them reachability conditions (existence of ﬁnite paths between elements) and recurrence conditions (existence of inﬁnite paths with inﬁnitely many points of a given property). Already in Rabin’s paper [23] the main theorem is used to infer a great number of further decidability results. The technique for the transfer of decidability is the method of interpretation: It is based on the idea of describing a structure A, using MSO-formulas, within the structure T2 of the binary tree. The decidability of the MSO-theory of A can then be deduced from the fact that the MSO-theory of T2 is decidable. Rabin considered mainly structures of interest to mathematical logic. For example, he showed that the monadic second-order theory of the rational number ordering (Q, <) is decidable. In theoretical computer science, the interest shifted to models like transition systems (for example, Kripke structures) and their unfoldings in the form of labelled trees. Also the terminology has changed a little: Rather than speaking of a structure with a decidable MSO-theory one says that the model-checking problem for this structure is decidable with respect to MSO-properties. Thus the search for inﬁnite B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 113–124, 2003. c Springer-Verlag Berlin Heidelberg 2003

114

Wolfgang Thomas

structures of this kind is tied to one of the fundamental questions in veriﬁcation, namely to determine the range of structures where the model-checking problem (in our case with respect to MSO-logic) can be solved by automatic procedures. In this research, the ﬁrst key result is the Muller-Schupp Theorem [22], stating that the transition graph of a pushdown automaton has a decidable MSO-theory. In [4], Caucal showed that the same holds for the more extended class of preﬁx-recognizable graphs. In both cases, the proof works by MSOinterpretations in the binary tree T2 . Proof sketches are provided in Section 2 below. Nearly 20 years ago, in MFCS 1984, A. Semenov [24] presented a decidability result of Muchnik which opened a new track for extending Rabin’s Tree Theorem. Muchnik’s Theorem says that for a structure whose MSO-theory is decidable also its “tree iteration” has a decidable MSO-theory. This provides another powerful method for the transfer of decidability results. When referring to graphs as structures, a variant of tree iteration is of central importance: the unfolding of a graph as a tree. A short discussion is given in Section 3. Again in MFCS, one year ago, D. Caucal proposed in [5] to use both transfer techniques (of MSO-interpretation and of unfolding) together, starting with the ﬁnite trees and graphs. (Equivalently one can start with the inﬁnite binary tree.) It turns out that by applying MSO-interpretations and unfoldings in alternation, a very rich hierarchy of models can be generated, each of them having a decidable MSO-theory. The main purpose of this paper is to provide (in Section 4 below) an intuitive introduction to this Caucal hierarchy. We explain that it provides a comprehensive framework for decidability results on MSO-theories. In this paper, we pursue a purely model-theoretic view. One should mention that at least two other views are also possible but not taken up here in any depth: First, the tree structures which arise as unfoldings in the hierarchy have been studied already decades ago in the investigation of higher-order recursion schemes (cf. [12]); recent results in the ﬁeld are due to Knapik, Niwi´ nski, and Urzyczyn [17,18]. In connection with the evaluation of these recursion schemes, the computational model of “iterated pushdown automaton” was introduced. The (global) transition graphs of iterated pushdown automata coincide with the graphs of the Caucal hierarchy (see [5,6]). Thus, the graphs of the Caucal hierarchy constitute also an interesting chapter of “inﬁnite automata theory” ([25]), where inﬁnite graphs are viewed and used as acceptors of non-regular languages.

2 2.1

Interpretations General Framework

We consider relational structures A = (A, R1A , . . . , RkA ), where A is at most countable. The RiA are relations of possibly diﬀerent arities, say RiA of arity ni . The corresponding signature is given by the relation symbols R1 , . . . , Rk . The ﬁrst-order language over this signature is built up from variables x, y, . . ., atomic formulas x = y and Ri (x1 , . . . , xni ), where x, y, x1 , . . . are ﬁrst-order variables,

Constructing Inﬁnite Graphs with a Decidable MSO-Theory

115

using the standard propositional connectives ¬, ∧, ∨, →, ↔ and the quantiﬁers ∃, ∀. The corresponding monadic second-order language (MSO-language) is obtained by adjoining variables X, Y, . . . for sets of elements (of the universe of a structure) and atomic formulas X(y), meaning that the element y is in the set X. We use the standard notations; e.g. A |= ϕ[a] indicates that the structure A satisﬁes the formula ϕ(x) with the element a as interpretation of x. Given a formula ϕ(x1 , . . . , xn ), the relation deﬁned by it in A is ϕA = {(a1 , . . . , an ) ∈ An | A |= ϕ[a1 , . . . , an ]} The structures considered in this paper are edge- and vertex-labelled graphs of the form G = (V, (Ei )i∈I , (Pj )j∈J ); here V is the set of vertices, I the alphabet of edge labels, Ei ⊆ V × V is the set of i-labelled edges, and Pj ⊆ V the set of vertices labelled j. We set E = i∈I Ei . The binary tree is the structure T2 = ({0, 1}∗ , S0 , S1 ) where Si = {(w, wi) | n w ∈ {0, 1}∗ }. Analogously Tn = ({0, . . . , n − 1}∗ , S0n , . . . , Sn−1 ) is the n-ary inﬁnite tree. Theorem 1 (Tree Theorem, [23]). The MSO-theory of T2 is decidable. Let us illustrate the idea of MSO-interpretation by showing that the result holds also for the structures Tn for n > 2. As typical example consider T3 = ({0, 1, 2}∗ , S03 , S13 , S23 ). We obtain a copy of T3 in T2 by considering only the T2 -vertices in the set T = (10 + 110 + 1110)∗ . A word in this set has the form 1i1 0 . . . 1im 0 with i1 , . . . , im ∈ {1, 2, 3}; and we take it as a representation of the element (i1 − 1) . . . (im − 1) of T3 . The following MSO-formula ϕ(x) (written in abbreviated suggestive form) deﬁnes the set T in T2 : ∀Y [Y (x) ∧ ∀y((Y (y10) ∨ Y (y110) ∨ Y (y1110)) → Y (y)) → Y ()] It says that x is in the closure of under 10-, 110-, and 1110-successors. The relation {(w, w10)|w ∈ {0, 1}∗ } is deﬁned by the following formula: ψ0 (x, y) := ∃z(S1 (x, z) ∧ S0 (z, y)) With the analogous formulas ψ1 , ψ2 for the other successor relations, we see that the structure with universe ϕT2 and the relations ψiT2 restricted to ϕT2 is isomorphic to T3 . In general, an MSO-interpretation of a structure A in a structure B is given by a “domain formula” ϕ(x) and, for each relation RA of A, say of arity m, an MSO-formula ψ(x1 , . . . , xm ) such that A with the relations RA is isomorphic to the structure with universe ϕB and the relations ψ B restricted to ϕB . Then for an MSO-sentence χ (in the signature of A) one can construct a sentence χ (in the signature of B) such that A |= χ iﬀ B |= χ . In order to obtain χ from χ, one has to replace every atomic formula R(x1 , . . . , xm ) by the corresponding formula ψ(x1 , . . . , xm ) and to relativize all quantiﬁcations to ϕ(x) (for details see e.g. [13]). As a consequence, we note the following:

116

Wolfgang Thomas

Proposition 1. If A is MSO-interpretable in B and the MSO-theory of B is decidable, then so is the MSO-theory of A. In the literature a more general type of interpretation is also used, called MSO-transduction (see [8]), where the universe B is represented in a k-fold copy of A rather than in A itself. For the results treated below it suﬃces to use the simple case mentioned above. 2.2

Pushdown Graphs and Preﬁx Recognizable Graphs

A graph G = (V, (Ea )a∈A ) is called pushdown graph (over the label alphabet A) if it is the transition graph of the reachable global states of an -free pushdown automaton. Here a pushdown automaton is of the form P = (Q, A, Γ, q0 , Z0 , ∆), where Q is the ﬁnite set of control states, A the input alphabet, Γ the stack alphabet, q0 the initial control state, Z0 ∈ Γ the initial stack symbol, and ∆ ⊆ Q × A × Γ × Γ ∗ × Q the transition relation. A global state (conﬁguration) of the automaton is given by a control state and a stack content, i.e., by a word from QΓ ∗ . The graph G = (V, (Ea )a∈A ) is now speciﬁed as follows: – V is the set of conﬁgurations from QΓ ∗ which are reachable (via ﬁnitely many applications of transitions of ∆) from the initial global state q0 Z0 . – Ea is the set of all pairs (pγw, qvw) from V 2 for which there is a transition (p, a, γ, v, q) in ∆. A more general class of graphs, which includes the case of vertices of inﬁnite degree, has been introduced by Caucal [4]. These graphs are introduced in terms of preﬁx-rewriting systems in which “control states” (as they occur in pushdown automata) are no longer used and where a word on the top of the stack (rather than a single letter) may be rewritten. Thus, a rewriting step can be speciﬁed by a triple (u1 , a, u2 ), describing a transition from a word u1 w via letter a to the word u2 w. The feature of inﬁnite degree is introduced by allowing generalized rewriting rules of the form U1 →a U2 with regular sets U1 , U2 of words. Such a rule leads to the (in general inﬁnite) set of rewrite triples (u1 , a, u2 ) with u1 ∈ U1 and u2 ∈ U2 . A graph G = (V, (Ea )a∈A ) is called preﬁx-recognizable if for some ﬁnite system S of such generalized preﬁx rewriting rules U1 →a U2 over an alphabet Γ , we have – V ⊆ Γ ∗ is a regular set, – Ea consists of the pairs (u1 w, u2 w) where u1 ∈ U1 , u2 ∈ U2 for some rule U1 →a U2 from S, and w ∈ Γ ∗ . Theorem 2 (Muller-Schupp [22], Caucal [4]). The MSO-theory of a pushdown graph is decidable; so is the MSO-theory of a preﬁx-recognizable graph. First we present the proof for pushdown graphs. Let G = (V, (Ea )a∈A ) be generated by the pushdown automaton P = (Q, A, Γ, q0 , Z0 , ∆). Each conﬁguration is a word over the alphabet Q ∪ Γ . Taking m = |Q| + |Γ | we can represent

Constructing Inﬁnite Graphs with a Decidable MSO-Theory

117

a conﬁguration by a node of the tree Tm . For technical convenience we write the conﬁgurations in reverse order, i.e. as words in Γ + Q. We give an MSOinterpretation of G in Tm . The formula ψa (x, y) which deﬁnes Ea in Tm has to say the following: “there is a stack content w such that x = (pγw)R and y = (qvw)R for a rule (p, a, γ, v, q) of ∆.” This is easily formalized (even with a ﬁrst-order formula), using the successor relations in Tm to capture the prolongation of w by γ, p, q and by the letters of v. Now it is easy to write down also the desired domain formula ϕ(x) which deﬁnes the conﬁgurations reachable from q0 Z0 . We refer to (q0 Z0 )R as deﬁnable element of the tree Tm and to the union E of the relations Ea , deﬁned by a∈A ψa (x, y). The formula ϕ(x) says that “each set X which contains (q0 Z0 )R and is closed under taking Esuccessors also contains x.” For preﬁx-recognizable graphs, a slight generalization of the previous proof is needed. Let G be a preﬁx-recognizable graph with a regular set V ⊆ Γ ∗ of vertices. We describe an MSO-interpretation of G in the tree Tm where m is the size of Γ . We start with a formula ψ(x, y) which deﬁnes the edge relation induced by a single rule U1 →a U2 with regular U1 , U2 . The formula expresses for x, y that there is a word (= tree node) w such that x = u1 w, y = u2 w with u1 ∈ U1 , u2 ∈ U2 . If A1 , A2 are ﬁnite automata recognizing U1 , U2 respectively, this can be phrased as follows: “there is a node w such that A1 accepts the path segment from x to w and A2 the path segment from y to w.” Acceptance of a path segment is expressed by requiring a corresponding automaton run. Its existence can be coded by a tuple of subsets over the considered path segment (for an automaton with 2k states a k-tuple of sets suﬃces). The disjunction of such formulas taken for all a-rules gives the desired formula deﬁning the edge relation Ea . The domain formula ϕ(x) is provided in the same way, now referring to the path segment from node x back to the root. Using the interpretation of Tm in T2 , the decidability claims follow from Rabin’s Tree Theorem. It is interesting to note that the preﬁx-recognizable graphs in fact coincide with the graphs which are MSO-interpretable in T2 ([2]).

3

Unfoldings

Let G = (V, (Ei )i∈I , (Pj )j∈J ) be a graph and v0 a designated vertex of V . The unfolding of G from v0 is a structure of the form t(G, v0 ) = (V , (Ei )i∈I , (Pj )j∈J ). Its domain V is the set of all paths from v0 ; here a path from v0 is a sequence v0 i1 v1 . . . ik vk where for h ≤ k we have (vh−1 , vh ) ∈ Eih . A pair (p, q) of paths is in Ei iﬀ q is an extension of p by an edge from Ei , and we have p ∈ Pj iﬀ the last element of p is in Pj .

118

Wolfgang Thomas

As an example consider the singleton graph G0 with vertex v0 and two edge relations E0 , E1 , both of which contain the edge (v0 , v0 ). The unfolding of G0 is (isomorphic to) the binary tree T2 . This example illustrates the power of the unfolding operation: Starting from the trivial singleton graph (which of course has a decidable MSO-theory), we obtain the binary tree T2 where decidability of the MSO-theory is a deep result. The unfolding operation takes sequences of edges (as elements of the unfolded structure). A related construction, called tree iteration, refers to sequences of elements instead. It has the advantage that it covers arbitrary relational structures without extra conventions. To spare notation we deﬁne it only over graphs, as considered above. The tree iteration of a graph G = (V, (Ei )i∈I , (Pj )j∈J ) is the structure G∗ = ∗ (V , S, C, (Ei∗ )i∈I , (Pj∗ )j∈J ) where S = {(w, wv) | w ∈ V ∗ , v ∈ V } (“successor”), C = {(wv, wvv) | w ∈ V ∗ , v ∈ V } (“clone relation”), Ei∗ = {(wu, wv) | w ∈ V ∗ , (u, v) ∈ Ei }, and Pj∗ = {wv | w ∈ V ∗ , v ∈ Pj }. From the singleton graph mentioned above one obtains by tree iteration a copy of the natural number ordering rather than of the binary tree. However, the structure T2 can be generated by tree iteration from the two element structure ({0, 1}, P0 , P1 ) using the two predicates P0 = {0} and P1 = {1}. The unfolding t(G, v0 ) can be obtained by a monadic transduction from G∗ , more precisely by an MSO-interpretation in a twofold copy of G. Both operations preserve the decidability of the MSO-theory. Again we state this only for graphs: Theorem 3 (Muchnik, Walukiewicz, Courcelle (cf. [24]), [26], [11])). If a graph has a decidable MSO-theory, then its unfolding from a deﬁnable vertex and its tree iteration also have decidable MSO-theories. Extending earlier work of Shelah and Stupp, the theorem was shown for tree iterations by A. Muchnik (see [24]). A full proof is given by Walukiewicz in [26]; for a very readable account we recommend [1]. For the unfolding operation see the papers [9,11] by Courcelle and Walukiewicz. As a small application of the theorem we show a result (of which we do not know a reference) on structures (N, Succ, P ), the successor structure of the natural numbers with an extra unary predicate P . Consider the binary tree T2 expanded by the predicate P = {w ∈ {0, 1}∗ | |w| ∈ P }, the “level predicate” for P . Now the MSO-theory of (N, Succ, P ) is decidable iﬀ the MSO-theory of (N, Succ0 , Succ1 , P ) is decidable where Succ0 = Succ1 = Succ. The unfolding of the latter structure is the binary tree expanded by the level predicate for P . Hence we obtain: Proposition 2. If the MSO-theory of (N, Succ, P ) is decidable, then so is the MSO-theory of the binary tree expanded by the level predicate for P .

4

Caucal’s Hierarchy

In [5], Caucal introduced the following hierarchy (Gn ) of graphs, together with a hierarchy (Tn ) of trees:

Constructing Inﬁnite Graphs with a Decidable MSO-Theory

119

– T0 = the class of ﬁnite trees – Gn = the class of graphs which are MSO-interpretable in a tree of Tn – Tn+1 = the class of unfoldings of graphs in Gn By the results of the preceding sections (and the fact that a ﬁnite structure has a decidable MSO-theory), each structure in the Caucal hierarchy has a decidable MSO-theory. By a hierarchy result of Damm [12] on higher-order recursion schemes, the hierarchy is strictly increasing. In Caucal’s paper [5], a diﬀerent formalism of interpretation (via “inverse rational substitutions”) is used instead of MSO-interpretations. We work with the latter to keep the presentation more uniform; the equivalence between the two approaches has been established by Carayol and W¨ ohrle [10]. Let us take a look at some structures which occur in this hierarchy. It is clear that G0 is the class of ﬁnite graphs, while T1 contains the so-called regular trees (alternatively deﬁned as the inﬁnite trees which have only ﬁnitely many non-isomorphic subtrees). Figure 1 (upper half) shows a ﬁnite graph and its unfolding as a regular tree: a

•

a

c

b

•

a

•

•

• •

a

a

•

d

c

b

•

a

•

d

c

e

a

•

d

c

e

c

•

•

•

a

•

d

•

a

···

c

•

c

e

a

•

c

•

•

a

•

c

•

•

a

•

c

b

•

a

•

•

•

a

···

•

d

···

···

c

e

e

Fig. 1. A graph, its unfolding, and a pushdown graph

By an MSO-interpretation we can obtain the pushdown graph of Figure 1 in the class G1 ; the domain formula and the formulas deﬁning Ea , Eb , Ec are trivial, while ψd (x, y) = ψe (x, y) = ∃z∃z (Ea (z, z ) ∧ Ec (z, y) ∧ Ec (z , x)) Let us apply the unfolding operation again, from the only vertex without incoming edges. We obtain the “algebraic tree” of Figure 2, belonging to T2 (where for the moment one should ignore the dashed line). As a next step, let us apply an MSO-interpretation to this tree which will produce a graph (V, E, P ) in the class G2 (where E is the edge relation and P a unary predicate). Referring to Figure 2, V is the set of vertices which are located along the dashed line, E contains the pairs which are successive vertices along the dashed line, and P contains the special vertices drawn as non-ﬁlled circles. This structure is isomorphic to the structure (N, Succ, P2 ) with the successor relation Succ and predicate P2 containing the powers of 2.

120

Wolfgang Thomas •

a

•

a

c

b

•

a

• c

◦ ◦

• •

• e

• d

◦

d

• e

•

···

c

d

e

a

•

c

• d

a

•

d

•

e

• e

e

d

•

• d

◦

• •

e

•

d

•

e

•

···

Fig. 2. Unfolding of the pushdown graph of Figure 1

To prepare a corresponding MSO-interpretation, we use formulas such as Ed∗ (x, y) which expresses “all sets which contain x and are closed under taking Ed -successors contain y, and y has no Ed -successor” As domain formula we use ϕ(x) = ∃z(Eb (z, x) ∨ ∃y(Ec (z, y) ∧ Ed∗ (y, x))). The required edge relation E is deﬁned by ψ(x, y) = ∃z∃z (ψ1 (x, y) ∨ ψ2 (x, y) ∨ ψ3 (x, y)) where – ψ1 (x, y) = Ea (z, z ) ∧ Eb (z, x) ∧ Ec (z , y) – ψ2 (x, y) = Ea (z, z ) ∧ Ece∗ (z, x) ∧ Ecd∗ (z , y) – ψ3 (x, y) = Ede∗ (z, x) ∧ Eed∗ (z, y) Finally we deﬁne P by the formula χ(x) = ∃z∃z (Ec (z, z ) ∧ Ed∗ (z , x)). We infer that the MSO-theory of (N, Succ, P2 ) is decidable, a result ﬁrst proved by Elgot and Rabin [14] with a diﬀerent approach. The idea of [14], later applied to many other expansions of the successor structure by unary predicates, is to transform ﬁrst a given MSO-sentence ϕ to an equivalent Bchi automaton Bϕ , so that (N, Succ, P2 ) |= ϕ iﬀ Bϕ accepts the characteristic 0-1-sequence αP2 (with αP2 (i) = 1 iﬀ i ∈ P2 ). By contracting the 0-segments between the letters 1, one can modify αP2 to an ultimately periodic sequence β such that Bϕ accepts αP iﬀ Bϕ accepts β. Whether Bϕ accepts such a “regular model” β is decidable. Note that this reduction to a regular model depends on the sentence ϕ under consideration. The generation of (N, Succ, P2 ) as a model in G2 provides a uniform decidability proof. In [7], the contraction method was adapted to cover all morphic predicates P (coded by morphic 0-1-words). Fratani and S´enizergues [15] have shown that the models (N, Succ, P ) for morphic P also occur in the Caucal hierarchy. In the

Constructing Inﬁnite Graphs with a Decidable MSO-Theory

121

present paper we discuss another structure treated already in [14]: the structure (N, Succ, Fac) where Fac is the set of factorial numbers. We start from a simpler pushdown graph than the one used above and consider its unfolding, which is the comb structure indicated by the thick arrows of the lower part of the ﬁgure. •

a

•

c

b

•

a

•

c

b

• b

a

•

c

b

• b

•

•

b

c

···

•

c

···

• b

• c

•

a

b

•

•

•

•

···

c

•

• c

c

•

• c

• Fig. 3. Preparing for the factorial predicate

We number the vertices of the horizontal line by 0, 1, 2 . . . and call the vertices below them to be of “level 0”, “level 1”, “level 2” etc. Now we use the simple MSO-interpretation which takes all tree nodes as domain and introduces for n ≥ 0 a new edge from any vertex of level n + 1 to the ﬁrst vertex of level n. This introduces the thin lines in Figure 3 as new edges (assumed to point backwards). The reader will be able to write down a deﬁning MSO-formula. Note that the top vertex of each level plays a special role since it is the target of an edge labelled b, while the remaining ones are targets of edges labelled c. Consider the tree obtained from this graph by unfolding. It has subtrees consisting of a single branch oﬀ level 0, 2 branches oﬀ level 1, 2 · 3 branches oﬀ level 2, and generally (n + 1)! branches oﬀ level n. Referring to the c-labelled edges these branches are arranged in a natural (and MSO-deﬁnable) order. To capture the structure (N, Succ, Fac), we apply an interpretation which (for n ≥ 1) cancels the branches starting at the b-edge target of level n (and leaves only the branches oﬀ the targets of c-edges). As a result, (n + 1)! − n! branches oﬀ level n remain for n ≥ 1, while there is one branch oﬀ level 0. Numbering these remaining branches, the n!-th branch appears as ﬁrst branch oﬀ level n. Note that we traverse this ﬁrst branch oﬀ a given level by disallowing c-edges after the ﬁrst c-edge. So a global picture like Figure 2 emerges, now representing the factorial predicate. Summing up, we have generated the structure (N, Succ, Fac) as a graph in G3 .

122

Wolfgang Thomas

•

•

•

•

•

0

1

10

11

100

•

• 110

•

• 1000

•

• 1010

•

•

•

1100

Fig. 4. Graph of ﬂip function

So far we have considered expansions of the successor structure of the natural numbers by unary predicates. We now discuss the expansion by an interesting unary function (here identiﬁed with its graph, a binary relation). It is the ﬂip function, introduced in [21] in the study of a hierarchical time structure (involving diﬀerent time granularities). The function ﬂip associates 0 to 0 and for each nonzero n that number which arises from the binary expansion of n by modifying the least signiﬁcant 1-bit to 0. An illustration of the graph Flip of this function is given in Figure 4. It is easy to see that the structure (N, Succ, Flip) can be obtained from the algebraic tree of Figure 2 by an MSO-interpretation. A Flip-edge will connect vertex u to the last leaf vertex v which is reachable by a d∗ -path from an ancestor of u; if such a path does not exist, an edge to the target of the b-edge (representing number 0) is taken. Other parts of arithmetic can also be captured by suitable structures of the Caucal hierarchy. For example, it can be shown that a semilinear relation (a relation deﬁnable in Presburger arithmetic) can be represented by a suitable graph. As the simplest example consider the relation x + y = z. It can be represented in a comb structure like Figure 3 where each vertical branch is inﬁnite and for each edge a corresponding back-edge (with dual label) is introduced. In the unfolding of this inﬁnite comb structure, a vertex on column x and row y allows a path precisely of length x + y via the back-edges to the origin. In this way, graphs can be generated which (as acceptors of languages) are equivalent to the Parikh automata of [19].

5

Outlook

The examples treated above should convince the reader that the Caucal Hierarchy supplies a large reservoir of interesting models where the MSO-theory is decidable. Many problems are open in this ﬁeld. We mention some of them. 1. Studying and extending the range of the Caucal Hierarchy: We do not know much about the graphs on levels ≥ 3 of the Caucal hierarchy. Which structures of arithmetic (with domain N and some relations over N) occur there? How to decide on which level a given structure occurs? Is it possible to obtain a still richer landscape of models by invoking the operation of tree iteration (possibly for structures with relations of arity > 2, as in [3])? 2. Comparison with other approaches to generate inﬁnite graphs: There are representation results which allow to generate, for n > 0, the graphs of level n from a single tree of level n, respectively as the transition graphs of higher-level

Constructing Inﬁnite Graphs with a Decidable MSO-Theory

123

pushdown automata (see [5,6] and the references mentioned there). There are as yet only partial results which settle the relation between the graphs of Caucal’s hierarchy and the synchronized rational (or “automatic”) graphs, the rational graphs, and the graphs generated by ground term rewriting systems (cf. e.g. [25,20] and the references mentioned there). 3. Complexity of Model-Checking: The reduction of the MSO-model-checking problem for an unfolded graph to the corresponding problem for the original graph involves a non-elementary blow-up in complexity. When using restricted logics one can avoid this. For example, Cachat [6] has shown that µ-calculus model-checking over graphs of level n is possible in n-fold exponential time.

Acknowledgments Many thanks are due to Didier Caucal for numerous fruitful discussions and to my collaborators and students Jan Altenbernd, Thierry Cachat, Christof L¨ oding, and Stefan W¨ ohrle for their help.

References 1. D. Berwanger, A. Blumensath, The monadic theory of tree-like structures, in [16], pp. 285-302. 2. A. Blumensath, Preﬁx-recognizable graphs and monadic second-order logic, Rep. AIB-06-2001, RWTH Aachen, 2001. 3. A. Blumensath, Axiomatising tree-interpretable structures, in: Proc. 19th STACS, Springer LNCS 2285 (2002), 596-607. 4. D. Caucal, On inﬁnite transition graphs having a decidable monadic theory, in: Proc. 23rd ICALP (F. Meyer auf der Heide, B. Monien, Eds.), Springer LNCS 1099 (1996), 194-205 [Full version in: Theor. Comput. Sci. 290 (2003), 79-115]. 5. D. Caucal, On inﬁnite graphs having a decidable monadic theory, in: Proc. 27th MFCS (K. Diks, W. Rytter, Eds.), Springer LNCS 2420 (2002), 165-176. 6. Th. Cachat, Higher order pushdown automata, the Caucal Hierarchy of graphs and parity games, in Proc. 30th ICALP, 2003, Springer LNCS (to appear). 7. O. Carton, W. Thomas, The monadic theory of morphic inﬁnite words and generalizations, in: Proc. 25th MFCS (M. Nielsen, B. Rovan, Eds.), Springer LNCS 1893 (2000), 275-284. 8. B. Courcelle, Monadic second-order graph transductions: a survey, Theor. Comput. Sci. 126 (1994), 53-75. 9. B. Courcelle, The monadic second-order logic of graphs IX: machines and their behaviours, Theor. Comput. Sci. 151 (1995), 125-162. 10. A. Carayol, S. Whrle, personal communication. 11. B. Courcelle, I. Walukiewicz, Monadic second-order logic, graph coverings and unfoldings of transition systems, Ann. Pure Appl. Logic 92 (1998), 35-62. 12. W. Damm, The IO and OI hierarchies, Theor. Comput. Sci. 20 (1982), 95-208. 13. H.D. Ebbinghaus, J. Flum, W. Thomas, Mathematical Logic, Springer, BerlinHeidelberg-New York 1984. 14. C.C. Elgot, M.O. Rabin, Decidability and undecidability of extensions of second (ﬁrst) order theory of (generalized) successor, J. Symb. Logic 31 (1966), 169-181.

124

Wolfgang Thomas

15. S. Fratani, G. S´enizergues, personal communication. 16. Automata, Logics, and Inﬁnite Games (E. Grdel, W. Thomas, Th. Wilke, Eds.), Springer LNCS 2500 (2002), Springer-Verlag, Berlin-Heidelberg-New York 2002. 17. T. Knapik, D. Niwinski, P. Urzyczyn, Deciding monadic theories of hyperalgebraic trees, in: TCLA 2001 (S. Abramsky, Ed.), Springer LNCS 2044 (2001), 253-267. 18. T. Knapik, D. Niwinski, P. Urzyczyn, Higher-order pushdown trees are easy, in: Proc. 5th FoSSaCS (M. Nielsen, U. Engberg, Eds.), Springer LNCS 2303 (2002), 205-222. 19. F. Klaedtke, H. Rue, Monadic second-order logic with cardinalities, in: Proc. 30th ICALP 2003, Springer LNCS (to appear). 20. C. L¨ oding, Ground tree rewriting graphs of bounded tree width, in: Proc. 19th STACS, Springer LNCS 2285 (2002), 559-570. 21. A. Montanari, A. Peron, A. Policriti, Extending Kamp’s Theorem to model time granularity, J. Logic Computat. 12 (2002), 641-678. 22. D. Muller, P. Schupp, The theory of ends, pushdown automata, and second-order logic, Theor. Comput. Sci. 37 (1985), 51-75. 23. M.O. Rabin, Decidability of second-order theories and automata on inﬁnite trees, Trans. Amer. Math. Soc. 141 (1969), 1-35. 24. A. Semenov, Decidability of monadic theories, in: Proc. MFCS 1984 (M.P. Chytil, V. Koubek, Eds.), Springer LNCS 176 (1984), 162-175. 25. W. Thomas, A short introduction to inﬁnite automata. In: Proc. 5th International Conference “Developments in Language Theory”, Springer LNCS 2295 (2002), 130144. 26. I. Walukiewicz, Monadic second-order logic on tree-like structures, Theor. Comput. Sci. 275 (2002), 311-346.

Towards a Theory of Randomized Search Heuristics Ingo Wegener FB Informatik, LS 2, Univ. Dortmund, 44221 Dortmund, Germany [email protected]

Abstract. There is a well-developed theory about the algorithmic complexity of optimization problems. Complexity theory provides negative results which typically are based on assumptions like NP=P or NP=RP. Positive results are obtained by the design and analysis of clever algorithms. These algorithms are well-tuned for their speciﬁc domain. Practitioners, however, prefer simple algorithms which are easy to implement and which can be used without many changes for diﬀerent types of problems. They report surprisingly good results when applying randomized search heuristics like randomized local search, tabu search, simulated annealing, and evolutionary algorithms. Here a framework for a theory of randomized search heuristics is presented. It is discussed how randomized search heuristics can be delimited from other types of algorithms. This leads to the theory of black-box optimization. Lower bounds in this scenario can be proved without any complexity-theoretical assumption. Moreover, methods how to analyze randomized search heuristics, in particular, randomized local search and evolutionary algorithms are presented.

1

Introduction

Theoretical computer science has developed powerful methods to estimate the algorithmic complexity of optimization problems. The borderline between polynomial-time solvable and NP-equivalent problems is marked out and this holds for problems and their various subproblems as well as for their approximation variants. We do not expect that randomized algorithms can pull down this border. The “best” algorithms for speciﬁc problems are those with the smallest asymptotic (w.r.t. the problem dimension) worst-case (w.r.t. the problem instance) run time. They are often well-tuned especially for this purpose. They can be complicated, diﬃcult to implement, and not very eﬃcient for reasonable problem dimension. This has led to the area of algorithm engineering. Nevertheless, many practitioners like another class of algorithms, namely socalled randomized search heuristics. Their characteristics are that they are

This work was supported by the Deutsche Forschungsgemeinschaft (DFG) as part of the Collaborative Research Center “Computational Intelligence” (SFB 531), the Collaborative Research Center “Complexity Reduction of Multivariate Data Structures” (SFB 475), and the GIF project “Robustness Aspects of Algorithms”.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 125–141, 2003. c Springer-Verlag Berlin Heidelberg 2003

126

Ingo Wegener

– easy to implement, – easy to design, – often fast although there is no guaranteed upper bound on the expected run time, – often producing good results although there is no guarantee that the solution is optimal or close to optimal. Classical algorithm theory is concerned with guarantees on the run time (or the expected run time for randomized algorithms) and with guarantees for the quality of the results produced by the algorithm. This has led to the situation that practitioners work with algorithms which are almost not considered in the theory of algorithms. The motivation of this paper is the following. If randomized search heuristics ﬁnd many applications, then there should be a theory of this class of algorithms. The aim is to understand how these algorithms work, what they can achieve and what they cannot achieve. This should lead to the design of better heuristics, rules which algorithm is appropriate under certain restrictions, and an at least partial analysis of these algorithms on selected problems. Finally, these results can be used when teaching randomized search heuristics. The problem is that we are interested in “the best” algorithms for an optimization problem and we do not expect that a randomized search heuristic is such a best algorithm. It seems to be impossible to deﬁne precisely which algorithm is a randomized search heuristic. Our solution to this dilemma is to describe an algorithmic scenario such that all known randomized search heuristics can work in this scenario while most problem-speciﬁc algorithms are not applicable in this scenario. This black-box scenario is presented in Section 2. There it is shown that black-box algorithms can be interpreted as randomized decision trees. This allows the application of methods from classical complexity theory, in particular, lower-bound methods like Yao’s minimax principle. This new framework allows a general theory of black-box algorithms including randomized local search, tabu search, simulated annealing, and evolutionary algorithms. Theoretical results on each of these classes of randomized search heuristics were known before, e.g., Papadimitriou, Sch¨ aﬀer, and Yannakakis (1990) for local search, Glover and Laguna (1993) for tabu search, Kirkpatrick, Gelatt, and Vecchi (1983) and Sasaki and Hajek (1998) for simulated annealing, and Rabani, Rabinovich, and Sinclair (1998), Wegener (2001), Droste, Jansen, and Wegener (2002), and Giel and Wegener (2003) for evolutionary algorithms. Lower bounds on the complexity of black-box optimization problems are presented at the end of this paper, in Section 9. Before, we investigate what can be achieved by randomized search heuristics, in particular, by randomized local search and evolutionary algorithms. The aim is to present and to apply methods to analyze randomized search heuristics on selected problems. In Section 3, we, therefore, discuss some methods which have been applied recently and, afterwards, we present examples of these applications. In Section 4, we investigate the optimization of degree-bounded polynomials which are monotone with respect to each variable. It is not known whether the polynomial is increasing or

Towards a Theory of Randomized Search Heuristics

127

decreasing with xi . Afterwards, we investigate famous problems with well-known eﬃcient algorithms working in the classical optimization scenario. In Section 5, we investigate sorting as the maximization of sortedness. The sortedness can be measured in diﬀerent ways which has inﬂuence on the optimization time of evolutionary algorithms. In Section 6, the single-source-shortest-paths problem is discussed. It turns out that this problem can be handled eﬃciently only in the model of multi-objective optimization. In Section 7, the maximum matching problem is investigated in order to show that evolutionary algorithms can ﬁnd improvements which are not obtainable by single local operations. Evolutionary algorithms work with two search operators known as mutation and crossover. In Section 8, we discuss the ﬁrst analytic results about the eﬀectiveness of crossover. The results on the black-box complexity in Section 9 show that the considered heuristics are close to optimal — in some cases. We ﬁnish with some conclusions

2

The Scenario of Black-Box Optimization

The aim is to describe an algorithmic scenario such that the well-known randomized search heuristics can work in this scenario while it is not possible to apply “other algorithms”. What are the speciﬁc properties of randomized search heuristics? The main observation is that randomized search heuristics use the information about the considered problem instance in a highly specialized way. They do not work with the parameters of the instance. They only compute possible solutions and work with the values of these solutions. Consider, e.g., the 2-opt algorithm for the TSP. It starts with a random tour π. In general, it stores one tour π, cuts it randomly into two pieces and combines these pieces to a new tour π . The new tour π replaces π iﬀ its cost is not larger than the cost of π. Problem-speciﬁc algorithms work in a diﬀerent way. Cutting-plane techniques based on integer linear programming create new conditions based on the values of the distance matrix. Branch-and-bound algorithms use the values of the distance matrix for the computation of upper and lower bounds and for the creation of subproblems. This observation can be generalized by considering other optimization problems. Droste, Jansen, and Wegener (2003) have introduced the following scenario called black-box optimization. A problem is described as a class of functions. This uniﬁes the areas of mathematical optimization (e.g., maximize a pseudoboolean polynomial of degree 2) and combinatorial optimization. E.g., TSP is the class of all functions fD : Σn → R+ 0 where D = (dij ) is a distance matrix, Σn is the set of permutations or tours, and fD (π) is the cost of π with respect to D. Hence, it is no restriction to consider problems as classes Fn of functions f : Sn → R. The set Sn is called search space for the problem dimension n. In our case, Sn is ﬁnite. A problem-speciﬁc algorithm knows Fn and the problem instance f ∈ Fn . Each randomized search heuristic belongs to the following class of black-box algorithms. Algorithm 1. (Black-box algorithm) 1. Choose some probability distribution p on Sn and produce a random search point x1 ∈ Sn according to p. Compute f (x1 ).

128

Ingo Wegener

2. In Step t, stop if the considered stopping criterion is fulﬁlled. Otherwise, depending on I(t) = (x1 , f (x1 ), . . . , xt−1 , f (xt−1 )) choose some probability distribution pI(t) on Sn and produce a random search point xt ∈ S according to pI(t) . Compute f (xt ). This can be interpreted as follows. The black-box algorithm only knows the problem Fn and has access to a black box which, given a query x ∈ Sn , answers with the correct value of f (x) where f is the considered problem instance. Hence, black-box optimization is an information-restricted scenario. It is obvious that most problem-speciﬁc algorithms cannot be applied in this scenario. We have to take into account that randomized search heuristics stop without knowing whether they have found an optimal search point. Therefore, we investigate black-box algorithms without stopping criterion as an inﬁnite stochastic process and we deﬁne the run time as the random variable measuring the time until an optimal search point is presented as a query to the black box. This is justiﬁed since randomized search heuristics use most of their time for searching for an optimum and not for proving that it is optimal (this is diﬀerent for exact algorithms like branch and bound). Since queries to the black box are the essential steps, we only charge the algorithm for queries to the black box, i.e., for collecting information. Large lower bounds in this model imply that black-box algorithms cannot solve the problem eﬃciently. For most optimization problems, the computation of f (x) is easy and, for most randomized search heuristics, the computation of the next query is easy. The disadvantage of the model is that we allow all black-box algorithms including those which collect information to identify the problem instance. Afterwards, they can apply any problem-speciﬁc algorithm. MAX-CLIQUE is deﬁned as follows. For a graph G and a subset V of the vertex set V let fG (V ) = |V |, if V is a clique in G, and fG (V ) = 0 otherwise. Asking a query for each twoelement set V we get the information about the adjacency matrix of G and can compute a maximum clique without asking the black box again. Finally, we have to present the solution to the black box. The number of black-box queries of this algorithm equals n2 + 1 but the overall run time is exponential. Hence, our cost model is too generous. For upper bounds, we also have to consider the overall run time of the algorithm. Nevertheless, we may get eﬃcient black-box algorithms which cannot be considered as randomized search heuristics, see, e.g., Giel and Wegener (2003) for the maximum matching problem. Hence, it would be nice to further restrict the scenario to rule out such algorithms. The second observation about randomized search heuristics is that they typically do not store the whole history, i.e., all previous queries and answers or, equivalently, all chosen search points and their values. Randomized local search, simulated annealing, and even some special evolutionary algorithms only store one search point (and its value). Then the next search point is computed and it is decided which of the two search points is stored. Evolutionary algorithms typically work with populations, i.e., multisets of search points. In most cases, the population size is quite small, typically not larger than the problem dimension.

Towards a Theory of Randomized Search Heuristics

129

A black-box algorithm with space restriction s(n) can store at most s(n) search points with their values. After a further search point is produced and presented to the black box, it has to be decided which search point will be forgotten. This decision can be done randomly. We can conclude that the blackbox scenario with (even small) space restrictions includes the typical randomized search heuristics and rules out several algorithms which try to identify the problem instance. Up to now, there are several lower bounds on the black-box complexity which even hold in the scenario without space restrictions (see Section 9). Lower bounds which depend strongly on the space bound are not known and are an interesting research area. We have developed the black-box scenario from the viewpoint of well-known randomized search heuristics. In order to prove lower bounds it is more appropriate to describe black-box algorithms as randomized decision trees. A deterministic black-box algorithm corresponds to a deterministic search tree. The root contains the ﬁrst query and has an outgoing edge for each possible answer. In general, a path to an inner node describes the history with all previous queries and answers and contains the next query with an outgoing edge for each possible answer. A randomized decision tree is a probability distribution on the set of deterministic decision trees. Since Sn is ﬁnite and since it makes no sense to repeat queries if the whole history is known, the depth of the decision trees can be bounded by |Sn |. It is easy to see that both deﬁnitions of black-box algorithms are equivalent. The study of randomized decision trees has a long history (e.g., Hajnal (1991), Lov´ asz, Naor, Newman, and Wigderson (1991), Heiman and Wigderson (1991), Heiman, Newman, and Wigderson (1993)). Usually, parts of the unknown input x can be queried and one is interested in computing f (x). Here we can query search points x and get the answer f (x). Usually, the search stops at a leaf of the decision tree and we know the answer to the problem. Here the search stops at the ﬁrst node (not necessarily a leaf) where the query concerns an optimal search point. Although our scenario diﬀers in details from the traditional investigation of randomized decision trees, we can apply lower-bound techniques known from the theory on decision trees. It is not clear how to improve such lower bounds in the case of space restrictions.

3

Methods for the Analysis of Randomized Search Heuristics

We are interested in the worst-case (w.r.t. the problem instance) expected (w.r.t. the random bits used by the algorithm) run time of randomized search heuristics. If the computation of queries (or search points) and the evaluation of f (often called ﬁtness function) are algorithmically simple, it is suﬃcient to count the number of queries. First of all, randomized search heuristics are randomized algorithms and many methods used for the analysis of problem-speciﬁc randomized algorithms can be applied also for the analysis of randomized search heuristics. The main

130

Ingo Wegener

diﬀerence is that many problem-speciﬁc randomized heuristics implement an idea how to solve the problem and they work in a speciﬁc direction. Randomized search heuristics try to ﬁnd good search directions by experiments, i.e., they try search regions which are known to be bad if one knows the problem instance. Nevertheless, when analyzing a randomized search heuristic, we can develop an intuition how the search heuristic will approach the optimum. More precisely, we deﬁne a typical run of the heuristic with certain subgoals which should be reached within certain time periods. If a subgoal is not reached within the considered time period, this can be considered as a failure. The aim is to estimate the failure probabilities and often it is suﬃcient to estimate the total failure probability by the sum of the single failure probabilities. If the heuristic works with a ﬁnite storage and the analysis is independent from the initialization of the storage, then a failure can be interpreted as the start of a new trial. This general approach is often successful. The main idea is easy but we need a good intuition how the heuristic works. If the analysis is not independent of the contents of the storage, the heuristic can get stuck in local optima. If the success probability within polynomially many steps is not too small (at least 1/p(n) for a polynomial p), a restart or a multistart strategy can guarantee a polynomial expected optimization time. It is useful to analyze search heuristics together with their variants deﬁned by restarts or many independent parallel runs. The question is how we can estimate the failure probabilities. The most often applied tool is Chernoﬀ’s inequality. It can be used to ensure that a (not too short) sequence of random experiments with results 1 (success) and 0 (no success) has a behavior which is very close to the expected behavior with overwhelming probability. A typical situation is that one needs n steps with special properties in order to reach the optimum. If the success probability of a step equals p, it is very likely that we need Θ(n/p) steps to have n successes. All other tail inequalities, e.g., Markoﬀ’s inequality and Tschebyscheﬀ’s inequality, are also useful. We also need a kind of inverse of Chernoﬀ’s inequality. During N Bernoulli trials with success probability 1/2 it is not unlikely (more precisely, there is a positive constant c > 0 such that the probability is at least c) to have at least N/2 + N 1/2 successes, i.e., the binomial distribution is not too concentrated. If a heuristic tries two directions with equal probability and the goal lies in one direction, the heuristic may ﬁnd it. E.g., the expected number of steps of a random walk on {0, . . . , n} with p(0, 1) = p(n, n−1) = 1 and p(i, i−1) = p(i, i + 1) = 1/2, otherwise, until it reaches n is bounded by O(n2 ). A directed search starting in 0 needs n steps. This shows that a directed search is considerably better but a randomized search is not too bad (for an application of these ideas see Jansen and Wegener (2001b)). If the random walk is not fair and the probability to go to the right equals p, we may be interested in the probability of reaching the good point n before the bad point 0 if we start at a. This is equivalent to the gambler’s ruin problem. Let t := (1 − p)/p. Then the success probability equals (1 − ta )/(1 − tn ).

Towards a Theory of Randomized Search Heuristics

131

There is an another result with a nice description which has many applications. If a randomized search heuristic ﬂips a random bit of a search point x ∈ {0, 1}n , we are interested in the expected time until each position has been ﬂipped at least once. This is the scenario of the coupon collector’s theorem. The expected time equals Θ(n log n) and large deviations from the expected value are extremely unlikely. This result has the following consequences. If the global optimum is unique, randomized search heuristics without problem-speciﬁc modules need Ω(n log n) steps on the average. In many cases, one needs more complicated arguments to estimate the failure probability. Ranade (1991) was the ﬁrst to apply an argument now known as delay-sequence argument. The idea is to characterize those runs which are delayed by events which have to have happened. Afterwards, the probability of these events is estimated. This method has found many applications since its ﬁrst presentation, for the only application to the analysis of an evolutionary algorithm see Dietzfelbinger, Naudts, van Hoyweghen, and Wegener (2002). Typical runs of a search heuristic are characterized by subgoals. In the case of maximization, this can be the ﬁrst point of time when a query x where f (x) ≥ b is presented to the black box. Diﬀerent ﬁtness levels (all x where f (x) = a) can be combined to ﬁtness layers (all x where a1 ≤ f (x) ≤ a2 ). Then it is necessary to estimate the time until a search point from a better layer is found if one has seen a point from a worse layer. The ﬁtness alone does not provide the information that “controls” or “directs” the search. As in the case of classical algorithms, we can use a potential function g : Sn → R (also called pseudo-ﬁtness). The black box still answers the query x with the value of f (x) but our analysis of the algorithm is based on the values of g(x). Even if a randomized search heuristic with space restriction 1 does not accept search points whose ﬁtness is worse, the g-value of the search point stored in the memory may decrease. We may hope that it is suﬃcient that the expected change of the g-value is positive. This is not true in a strict sense. A careful drift analysis is necessary in order to guarantee “enough” progress in a “short” time interval (see, e.g., Hajek (1982), He and Yao (2001), Droste, Jansen, and Wegener (2002)). Altogether, the powerful tools from the analysis of randomized algorithms have to be combined with some intuition about the algorithm and the problem. Results obtained in this way are reported in the following sections.

4

The Optimization of Monotone Polynomials

Each pseudo-boolean function f : {0, 1}n → R can be written uniquely as a polynomial f (x) = wA xi . A⊆{1,...,n}

i∈A

Its degree d(f ) is the maximal |A| where wA = 0 and its size s(f ) the number of sets A where wA = 0. Already the maximization of polynomials of degree 2 is NP-hard. The polynomial is called monotone increasing if wA ≥ 0 for all A. The maximization of monotone increasing polynomials is trivial since the input

132

Ingo Wegener

1n is optimal. Here we investigate the maximization of monotone polynomials of degree d, i.e., polynomials which are monotone increasing with respect to some z1 , . . . , zn where zi = xi or zi = 1 − xi . This class of functions is interesting because of its general character and because of the following properties. For each input a and each global optimum a∗ there is a path a0 = a, . . . , am = a∗ such that ai+1 is a Hamming neighbor of ai and f (ai+1 ) ≥ f (ai ), i.e., we can ﬁnd the optimum by local steps which do not create points with a worse ﬁtness. Nevertheless, there are non-optimal points where no search point in the Hamming ball with radius d − 1 is better. We investigate search heuristics with space restriction 1. They use a random search operator (also called mutation operator) which produces the new query a from the current search point a. The new search point a is stored instead of a if f (a ) ≥ f (a). The ﬁrst mutation operator RLS (randomized local search) chooses i uniformly at random and ﬂips ai , i.e., ai = 1 − ai and aj = aj for all j = i. The second operator EA (evolutionary algorithm) ﬂips each bit independently from the others with probability 1/n. Finally, we consider a class of operators RLSp , 0 ≤ p ≤ 1/n, which choose uniformly at random some i. Then ai is ﬂipped with probability 1 and each aj , j = i, is ﬂipped independently from the others with probability p. Obviously RLS0 = RLS. Moreover, RLS1/n is close to EA if the steps without ﬂipping bit are omitted. For all these heuristics, we have to investigate how they ﬁnd improvements. In general, the analysis of RLS is easier. The number of bits which have a correct (optimal) value and inﬂuence the ﬁtness value essentially is never decreased. This is diﬀerent for RLSp , p > 0, and EA. If one bit gets the correct value, several other bits can be changed from correct into incorrect. Nevertheless, it is possible that a replaces a. Wegener and Witt (2003) have obtained the following results. All heuristics need an expected time of Θ((n/d) · 2d ) to optimize monotone polynomials of size 1 and degree d, i.e., monomials. This is not too diﬃcult to prove. One has to ﬁnd the unique correct assignment to d variables, i.e., one has to choose among 2d possibilities, and the probability that one of the d important bits is ﬂipped in one step equals Θ(d/n). In general, RLS performs a kind of parallel search on all monomials. Its expected optimization time is bounded by O((n/d) · log(n/d + 1) · 2d ). It can be conjectured that the same bounds hold for RLSp and EA. The best known result is a bound of O((n2 /d) · 2d ) for RLSp , d(f ) ≤ c log n, and p small enough, more precisely p ≤ 1/(3dn) and p ≤ α/(nc/2 log n) for some constant α > 0. The proof is a drift analysis on the pseudo-ﬁtness counting the correct bits with essential inﬂuence on the ﬁtness value. Moreover, the behavior of the underlying Markoﬀ chain is estimated by comparing it with a simpler Markoﬀ chain. It can be shown that the true Markoﬀ chain is only by a constant factor slower than the simple one. Similar ideas are applied to analyze the mutation operator EA. This is essentially the case of RLS1/n , i.e., there are often several ﬂipping bits. The best bound for degree d ≤ 2 log n − 2 log log n − a for some constant a depends on the size s and equals O(s · (n/d) · 2d ).

Towards a Theory of Randomized Search Heuristics

133

For all mutation operators, the expected optimization time equals Θ((n/d) · log(n/d + 1) · 2d ) for the following function called royal road function in the community of evolutionary algorithms. This function consists of n/d monomials of degree d, their weights equal 1 and they are deﬁned on disjoint sets of variables. These functions are the most diﬃcult monotone polynomials for RLS and the conjecture is that this holds also for RLSp and EA. The conjecture implies that overlapping monomials simplify the optimization of monotone polynomials. Our analysis of three simple randomized search heuristics on the simple class of degree-bounded monotone polynomials shows already the diﬃculties of such analyses.

5

The Maximization of the Sortedness of a Sequence

Polynomials of bounded degree are a class of functions deﬁned by structural properties. Here and in the following sections, we want to discuss typical algorithmic problems. Sorting can be understood as the maximization of the sortedness of the sequence. Measures of sortedness have been developed in the theory of adaptive sorting algorithms. Scharnow, Tinnefeld, and Wegener (2002) have investigated ﬁve scenarios deﬁned as minimization problems with respect to ﬁtness functions deﬁned as distances dπ∗ (π) of the considered sequence (or permutation) π on {1, . . . , n} from the optimal sequence π ∗ . Because of symmetry it is suﬃcient to describe the deﬁnitions only for the case that π ∗ = id is the identity: – INV(π) counts the number of inversions, i.e., pairs (i, j) with i < j and π(i) > π(j), – EXC(π) counts the minimal number of exchanges of two objects to sort the sequence, – REM(π) counts the minimal number of removals of objects in order to obtain a sorted subsequence, this is also the minimal number of jumps (an object jumps from its current position to another position) to sort the sequence, – HAM(π) counts the number of objects which are at incorrect positions, and – RUN(π) counts the number of runs, i.e., the number of sorted blocks of maximal length. The search space is the set of permutations and the function to be minimized is one of the functions dπ∗ . We want to investigate randomized search heuristics related to RLSp and EA in the last section. Again we have a space restriction of 1 and consider the same selection procedure to decide which search point is stored. There are two local search operators, the exchange of two objects and the jump of one object to a new position. RLS performs one local operation chosen uniformly at random. For EA the number of randomly chosen local operations equals X + 1 where X is Poisson distributed with parameter λ = 1. It is quite easy to prove O(n2 log n) bounds for the expected run times of RLS and EA and the ﬁtness functions INV, EXC, REM, and HAM. It is suﬃcient to consider the diﬀerent ﬁtness levels and to estimate the probability of increasing

134

Ingo Wegener

the ﬁtness within one step. A lower bound of Ω(n2 ) holds for all ﬁve ﬁtness functions. Scharnow, Tinnefeld, and Wegener (2002) describe also some Θ(n2 log n) bounds which hold if we restrict the search heuristics to one of the search operators, namely jumps or exchanges. E.g., in the case of HAM, exchanges seem to be the essential operations and the expected optimization time of RLS and EA using exchanges only is Θ(n2 log n). An exchange can increase the HAM value by at most 2. One does not expect that jumps are useful in this scenario. This is true in most situations but there are exceptions. I.e., HAM(n, 1, . . . , n − 1) = n and a jump of object n to position n creates the optimum. An interesting scenario is described by RUN. The number of runs is essential for adaptive mergesort. In the black-box scenario with small space bounds, RUN seems to give not enough information for an eﬃcient optimization. Experiments prove that RLS and EA are rather ineﬃcient. This has not been proven rigorously. Here we discuss why RUN establishes a diﬃcult problem for typical randomized search heuristics. Let RUN(π) = 2 and let the shorter run have length l. An accepted exchange of two objects usually does neither change the number of runs nor their lengths. Each object has a good jump destination in the other run. This may change l by 1. However, there are only l jumps decreasing l but n − l jumps increasing l. Applying the results on the gambler’s ruin problem it is easy to see that it will take exponential time until l drops from a value of at least n/4 to a value of at most n/8. A rigorous analysis is diﬃcult. At the beginning, there are many short runs and it is diﬃcult to control the lengths of the runs when applying RLS or EA. Moreover, there is another event which has to be controlled. If run r2 follows r1 and the last object of r1 jumps away, it can happen that r1 and r2 melt together since all objects of r2 are larger than the remaining objects of r1 . It seems to be unlikely that long runs melt together. Under this assumption one can prove that RLS and EA need on the average exponential time on RUN.

6

Shortest-Paths Problems

The computation of shortest paths from a source s to all other places is one of the classical optimization problems. The problem instance is described by a distance matrix D = (dij ) where dij ∈ R+ ∪ {∞} describes the length of the direct connection from i to j. The search space consists of all trees T rooted at T s := n. Each tree can be described by the vector vT = (v1T , . . . , vn−1 ) where viT is the number of the direct predecessor of i in T . The ﬁtness of T can be deﬁned in diﬀerent ways. Let dT (i) be the length of the s-i-path in T . Then – fT (v) = dT (1) + · · · + dT (n − 1) leads to a minimization problem with a single objective and – gT (v) = (dT (1), . . . , dT (n − 1)) leads to a minimization problem with n − 1 objectives. In the case of multi-objective optimization we are interested in Pareto optimal solutions, i.e., search points v where gT (v) is minimal with respect to the partial

Towards a Theory of Randomized Search Heuristics

135

order “≤” on (R ∪ {∞})n−1 . Here (a1 , . . . , an−1 ) ≤ (b1 , . . . , bn−1 ) iﬀ aj ≤ bj for all j. In the case of the shortest-paths problem there is exactly one Pareto optimal ﬁtness vector which corresponds to all trees containing shortest s-i-paths for all i. Hence, in both cases optimal search points correspond to solutions of the considered problem. Nevertheless, the problems are of diﬀerent complexity when considered as black-box optimization problems. The single-objective problem has very hard instances. All instances where only the connections of a speciﬁc tree T have ﬁnite length lead in black-box optimization to the same situation. All but one search points have the ﬁtness ∞ and the other search point is optimal. This implies an exponential black-box complexity (see Section 9). The situation is diﬀerent for the multi-objective problem. A local operator is to replace viT by some w ∈ / {i, viT }. This may lead to a graph with cycles and, therefore, an illegal search point. We may assume that illegal search points are marked or that dT (i) = ∞ for all i without an s-i-path. Again, we can consider the operator RLS performing a single local operation (uniformly chosen at random) and the operator EA performing X + 1 local operations (X Poisson distributed with λ = 1). Scharnow, Tinnefeld, and Wegener (2002) have analyzed these algorithms by estimating the expected time until the algorithm stores a search point whose ﬁtness vector has more optimal components. The worst-case expected run time can be estimated by O(n3 ) and by O(n2 d log n) if the depth (number of edges on a path) of an optimal tree equals d. This result proves the importance of the choice of an appropriate problem modeling when applying randomized search heuristics.

7

Maximum Matchings

The maximum matching problem is a classical optimization problem. In order to obtain a polynomial-time algorithm one needs the non-trivial idea of augmenting paths. This raises the question what can be achieved by randomized search heuristics that do not employ the idea of augmenting paths. Such a study can give insight how an undirected search can ﬁnd a goal. The problem instance of a maximum matching problem is described by an undirected graph G = (V, E). A candidate solution is some edge set E ⊆ E. The search space equals {0, 1}m for graphs with m edges and each bit position describes whether the corresponding edge is chosen. Finally, fG (E ) = |E |, if the edges of E are a G-matching, and fG (E ) = 0 otherwise. A search heuristic can start with the empty matching. We investigate three randomized search heuristics. Randomized local search RLS ﬂips a coin in order to decide whether it ﬂips one or two bits uniformly at random. It is obvious that an RLS ﬂipping only one bit per step can get stuck in local optima. Flipping two bits in a step, an augmenting path can be shortened by two edges within one step. If the augmenting path has length 1, the matching can be enlarged by ﬂipping the edge on this path. The mutation operator EA ﬂips each bit independently with probability 1/m. This allows to ﬂip all bits of an augmenting path simultaneously. The analysis of EA is much more diﬃcult than the analysis of RLS since more global

136

Ingo Wegener edges

h

          

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

Kh,h

Fig. 1. The graph Gh, and an augmenting path.

changes are possible. Finally, SA is some standard form of simulated annealing whose details are not described here (see Sasaki and Hajek (1988)). Practitioners do not ask for optimal solutions, they are satisﬁed with (1 + ε)optimal solutions, in particular, if they can choose the accuracy parameter ε > 0. It is suﬃcient to ﬁnd a (1 + ε)-optimal solution in expected polynomial time and to obtain a probability of 3/4 of ﬁnding a (1 + ε)-optimal solution within a polynomial number of steps. Algorithms with the second property are called PRAS (polynomial-time randomized approximation scheme). Giel and Wegener (2003) have shown that RLS and EA have the desired properties and the expected run time is bounded by O(m21/ε ). This question has not been investigated for SA. The observation is that small matchings imply the existence of short augmenting paths. For RLS the probability of a sequence of steps shortening a path to an augmenting edge is estimated. For EA it is suﬃcient to estimate the probability of ﬂipping exactly all the edges of a short augmenting path. These results are easy to prove and show that randomized search heuristics perform quite well for all graphs. The run time grows exponentially with 1/ε. This is necessary as the following example shows (see Figure 1). The graph Gh, has h · ( + 1) nodes arranged as h × ( + 1)-grid. All “horizontal” edges exist. Moreover, the columns 2i and 2i + 1 are connected by all possible edges. The graph has a unique perfect matching consisting of all edges ((i, 2j − 1), (i, 2j)). Figure 1 shows an almost perfect matching (solid edges). In such a situation there is a unique augmenting path which goes from left to right perhaps changing the row, in our example (2, 5), (2, 6), (2, 7), (2, 8), (1, 9), (1, 10). The crucial observation is that the free node (2, 5) (similarly for (1, 10)) is connected to h + 1 free edges whose other endpoints are connected to a unique matching edge each. There are h + 1 2-bitﬂips changing the augmenting path at (2, 5), h of them increase the length of the augmenting path and only one decreases the length. If h ≥ 2, this is an unfair game and we may conjecture that Gh, , h ≥ 2, is diﬃcult for randomized search heuristics. This is indeed the case. Sasaki and Hajek (1988) have proved for SA that the expected optimization time grows exponentially if h = . Giel and Wegener (2003) have proved a bound of 2Ω() on the expected optimization time for each h ≥ 2 and RLS and EA. The proof for RLS follows the ideas discussed above.

Towards a Theory of Randomized Search Heuristics

137

One has to be careful since the arguments do not hold if an endpoint of the augmenting path is in the ﬁrst or last column. Moreover, we have to control which matchings are created during the search process. Many of the methods discussed in Section 3 are applied, namely typical runs analyzed with appropriate potential functions (length of the shortest augmenting path), drift analysis, gambler’s ruin problem, and Chernoﬀ bounds. The analysis of EA is even more diﬃcult. It is likely that there are some steps where more than two bits ﬂip and the resulting bit string describes a matching of the same size. Such a step may change the augmenting paths signiﬁcantly. Hence, a quite simple, bipartite graph where the degree of each node is bounded above by 3 is a diﬃcult problem instance for typical search heuristics. Our arguments do not hold in the case h = 1, i.e., in the case that the graph is a path of odd length. In this case, RLS and EA ﬁnd the perfect matching in an expected number of O(m4 ) steps (Giel and Wegener (2003)). One may wonder why the heuristics are not more eﬃcient. Consider the situation of one augmenting path of length Θ() = Θ(m). Only four diﬀerent 2-bit-ﬂips are accepted (two at each endpoint of the augmenting path). Hence, on the average only one out of Θ(m2 ) steps changes the situation. The length of the augmenting path has to be decreased by Θ(m). The “inverse” of Chernoﬀ’s bound (see Section 3) implies that we need on the average Θ(m2 ) essential steps. The reason is that we cannot decrease the length of the augmenting path deterministically. We play a coin tossing game and have to wait until we have won Θ(m) euros. We cannot lose too much since the length of the augmenting path is bounded above by m. The considerations show that randomized search heuristics “work” at free nodes v. The pairs ((v, w), (w, u)) of a free and a matching edge cannot be distinguished by black-box heuristics with small space bounds. Some of them will decrease and some of them will increase the length of augmenting paths. If this game is fair “on the average”, we can hope to ﬁnd a better matching in expected polynomial time. Which graphs are fair in this imprecise sense? There is a new result of Giel and Wegener showing that RLS ﬁnds an optimal matching in trees with n nodes in an expected number of O(n6 ) steps. One can construct situations which look quite unfair. Then nodes have a large degree. Trees with many nodes of large degree have a small diameter and/or many leaves. A leaf, however, is unfair but in favor of RLS. If a leaf is free, a good 2-bit-ﬂip at this free node can only decrease the length of each augmenting path containing the leaf. The analysis shows that the bad unfair inner nodes and the good unfair leaves together make the game fair or even unfair in favor of the algorithm.

8

Population-Based Search Heuristics and Search with Crossover

We have seen that randomized local search RLS is often eﬃcient. RLS is not able to escape from local optima. This can be achieved with the same search operator if we accept sometimes worsenings. This idea leads to the Metropolis

138

Ingo Wegener

algorithm or simulated annealing. Another idea is the mutation operator EA from evolutionary algorithms. It can perform non-local changes but it prefers local and almost local changes. In any case, these algorithms work with a space restriction of 1. One of the main ideas of evolutionary algorithms is to work with more search points in the storage, typically called population-based search. Such a population can help only if the algorithm maintains some diversity in the population, i.e., it contains search points which are not close together. It is not necessary to deﬁne these notions rigorously here. It should be obvious nevertheless that it is more diﬃcult to analyze population-based search heuristics (with the exception of multi-start variants of simple heuristics). Moreover, the crossover operator needs a population. We remember that crossover creates a new search point z from two search points x and y. In the case of Sn = {0, 1}n , one-point crossover chooses i ∈ {1, . . . , n − 1} uniformly at random and z = (x1 , . . . , xi , yi+1 , . . . , yn ). Uniform crossover decides with independent coin tosses whether zi = xi or zi = yi . Evolutionary algorithms where crossover plays an important role are called genetic algorithms. There is only a small number of papers with a rigorous analysis of population-based evolutionary algorithms and, in particular, genetic algorithms. The diﬃculties can be described by the following example. Assume a population consisting of n search points all having k ones where k > n/2. The optimal search point consists of ones only, all search points with k ones are of equal ﬁtness, and all other search points are much worse. If k is not very close to n, it is quite unlikely to create 1n with mutation. A genetic algorithm will sometimes choose one search point for mutation and sometimes choose two search points x and y for uniform crossover and the resulting search point z is mutated to obtain the new search point z ∗ . Uniform crossover can create 1n only if there is no position i where xi = yi = 0. Hence, the diversity in the population should be large. Mutation creates a new search point close to the given one. If both stay in the population, this can decrease the diversity. Uniform crossover creates a search point z between x and y. This implies that the search operators do not support the creation of a large diversity. Crossover is even useless if all search points of the population are identical. In the case of a very small diversity, mutation tends to increase the diversity. The evolution of the population and its diversity is a diﬃcult stochastic process. It cannot be analyzed completely with the known methods (including rapidly mixing Markoﬀ chains). Jansen and Wegener (2002) have analyzed this situation. In the case of k = n − Θ(log n) they could prove that a genetic algorithm reaches the goal in expected polynomial time. This genetic algorithm uses standard parameters with the only exception that the probability of performing crossover is very small, namely 1/(cn log n) for some constant c. This assumption is necessary for the proof that we obtain a population with quite diﬀerent search points. Since many practitioners believe that crossover is essential, theoreticians are interested in proving this, i.e., in proving that a genetic algorithm is eﬃcient in situations where all mutation- and population-based algorithms fail. The most

Towards a Theory of Randomized Search Heuristics

139

modest aim is to prove such a result for at least one instance of one perhaps even very artiﬁcial problem. No such result was known for a long time. The royal road functions (see Section 4) were candidates for such a result. We have seen that randomized local search and simple evolutionary algorithms solve these problems in expected time O((n/d) · log(n/d + 1) · 2d ) and the black-box complexity of these problem is Ω(2d ) (see Section 9). Hence, there can be no superpolynomial trade-oﬀ for these functions. The ﬁrst superpolynomial trade-oﬀ has been proved by Jansen and Wegener (2002, the conference version has been published 1999) based on the results discussed above. Later, Jansen and Wegener (2001a) have designed artiﬁcial functions and have proved exponential trade-oﬀs for both types of crossover.

9

Results on the Black-Box Complexity of Speciﬁc Problems

We have seen that all typical randomized search heuristics work in the blackbox scenario and they indeed work with a small storage. Droste, Jansen, and Wegener (2003) have proved several lower bounds on the black-box complexity of speciﬁc problems. The lower-bound proofs apply Yao’s minimax principle (Yao (1977)). Yao considers the zero-sum game between Alice choosing a problem instance and Bob choosing an algorithm (a decision tree). Bob has to pay for each query asked by his decision tree when confronted with the problem instance chosen by Alice. Both players can use randomized strategies. If the number of problem instances and the number of decision trees are ﬁnite, lower bounds on the black-box complexity can be obtained by proving lower bounds for deterministic algorithms for randomly chosen problem instances. We are free to choose the probability distribution on the problem instances. The following application of this technique is trivial. Let Sn be the search space and let fa , a ∈ Sn , be the problem instance where fa (a) = 1 and fa (b) = 0 for b = a. The aim is maximization. We choose the uniform distribution on all fa , a ∈ Sn . A deterministic decision tree is essentially a decision list. If a query leads to the answer 1, the search is stopped successfully. Hence, the expected depth is always at least (|Sn | + 1)/2 and this bound can be achieved if we query all a ∈ Sn in random order. This example seems to be too artiﬁcial to have applications. The shortestpaths problem (see Section 6) with a single objective contains this problem where the search space consists of all trees rooted at s. Hence, we know that this problem is hard in black-box optimization. In the case of the maximization of monotone polynomials we have the subproblem of the maximization of all z1 · · · zd where zi ∈ {xi , 1 − xi }. The bits at the positions d + 1, . . . , n have no inﬂuence on the answers to queries. Hence, we get the lower bound (2d + 1)/2 for the maximization of polynomials (or monomials) of degree d. This bound is not far from the upper bound shown in Section 4. For several problems, we need lower bounds which hold only in a spacerestricted scenario since there are small upper bounds in the unrestricted scenario:

140

Ingo Wegener

– O(n) for sorting and the distance measure INV, – O(n log n) for sorting and the distance measures HAM and RUN, – O(n) for shortest paths as multi-objective optimization problem (a simulation of Dijkstra’s algorithm), – O(m2 ) for the maximum matching problem. Finally, we discuss a non-trivial lower bound in black-box optimization. A function is unimodal on {0, 1}n if each non-optimal search point has a better Hamming neighbor. It is easy to prove that RLS and EA can optimize unimodal functions with at most b diﬀerent function values in an expected number of O(nb) steps. This bound is close to optimal. A lower bound of Ω(b/nε ) has been proved by Droste, Jansen, and Wegener (2003) if (1 + δ)n ≤ b = 2o(n) (an exponential lower bound for deterministic algorithms has been proven earlier by Llewellyn, Tovey, and Trick (1989)). Here the idea is to consider the following stochastic process to create a unimodal function. Set p0 = 1n , let pi+1 be a random Hamming neighbor of pi , 1 ≤ i ≤ b − n. Then delete the circles on p0 , p1 , . . . to obtain a simple path q0 , q1 , . . .. Finally, let f (qi ) = n + i and f (a) = a1 + · · · + an for all a outside the simple path. Then it can be shown that a randomized search heuristic cannot do essentially better than to follow the path.

10

Conclusion

Randomized search heuristics ﬁnd many applications but the theory of these heuristics is not well developed. The black-box scenario allows the proof of lower bounds for all randomized search heuristics – without complexity theoretical assumption. The reason is that the scenario restricts the information about the problem instance. Moreover, methods to analyze typical heuristics on optimization problems have been presented. Altogether, the idea of a theory of randomized search heuristics developed as well as the theory of classical algorithms is still a vision but steps to approach this vision have been described

References 1. Dietzfelbinger, M., Naudts, B., van Hoyweghen, C., and Wegener, I. (2002). The analysis of a recombinative hill-climber on H-IFF. Submitted for publication in IEEE Trans. on Evolutionary Computation. 2. Droste, S., Jansen, T., and Wegener, I. (2003). Upper and lower bounds for randomized search heuristics in black-box optimization. Tech. Rep. Univ. Dortmund. 3. Droste, S., Jansen, T., and Wegener, I. (2002). On the analysis of the (1+1) evolutionary algorithm. Theoretical Computer Science 276, 51–81. 4. Giel, O. and Wegener, I. (2003). Evolutionary algorithms and the maximum matching problem. Proc. of 20th Symp. on Theoretical Aspects of Computer Science (STACS), LNCS 2607, 415–426. 5. Glover, F. and Laguna, M. (1993). Tabu search. In C.R. Reeves (Ed.): Modern Heuristic Techniques for Combinatorial Problems, 70–150, Blackwell, Oxford.

Towards a Theory of Randomized Search Heuristics

141

6. Hajek, B. (1982). Hitting-time and occupation-time bounds implied by drift analysis with applications. Advances in Applied Probability 14, 502–525. 7. He, J. and Yao, X. (2001). Drift analysis and average time complexity of evolutionary algorithms. Artiﬁcial Intelligence 127, 57–85. 8. Jansen, T. and Wegener, I. (2001a). Real royal road functions — where crossover provably is essential. Proc. of 3rd Genetic and Evolutionary Computation Conf. (GECCO), 375–382. 9. Jansen, T. and Wegener, I. (2001b). Evolutionary algorithms — how to cope with plateaus of constant ﬁtness and when to reject strings of the same ﬁtness. IEEE Trans. on Evolutionary Computation 5, 589–599. 10. Jansen, T. and Wegener, I. (2002). The analysis of evolutionary algorithms — a proof that crossover really can help. Algorithmica 34, 47–66. 11. Kirkpatrick, S., Gelatt, C.D., and Vecchi, M.P. (1983). Optimization by simulated annealing. Science 220, 671–680. 12. Llewellyn, D.C., Tovey, C., and Trick, M. (1989). Local optimization on graphs. Discrete Applied Mathematics 23, 157–178. 13. Lov´ asz, L., Naor, M., Newman, I., and Wigderson, A. (1991). Search problems in the decision tree model. Proc. of 32nd IEEE Symp. on Foundations of Computer Science (FOCS), 576–585. 14. Papadimitriou, C.H., Sch¨ aﬀer, A.A., and Yannakakis, M. (1990). On the complexity of local search. Proc. of 22nd ACM Symp. on Theory of Computing (STOC), 438– 445. 15. Rabani, Y., Rabinovich, Y., and Sinclair, A. (1998). A computational view of population genetics. Random Structures and Algorithms 12, 314–330. 16. Ranade, A.G. (1991). How to emulate shared memory. Journal of Computer and System Sciences 42, 307–326. 17. Sasaki, G. and Hajek, B. (1988). The time complexity of maximum matching by simulated annealing. Journal of the ACM 35, 387–403, 1988. 18. Scharnow, J., Tinnefeld, K., and Wegener, I. (2002). Fitness landscapes based on sorting and shortest paths problems. Proc. of 7th Conf. on Parallel Problem Solving from Nature (PPSN–VII), LNCS 2439, 54–63. 19. Wegener, I. (2001). Theoretical aspects of evolutionary algorithms. Proc. of 28th Int. Colloquium on Automata, Languages and Programming (ICALP), LNCS 2076, 64–78. 20. Wegener, I. and Witt, C. (2003). On the optimization of monotone polynomials by simple randomized search heuristics. Combinatorics, Probability and Computing, to appear. 21. Yao, A.C. (1977). Probabilistic computations: Towards a uniﬁed measure of complexity. Proc. of 17th IEEE Symp. on Foundations of Computer Science (FOCS), 222–227.

Adversarial Models for Priority-Based Networks 1 ` C. Alvarez , M. Blesa1 , J. D´ıaz1 , A. Fern´ andez2 , and M. Serna1 1

Dept. de Llenguatges i Sistemes Inform` atics, Universitat Polit`ecnica de Catalunya Jordi Girona 1–3, 08034 Barcelona {alvarez,mjblesa,diaz,mjserna}@lsi.upc.es 2 Dept. Ciencias Experimentales e Ingenier´ıa, Universidad Rey Juan Carlos Tulip´ an s/n, Campus de M´ ostoles, 28933 Madrid [email protected]

Abstract. We propose several variations of the adversarial queueing model to cope with packets that can have diﬀerent priorities, the priority and variable priority models, and link failures, the failure and reliable models. We address stability issues in the proposed adversarial models. We show that the set of universally stable networks in the adversarial model remains the same in the four introduced models. From the point of view of queueing policies we show that several queueing policies that are universally stable in the adversarial model remain so in the priority, failure and reliable models. However, we show that lis, a universally stable queueing policy in the adversarial model, is not universally stable in any of the other models, and that no greedy queueing policy is universally stable in the variable priority model. Finally we analyze the problem of deciding stability of a given network under a ﬁxed protocol. We provide a characterization of the networks that are stable under fifo and lis in the failure model. This characterization allows us to show that deciding network stability under fifo and lis in the proposed models can be solved in polynomial time.

1

Introduction

The model of Adversarial Queueing Theory (aqt) proposed by Borodin et al. [10] considers the time evolution of a packet-routing network as a game between an adversary and a queueing policy. At each time step the adversary may inject a set of packets to some of the nodes. For each packet the adversary speciﬁes the sequence of edges that it must traverse, after which the packet will be absorbed. If more than one packet try to cross an edge e at the same time step, then the queueing policy chooses one of these packets to be sent across e. The remaining

Work partially supported by the FET Programme of the EU under contract number IST-2001-33116 (FLAGS), and by the Spanish CICYT project TIC-2001-4917-E. The second author was also supported by the Catalan government with the predoctoral grant 2001FI-00659.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 142–151, 2003. c Springer-Verlag Berlin Heidelberg 2003

Adversarial Models for Priority-Based Networks

143

packets wait in the queue. This game then advances to the next time step. The main goal of the model is to study stability issues of the network, under diﬀerent greedy queueing policies. Stability is the property of deciding whether at any time the maximum number of packets present in the system is bounded by a constant that may depend on system parameters. Recall that a protocol is greedy if whenever there is at least one packet waiting to use an edge, the protocol advances a packet through the edge. In the adversarial model the adversary is restricted by a pair (r, b) where b ≥ 0 is the burstiness, and 0 < r < 1 is the injection rate. The adversary must obey the following rule Ne (I) ≤ r|I| + b, where Ne (I) denotes the number of packets injected by the adversary during a time interval I that have paths containing the edge e1 [10,5]. In this paper we consider a generalization of the adversarial model which takes into account the possibility that packets may have diﬀerent priorities and we explore some dynamic network settings. Priority Models. Considering priorities is a natural approach to model nowadays’ networks. Today’s networked applications, such as data mining, e-commerce and multimedia, are bandwidth hungry and time sensitive. These applications need networks that accommodate to these requirements and guarantee some Quality of Service (QoS). Classifying and prioritizing network traﬃc ﬂow are basic tasks. We are interested in analyzing the power of an adversary that can prioritize the packets. We will consider two settings, in the ﬁrst one each packet have a ﬁxed priority and in the second one the adversary is allowed to modify the priority of a packet at any time step. Consequently, we deﬁne two new models for adversarial queueing theory, the priority and the variable priority. When packets have priorities, each edge has a queue associated to every possible priority. If at a certain time, more than one packet tries to cross the same edge e, the queueing policy chooses one of these packets to send across e, from the non-empty queue with highest priority. The limitations on the adversary are the same as in the adversarial model. Models for Dynamic Networks. Inspired by the priority models and by the growing importance of wireless mobile networks where some connections between nodes may fail or change quickly and unpredictably, we also consider some variations of the adversarial model for dynamic networks, in which edges can appear and disappear arbitrarily. Note that in the priority model, we can simulate the failure of an edge e by injecting a packet of length 1 in e with a priority higher than any other packet in the queue of e. Once this packet has been injected none of the remaining packets of the queue can be served until the next time step. It seems natural to introduce models for dynamic networks in which the adversary controls not only the packet arrivals, but also the edge failures. The constraints of an adversary are deﬁned obeying the rule: the number of packets introduced by the adversary during interval I, which have paths containing e, can not be 1

Recall that in [10] the model is deﬁned over windows of ﬁxed size w and the equation Ne (I) ≤ r|I|, for |I| = w. It is known that both models are equivalent [16].

144

` C. Alvarez et al.

greater than a fraction of the number of times that e is alive in this interval. Furthermore, as packets must follow a pre-speciﬁed path, the adversary should not be able to make an edge fail perpetually. We assume that a packet cannot cross a failed link, and that during an edge failure the packets that arrive wait queued at the head of the edge. Let Fe (I) be the number of steps during time interval I for which the edge e is down. In order to guarantee that we keep a bound on the maximum number of failures of an edge e in any time interval, we propose the failure model, in which the adversary is controlled by a common bound on both the injection and the edge failures, according to the restriction Ne (I) + Fe (I) ≤ r|I| + b.

(1)

Observe than in this case, during a given interval, the injection rate limits the maximum number of failures and the maximum number of packet injections per edge. With the aim of allowing a higher amount of edge failures, we deﬁne a new dynamic model. To do so, we introduce an additional parameter α. The adversary is characterized by (r, b, α) , where r, b are deﬁned as before and r ≤ α ≤ 1. For any edge e and any interval I, the adversary must obey the constraint Ne (I) + αFe (I) ≤ r|I| + b.

(2)

In the model deﬁned by equation (2), we can consider two extreme cases: For α = 1 we obtain the constraint (1) deﬁning the failure model. For α ≤ r we get a model in which an edge can be down all the time. Notice that in the case r < α ≤ 1, if the adversary produces a failure of an edge, then it is forced to recover the edge after b/(α − r) steps, otherwise it will violate inequality (2). We are interested in the latter property and will use the term reliable model to denote an adversary with parameters (r, b, α) where 0 < r < 1, b > 1, and r < α ≤ 1 and constrained by inequality (2). Greedy Protocols. Through this paper the term network will refer to digraphs which may have multiple edges but no loops. As in [10,5] we will only consider greedy protocols which apply their policies to the queues at the edges according to some local or global criteria. The systems acts synchronously. The main queueing protocols we consider are: fifo (First In First Out), lifo (Last In First Out), sis (Shortest In System), lis (Longest In System), ntg (Nearest To Go), ftg (Furthest To Go), nfs (Nearest From Source), ffs (Farthest From Source) and ntg-lis, which works as ntg, but resolving ties using the lis protocol. It is known that ftg, nts, sis and lis are universally stable in the adversarial model while fifo, lifo, ntg and ffs are not [5]. Related Work. Two adversarial models for dynamic networks have been proposed in [7] and [6]. In both models the injected packets are deﬁned by specifying only source and destination, and thus they are not forced to follow a pre-speciﬁed path. The dynamic models proposed in this paper consider the case in which the injected packets are deﬁned specifying the sequence of edges that they must

Adversarial Models for Priority-Based Networks

145

traverse. Our models and the dynamic models proposed in [7] and in [6] have the common characteristic that, for every interval I, the adversary can not inject to any edge e (or to any set S of nodes for the model in [6]), more packets than the number of packets that e can absorb (or the edges with only one extreme in S). An interpretation of system failure as a slowdown in the transmission or link capacity, instead of link failure was studied in [9], in both models packets are injected with a pre-speciﬁed path. In the dynamic slowdown model, a packet p suﬀers slowdown se (t) while crossing edge e at time t, that means that it p starts to traverse the link at time t and arrives to the tail of e at time t + se (t). During this transfer time the packets that want to cross e wait in the queue of e. In the static case every link e has a ﬁxed slowdown se . This situation has some similarities with the failure model as the slowdown s incurred by a packet traversing a link e can be interpreted as “link e fails during s − 1 steps”. However, there is a diﬀerence, in the slowdown model p is delayed after leaving e’s queue, while in the failure model p waits in e’s queue. This means that when e is recovered, the next packet to be served might not be p. In the capacity model every edge e in a network has capacity ce (t) at time step t. They also consider a static case in which the capacity does not depend on time. In step t a link is able to transmit simultaneously up to ce (t) packets. The main results in the paper are that every universally stable network remains universally stable in the slowdown and the capacity models, even in the dynamic case. That sis, nts and ftg remain universally stable in all the models. The situation is diﬀerent for lis since it is universally stable in the static slowdown model but it is not in the dynamic slowdown and capacity models. Even though we can interpret that a link fails at any time step with ce (t) = 0 the proof that lis is not universally in the dynamic capacity model uses two non zero capacities (see Theorem 3.1 [9]). Our Contributions. We address stability issues in the proposed adversarial models. Let us recall that a network is stable under a protocol and an adversary if the number of packets in the system at any time step remains bounded. Our ﬁrst results concern with universal stability of networks. First we show that the property that a network is stable under any adversary and queueing policy remains the same in the adversarial, priority, variable priority, failure and reliable models. From the point of view of queueing policies, we show that nfs, sis and ftg, that are universally stable in the adversarial model [5], remain so in the failure, reliable, and priority models. However, we show that lis, a universally stable queueing policy in the adversarial model [5], is not universally stable in the failure, reliable and priority models. Moreover, we show that no greedy protocol is universally stable in the variable priority model. Finally we analyze the problem of deciding stability of a given network under a ﬁxed protocol. We provide a characterization of the networks that are stable under fifo and lis in the failure model. This characterization is the same as the one given in [3] in the adversarial model for universal stability and for stability

146

` C. Alvarez et al.

under ntg-lis. Thus, our results show that for fifo and lis the stability problem in the failure model can be solved in polynomial time. Let us observe that the characterization of fifo stability in the adversarial model remains an open problem [4]. Due to the lack of space we will omit all the proofs they can be found in the full version [1].

2

Universal Stability of Networks

A communication system is formed by three main components: A network G, a scheduling protocol P and a traﬃc pattern A which is represented by an adversary. The concept of universal stability applies either to networks or protocols. Let M denote a model in the set {adversarial, reliable, failure, priority, variable priority } as deﬁned in the previous section. Given a network G, a queuing policy P and a model M, we say that for a given adversary A following the restrictions of M, the system S = (G, A, P) is stable in the model M if, at any time step, the maximum number of packets in the system is bounded by a ﬁxed value that may depend on system parameters. The pair (G, P) is stable in the model M if, for any adversary A following the restrictions of M, the system S = (G, A, P) is stable in M. A network G is universally stable in the model M if, for any greedy queuing policy P, the pair (G, P) is stable in M. A greedy protocol P is universally stable in the model M if, for any digraph G, the pair (G, P) is stable in M. In order to compare the power of adversaries in diﬀerent adversarial models, we introduce the concept of simulation. We say that an adversary A in model M simulates an adversary A in model M when for any network G and any protocol P if (G, A, P) is stable in M, then (G, A , P) is stable in M . Lemma 1. Any (r, b)-adversary in the adversarial model can be simulated by an (r, b)-adversary in the failure model. Any (r, b)-adversary in the failure model is an (r, b, 1)-adversary in the reliable model. Any (r, b)-adversary in the failure model can be simulated by an (r, b)-adversary in the priority model using two priorities. Any (r, b, α)-adversary in the reliable model can be simulated by an (r + 1 − α, b)-adversary in the failure model. Observe that the failure and reliable models are equivalent. So any stability or instability result for one model applies to the other as well. We will consider in the following only the failure and priority models. Now we can state our main result in this section. Theorem 1. Given a digraph 1. G is universally stable 2. G is universally stable 3. G is universally stable 4. G is universally stable

G, the following properties are equivalent in the adversarial model, in the failure model, in the priority model, and in the variable priority model.

Adversarial Models for Priority-Based Networks

3

147

Universal Stability of Protocols

In this section we address the universal stability property in the failure and priority models, from the point of view of the queuing policy. We will consider the six basic protocols presented in the introduction. Recall that ftg, nts, sis and lis are universally stable in the adversarial model while fifo, lifo, ntg, and ffs are not [5]. Since any adversary in the adversarial model can be seen as an adversary in the other models, fifo, lifo, ntg, and ffs are not universally stable in the failure and priority models. First we show how the behavior of any (r, b)-adversary for network G in the priority model can be simulated by an (r , b )-adversary for a network G in the adversarial model, under the same protocol P in the case that P ∈ {ftg, ntg, nfs, ffs}. Let G be a directed graph and Aπ an adversary in the priority model, assigning at most π priorities and with injection rate (r, b), where b ≥ 0 and 0 < r < 1. Every injected packet p has a priority πp in the ordered interval [1, . . . , π], being one the lowest priority. Lemma 2. For any system S = (G, Aπ , P) in the priority model, for P ∈ {ftg, ntg}, there is a system S = (G , A , P) in the adversarial model such that, G is a subgraph of G , A has injection rate (r, b), if a packet p is injected in S at time t with route r, a packet p is injected in S at time t with a route r obtained by concatenating r and a path of edges not in G, and if p crosses edge e at time t in S, p crosses e at time t in S . To get a similar result for nfs and ffs we have to relate two diﬀerent networks. Lemma 3. For any system S = (G, Aπ , P) in the priority model, for P ∈ {nfs, ffs}, there is a system S = (G , A , P) in the adversarial model such that, G is a subgraph of G , A has injection rate (r, b ), where b = r(π − 1)d + b, if a packet p is injected in S with route r, a packet p is injected in S with a route r obtained by concatenating and a path of edges not in G and r, and if p crosses edge e at time t in S, p crosses e at time t in S . As a consequence of the previous lemmas, and the results in [5] we get, Theorem 2. ftg and nfs are universally stable in the priority, failure and reliable models. We can show the universal stability of sis in the priority model by following similar arguments to that of Lemma 2.2 in [5] for showing the universal stability of sis in the adversarial model and induction on the number of priorities. Theorem 3. sis is universally stable in the priority, failure and reliable models. The next result states the non universal stability of lis in the failure model, and therefore in all the other models. We show that the graph U 1 of Figure 1 is not stable under lis in the failure model. Putting this result together with Lemma 1 we get,

148

` C. Alvarez et al.

Í½

Í¾

Í ½

Í ¾

Fig. 1. The two subgraphs characterizing universal stability in the adversarial model (see [3]), and their extensions replacing an edge by a path

Theorem 4. lis is not universally stable in the failure, reliable and priority models. Finally, we consider universal stability of protocols in the variable priority model. We show that the graph U 1 of Figure 1 is not stable under any greedy protocol in the variable priority model. Theorem 5. There is no greedy protocol universally stable in the variable priority model

4

Stability under a Protocol

In this section we analyze the complexity of the problem of deciding whether a given network G is stable under a ﬁxed protocol. Few results are known for this problem. Before formally stating our results, we need to introduce some graph theoretical deﬁnitions. We will consider the following subdivision operations over digraphs: – The subdivision of an arc (u, v) in a digraph G consists in the addition of a new vertex w and the replacement of (u, v) by the two arcs (u, w) and (w, v). – The subdivision of a 2-cycle (u, v), (v, u) in a digraph G consists in the addition of a new vertex w and the replacement of (u, v), (v, u) by the arcs (u, w), (w, u), (v, w) and (w, v). Given a digraph G, E (G) denotes the family of digraphs formed by G and all the digraphs obtained from G by successive arc or 2-cycle subdivisions. Given a family of digraphs F, S(F) denotes the family of digraphs that contain a graph in F as a subgraph. Figure 1 provides the two basic graphs needed to characterize universal stability, and the shape of the extensions of those graphs. This basic family provides the characterization of the stability properties. It is known that a digraph is universally stable in the adversarial model if and only if G ∈ / S(E(U 1 ) ∪ E(U 2 )) [3] . The same property characterizes network stability under ntg-lis [3] and ffs [2]. It is also known that, for a given digraph G, checking whether G ∈ / S(E(U 1 ) ∪ E(U 2 )) can be done in polynomial time [3]. Further results for undirected graphs and other variations can be found in [5] and [4]. Nothing is known about the complexity of deciding stability in the adversarial model for any other queueing policy. In the following we will provide a similar characterization of stability in the failure model under fifo and lis.

Adversarial Models for Priority-Based Networks

4.1

149

FIFO Stability under the Failure Model

Much eﬀort have been devoted to study stability and instability properties of fifo recently. The fifo protocol was shown not to be universally stable [5]. A network-dependent absolute constant is provided in [12] such that fifo is stable against any adversary with smaller injection rate. A lower bound of 0.749 for the instability is calculated in [13]. This bound was decreased to 0.5 [15]. In [11] it is shown that fifo is stable if the injection rate is smaller than 1/(d − 1). Recently, it has been proved that fifo can become unstable at arbitrarily low injections rates [8,14]. We show that the two basic graphs given in Figure 1, as well as their extensions, are not stable under fifo in the failure model. As a more general result, for network U 2 we can show instability in the adversarial model. Lemma 4. Any graph in S(E(U 1 ) ∪ E(U 2 )) is not stable under fifo in the failure model. As we have pointed out before, all networks G ∈ / S(E(U 1 ) ∪ E(U 2 )) are universally stable in the adversarial model. Taking into account that if a network has an unstable subnetwork it is also unstable we get the following result. Theorem 6. Let G be a digraph, the pair (G, fifo) is stable in the failure model if and only if G is universally stable in the adversarial model. A corollary of this result is the equivalence between fifo stability in the failure model and universal stability in the adversarial model. Furthermore, as instability in the failure model implies instability in the priority and reliable models, the characterization of fifo stability remains the same in the priority and reliable models. Observe also that stability under fifo can be checked in polynomial time for the failure, priority and reliable models. 4.2

LIS Stability under the Failure Model

The lis protocol gives priority to the packet that was longer in the system, i.e., that joined the network earlier. In [5], the lis protocol was shown to be universally stable in the adversarial model, with O(b/(1 − r)d ) queue size per edge and delay of the packets in the order O(b/(1 − r)d ). However, as we have shown the protocol is not universally stable in the failure model. We proceed as in the case of fifo by showing, respectively, the instability of the basic graphs given in Figure 1 and their extensions. Lemma 5. Any graph in S(E(U 1 )∪E(U 2 )) is not stable under lis in the failure model. Therefore, as in the case of fifo, we have Theorem 7. A digraph G is stable under lis in the failure model if and only if G is universally stable in the adversarial model.

150

4.3

` C. Alvarez et al.

The Variable Priority Model

For the variable priority model the situation is simpler. It is easy to adapt the lis instability proofs for U 1 , U 2 and their extensions. This together with Theorem 1 gives the following result. Theorem 8. Let P be any greedy protocol. A digraph G is stable under P in the failure model if and only if G is universally stable in the adversarial model.

5

Conclusions and Open Problems

We have proposed several variations on the adversarial model to cope with packet priorities and link failures. We have studied universal stability from the point of view of both, the network and the queueing policy. We have also addressed the complexity of deciding stability under a ﬁxed protocol. We have shown that in the adversarial, failure, reliable, priority and variable priority models, the set of networks that are universally stable remains the same. The models present a diﬀerent behavior with respect to the universal stability of protocols, since lis is universally stable in the adversarial model, but it is not universally stable in the other models. In contrast, we have shown that there are no universally stable protocols for the variable priority model. We have proposed a new and natural way to model the behavior of queueing systems in dynamic networks. Our results compared to the slowdown models introduced in [9] show that the power of an adversary in the failure and in the dynamic slowdown model is quite similar. In both cases the lis protocol is not universally stable. However the static slowdown model is less powerful than the failure model as lis remains universally stable [9]. The argument used in the proof of Theorem 4.1 in [9] can be used to show how to construct an adversary in the variable priority model that simulates an adversary in the dynamic slowdown model. It would be of interest to ﬁnd constructions, similar to those given in Lemmas 2 and 3 to relate the power of the slowdown and failure models without changing the protocol. Regarding the dynamic capacity model, the authors frequently use the trick of injecting c − ce (t) dummy packets which only need to traverse link e. This can be done without violating the load condition for a network with static capacity c provided that ce (t) > 0 (see Theorems 3.3 and 3.4 [9]). It will be of interest to analyze the case with zero capacities. It remains as an open problem to show the existence of a protocol that is universally stable in the failure model but it is not in the priority model. All the already known characterizations of stability under a protocol are equivalent to universal stability in the adversarial model, even in the variable priority model. It is an interesting open question to know whether there is any protocol P, not universally stable, for which there are networks that are not universally stable but that are stable under P. Finally let us point that deciding stability under FIFO in the adversarial model is open, and that also nothing is known about characterizations of stability under lifo.

Adversarial Models for Priority-Based Networks

151

References ` 1. C. Alvarez, M. Blesa, J. D´ıaz, A. Fern´ andez and M. Serna. Adversarial models for priority-based networks. Technical Report LSI-03-25-R, Software department, Universitat Polit`ecnica de Catalunya, 2003. ` 2. C. Alvarez, M. Blesa, J. D´ıaz, A. Fern´ andez and M. Serna. The complexity of deciding stability under ffs in the adversarial model. Technical Report LSI-03-16R, Software department, Universitat Polit`ecnica de Catalunya, 2003. ` 3. C. Alvarez, M. Blesa, and M. Serna. A characterization of universal stability in the adversarial queueing model. Technical Report LSI-03-27-R, Software department, Universitat Polit`ecnica de Catalunya, 2003. ` 4. C. Alvarez, M. Blesa, and M. Serna. Universal stability of undirected graphs in the adversarial queueing model. In 14th ACM Symposium on Parallel Algorithms and Architectures (SPAA’02), pages 183–197, Winnipeg, Canada, August 2002. ACM Press New York, USA. 5. M. Andrews, B. Awerbuch, A. Fern´ andez, J. Kleinberg, T. Leighton, and Z. Liu. Universal stability results for greedy contention–resolution protocols. Journal of the ACM, 48(1):39–69, 2001. 6. E. Anshelevich, D. Kempe, and J. Kleinberg. Stability of load balancing algorithms in dynamic adversarial systems. In 34th. ACM Symposium on Theory of Computing (STOC’02), pages 399–406, 2002. 7. B. Awerbuch, P. Berenbrink, A. Brinkmann, and C. Scheideler. Simple routing strategies for adversarial systems. In 42th. IEEE Symposium on Foundations of Computer Science (FOCS’01), pages 158–167, 2001. 8. R. Bhattacharjee and A. Goel. Instability of FIFO at arbitrarily low rates in the adversarial queueing model. Technical Report 02-776, Department of Computer Science, University of Southern California, Los Angeles, USA, 2002. 9. A. Borodin, R. Ostrovsky, Y. Rabani. Stability Preserving Transformations: Packet Routing Networks with Edge Capacities and Speeds. In ACM-SIAM Symposium on Discrete Algorithms (SODA’01), pages 601–610, 2001. 10. A. Borodin, J. Kleinberg, P. Raghavan, M. Sudan, and D. Williamson. Adversarial queueing theory. Journal of the ACM, 48(1):13–38, 2001. 11. Charny, A. and Le Boudec, J.-Y. Delay Bounds in a Network With Aggregate Scheduling In Proc. First International Workshop on Quality of future Internet Services. Berlin, Germany, 2000. 12. J. Diaz, D. Koukopoulos, S. Nikoletseas, M. Serna, P. Spirakis, and D. Thilik´ os. Stability and non-Stability of the FIFO Protocol. In 13th annual ACM Symposium on Parallel Algorithms and Architectures (SPAA’01), pages 48–52, 2001. 13. D. Koukopoulos, M. Mavronicolas, S. Nikoletseas, and P. Spirakis. On the stability of compositions of universally stable, greedy contention-resolution protocols. In D. Malkhi, editor, Distributed Computing, 16th International Conference, volume 2508 of Lecture Notes in Computer Science, pages 88–102, Springer. 2002. 14. D. Koukopoulos, M. Mavronicolas and P. Spirakis. FIFO is unstable at arbitrarily low rates. ECCC, TR-03-16, 2003. 15. Z. Lotker, B. Patt-Shamir, and A. Ros´en. New stability results for adversarial queuing. In 14th ACM Symposium on Parallel Algorithms and Architectures (SPAA’02), pages 175–182, Winnipeg, Canada, 2002. 16. A. Ros´en. A note on models for non-probabilistic analysis of packet switching networks. Information Processing Letters, 84:237–240, 2002.

On Optimal Merging Networks Kazuyuki Amano and Akira Maruoka Graduate School of Information Sciences, Tohoku University Aoba 05, Aramaki, Sendai 980-8579, Japan {ama,maruoka}@ecei.tohoku.ac.jp

Abstract. We prove that Batcher’s odd-even (m, n)-merging networks are exactly optimal for (m, n) = (3, 4k + 2) and (4, 4k + 2) for k ≥ 0 in terms of the number of comparators used. For other cases where m ≤ 4, the optimality of Batcher’s (m, n)-merging networks has been proved. So we can conclude that Batcher’s odd-even merge yields optimal (m, n)merging networks for every m ≤ 4 and for every n. The crucial part of the proof is characterizing the structure of optimal (2, n)-merging networks.

1

Introduction

A comparator network, which consists of comparators, has been widely investigated as a model of oblivious comparison-based algorithm. A comparator is a 2-input 2-output gate, where one output computes the minimum of the two inputs and the other output computes the maximum (see Fig. 1). x1

1 0 0 1 0 1

x2 y1

x

1 0 0 1

min{ x , y }

1 0 1 0

y2 y3

0 1 1 0 0 1

y4

y

1 0 0 1

max{ x , y }

x3

z1 0 1 1 0 0 1

1 0 0 1 0 1

1 0 1 0 1 0 0 1

00 11 11 00 00 11 11 00 11 00

11 00 11 00 00 11 00 11 0 1 00 11 0 1 00 11 0 1 00 11 00 11 00 00 11 11 00 11 00 11 00 11 00 11 00 11 00 11

z2 z3 z4 z5 z6 z7

Fig. 1. Left: a comparator, Right: a (3, 4)-merging network

An (m, n)-merging network is a comparator network that merges m elements x1 ≤ · · · ≤ xm with n elements y1 ≤ · · · ≤ yn to form the sorted sequence z1 ≤ · · · ≤ zm+n . A merging network is usually drawn as in Fig. 1. Batcher [2] proposed odd-even merge, which gives an (m, n)-merging network with the least number of comparators used known up to the present. Let C(m, n) be the number of comparators used in the odd-even merge for m and n. The function C(m, n) is given(see [4, p.224]) by mn, if mn ≤ 1; C(m, n) = C(m/2, n/2) + C(m/2, n/2) + (m + n − 1)/2, if mn > 1. Let M (m, n) denote the minimum number of comparators in an (m, n)merging network. At present no merging networks have been discovered that B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 152–161, 2003. c Springer-Verlag Berlin Heidelberg 2003

On Optimal Merging Networks

153

are superior to the odd-even merge, and the problem of proving or disproving M (m, n) = C(m, n) for every m and n remains unsolved for over 30 years (see [8] or [4, Exercise 43, p. 241]). We call an (m, n)-merging network that uses M (m, n) comparators as the optimal merging network. There has been considerable eﬀort to determine the exact values of M (m, n). It is clear that M (1, n) = C(1, n) = n. Yao and Yao [8] proved that M (2, n) = C(2, n) = 3n/2. Aigner and Schwarzkopf [1] showed that M (3, n) = C(3, n) = (7n + 3)/4 for n ≡ 0, 1, 3 mod 4. In the same paper, they also stated that M (3, n) = C(3, n) for n ≡ 2 mod 4 but without a detailed proof. Recently, Iwata [3] proved that M (m1 + m2 , n) ≥ M (m1 , n) + M (m2 , n) + m1 + m2 + n − 2)/2,

(1)

for every m1 , m2 ≥ 1 and for every n ≥ 1. As a corollary of this inequality, they showed that M (m, n) = C(m, n) for m = 3, 4 and for n ≡ 0, 1, 3 mod 4. Unfortunately, for n ≡ 2 mod 4, the best lower bounds obtained by Eq. (1) is M (m, n) ≥ C(m, n) − 1 for m = 3, 4. Somewhat interestingly, the case n ≡ 2 mod 4 seems to be the hardest one in determining the exact values of M (m, n) for m = 3, 4. The exact values of M (n, n) for n ≤ 9 have been known [1,7], and the asymptotic behavior of M (m, n) has been also investigated in, e.g., [3,5,6,8]. The main result of the present paper is to show that Batcher’s odd-even merge yields optimal (m, n)-merging networks for every m ≤ 4 and for every n. This is achieved by proving M (3, 4k + 2) = C(3, 4k + 2) (in Section 3), and M (4, 4k + 2) = C(4, 4k + 2) (in Section 4). The crucial part of the proofs of the above results is characterizing the structure of optimal (2, n)-merging networks(Theorem 1), which we describe in Section 2. In addition, our arguments can also be used to determine M (m, n) for certain (m, n)’s with m > 4 such as (m, n) = (5, 8k + 6). This will be discussed in Section 5. In what follows, we call a horizontal line of input element xi (yj ) of an (m, n)merging network line xi (line yj ) for 1 ≤ i ≤ m (1 ≤ j ≤ n, respectively). The line xi is always placed above the line xi+1 for every 1 ≤ i < m, and similarly for yj . We can place lines x1 , x2 , . . . , xm of an (m, n)-merging network interspersed within lines y1 , y2 , . . . , yn in arbitrarily order since, for any two diﬀerently interspersed input lines, there is a transformation from an (m, n)-merging network with interspersed input lines into a network with another intersperse lines that preserves the number of comparators used [4, Exercise 16, p.238]. A comparator connecting a line i to a line j is denoted by [i : j]. A comparator is said to be of the form [i : ∗] if it is a comparator [i : j] for some j. Comparator of the form [∗ : j] is deﬁned similarly. A subnetwork of a merging network N consists of a set K of adjacent lines in N with the set of all comparators that connect two lines in K. We sometimes identify a merging network with the set of comparators contained in the network. For a set A, |A| denotes the cardinality of A.

154

2

Kazuyuki Amano and Akira Maruoka

Structure of Optimal (2, n)-Merging Networks

In this section, we characterize the structure of optimal (2, n)-merging networks for even n. This is the key to the proofs of the lower bounds for M (m, n) with m = 3, 4 and we believe that it is interesting in its own right. Put n = 2k. Let A be an optimal (2, n)-merging network consisting of input lines x1 , x2 , y1 , y2 , . . . , yn from top to bottom. The network A contains M (2, n) = 3n/2 = 3k comparators. Consider the behavior of the network A for two input sequences a0 = 0, n + 1, 1, 2, . . . , n and a1 = n + 1, n + 2, 1, 2 . . . , n . Note that the network A sorts a0 and a1 . We also note that if a network sorts a0 and a1 , then it is a (2, n)-merging network (this can be proved by using the zero-one principle [4, p. 224] or see [1, Lemma 1]). Now we divide the set of comparators A into three subsets A0 , A1 and A01 as follows: A0 : the set of comparators that change the content of a0 , but not of a1 , A1 : the set of comparators that change the content of a1 , but not of a0 , A01 : the set of comparators that change the contents of a0 and a1 . x1 x2 y1 y2 y3 y4 y5 y6

1

1 0 0 1 0 1 0 1 0 0 1 01 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1 1 0 1 1 0 1 0 1 1 0 11 00 1 0 0 1 1 0 11 00 1 0 01 0 0 1 00 11 0 1 0 1 00 11 0 1 0 1 0 1 00 11 0 1 0 1 0 1 1 00 11 0 1 0 1 0 1 00 11 0 1 0 1 0 1 00 11 0 1 0 1 0 1 0 1 011 0 0 1 00 11 00 11 0 1111111111111111 0000000000000000 0 1 0 00 1 11 0 1 00 11 00 11 0 1 0 1 0 1 0 1 0 1 0 00 1 0 1 0 1 01 1 1111111111111111 0000000000000000 11 0 0 1 0 1 0 1 0 1 0 1 01 1 0 1 0 0 1 1111111111111111 0000000000000000 11 00 11 00 0 1 11 00

(a)

x1 x2 y1 y2 y3 y4 y5 y6

1

1

11 00 00 11 0 1 00 11 00 11 0 1 0 0 1 01 1 0 1 0 1 0 1 0 1 0 1 0 1 11 00 0 1 11 00 0 1 0 1 0 1 0 1 0 1 0 1 00 11 0 1 00 11 0 1 00 11

011 0 0 1 0 0 1 1 0 1 0 1 0 1 0 1 0 0 1 0 1 1 0 0 1 1 0 0 1 0 1 1 0 011 00 01 00 11 0 0 1 0 1 1 011 1 00 11 00 11 0 1 0 1 0 1 0 00 00 0 11 1 0 1 0 1 00 11 1 0 11 1 0 1 00 0 1 0 11 1 0 1 00 0 1 0 1 0 1 0 1 0 1 00 11 0 1 1111111111111111 0000000000000000 11 00 0 1 00 11 0 1 0 1 1 0 0 1 1111111111111111 0000000000000000 11 00 0 1 0 1 0 1 0 1 1 0 1111111111111111 0000000000000000 11 00 1 0 1 0

(b)

Fig. 2. (a) Batcher’s (2, 6)-merging network; (b) a (2, 6)-merging network that uses the same number of comparators as Batcher’s network. A comparator labeled “0” (“1” and “01”) is a member of A0 (“A1 ” and “A01 ”, respectively)

Note that A0 , A1 and A01 are mutually disjoint, and A0 ∪ A1 ∪ A01 = A since A is optimal. In both networks shown in Fig. 2, the three sets A0 , A1 and A01 have the same size (,i.e., |A0 | = |A1 | = |A01 | = 3). Moreover, we can notice that every comparator in A0 has odd “length” and every comparator in A1 ∪ A01 has even “length”. Formally, the length of a comparator c = [i : j] is deﬁned to be (j − i) where we label the input lines with 0, 1, 2, . . . , from top to bottom. Note that in the network A, a comparator [yi : yj ] has length (j − i) and [xi : yj ] has length j + (2 − i). In the following, we prove that these conditions are satisﬁed by any optimal (2, 2k)-merging network. The statements (iv) and (v) of Theorem 1 are crucial to the proofs of all our lower bounds. Theorem 1. For every k ≥ 1 and for every optimal (2, 2k)-merging network A, which consists of lines x1 , x2 , y1 , . . . , y2k , the following are true: (i) |A0 | = |A1 | = |A01 | = k. (ii) For every 1 ≤ i ≤ 2k, there is a unique comparator in A0 ∪ A01 which is of the form [∗ : yi ]. (iii) For every 1 ≤ i ≤ 2k, there is a unique comparator in A1 ∪ A01 which is of the form [∗ : yi ].

On Optimal Merging Networks

155

(iv) Every comparator in A01 ∪ A1 has even length. (v) Every comparator in A0 has odd length. Proof (of (i), (ii) and (iii), Theorem 1). Recall that n = 2k. Let A be an optimal (2, 2k)-merging network. Obviously, a network consisting of the set of comparators A0 ∪ A01 sorts a0 . For any network sorting a0 , there are precisely n comparators that change the content of a0 and there is a unique comparator which is of the form [∗ : yj ] for every 1 ≤ j ≤ n. (This was observed by Aigner and Schwarzkopf [1, Section 3], and a similar observation was made by Yao and Yao [8, Lemma 1].) This implies (ii) of Theorem 1, and |A0 | + |A01 | = n = 2k. Hence we have |A1 | = 3k − 2k = k. Let (a0 )j ((a1 )j ) be the content of the line j after the h-th comparison in A when a0 (a1 , respectively) is given to A as input. Let us say that a line j is split after the h-th comparison if (a0 )j = (a1 )j holds. We deﬁne dh to be the number of split lines after the h-th comparison. Thus, at the input d0 = 2, and at the end d3k = n + 2, and hence d3k − d0 = n. Since every comparator raises the value of dh at most 1 and a comparator in A01 does not change the value of dh , we have |A0 | + |A1 | ≥ n = 2k. This implies |A0 | ≥ k, since |A1 | = k. Here we show that, for every comparator c of the form [∗ : yj ] in A0 , there is another comparator of the form [∗ : yj ] in A1 that lies to the left of c. If this is proven, then k = |A1 | ≥ |A0 | ≥ k, and hence k = |A1 | = |A0 | = |A01 | (that is, (i) of Theorem 1.). Moreover, this and (i) and (ii) of Theorem 1 imply (iii) of the theorem. Suppose that a comparator c = [i : yj ] is in A0 . It is easy to check that we must have (a0 )yj < (a0 )i ≤ (a1 )i < (a1 )yj at the input of c. Thus, the line yj must be split before encountering the comparator c. Since the initial content of the line yj is (j, j), the line yj must have been part of a previous comparator. Since the content of yj is greater than or equal to j at any stage, the ﬁrst comparator changing the content (j, j) is of the form [∗ : yj ]. Moreover this comparator must be in the set A1 since c is in A0 and (ii) of Theorem 1. This completes the proof of (iii) of the theorem.

Proof (of (iv), Theorem 1). Let us focus on the content of a1 in the network A. We label the lines with 1, 2, . . . , n + 2 from top to bottom. For any stage in A, the diﬀerential of line j, denoted by ej , is deﬁned to be (a1 )j − j and the diﬀerential sequence is deﬁned to be e1 , e2 , . . . , en+2 . Note that the diﬀerential sequence is n, n, −2, . . . , −2 at the input and is 0, 0, . . . , 0 at the output. If there is a comparator of odd length in A01 ∪ A1 , then there must be a line whose diﬀerential is an odd. (This is proved by focusing on the outputs of the ﬁrst such comparator.) So to prove (iv), it is suﬃcient to show that there are no lines whose diﬀerential d is odd in the network A. We shall prove this by induction on t where d = 2t + 1. Let c = [i : j] ∈ A01 ∪ A1 be a comparator whose length is l. Suppose that (ei , ej ) is equal to (α, β) at the input of c. By (iii) of Theorem 1 and the ﬁrst comparator changing the content of line j is of the form [∗ : yj ], we have β = −2. Since the comparator c swaps the two inputs, we have i + α > −2 + j, and this implies α + 2 > j − i = l ≥ 1. Hence α ≥ 0. So every comparator in A01 ∪ A1

156

Kazuyuki Amano and Akira Maruoka

decreases the number of negative entries in a diﬀerential sequence by at most 1. Because it contains precisely n = |A01 ∪ A1 | negative entries at the input, every comparator in A01 ∪ A1 must decrease the number of negative entries in a sequence by 1. The diﬀerential (ei , ej ) is changed from (α, −2) to (−2 + l, α − l) by the comparator c. Hence, for every comparator c = [i : j] ∈ A01 ∪ A1 , the diﬀerential (ei , ej ) = (α, −2) before c must satisfy α ≥ l ≥ 2. This proves the base case (t = 0). For the induction step, suppose that (ei , ej ) = (2t + 1, −2) at the input of some comparator c = [i : j] ∈ A01 ∪ A1 whose length is l. The diﬀerentials of the outputs of c are −2 + l and 2t + 1 − l. Since 2 ≤ l ≤ 2t + 1, one of them is odd and is smaller than or equal to 2(t − 1) + 1. This contradicts the induction hypothesis. (End of the proof of (iv) of Theorem 1.)

Proof (of (v), Theorem 1). To prove (v) of Theorem 1, we will show that the number of comparators having odd length is greater than or equal to that having even length for any network of size n that sorts a0 . This implies (v) of Theorem 1 because (a) A0 ∪ A01 sorts a0 , (b) |A0 | = |A01 | = n/2 (Theorem 1, (i)) and (c) every comparator in A01 has even length(Theorem 1, (iv)). To show this, we analyze an arbitrary network with n comparisons sorting a = n + 1, 1, 2, . . . , n , which consists of the lines 1, 2, . . . , n + 1. Aigner and Schwarzkopf [1] observed that, for every such network, the content a1 , . . . , an+1 after the r-th comparison can be described as follows: The set of lines {1, 2, . . . , n + 1} is uniquely divided into groups F1 |F2 | · · · |Fr+1 from top to bottom such that (i) i ∈ Fk , j ∈ Fl , k < l ⇒ ai < aj , and (ii) if Fk = {fk , fk + 1, . . . , fk+1 − 1} then afk +1 < · · · < afk+1 −1 < afk . In other words, within each group the top element is the largest, with the others appearing in their natural order. A comparator [i : j] changes the content of a if and only if i and j are in a certain group Fk with i = fk < j ≤ fk+1 − 1. By this comparator the set Fk splits into two groups Fku = {fk , fk + 1, . . . , j − 1} and Fkl = {j, j + 1, . . . , fk+1 − 1} satisfying (i) and (ii) when we replace Fk by Fku and Fkl . For a network M , let o(M ) (e(M )) be the number of comparators having odd length (even length, respectively) in M . Let T (k) be the minimum value of o(Ak ) − e(Ak ) where Ak ranges over all networks with input lines 1, 2, . . . , k + 1 consists of k comparators that sorts a sequence a1 , a2 , . . . , ak+1 with a2 < a3 < · · · < ak+1 < a1 . Then T (k) can be expressed recursively as follows: T (k) = min {(−1)i−1 + T (i − 1) + T (k − i)}. 1≤i≤k

(2)

We are going to show T (k) = 0 for every even k and T (k) = 1 for every odd k by the induction on k. The base cases T (0) = 0 and T (1) = 1 are obvious. If k is even, then by Eq. (2) and the induction hypothesis we have T (k) = min −1 + min {T (k1 )} + min {T (k2 )}, k1 :k1
k2 :k2
= min{−1 + 1 + 0, 1 + 0 + 1} = 0.

On Optimal Merging Networks

Similarly, if k is odd, then we have T (k) = min −1 + min

{T (k1 )} +

k1 :k1
1+

min

{T (k1 )} +

k1 :k1
157

{T (k2 )}, {T (k2 )}

min

k2 :k2
min

k2 :k2
= min{−1 + 1 + 1, 1 + 0 + 0} = 1. This proves the induction step and completes the proof of (v) of the theorem.

3

Lower Bounds for M (3, n)

In this section, we give a detailed proof for M (3, n) = C(3, n) for every n ≡ 2 mod 4 based on the results described in the previous section. Note that the optimality of Batcher’s odd-even merge has been proved for n ≡ 2 mod 4 [1,3]. Theorem 2. For every k ≥ 0, M (3, 4k + 2) = C(3, 4k + 2) = 7k + 5. Put n = 4k + 2. It was shown that 7k + 4 ≤ M (3, n) ≤ C(3, n) = 7k + 5. Throughout this section, suppose for a contradiction that there exists an optimal (3, 4k + 2)-merging network consisting of 7k + 4 comparators with input lines x1 , x2 , y1 , . . . , y4k+2 , x3 . First we show that optimal (2, n)- and (1, n)-merging networks must be “embedded” into the network N . To prove this, we shall need some notations and lemmas introduced by Iwata [3]. For a line z and a comparator c that is connected to line z, if there is not any comparator positioned to the left of c as far as line z is concerned, then we say that c is the leftmost comparator w.r.t. line z. If line z is an upper endline (a lower endline) of c then we call c by downward leftmost (upward leftmost, respectively) w.r.t. line z. For example in Fig. 1, the comparator [y2 : y4 ] is the upward leftmost comparator w.r.t. line y4 , the comparator [x2 : y2 ] is the downward leftmost comparator w.r.t. line x2 and is also the upward leftmost comparator w.r.t. line y2 . Consider a subnetwork NA of N consisting of the top three lines x1 , x2 , y1 . Obviously, NA is a (2, 1)-merging network and |NA | ≥ 2. Similarly, a subnetwork NB consisting of the bottom two lines yn , x3 is a (1, 1)-merging network and |NB | ≥ 1. Let NU (ND ) be the set of upward (downward, respectively) leftmost comparators w.r.t. line yi for some i ∈ {2, . . . , n − 1}. By the optimality of N , NU and ND are disjoint. Let A be a network after deletion of the comparators from N that belong to NB and to ND . Similarly let B be a network constructed from N after deletion of the comparators of NA and of NU . The following lemma is proved by Iwata [3]. Lemma 3 (Lemma 2.2 in [3]). The network A acts as a (2, n)-merging network, which consists of lines x1 , x2 , y1 , y2 , . . . , yn and |A| ≥ M (2, n). Similarly, the network B acts as a (n, 1)-merging network, which consists of lines y1 , y2 , . . . , yn , x3 and |B| ≥ M (1, n).

158

Kazuyuki Amano and Akira Maruoka

The next lemma says that the networks A and B are in fact optimal. This easily follows from Lemma 3 and we omit the proof due to space limitations. Lemma 4. Both of the networks A and B is optimal, i.e., |A| = M (2, n) = 6k + 3 and |B| = M (1, n) = 4k + 2. Moreover, |NA | = 2, |NB | = 1, |NU | = 3k, |ND | = k and |A ∩ B| = 3k + 1, and every comparator of N belongs to exactly one of NA , NB , NU , ND or of A ∩ B.

Since the network A is an optimal (2, n)-merging network, we can apply Theorem 1 to A. Let A0 , A1 and A01 be subsets of A deﬁned as in Section 2. Since a network consisting of the set of comparators A0 ∪ A01 (B) acts as an optimal (1, n)-merging networks on lines x2 , y1 , . . . , yn (y1 , . . . , yn , x3 , respectively), the following fact is obvious. Fact 5. For every 1 ≤ i ≤ n, the following are true: (i) There is a path from x2 to yi in A0 ∪ A01 , (ii) There is a unique comparator in A0 ∪ A01 which is of the form [∗ : yi ], (iii) There is a path from x3 to yi in B, and (iv) There is a unique comparator in B which is of the form [yi : ∗].

From this fact we can easily prove the following: Lemma 6. For every 1 ≤ i ≤ n, there is at most one comparator in (A0 ∪A01 )∩ B that connects to line yi . Proof. Suppose for a contradiction that there exist two comparators in (A0 ∪ A01 )∩B that connect to line yi for some i. By (ii) and (iv) in Fact 5, one of them is [yk : yi ] for some k < i and the other is [yi : yj ] for some i < j. Without loss of generality, we can assume that the comparator [yk : yi ] is left to the comparator [yi : yj ]. Since (iv) of Fact 5, there exist no paths from x3 to yk in B, which contradicts (iii) of Fact 5.

Proof (of Theorem 2). First we claim that |A0 ∩ B| = 2k and |A01 ∩ B| = 1. By similar arguments as in the proof of (iii) of Theorem 1, every comparator in A0 is not leftmost upward, hence A0 ∩ NU = φ. Since |A0 | = 2k + 1 ((i) of Theorem 1), |A0 ∩ NA | ≤ 1 and B = N \(NA ∪ NU ), we have |A0 ∩ B| ≥ 2k. It is easy to check that a comparator of the form [∗ : yn ] in A0 ∪ A01 must be in A01 , and this comparator is also in B since it cannot belong to NA and to NU . Therefore |A01 ∩ B| ≥ 1. On the other hand, Lemma 6 implies |(A0 ∪ A01 ) ∩ B| ≤ n/2 = 2k + 1, and this implies the claim. For a set of comparator M , let C(M ) be the set of indices i such that there is a comparator of the form [∗ : yi ] or of the form [yi : ∗] in M . By the above claim and Lemma 6, for every 1 ≤ i ≤ 4k + 2, there is a unique comparator which is of the form [yi : ∗] or [∗ : yi ] in (A0 ∪A01 )∩B, i.e., C((A0 ∪A01 )∩B) = {1, 2, . . . , 4k+2}. On the other hand, since every comparator in A0 has odd length ((v) of Theorem 1), the numbers of even and odd numbers in C(A0 ∩ B) are equal to 2k. Since every comparator in A01 has even length ((iv) of Theorem 1), C(A01 ∩B) consists of two even numbers or of two odd numbers. Therefore C((A0 ∪A01 )∩B) consists of 2k even and 2k + 2 odd numbers or of 2k + 2 even and 2k odd numbers. This contradicts C((A0 ∪A01 )∩B) = {1, 2, . . . , 4k +2} since {1, 2, . . . , 4k +2} consists of 2k + 1 even and 2k + 1 odd numbers. This completes the proof.

On Optimal Merging Networks

4

159

Lower Bounds for M (4, n)

Batcher’s (4, n)-merging networks use 2n + 1 comparators for n ≡ 0 mod 4 and 2n + 2 comparators for n ≡ 0 mod 4. In this section, we completes the proof that Batcher’s (4, n)-merging networks are “exactly” optimal for every n ≥ 1 by showing: Theorem 7. For every k ≥ 0, M (4, 4k + 2) = C(4, 4k + 2) = 8k + 6. The outline of the proof is similar to that of the proof for the case M (3, 4k+2): Suppose for a contradiction that there exists an optimal (4, 4k + 2)-merging network N consisting of C(4, 4k + 2) − 1 = 8k + 5 comparators with input lines x1 , x2 , y1 , . . . , y4k+2 , x3 , x4 . First we show that two optimal (2, 4k + 2)-merging networks must be “embedded” into the network N , and then we show that this is impossible by the structure of optimal (2, 4k + 2)-merging networks. Put n = 4k + 2. As in the previous section, let NA (NB ) be a subnetwork of N consisting of the top three lines x1 , x2 , y1 (the bottom three lines yn , x3 , x4 , respectively). Let NU (ND ) be the set of leftmost upward (downward, respectively) comparators in N w.r.t. line yi for some i ∈ {2, . . . , n − 1}. Let A (B) be a network constructed from N after deletion of the comparators both in NB and in ND (both in NA and in NU , respectively). By similar arguments to the proof of the lower bound for M (3, n), we can prove that both of the networks A and B acts as an optimal (2, n)-merging network. The proof of Lemma 8 is similar to that of Lemma 4 and is omitted due to space constraints. Lemma 8. Both of the network A and B behaves like an optimal (2, 4k + 2)merging network, i.e., |A| = |B| = M (2, 4k + 2) = 6k + 3. Moreover, |NA | = |NB | = 2, |NU | = |ND | = 2k and |A ∩ B| = 4k + 1, and every comparator of N belongs to exactly one of NA , NB , NU , ND or of A ∩ B.

Since the networks A and B are shown to be optimal (2, n)-merging networks, we can apply Theorem 1 to both networks. Lemma 9. NA consists of the two comparators [x1 : y1 ] ∈ A1 and [x2 : y1 ] ∈ A0 . Proof. By Lemma 8, NA behaves like an optimal (2, 1)-merging network consisting of lines x1 , x2 , y1 . There are two diﬀerent (2, 1)-merging networks that use 2 comparators (Fig. 3). If we assume that NA is the network shown in Fig. 3 (b), then the comparator [x1 : x2 ] must be in A1 and this contradicts (ii) of Theorem 1. Hence NA must be the network shown in Fig. 3 (a) and the lemma follows easily.

We divide the set of comparators A ∩ B into two subsets (A ∩ B)E and (A ∩ B)O where (A ∩ B)E ((A ∩ B)O ) consists of all comparators having even length (odd length, respectively). Lemma 10. For the network A, the following are true: (i) A0 consists of a comparator [x2 : y1 ] in NA and 2k comparators in (A ∩ B)O , (ii) A01 ∪ A1

160

Kazuyuki Amano and Akira Maruoka x1

00 11 11 00 00 11

0 1 1 0 0 1

x1

x2

11 00 00 11 00 11

x2

1 0 0 1 0 1

y1

11 00 11 00 11 00 00 11

y1

1 0 1 0

(a)

1 0 0 1 0 1

(b)

Fig. 3. (2, 1)-merging networks using 2 comparators

consists of a comparator [x1 : y1 ] in NA , 2k comparators in NU and 2k + 1 comparators in (A ∩ B)E , and (iii) For every 1 ≤ j ≤ 4k + 2, there is a unique comparator in A01 ∪ A1 which is of the form [∗ : yj ]. Proof. By similar arguments used in the proof of (iii) of Theorem 1, every comparator in A0 is not a member of NU , and hence NU ⊆ A01 ∪ A1 . Since A = (A ∩ B) ∪ NA ∪ NU , the statements (i) and (ii) of Lemma 10 follow from Theorem 1 and Lemma 9. The statement (iii) is identical to (iii) of Theorem 1.

By applying all arguments for the network A “upside down”, we can obtain a similar result to Lemma 10 for the network B. Let b0 = 1, 2, . . . , n, 0, n + 1 and b1 = 1, 2, . . . , n, −1, 0 . We divide the set of comparators B into three subsets B0 , B1 and B01 as follows: B0 : the set of comparators that change the content of b0 , but not of b1 , B1 : the set of comparators that change the content of b1 , but not of b0 , B01 : the set of comparators that change the contents of b0 and b1 . The next lemma is proved by symmetry. Lemma 11. For the network B, the following are true: (i) B0 consists of a comparator [y4k+2 : x3 ] in NB and 2k comparators in (A ∩ B)O , (ii) B01 ∪ B1 consists of a comparator [y4k+2 : x4 ] in NB , 2k comparators in ND and 2k + 1 comparators in (A ∩ B)E , (iii) For every 1 ≤ j ≤ 4k + 2, there is a unique comparator in B01 ∪ B1 which is of the form [yj : ∗].

Proof (of Theorem 7). By Lemmas 10 and 11, we have (A01 ∪ A1 )\(NA ∪ NU ) = (A ∩ B)E = (B01 ∪ B1 )\(NB ∪ ND ).

(3)

For a set of comparators M , let L(M ) (U (M )) be the set of indices i such that there is a comparator of the form [∗ : yi ] ([yi : ∗], respectively) in M . By Eq. (3) and Lemma 10, we have L((A ∩ B)E ) = {2, . . . , 4k + 2}\L(NU ), and by Eq. (3) and Lemma 11, we have U ((A ∩ B)E ) = {1, . . . , 4k + 1}\U (ND ). By the deﬁnition of leftmost comparator and Lemma 8, we have |L(NU )| = |U (ND )| = 2k, L(NU ) ∩ U (ND ) = ∅ and L(NU ) ∪ U (ND ) = {2, . . . , 4k + 1}. Suppose that L(NU ) contains d even numbers. Then U (ND ) contains 2k − d even numbers, L((A ∩ B)E ) contains 2k + 1 − d even numbers and U ((A ∩ B)E ) contains 2k − (2k − d) = d even numbers. On the other hand, since every comparator in (A∩B)E has even length, the numbers of even numbers in L((A∩ B)E ) and that in U ((A ∩ B)E ) must be the same. This contradicts the fact that 2k + 1 − d = d for every integer d.

On Optimal Merging Networks

5

161

Some More Results and Concluding Remarks

Our theorem on the structure of optimal (2, n)-merging networks described in Section 2 can also be used to determine M (m, n) for some m > 4 such as (m, n) = (5, 8k + 6). Note that Iwata [3] has shown that M (5, n) = C(5, n) for n ≡ 0, 1, 5 mod 8 and M (m, n) = C(m, n) for m = 6, 7, 8 and n ≡ m mod 8. Theorem 12. For every k ≥ 0, M (5, 8k + 6) = C(5, 8k + 6) = 17k + 16. Proof (sketch). Suppose for a contradiction that there is a (5, 8k + 6)-merging network N consisting of lines x1 , x2 , y1 , . . . , y8k+6 , x3 , x4 , x5 that uses 17k + 15 comparators. Let NA (NB ) be a subnetwork of N of the top three lines x1 , x2 , y1 (the bottom four lines y8k+6 , x3 , x4 , x5 . Let NU (ND ) be the set of upward (downward, respectively) leftmost comparators w.r.t line yi for some i ∈ {2, . . . , 8k+5}. By similar calculations in the proof of Lemma 4, we have |A| = M (2, 8k + 6), |B| = M (3, 8k + 6), |NA | = 2 and |NB | = 3. This implies that every comparator of N belongs to exactly one of NA , NB , NU , ND or of A ∩ B. Now we consider a subnetwork NB of N consisting of the bottom ﬁve lines y8k+5 , y8k+6 , x3 , x4 , x5 . Since NB forms (2, 3)-merging network, we have |NB | ≥ 5. This implies that there are at least two comparators in NB of the form [y8k+5 : ∗]. Since every comparator of N belongs to exactly one of the sets NA , NB , NU , ND or of A ∩ B, there must be a comparator [y8k+5 , y8k+6 ] in A ∩ B. (One possible comparator of the form [y8k+5 : ∗] in NB may be in ND for line y8k+5 .) By Theorem 1, this must be in A0 and there is another comparator of the form [∗ : y8k+6 ] in A1 . On the other hand, since M (2, 8k + 6) = M (2, 8k + 5) + 1, there is exactly one comparator of the form [∗ : y8k+6 ] in A, a contradiction. Finally, we remark that the arguments described in Section 4 can also be used to characterize the structure of optimal (4, 4k)-merging networks.We believe that this would help to determine M (m, n) for higher values of m and n.

References 1. M. Aigner, O. Schwarzkopf, “Bounds of the Size of Merging Networks”, Discrete Applied Mathematics 61 (1995) 187–194. 2. K.E. Batcher, “Sorting Networks and Their Applications”, Proc. AFIPS 1968 SJCC 32 (1968) 307–314. 3. S. Iwata, “Lower Bounds for Merging Networks”, Information and Computation 168 (2001) 187–205. 4. D.E. Knuth, The Art of Computer Programming Vol. 3: Sorting and Searching, 2nd ed., Addison-Wesley (1998). 5. T. Leighton, Y. Ma, T. Suel, “On Probabilistic Networks for Selection, Merging, and Sorting”, Proc. 7th Symp. Parallel Algorithms and Architectures (1995) 106–118. 6. P.B. Miltersen, M. Paterson, J. Tarui, “The Asymptotic Complexity of Merging Networks”, J. Assoc. Comput. Mach. 43 (1996) 147–165. 7. K. Yamazaki, H. Mizuno, K. Masuda, S. Iwata, “Minimum Number of Comparators in (6,6)-merging Network”, IEICE Trans. Inform. Syst. E83-D (2000) 137–141. 8. A.C. Yao and F.F. Yao, “Lower Bounds of Merging Networks”, J. Assoc. Comput. Mach. 23 (1976) 566–571.

Problems which Cannot Be Reduced to Any Proper Subproblems Klaus Ambos-Spies Ruprecht-Karls-Universit¨ at Heidelberg Department of Mathematics and Computer Science Im Neuenheimer Feld 294, D-69120 Heidelberg, Germany [email protected]

Abstract. For any reducibility ≤r , an inﬁnite set A is called introimmune under r-reducibility if any subset B of A to which A can be rreduced is a ﬁnite variant of A. We show that there is a recursive – in fact exponential-time computable – set which is introimmune under polynomial-time bounded Turing reducibility. More generally, there are recursive introimmune sets for all recursively presentable reducibilities. This answers some questions of Cintioli and Silvestri.

1

Introduction

Soare [6] has shown that there is a set A which is not Turing reducible to any of its proper subsets, i.e., for any subset B of A such that A is Turing reducible to B, B is a ﬁnite variant of A. Following Cintioli and Silvestri [2] we call such a set introimmune under Turing reducibility or T-introimmune for short. Jockusch [3] has shown that T-introimmune sets are very complicated: Every Tintroimmune set is Turing hard for the class of arithmetical sets. So, in particular, no T-introimmune set can be deﬁned in ﬁrst order arithmetic. Simpson [5] has improved this observation by showing that T-introimmune sets are even Turing hard for the larger class of the hyperarithmetical sets. Cintioli and Silvestri [2] started the investigation of introimmunity under strong and subrecursive reducibilities. They have shown that for some strong reducibilities, namely for (nondeterministic) many-one reducibility and for conjunctive reducibility there are arithmetical (but necessarily nonrecursive) sets which are introimmune under these reducibilities and they raised the question whether there are recursive (or at least recursively enumerable) sets which are introimmune under some of the common subrecursive reducibilities. In particular they have asked whether there is a recursive set which is introimmune under polynomial-time bounded Turing reducibility (p-T-reducibility for short). Here we answer these questions aﬃrmatively. We show that there is a recursive – in fact exponential time computable – set A which is p-T-introimmune. Furthermore, straightforward modiﬁcations of our construction yield recursive sets which are introimmune under any given standard subrecursive reducibility and recursively approximable sets which are introimmune under truth-table reducibility. B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 162–168, 2003. c Springer-Verlag Berlin Heidelberg 2003

Problems which Cannot Be Reduced to Any Proper Subproblems

163

We conclude this introductory section by formally stating the deﬁnition of introimmunity and by reviewing some basic facts on this notion. In Section 2 we prove our main theorem, in Section 3 we summarize our further results. We assume the reader to be familiar with the basic concepts of computability and computational complexity theory. Our notation is standard. For unexplained concepts or notation see e.g. Odifreddi [4] and Balcazar et al. [1]. Deﬁnition 1 (Cintioli and Silvestri [2]). Let ≤r be any reducibility. A set A is ≤r -introimmune if A is inﬁnite and ∀B ⊆ A

∗

(A ≤r B ⇒ A = B),

(1)

∗

where A = B denotes that the symmetric diﬀerence of A and B is ﬁnite. For comparing introimmunity for diﬀerent reducibilities we need the following well known notion. We say that a reducibility ≤r is weaker than a reducibilty ≤r (or that ≤r is stronger than ≤r ) if ∀A, B

(A ≤r B ⇒ A ≤r B)

holds. Then the following is immediate by deﬁnition. Proposition 2. Let ≤r and ≤r be reducibilities such that ≤r is weaker than ≤r . Then every ≤r -introimmune set is ≤r -introimmune too. Note that for any unbounded reducibility ≤r which is weaker than manyone reducibility no ≤r -introimmune set is recursive. This follows from the fact that every inﬁnite recursive set A is many-one reducible to any nonempty ﬁnite subset B of A. Similarly, for any polynomial-time bounded reducibility ≤pr which is weaker than polynomial-time bounded many-one reducibility, no ≤pr introimmune set is in P. For a more general statement of these observations and some more basic facts on introimmune sets see Cintioli and Silvestri [2].

2

The Main Theorem

Theorem 3. There is a ≤pT -introimmune set A in EXPTIME. Proof. Let {Me : e ≥ 0} be an enumeration of the polynomial-time-bounded deterministic oracle Turing machines such that MeX (x) runs in time ≤ 2|x| for all strings x of length ≥ e (for all e, X). Then it suﬃces to construct an exponentialtime computable set A satisfying the requirements Pe (= R2e ) : ||A|| ≥ e ∗

Ne (= R2e+1 ) : ∀B (B ⊆ A & A = MeB ⇒ A = B) for all numbers e ≥ 0. Note that the P -requirements ensure that A is inﬁnite while the N -requirements ensure that (1) holds for ≤pT in place of ≤r . So a set A meeting the above requirements is ≤pT - introimmune.

164

Klaus Ambos-Spies

For the sake of our strategy for meeting the N -requirements we ensure that the set A is very sparse: We will ensure that A is contained in the set D = {0δ(n) : n ≥ 0} where δ is the iterated exponential function deﬁned by δ(0) = 0 and δ(n + 1) = 2δ(n) . Note that, by polynomial-time honesty of δ, the set D is in P. In fact, given a string x, in poly(|x|) steps we can do both, decide whether x is an element of D and – if so – compute the unique number n such that x = 0δ(n) . The fact that A is contained in D will be used as follows. Note that, by choice of Me , for s ≥ e the length of any query in the computation MeB (0δ(s) ) is bounded by 2δ(s) , whence, by deﬁnition of δ, no string 0δ(s ) with s > s can be queried in this computation. For a subset B of A, i.e., for B ⊆ A ⊆ D, this implies that the computation of MeB (0δ(s) ) only depends on the values of B on strings less than or equal to the input, i.e., more precisely on B ∩ {0δ(0) , . . . , 0δ(s−1) } and on B(0δ(s) ). Now, the set A is deﬁned in stages where A(0δ(s) ) is speciﬁed at stage s. The part A ∩ {0δ(0) , . . . , 0δ(s−1) } of A speciﬁed prior to stage s is denoted by As (A0 = ∅). At stage s of the construction we take action towards meeting the highest priority requirement which has not been satisﬁed before and which can be succesfully attacked at stage s. (Here requirement Re has higher priority than requirement Re if e < e .) The strategies for meeting the individual requirements will be ﬁnitary whence we will be able to argue that all requirements are eventually met. The strategy for meeting a P -requirement is straightforward: We say that requirement Pe is satisﬁed at stage s if ||As || ≥ e. Requirement Pe requires attention at stage s if Pe is not satisﬁed at this stage. If Pe is the highest priority requirement requiring attention at stage s then we put 0δ(s) into A. Obviously, e of these actions suﬃce to meet Pe . So the strategy for meeting a P -requirement is ﬁnitary and positive (i.e. forces strings into A). In order to explain the strategy for meeting an N -requirement, ﬁx e and assume that Ne fails. Then there is a subset B of A such that A = MeB but ∗

A = B. So there must be inﬁnitely many stages s such that B(0δ(s) ) = 0 and A(0δ(s) ) = MeB (0δ(s) ) = 1.

(2)

Fix such an s ≥ e. Then, as observed above, no string 0δ(s ) with s > s can be queried in the computation MeB (0δ(s) ). Hence, by B ⊆ A ⊆ D, the computation of MeB (0δ(s) ) only depends on B ∩ {0δ(0) , . . . , 0δ(s−1) } and on B(0δ(s) ). It follows that at stage s of the construction we can check whether there is the risk that (2) may hold for some subset B of A by checking whether there is any subset B of As = A ∩ {0δ(0) , . . . , 0δ(s−1) } such that MeB correctly computes A on δ(0) δ(s−1) B δ(s) {0 , . . . , 0 } and Me (0 ) = 1. If this is the case we can prevent (2) from happening by letting A(0δ(s) ) = 0 thereby ensuring that A(0δ(s) ) = MeB (0δ(s) ). Note that this action is negative (i.e. restrains strings from A). By K¨ onig’s Lemma we will argue that, for given e, we have to take this action at at most ﬁnitely many stages whence the N -strategies will be ﬁnitary too.

Problems which Cannot Be Reduced to Any Proper Subproblems

165

For the formal description of the strategy for meeting requirement Ne we use the following notation. For a string α we let Meα (0δ(s) ) denote the computation MeSα (0δ(s) ) where Sα = {0δ(n) : n < |α| & α(n) = 1}. Then Ne requires attention via α at stage s if s ≥ e and α is a string of length s such that ∀m < |α| (α(m) ≤ A(0δ(m) )) δ(m)

∀m < |α| (A(0 Meα0 (0δ(|α|) )

)=

Meα (0δ(m) ))

=1

(3) (4) (5)

hold; and Ne requires attention at stage s if Ne requires attention via some α at stage s. Note that, for a string α such that Ne requires attention via α at stage s, condition (3) ensures that the set Sα corresponding to α is a subset of the part As of A deﬁned prior to stage s while conditions (4) and (5) imply that machine Me with oracle Sα correctly computes As and predicts the value 1 for A(0δ(s+1) ), respectively (i.e., Sα corresponds to the set B in the informal discussion of the Ne strategy above). Based on these deﬁnitions, stage s of the construction of A at which A(0δ(s) ) is speciﬁed is as follows. Stage s.

Fix n minimal such that Rn requires attention. If n is even let A(0δ(s) ) = 1, otherwise let A(0δ(s) ) = 0. In any case say that Rn is active at stage s.

As one can easily check, the ﬁrst s stages of the construction can be carried out in 2O(δ(s)) steps. Hence the constructed set A is in EXPTIME. It remains to show that the constructed set satisﬁes the requirements Pe and Ne for all e ≥ 0. This is done by proving the following two claims. Claim 1. Every requirement Rn requires attention at at most ﬁnitely many stages. Proof. The proof is by induction on n. Fix n and, by inductive hypothesis, choose s0 such that s0 > n and no requirement Rm with m < n requires attention at any stage s ≥ s0 . Then Rn becomes active at any stage s ≥ s0 at which it requires attention. Now, for a contradiction, assume that Rn requires attention inﬁnitely often. Distinguish the following two cases according to the type of the requirement Rn . Case 1: Rn is a P -requirement, say Rn = Pe . By assumption, we may ﬁx a stage s1 > s0 such that Pe required attention – hence became active – at e stages t with s0 ≤ t < s1 . So 0δ(t) ∈ A for at least e numbers t less than s1 . It follows that ||As || ≥ e for all s ≥ s1 whence Pe does not to require attention after stage s1 contrary to assumption. Case 2: Rn is an N -requirement, say Rn = Ne . Let S be the set of strings α0 such that Ne requires attention via α at some stage s ≥ s0 , and let T be the binary tree obtained from S by closure under preﬁxes. Since any string α such that Ne requires attention via α at stage s has length s and since, by assumption, Ne requires attention at inﬁnitely many stages, the set S – hence the tree T – is

166

Klaus Ambos-Spies

inﬁnite. So, by K¨ onig’s Lemma, the tree T possesses an inﬁnite path. Since T is the downward closure of S, it follows that there are strings β1 and β2 in S such that β2 properly extends β1 . We will show that this is impossible which will give the desired contradiction. By deﬁnition of S, for the strings β1 and β2 as above there are strings α1 and α2 such that β1 = α1 0 and β2 = α2 0,

(6)

and stages s1 and s2 such that s0 ≤ s1 < s2

(7)

Ne requires attention via αi at stage si (i = 1, 2).

(8)

and Since β2 is a proper extension of β1 it follows from (6) that α2 extends α1 0. Since Ne requires attention via α1 at stage s1 this implies by clause (5) in the deﬁnition of requiring attention that Meα2 (0δ(|α1 |) ) = Meα1 0 (0δ(|α1 |) ) = 1.

(9)

On the other hand, since, by s0 ≤ s1 , Ne receives attention at stage s1 , A(0δ(|α1 |) ) = 0.

(10)

But (9) and (10) imply that (4) fails for α2 in place of α whence Ne does not require attention via α2 at stage s2 contrary to (8). This completes the proof of Claim 1. Claim 2. Every requirement Rn is met. Proof. Fix n and for a contradiction assume that Rn is not met. We will show that Rn will require attention inﬁnitely often contrary to Claim 1. If Rn is a P requirement then this is immediate. So, for the remainder of the proof, assume that Rn is an N -requirement, say Rn = Ne . By failure of Ne there is a subset B of A such that A = MeB and B(0δ(s) ) < A(0δ(s) ) for inﬁnitely many numbers s. We will show that, for any such s with s ≥ e, Ne requires attention at stage s. So ﬁx s ≥ e with B(0δ(s) ) = 0 and A(0δ(s) ) = 1 and let α be the string of length s deﬁned by α(m) = B(0δ(m) ) for m < s. Then, by B ⊆ A, (3) holds and, by A = MeB , (4) holds. Finally, by A = MeB and by choice of s, Meα0 (0δ(s) ) = MeB (0δ(s) ) = A(0δ(s) ) = 1, whence (5) holds too. So Ne requires attention via α at stage s. This completes the proof of Claim 2. As observed above, Claim 2 implies that the constructed set A is ≤pT -introimmune. This completes the proof of the theorem. The complexity bound in Theorem 3 is not optimal. By applying standard techniques (like using a slow enumeration MeX (x) of the p-T-reductions and delaying the ﬁrst stage at which a requirement may become active) the proof of Theorem 3 can be modiﬁed to yield a ≤pT -introimmune set A in any given smooth hyperpolynomial time class, i.e., in any time class DTIME(f (n)) where f is a nondecreasing time-constructible function majorizing all polynomials.

Problems which Cannot Be Reduced to Any Proper Subproblems

3

167

Further Results

Since polynomial-time Turing reducibility is the weakest of the (deterministic) polynomial-time reducibilities, it follows from Proposition 2 that Theorem 3 establishes the existence of introimmune sets in EXPTIME for all of these reducibilities. Moreover, the proof of Theorem 3 can be easily modiﬁed to obtain recursive introimmune sets under any of the common subrecursive reducibilities. Since the standard subrecursive reducibilities are recursively presentable (or can be weakend to recursively presentable reducibilities) this is achieved as follows. We ﬁrst recall the deﬁnition of presentability we will use. Deﬁnition 4. Let the reducibility ≤r be deﬁned by ∀A, B (A ≤r B ⇔ ∃e (A = MeB ) )

(11)

where {Me : e ≥ 0} is a (recursive) enumeration of total oracle Turing machines. Then the reducibility ≤r is (recursively) presentable, and {Me : e ≥ 0} is a (recursive) presentation of ≤r . Here an oracle machine M is called total if M X (x) is deﬁned for all oracles X and inputs x. (So total oracle Turing machines are just total recursive operators.) Note that, by K¨ onig’s Lemma, the computation tree T (x) of a total oracle machine M on input x is ﬁnite. So, more formally, a presentation {Me : e ≥ 0} can be deﬁned by the set {(e, x, Te (x)) : e ≥ 0 & x ∈ 2ω }, where Te (x) is the computation tree of Me on input x. Then the complexity of a presentation can be identiﬁed with the complexity of this corresponding set. It is well known that there is no weakest recursively presentable reducibility whereas truth-table reducibility is the weakest presentable reducibility. So, in particular, Turing reducibility is not presentable. Furthermore, truth-table reducibility has a ∆02 presentation. Theorem 5. Let ≤r be a presentable reducibility and let {Me : e ≥ 0} be a presentation of ≤r . There is an ≤r -introimmune set A which is recursive in the presentation {Me : e ≥ 0}. In particular, if ≤r is recursively presentable then there is a recursive ≤r -introimmune set A. Proof. (Idea) It suﬃces to make the following two changes in the proof of Theorem 3. First {Me : e ≥ 0} is now the given presentation of the reducibility ≤r . Second, the iterated exponential function δ has to be replaced by the function δ inductively deﬁned by δ(0) = 0 and δ(n + 1) = t(δ(n)) where the function t is such that t(n) is a strict upper bound on the depth of the computation trees Te (x) of the oracle machines Me on input x for e ≤ n and |x| ≤ n. The deﬁnition of the function t ensures that for s ≥ e the length of any query in the computation MeX (0δ(s) ) (for any oracle X) is less than t(δ(s)), whence, by the modiﬁed deﬁnition of δ, no string 0δ(s ) with s > s can be queried in this computation. We leave it to the reader to check that these changes suﬃce to make the proof work in the more general setting.

168

Klaus Ambos-Spies

Theorem 5 does not only completely settle the question of the existence of recursive introimmune sets for the standard subrecursive reducibilities but it also yields introimmune sets of low arithmetical complexity for some of the most common unbounded reducibilities, namely for all reducibilities of truthtable type. By Proposition 2, this follows from the above mentioned fact that truth-table reducibility has a ∆02 presentation. Corollary 6. There is a ≤tt -introimmune set in ∆02 . Cintioli and Silvestri [2] have shown that there is a set in the fourth level ∆04 of the arithmetical hierarchy which is introimmune under conjunctive reductions. The above corollary improves this result in two directions, ﬁrst by extending the class of conjunctive reductions to the class of general truth-table reductions and second by decreasing the complexity bound from ∆04 to ∆02 . Note that the class ∆02 consists of the sets which are recursive in the halting problem or, equivalently, which are recursively approximable. The complexity bound in Corollary 6 is optimal in the sense that we cannot replace ∆02 by the class of recursively enumerable (r.e.) sets. This follows from the fact that any inﬁnite r. e. set A contains an inﬁnite recursive set B whence any nonrecursive r. e. set A is reducible to a proper subset A − B of A.

References 1. J. L. Balcazar, J. Diaz and J. Gabarro. Structural Complexity Theory, vol. I. Springer-Verlag, Berlin, 1988. 2. P. Cintioli and R. Silvestri. Polynomial time introreducibility. Theory Comput. Systems 36 (2003) 1-15. 3. C. G. Jockusch, Jr. Upward closure and cohesive degrees. Israel J. Math. 1973 (15) 332-335. 4. P. Odifreddi. Classical Recursion Theory. North-Holland, Amsterdam, 1989. 5. S. G. Simpson. Sets which do not have subsets of every higher degree. J. Symbolic Logic 43 (1978) 135-138. 6. R. I. Soare. Sets with no subsets of higher degree. J. Symbolic Logic 34 (1969) 53-56.

ACID-Uniﬁcation Is NEXPTIME-Decidable Siva Anantharaman1 , Paliath Narendran2, , and Michael Rusinowitch3 2

1 LIFO - Orl´eans, France University at Albany–SUNY, USA 3 LORIA - Nancy, France

Abstract. We consider the uniﬁcation problem for the equational theory AC(U )ID obtained by adjoining a binary ‘∗’ which is distributive over an associative-commutative idempotent operator ‘+’, possibly admitting a unit element U . We formulate the problem as a particular class of set constraints, and propose a method for solving it by using the dag automata introduced by W. Charatonik, that we enrich with labels for our purposes. AC(U )ID-uniﬁcation is thus shown to be in NEXPTIME. Keywords: E-Uniﬁcation, Complexity, Set constraints, Tree automata, Dag automata.

1

Introduction

The uniﬁcation problems for the theories AC (“Associativity-Commutativity”) and ACU (“AC plus Unit element”) have been studied in great detail in the past, and admit many applications ([5]). Natural extensions are the theories that one obtains by adjoining a binary symbol ‘∗’ which is ‘multiplication-like’ over the basic AC(U )-symbol ‘+’; if this ‘∗’ is assumed 2-sided distributive over ‘+’, the uniﬁcation problem is undecidable by Hilbert’s tenth problem, cf. loc. cit.. The objective of this paper is to show that, when we assume in addition that the AC(U )-symbol is idempotent, uniﬁcation actually becomes decidable. The theories that we get under such assumptions on ‘+’ and ‘∗’ are useful in program speciﬁcation based on set constraints; we denote them by AC(U )ID. Due to the idempotence of ‘+’, an ACU ID-uniﬁcation problem can itself be formulated as one of solving a particular class of set constraints in terms of ﬁnite sets; and under such a vision, ACID-uniﬁcation – where no unit element is assumed for ‘+’ – reduces to solving the same set constraints problem in terms of ﬁnite and non-empty sets. In the current paper, we ﬁrst address the problem for the theory ACID, and show that solving the set constraints deduced from a given ACID-uniﬁcation problem P, amounts to showing the non-emptiness of the language of a labeled dag automaton constructed naturally from P. We conclude therefrom that ACID-uniﬁcation is NEXPTIME-decidable; it then follows that the same upper bound holds also for ACU ID-uniﬁcation. A DEXPTIME lower bound has been established for both the problems in [3].

Supported in part by NSF grant CCR-9712396and ONR grant N00014-01-1-0430.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 169–178, 2003. c Springer-Verlag Berlin Heidelberg 2003

170

Siva Anantharaman, Paliath Narendran, and Michael Rusinowitch

This paper is arranged as follows. We ﬁrst consider ACID-uniﬁcation, and formulate the problem in terms of set constraints, then do some pre-processing based on Occur-Check (referred to as ‘pruning’ subsequently), meant to eliminate syntactically some unsolvable cases; this constitutes section 2. Generalities on dag automata and their languages are given section 3, where we also introduce the notions of labeled term dag, and of labeled term dag automaton. We proceed then to associate a labeled term dag automaton to any ACID-uniﬁcation problem given in a ‘standard’ form, with the objective that solving the latter is equivalent to showing that the former has a non-empty language. Section 4 is devoted to the construction of this automaton. The automaton constructed serves our purposes only indirectly: a solution to the ACID-uniﬁcation problem is not always the sets of terms deduced from the labels at the various nodes of an accepted labeled term dag; but one can build naturally a grammar from any accepting run (cf. section 5) and from this grammar a solution to the ACID-uniﬁcation problem can be derived as ﬁnite non-empty sets of terms over the constants (cf. [4] for illustrative examples). Our thanks to Hubert Comon who pointed out that reasoning with boolean vectors alone at the states of our automaton would lead to incorrect conclusions; and to Sophie Tison and Witold Charatonik for useful discussions.

2

From ACID-Uniﬁcation to Set Constraints

We consider the uniﬁcation problem w.r.t. the following theory, that we shall denote ACID: x + x ≈ x, x ∗ (y + z) ≈ (x ∗ y) + (x ∗ z), (u + v) ∗ w ≈ (u ∗ w) + (v ∗ w) plus the AC-axioms for ‘+’. This set of equations can be converted to a convergent rewrite system (modulo ACI); any ground term over given free ground constants, in normal form w.r.t. this system, can then be viewed as a set of terms over ‘∗’ and the constants: indeed ‘+’ can be viewed as set union. An ACIDuniﬁcation problem with free constants is that of solving, modulo this equational theory, a family of equations of the form: {s1 = t1 , . . . , sk = tk }, where the terms in the equations and/or the solutions can involve the given free constants. We shall be assuming henceforth that our ACID-uniﬁcation problem (after normalization and decomposition steps) is in a standard form, i.e. every equation in our problem has one of the following forms (respectively referred to as of type ‘product’, ‘sum’, or ‘constant’): x = y ∗ z, u = v + w, u = c , where u, v, w, x, y, z are variables and c is any constant. A given ACID-uniﬁcation problem can be reduced to a standard form in more than one manner. Since ‘+’ is idempotent and ‘∗’ distributes left and right over ‘+’, we may view this ACID-uniﬁcation problem as a set constraints problem; e.g. if y and z are interpreted as sets of terms over ∗ and the constants, then y ∗ z = {s ∗ t | s ∈ y, t ∈ z}. The problem of satisﬁability of set constraints allowing arbitrary sets in the solutions, has been studied intensively over the past decade, in particular in [1,2,6,10,12,7,8,9,13]. Very few results seem to be known however, for solvability with ﬁnite sets. The only result we actually know of is very general, based on

ACID-Uniﬁcation Is NEXPTIME-Decidable

171

the GTSA’s of Gilleron, Tison and Tommasi (cf. proposition 14, [12]); based on that, we may formulate our next result (but with no complexity estimate): Proposition 1. The ACID-uniﬁcation problem is decidable. We show in this paper that this problem is NEXPTIME-decidable via an approach based on the formalism of dag automata ([7]), that we enrich with labels at its nodes in order to serve our purposes. We ﬁrst “pre-process” our uniﬁcation problem, to eliminate syntactically some of the unsolvable cases. 2.1

Pre-processing an ACID-Uniﬁcation Problem

Pre-processing our ACID-uniﬁcation problem P (given in standard form) is based on a dependency graph associated to P; The nodes of this graph will be the variables {x, y, z, . . .} appearing in P; its edges are directed and labeled by one of the two symbols ‘{∗, ⊃}’ as follows: For any two variables u, v, (u, v) is an edge labeled with ‘∗’ (resp. ⊃) iﬀ there is an equation of the form u = v ∗ w (resp. u = v + w) in our ACID-problem. The arcs of this graph will be called dependency arcs. A variable from which there is no outgoing edge is called an end-variable of P. If u, v are two variables in the problem P, then we write u v iﬀ there is a directed path from u to v; and we write u ∗ v iﬀ there is a directed path from u to v with at least one edge along the path labeled with ‘∗’. By preprocessing P we mean carrying out the following three operations on P and/or its associated dependency graph. Occur-Check-1: Return ‘Fail’ if there is a variable u such that u ∗ u; or, if u ∗ v for some v, and u = a is in P, where a is a ground constant. Next, if the problem P admits a solution then for every ‘sum’-equation Z = X + Y in P, terms of maximal height in Z have to be in X or in Y . This gives: Occur-Check-2: Return ‘Fail’ if there is a ‘sum’-equation Z = X + Y in P, such that: Z ∗ X and Z ∗ Y . The ﬁnal pre-processing step consists in ‘locating’ the variable(s) to the right of every ‘sum’-equation Z = X + Y in P, which can contribute terms of maximal ‘∗’-height to Z in a ﬁnite, non-empty solution for P. For doing that, we consider all possible equivalence relations ∼ between the set variables such that: for every ‘sum’-equation Z = X + Y in P, either Z ∼ X or Z ∼ Y non exclusively; these will be referred to as height relations. Occur-Check-3. If for every height relation ∼ there are two variables in P such that X ∼ Y and X ∗ Y then return ‘Fail’. We say that pre-processing fails on P iﬀ ‘Fail’ is the value returned by one of the three Occur-Checks. The problem P is said to be in pruned form otherwise; in such a case, we introduce additional (height equivalence) arcs on the dependency graph of P, as follows. For every Z = X + Y in P, if we have X ∼ Z (resp. Y ∼ Z) for some height relation for which Occur-Check-3 does not ‘Fail’, then we introduce a ∼-arc from X (resp. Y ) to Z. Proposition 2. i) Pre-processing any ACID-uniﬁcation problem P can be performed in time NP; ii) The dependency graph of a pruned problem P is ‘∗’-loop free: that is to say, there is no node u such that u ∗ u.

172

Siva Anantharaman, Paliath Narendran, and Michael Rusinowitch

Example 1. The problem: V = X + Z, V = W + U W = X ∗ Y is in pruned form: V ∼ Z in the ﬁrst ‘sum’-equation, and V ∼ W or V ∼ U in the second; and the dependency graph satisﬁes the ‘∗’-loop free condition. This problem is actually solvable (cf. Example 2 below). We propose to show that solving an ACID-uniﬁcation problem P in pruned form, amounts to showing the non-emptiness of the language of a labeled dag automaton associated with P. proposition 2, ii) will be used for that purpose.

3

Labeled Term Dags and Labeled Dag Automata

We ﬁrst recall the notions of term dags and of dag automata as developed in [7]. A term dag (or t-dag) over a ranked alphabet Σ is a rooted dag where each node has a symbol from Σ such that: (i) the out degree of the node is the same as the rank of the symbol, (ii) edges going out of a node are ordered, and (iii) no two distinct subgraphs are isomorphic. Every node represents a unique term in a t-dag, so we often treat “node” and “term” as synonymous on a t-dag. Deﬁnition 1. A t-dag automaton (or dag automaton, “DA” for short) over a ranked alphabet Σ is a tuple (Σ, Q, F, ∆), where Q is a ﬁnite set of states, F ⊂ Q is the set of accepting states, and ∆ is a set of transition rules of the form: f (q1 , q2 , ..., qn ) → q, where f ∈ Σ is of arity of n, and q, qi ∈ Q, 1 ≤ i ≤ n. A run r of a DA A = (Σ, Q, F, δ) on a t-dag t is a mapping from the set of nodes of t to the set of states Q of DA, that respects the transition relation δ, i.e., for every node u, if the symbol at u is f of arity k, then f (r(u1 ), . . . , r(uk )) → r(u) must be a transition in δ, where u1 , . . ., uk are the successor-nodes of u given in order. A run r is accepting on t if and only if r(t) ∈ F , i.e, it maps the root node to an accepting state. A t-dag t is accepted by a DA iﬀ there is an accepting run on t. (Note that our runs are deﬁned bottom-up.) The ‘language’ of a DA is the set of all t-dags that it accepts. It is easy to show that the class of t-dag automata is stable under union and intersection, but emptiness is hard to decide; that emptiness of a DA is N P -decidable is proved in [7]. A labeled term dag, or lt-dag in short, is a term dag equipped additionally with a mapping from the nodes of the dag to a given set of labels E. A motivation for adding labels is the fact that a labeled term dag can be used to specify ﬁnite sets of terms, when the labels are boolean, cf. the following section. A labeled dag automaton (or LDA in short) is a quintuple (Σ, Q, F, ∆, E), where the ﬁrst four components form a DA, E is a ﬁnite set of labels, and the l transition relation ∆ consists of labeled rewrite rules of the form: f (q1 , . . . , qk ) → q, where k is the arity of f , and l is a label from E, and the qi , q are in Q. A run r of an LDA (Σ, Q, F, ∆, E) on an lt-dag t with label E is a mapping from the nodes of t to Q that respects the labels and the transition relation ∆: • for every node u on t, if the symbol at u is f of arity k, and the label of t at u is l, then transitions are possible via rules in ∆ of the form l f (r(u1 ), . . . , r(uk )) −→ r(u), where u1 , . . ., uk are the successor-nodes of u on t given in order.

ACID-Uniﬁcation Is NEXPTIME-Decidable

173

We shall paraphrase this condition as: “the label on an LDA-transition must tally with the one at the node reached ”. A run r is said to be pseudo-accepting on t iﬀ r(t) ∈ F , i.e, it maps the root node to an accepting state. An lt-dag t is said to be pseudo-accepted by an LDA iﬀ there is a pseudo-accepting run on t. The ‘language’ of an LDA is the set of all lt-dags that it pseudo-accepts. That its emptiness is N P -decidable for any given LDA is a direct consequence of the N P -decidability of emptiness of the language for any given DA. This is a key result on which our approach is based.

4

The LDA Associated to an ACID-Uniﬁcation Problem

We suppose given a (pruned) ACID-uniﬁcation problem P, and denote by {Xi }i=1..n the set of its variables. The lt-dags on which we shall consider runs of our LDA’s (to be deﬁned), are term dags over the symbol ‘∗’ and the given ground constants, labeled with n-bit vectors at their various nodes. These labels will in general be denoted by (m1 , m2 , . . . , mn ), mi ∈ {0, 1}, i = 1..n. They have the semantics that mi = 1 iﬀ the subterm of the lt-dag at the current node is an element of the set Xi . The term label will denote any n-bit vector. For any lt-dag t, the subterm of the dag at a node w will be denoted by tw ; the underlying term at the root node will be denoted t. The label of t at node w will be denoted by lw ; and for any i, the i-th entry of this label is denoted by liw . Deﬁnition 2. i) A label m = (m1 , m2 , . . . , mn ) is said to be a permitted label iﬀ : mk = 0 for any k for which there is an equation of the form Xk = Xi ∗ Xj in the problem P. ii) A label m is said to be a initial w.r.t. a ground constant a iﬀ m is a permitted label, and : mi = 1 for every i ∈ 1..n for which Xi = a is in P; mj = 0 for any j ∈ 1..n for which P contains an equation Xj = b with b = a. iii) For any i = 1..n, the set value deduced for Xi from the labels of an lt-dag t, is the set of ground terms: {tw | w node on t such that: liw = 1}. The above deﬁnition is coherent: if there is an equation of the form Xk = a in P, then Xk cannot be the lhs of a ‘product’-equation (note: it is assumed that the problem P is pruned). Remark 1. If the set values deduced from the labels of an lt-dag t satisfy the ACID-problem P, then t must satisfy the following condition: If two nodes u, v on an lt-dag t are such that liu = 1 = ljv and if there is a ‘product’-equation Xk = Xi ∗ Xj in P, then there must be a node w on the dag t such that lkw = 1 with subterm tu ∗tv . We shall refer to this condition as the ‘closure property’; an lt-dag having this closure property is said to be a closed lt-dag. It seems unlikely that such a ‘global’ condition can be checked under an automaton-based approach; thus, for the notion of acceptance that we propose below, the solution to the ACID-problem derived from an accepted lt-dag t is not in general the set values deduced from the labels of t.

174

4.1

Siva Anantharaman, Paliath Narendran, and Michael Rusinowitch

The States of the LDA with Their Defect Sets

Recall that n denotes the number of set variables of the given ACID-problem. Let S be the set of all 2n-bit vectors of the form {..., li , ...; ..., hi , ...} such that li ≤ hi for all i ∈ 1..n; the elements of S will be referred to as pstates (‘p’ stands for ‘preliminary’); we shall denote them by barred capital letters such as A, B, . . .. For any pstate A ∈ S, we shall denote by A.l (resp. A.h) its ﬁrst- or lower- half (resp. second- or upper- half) n-bit vector. Any state A of our LDA will have two components: its ﬁrst component is a pstate, i.e. a 2n-bit vector, that we denote A; this corresponds to two sets of boolean valuations on the set expressions {Xi }i=1..n ; the ‘lower i-th bit’ A.li of the state A on the LDA will signify (when 1) that the term at the current node on an lt-dag mapped to A under a given run r is accepted as element of Xi ; the ‘upper i-th bit’ A.hi is meant to signify (when 1) that some subterm below the current node has been accepted in Xi by the run; this explains why we only consider pstates A with A.l ≤ A.h. An accepting state on the LDA will then be in particular such that A.hi = 1 for all i = 1..n; but as we shall see below, this is not a suﬃcient condition for acceptance: many of the ‘product’-equations of P may still remain unsatisﬁed. To circumvent this, we add as a second component to any state A of the LDA a set of equalities formed from new symbols, giving information on the ‘products of terms accepted below the current node, that still have to be covered’. This second component at A will be referred as the defect set at A; for deﬁning it formally, we need a few preliminaries. A triple (k, i, j) of indices in 1..n is said to be conjugate w.r.t. the ACIDproblem P iﬀ there is an equation of the form Xk = Xi ∗ Xj in the problem P; a pair of indices (i, j) is said to be conjugate iﬀ they are the second and third components of some conjugate triple. A label l is said to be conjugate to a label m w.r.t. an equation Xk = Xi ∗ Xj of the ACID-problem P if and only if: li = 1 = mj . We shall agree to refer to any pstate A as ‘unmarked’ iﬀ the following holds: whenever there is an equation Xk = Xi ∗ Xj in the problem P we have A.li = 0 = A.lj . i Next we introduce new symbols XA for every i = 1..n, and every pstate A ∈ S; and also an additional new symbol T (signifying ‘to be covered’); these symbols will be referred to as dsymbols (the ‘d’ stands for ‘discriminating’). j i Two dsymbols XA , XB are said to be conjugate iﬀ the pair of indices (i, j) is conjugate. A defect equality over the dsymbols is, by deﬁnition, an equality having one of the two following forms: j j i i XCk = XA XB , T = XA XB

where A, B, C ∈ S, and (k, i, j) is a conjugate triple. Equalities of the ﬁrst type are said to be ‘closed’, and those of the second type ‘open’. Note that the number of all such equalities is polynomial in the size of S. Deﬁnition 3. (i) A defect set is any ﬁnite (possibly empty) set M containing dsymbols and defect equalities. Such a set M is said to be closed iﬀ: M contains j i no open defect equalities, and for any two dsymbols XA , XB in M which are

ACID-Uniﬁcation Is NEXPTIME-Decidable

175

j i conjugate, there is a closed equality in M with XA XB as its rhs. The empty defect set is closed, by deﬁnition. (ii) A state A of the LDA is a pair (A, MA ) such that A is a pstate, and MA is a defect set; A is called the ﬁrst component of A, and the defect set MA its second. A state (A, MA ) is said to be closed iﬀ: the pstate A is unmarked, A.hi = 1 for all i, and the defect set MA is closed in the sense deﬁned above.

To any label m = (m1 , ..., mn ) we associate a state (C m , Mm ), where: - C m is the pstate such that: C m .l = C m .h = m; - the defect set Mm contains all the dsymbols XCp

m

for indices p such that

C m .lp = 1, and all the open defect equalities T = XCi XCj m

gate pair of indices (i, j) such that C m .li = 1 = C m .lj .

m

for any conju-

For example, let Z = X ∗ Y be the given ACID-problem, and {X, Y, Z} its variable list. If m = (110) is a given label, then the associated state is such that: C m = C = (110; 110) and Mm is the defect set {XC , YC , T = XC YC }. This state is not closed. States associated to labels will be those reached under initialization by the runs of the LDA associated with the ACID-problem. 4.2

The Construction of the LDA Associated

As previously, P denotes the ACID-uniﬁcation problem given in standard, pruned form. In the following deﬁnition, for any defect set M we shall denote by M(s) (resp. by M(e) ) the set of all dsymbols (resp. defect equalities) which are elements of M; M is thus the disjoint union of its subsets M(s) and M(e) . Deﬁnition 4. The LDA associated to P is the quintuple (Σ, Q, F, ∆, L), where: Σ is the signature of the problem P, and the set of states Q is the set of all pairs A = (A, MA ) of pstates and defect sets such that for any A the following condition holds: • A.li ≤ A.hi for all i; and A.lk = A.li ∨ A.lj for every sum-equation Xk = Xi + Xj in the problem P; The set F of accepting states is the set of all closed states in Q (in the sense of the deﬁnition above). The set of labels L for the LDA is the set of all n-bit vectors (l1 , l2 , . . . , ln ); • (initialization) For any a ∈ Σ and any label m initial w.r.t. a, there m are transitions in ∆ of the form: a −→ (C m , Mm ) (cf. deﬁnition above). • (progression) For any A, B, C ∈ Q and label m = (m1 , m2 , . . . , mn ), m there is a transition rule in ∆ with label m of the form A ∗ B −→ C, iﬀ: (1) Conditions on the pstates at A, B, C: • m = C.l (‘label on the transition must tally with the one at the node reached’); and for every equation in P of the form Xk = Xi + Xj , we have mk = mi ∨ mj ;

176

Siva Anantharaman, Paliath Narendran, and Michael Rusinowitch

• for all i, C.hi = C.li ∨ (A.hi ∨ B.hi ); and for every equation in P of the form Xk = Xi ∗ Xj , we have: C.lk = A.li ∧ B.lj . (2) Conditions on the defect sets MA , MB , MC : • MC = MA ∪ MB ∪ {X jC | C.lj = 1}; (s)

(s)

(s)

• Let TC be the set of all open defect equalities T = X iA X jB whose rhs (s)

are conjugate dsymbols from MC ; and let CC,A,B be the set of all closed equalities – whose rhs contains no dsymbols indexed by C – and which come from MA or MB , or are of the form X kC = X iA X jB for conjugate (e)

triples (k, i, j) such that A.li = B.lj = 1 = C.lk ; then MC is the set obtained from TC by replacing each open equality T = X iA X jB in TC by all the closed equalities with the same rhs in CC,A,B , if they exist.

Remarks 2. i) If the defect set MT at state T contains a dsymbol X iA , the semantics is that some node at or below the current node got mapped to a state whose pstate is A, and the subterm there got accepted in the set Xi . A more detailed explanation on the progression conditions can be found in [4]. ii) It is important to note that our LDA is deterministic: Given two states A and B and a label m, there is a unique transition with label m from A, B to a state C. (This will be needed in proving the completeness of our approach.) A run of the LDA on an lt-dag is deﬁned as in the case of DA’s, subject to the above initialization and progression conditions. In addition, we shall assume (as we may) that our runs are all coherent in the sense that: for any node w on t, if r(w) = C, then C.l = lw . Example 2. The ACID-uniﬁcation problem: X + Z = X ∗ Y + U may be transformed into the following standard (pruned) form: V = X + Z, V = W + U, W = X ∗ Y Let us arrange the set variables into an ordered list, say as {X, Y, Z, U, V, W }; the states on the LDA are thus 12-bit vectors. Let a be any constant. Then the lt-dag rooted at ‘a∗a’, with label (110110) at its leaf-node ‘a’, and label (001011) at ‘a ∗ a’ is accepted by the LDA; cf. [4] for the details. In Example 2, the set values deduced from the labels of the pseudo-accepted lt-dag is a solution to the uniﬁcation problem. (This may not be true in general; cf. Example 3.) It happened to be true above only because the lt-dag was itself closed. Our next result throws some light in this connection: if r is an LDA-run on a closed lt-dag t, then at the state r(t) the defect set must be closed; this fact will be used in proving the completeness of our LDA approach. Proposition 3. Let r be a run on an lt-dag, and u any node on t such that the state r(u) on the LDA associated to P is not closed, i.e. to say contains an open defect equality of the form X iA X jB . Then there exist two nodes u , v on t (at or) below u, satisfying the following conditions: i) the labels at u , v are non-null and conjugate, and (r(u ), r(v )) = (A , B ); ii) no node (at or) below u has as subterm the product term tu ∗ tv .

ACID-Uniﬁcation Is NEXPTIME-Decidable

177

Proof. By induction on the height of the node u (cf. [4] for the details). We now show that an lt-dag pseudo-accepted by the LDA may not always be a closed lt-dag, i.e. may not have the closure property. Example 3. Consider the ACID-problem: Z =? X ∗ Y , and the solution set X = {a, c}, Y = {b, d}. The following run is pseudo-accepting on the lt-dag rooted at the term (a∗b)∗(c∗d), with nodes at the subterms a, b, c, d, (a∗b), (c∗d) labeled (100), (010), (100), (010), (001), (001) respectively, but which has no node with (c ∗ b) or (a ∗ d) as subterm; this lt-dag is not closed: 100

100

a −−−−−−→ A = (100; 100; {XA }),

c −−−−−−→ A = (100; 100; {XA }),

b −−−−−−→ B = (010; 010; {YB }),

d −−−−−−→ B = (010; 010; {YB }),

010

001

010

A ∗ B −−−−−−→ C = (001; 111; {XA , YB , ZC , ZC = XA YB }), 000

C ∗ C −−−−−−→ D = (000; 111; MD = MC })

5

The LDA-Approach Is Sound and Complete

Throughout this section, r is a given run of the LDA associated to the given ACID-problem P, on a given lt-dag t, mapping t to an accepting state. From the run r we build a grammar G = G(r) that will allow us to derive a solution to our uniﬁcation problem. The non-terminals of G are the pstate symbols of the LDA; and its terminals are the ground constants appearing in P. Deﬁnition 5. Let w be any node of t such that its image r(w) = E is accepting. i) For every (closed) defect equality in ME of the form X kC = X iA X jB , add a production rule C −→ A ∗ B to G. ii) For every state B reached under initialization of r, and every ground constant b such that r(b) = B, add a production rule B −→ b to G. iii) For any i ∈ 1..n, the contribution Ci (w) of the run r to any set variable Xi , at the node w, is the set of all terms over the ground constants, that are frontiers of derivation trees from non-terminals C in G, for which X iC ∈ ME , i.e. such that C.li = 1. One then proves successively, that: (i) the sets Ci (w), i = 1..n, are all nonempty, (ii) the assignments Xi = Ci (w) satisfy the set constraints problem associated to P; and (iii) the Ci (w), i = 1..n, are all ﬁnite sets. The proof details – which use the fact that ‘∗ ’ is a strict ordering on the variables of P (cf. Proposition 2) – are to be found in [4]. Theorem 1. Let P be any ACID-uniﬁcation problem P in pruned form, and let A be the LDA associated. i) The LDA-approach is sound: if there is an lt-dag pseudo-accepted by A, then P is solvable. ii) The LDA-approach is complete: Suppose P is solvable, and assume given a solution to P as (ﬁnite, non-empty) sets of terms over the ground constants

178

Siva Anantharaman, Paliath Narendran, and Michael Rusinowitch

and ‘∗’. Then we can construct a closed lt-dag t pseudo-accepted by A, such that the given solution is exactly the set values deduced from the labels of t. Proof. The soundness of the LDA-approach follows by putting together the above three properties. The proof of the completeness uses the fact that our LDA is deterministic, and also Proposition 3; cf. [4] for the details. Theorem 2. AC(U )ID-uniﬁcation is NEXPTIME-decidable. Proof. For ACID the assertion follows because the LDA associated is of size exponential w.r.t. the number of variables in the problem; and ﬁnding an lt-dag pseudo-accepted by this LDA is NP-decidable. That of ACU ID-uniﬁcation can then be deduced as follows: Choose non-deterministically the set of variables which are assigned the value U = 0; and solve for the others using the ACIDalgorithm just described.

References 1. A. Aiken, E. Wimmers. Solving Systems of Set Constraints. In Proc. of the 7th IEEE Symposium on Logic in Computer Science, 1992 (LICS’92), pp. 329 – 340. 2. A. Aiken, D. Kozen, M. Vardi, E. Wimmers. The Complexity of Set Constraints. In Proc. Conf. CSL’93, EACSL, September 1993, pp. 1–18. 3. S. Anantharaman, P. Narendran, M. Rusinowitch, Uniﬁcation over ACUI plus Distributivity/Homomorphisms, To appear in Proc. of CADE-19, 2003 (cf. also the long version: ftp://ftp-lifo.univ-orleans.fr/pub/Users/siva/RR2002-11.ps) 4. S. Anantharaman, P. Narendran, M. Rusinowitch, AC(U)ID-Uniﬁcation is NEXPTIME-Decidable, Research Report RR-2003-02, LIFO, Universit´e d’Orleans (Fr.), (ftp://ftp-lifo.univ-orleans.fr/pub/Users/siva/RR2003-02.ps) 5. F. Baader, W. Snyder. Uniﬁcation Theory. In: J.A. Robinson and A. Voronkov, editors, Handbook of Automated Reasoning. Elsevier Science Publishers, 2001. 6. L. Bachmair, H. Ganzinger, U. Waldmann. Set Constraints are the Monadic Class. In Proceedings of the 8th IEEE Symp. on Logic in Computer Science, 1993 (LICS’93), pp. 75 – 83. 7. W. Charatonik. Automata on DAG representations of ﬁnite trees. Technical Report MPI-I-99-2-001, Max-Planck-Institut f¨ ur Informatik, Saarbr¨ ucken, Germany. 8. W. Charatonik, L. Pacholski. Set constraints with projections are in NEXPTIME. In. Proc. IEEE Symp. on Foundations of Computer Science, 1994, pp. 642–653. 9. W. Charatonik, A. Podelski. Set Constraints with Intersection. In Proc. of the 12th IEEE Symposium on Logic in Computer Science, Warsaw 1997 (LICS’97), pp. 362–372. (To appear in: Information and Computation). 10. H. Comon, M. Dauchet, R. Gilleron, F. Jacquemard, D. Lugiez, S. Tison, M. Tommasi. Tree Automata Techniques and Applications. http://www.grappa.univ-lille3.fr/tata/ 11. R. Gilleron, S. Tison, M. Tommasi. Solving Systems of Set Constraints using Tree Automata. In Proc. STACS’93, Springer-Verlag, LNCS 665, 505–514. 12. R. Gilleron, S. Tison, M. Tommasi. Set Constraints and Tree Automata Information and Computation 149, pp. 1–41, 1999. 13. J.-M. Talbot, Ph. Devienne, S. Tison. Generalized Deﬁnite Set Constraints. In CONSTRAINTS: an International Journal 5(1-2):161–202, 2000.

Completeness in Diﬀerential Approximation Classes (Extended Abstract) G. Ausiello1 , C. Bazgan2 , M. Demange3 , and V. Th. Paschos2

3

1 Dipartimento di Informatica e Sistemistica Universit` a degli Studi di Roma “La Sapienza” [email protected] 2 LAMSADE, Universit´e Paris-Dauphine {bazgan,paschos}@lamsade.dauphine.fr Department of Decision and Information Systems, ESSEC [email protected]

Abstract. We study completeness in diﬀerential approximability classes. In diﬀerential approximation, the quality of an approximation algorithm is the measure of both how far is the solution computed from a worst one and how close is it to an optimal one. The main classes considered are DAPX, the diﬀerential counterpart of APX, including the NP optimization problems approximable in polynomial time within constant diﬀerential approximation ratio and the DGLO, the diﬀerential counterpart of GLO, including problems for which their local optima guarantee constant diﬀerential approximation ratio. We deﬁne natural approximation preserving reductions and prove completeness results for the class of the NP optimization problems (class NPO), as well as for DAPX and for a natural subclass of DGLO. We also deﬁne class 0-APX of the NPO problems that are not diﬀerentially approximable within any ratio strictly greater than 0 unless P = NP. This class is very natural for diﬀerential approximation, although has no sense for the standard one. Finally, we prove the existence of hard problems for a subclass of DPTAS, the diﬀerential counterpart of PTAS, the class of NPO problems solvable by polynomial time diﬀerential approximation schemata.

1

Preliminaries

An NP optimization problem Π is deﬁned as a four-tuple (I, sol, m, opt) such that: I is the set of instances of Π and it can be recognized in polynomial time; given x ∈ I, sol(x) denotes the set of feasible solutions of x; for every y ∈ sol(x), |y| is polynomial in |x|; given any x and any y polynomial in |x|, one can decide in polynomial time if y ∈ sol(x); given x ∈ I and y ∈ sol(x), m(x, y) denotes the value of y for x; m is polynomially computable and is commonly called feasible value, or objective value; ﬁnally, opt ∈ {max, min}. The set of NP optimization problems forms the class NPO. An NPO problem Π is said to be polynomially bounded, if, for any instance x of Π, the value of the optimum B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 179–188, 2003. c Springer-Verlag Berlin Heidelberg 2003

180

G. Ausiello et al.

solution of x is bounded by a polynomial in |x|. The set of polynomially bounded problems of NPO forms the class NPO-PB. In what follows, given an instance x of Π and a feasible solution y for x, we denote by opt(x) the value of an optimal solution of x and by ω(x) the value of a worst solution of x; ω(x) is the value of the optimum solution for x with respect to the NPO problem Π = (I, sol, m, opt ) where opt = max, if opt = min and opt = min, if opt = max. Polynomial approximation deals with polynomial computation of “good”, with respect to a predeﬁned criterion, feasible solutions for hard NPO problems. Two main such criteria have been used until now: the standard approximation ratio and the diﬀerential approximation ratio. For an approximation algorithm A computing a feasible solution y for x with value mA (x, y), its standard approxA (x, y) = mA (x, y)/ opt(x) and its diﬀerential one imation ratio is deﬁned as γΠ A as δΠ (x, y) = |ω(x) − mA (x, y)|/|ω(x) − opt(x)|. In what follows, whenever it is understood, reference to problem Π will be dropped. Finally note that, for any A problem Π and for any algorithm A, 0 δΠ 1. An approximation measure µ is called cost-respecting ([1]) if given two solutions y1 and y2 for an instance x of an optimization problem Π, the fact that y1 is worse than y2 implies that µ(y1 ) is worse than µ(y2 ). Obviously, both standard and diﬀerential approximation ratios are cost-respecting measures. Regarding the type of approximation results, NPO problems can be classiﬁed with respect to the approximation ratios known for them. The main approximability classes are: APX (DAPX), the class of NPO problems polynomially approximable within constant standard (diﬀerential) approximation ratio; PTAS (DPTAS), the class of problems polynomially approximable by standard (differential) polynomial time approximation schemata, i.e., within standard (diﬀerential) ratios arbitrarily close to 1; FPTAS (DFPTAS), the class of problems approximable by standard (diﬀerential) fully polynomial time approximation schemata, i.e., within ratios arbitrarily close to 1 in time polynomial in both the size of their instances and in 1/. Since the beginning of the 80’s, researchers have been highly interested in providing a structure in standard approximation by deﬁning suitable approximation preserving reductions in order to study completeness in approximability classes. Pioneering works in this direction, used in this paper, are, among others, the ones in [2,1,3]. In [1] several natural minimization problems have been shown to be NPO-complete under an approximation preserving reduction called strictreduction, dealing with any cost-respecting approximation measure r. Throughout the paper, for any reduction R, we will denote by Π ≤R Π the fact that Π R-reduces to Π . In [3], the subclass MAX-SNP of APX has been introduced and complete problems have been provided for it, under L-reduction. In [2], a polynomial time approximation schema preserving reduction, called P-reduction there, has been introduced and the existence of APX-complete problems has been shown. In what follows, we borrow the term PTAS from [4,5] and we will use it instead of P. Furthermore, another reduction called F has been deﬁned in [2] by means of which PTAS-complete problems have been provided. Surprisingly enough, diﬀerential approximation, although introduced in [6] since 1977, has not been systematically used until the mid-90’s when a formal

Completeness in Diﬀerential Approximation Classes

181

framework for it and a more systematic use started to be drawn ([7]). In any case, no structural approach to the study of diﬀerential approximability has been developed until now. This is the main objective of this paper. In Section 2, we show the existence of NPO-complete problems in the framework of the diﬀerential approximation. We then introduce a subclass of NPO, called 0-DAPX, for the problems of which no polynomial time algorithm can guarantee that any solution computed will be even slightly far from a worst one, unless P = NP; in other words, the diﬀerential ratio of any polynomial time algorithm is equal to 0. We prove that under the strict-reduction NPO-complete = 0-DAPX-complete ⊆ 0-DAPX ⊆ NPO. In Section 3, we tackle the question of the existence of complete problems for DAPX. We deﬁne a suitable reduction, called DPTAS-reduction and show that under it many natural NPO problems are DAPX-complete. In section 4, we devise an appropriate reduction and show the existence of hard problems for a natural subclass of DPTAS. Besides PTAS, the two most notable classes of APX in the literature are MAX-SNP and GLO. The ﬁrst one, introduced, as we have already mentioned in [3], is deﬁned in logical terms and, furthermore, independently on any approximability property of its members; henceforth, MAX-SNP is notorious for diﬀerential approximation also without need of deﬁning any diﬀerential counterpart for it. The latter one, GLO, is, roughly speaking, the class of the NPO-PB problems whose all locally optimal solutions (with respect to a suitable neighborhood) guarantee constant standard approximation ratio. It is introduced in [8] where a local optima preserving (LOP) reduction, which is a special case of L-reduction provided with some suitable local optimality properties, is also deﬁned. In Section 5, we devise a local optima preserving reduction strongly inspired from the LOP-reduction of [8] and, under this new reduction we prove the existence of natural complete problems for a natural subclass of DGLO (the diﬀerential counterpart of GLO). The deﬁnitions of the NPO problems mentioned and/or discussed in this paper can be found in [9]. Also, results are given without detailed proofs which can be found in [10].

2

Diﬀerential NPO-Completeness

We study in this section NPO-completeness with respect to diﬀerential approximation. Based upon the generic strict-reduction of [1], we deﬁne a particular strict-reduction, called D-reduction, which we use in the sequel for proving NPO-completeness. Deﬁnition 1. A D-reduction is a strict-reduction dealing with diﬀerential ratio. Two optimization problems Π and Π are D-equivalent if Π D-reduces to Π and Π D-reduces to Π. Theorem 1. max wsat and min wsat are D-equivalent. As usually, ([11,1]), we denote by Max NPO and Min NPO, the classes of maximization and minimization NPO problems, respectively.

182

G. Ausiello et al.

Theorem 2. max wsat is Max NPO-complete and min wsat is Min NPOcomplete under ≤D . Max NPO-hard and Min NPO-hard (under ≤D ) coincide and form the class of NPO-hard problems. In a completely analogous way, one can prove the D-equivalence of min {0,1} integer programming and max {0,1} integer programming. In other words min {0,1} integer programming and max {0,1} integer programming are NPO-complete, under ≤D . We note here that, the result of [1] about the Min NPO-completeness of min tsp (Theorem 3.3) can be erroneously seen as in “glaring contradiction” to a result of [12,13] where it is proved that min tsp on graphs with polynomially bounded edge-distances is in DAPX. In fact, there is no contradiction at all. Solution triv for min tsp adopted in [1], is considered as a tour containing exclusively edges of maximum distance. But such a solution is not always feasible for any instance of min tsp (the worst-value solution for this problem is an optimal solution of max tsp); hence the strict reduction of Theorem 3.3 in [1] is not a D-one. We now introduce an approximation class, called 0-DAPX in what follows, that seems very natural for diﬀerential approximation while has no sense in the standard case. Deﬁnition 2. 0-DAPX is the class of NPO problems Π for which approximation within any diﬀerential approximation ratio δ > 0 would entail P = NP. A problem Π is said to be 0-DAPX-hard, if approximation of Π within any strictly positive diﬀerential approximation ratio would imply approximation of any other 0-DAPX problem within strictly positive approximation ratios. Remark that inclusion in 0-DAPX is rather a negative than a positive approximation result. This seems quite natural since 0-approximability represents the worst intractability level for an NPO problem in the diﬀerential approach. In [14] it is proved that if P = NP, then, for any decreasing δ : N → (0, 1), min independent dominating set is not diﬀerential δ-approximable in polynomial time. By analogous reductions, it is proved in [15] that for any k > 3, polynomially bounded max wk-sat-B as well as the general minimization and maximization versions of integer-linear programming are in 0-DAPX. Theorem 3. Under ≤D , NPO-complete = 0-DAPX-complete ⊆ 0-DAPX. A natural question rising from the above is: what is the relation between NPOcomplete and 0-DAPX? Taking into consideration the fact that 0-DAPX is the hardest diﬀerential approximability class in NPO, one might guess that NPO-complete ≡ 0-DAPX, but in order to prove it we need a stronger reducibility. We show in [10] that deﬁning a special a kind of Turing-reduction, one can prove that NPO-complete = 0-DAPX-complete = 0-DAPX.

3

Diﬀerential APX-Completeness

Let us now address the problem of completeness in the class DAPX. Note ﬁrst that a careful reading of the proof of the standard APX-completeness of max

Completeness in Diﬀerential Approximation Classes

183

wsat-B given in [2] establishes also the following proposition which will be used in what follows. Proposition 1. Let Π ∈ APX. There exist 3 polynomially computable functions f , g and cρ :]0, 1[∩Q →]0, 1[∩Q such that ∀x ∈ IΠ , ∀z ∈ solΠ (x), ∀ρ ∈ ]0, 1[: (1) f (x, z, ρ) = (φx,z,ρ , Wx,z,ρ , wx,z,ρ ) with (φx,z,ρ , wx,z,ρ ) ∈ Imax wsat ; (2) ∀y ∈ solmax wsat (f (x, z, ρ)), g(x, z, ρ, y) ∈ solΠ (x); (3) if γΠ (x, z) ρ, then f (x, z, ρ) is an instance of max wsat-B and, for any solution y of f (x, z, ρ), if γmax wsat-B (f (x, z, ρ), y) 1 − cρ (), then γΠ (x, g(x, z, ρ, y)) 1 − . We now deﬁne a notion of polynomial time diﬀerential approximation schemata preserving reducibility, called DPTAS-reduction in what follows. Deﬁnition 3. Let Π, Π ∈ NPO. Then, Π ≤DPTAS Π if there exist two functions f , g and a function c :]0, 1[∩Q →]0, 1[∩Q, all computable in polynomial time, such that: (i) ∀x ∈ IΠ , ∀ ∈]0, 1[∩Q, f (x, ) ∈ IΠ ; f is possibly multivalued; (ii) ∀x ∈ IΠ , ∀ ∈]0, 1[∩Q, ∀y ∈ solΠ (f (x, )), g(x, y, ) ∈ solΠ (x); (iii) ∀x ∈ IΠ , ∀ ∈]0, 1[∩Q, ∀y ∈ solΠ (f (x, )), δΠ (f (x, ), y) 1 − c() ⇒ δΠ (x, g(x, y, )) 1 − ; if f is multi-valued, i.e., f = (f1 , . . . , fi ), for some i polynomial in |x|, then, the former implication becomes: ∀x ∈ IΠ , ∀ ∈]0, 1[∩Q, ∀y ∈ solΠ ((f1 , . . . , fi )(x, )), ∃j i such that δΠ (fj (x, ), y) 1 − c() ⇒ δΠ (x, g(x, y, )) 1 − . It is easy to see that given two NPO problems Π and Π , if Π ≤DPTAS Π and Π ∈ DAPX, then Π ∈ DAPX. Let Π ∈ DAPX and let T be a diﬀerential ρ-approximation algorithm for Π, with ρ ∈]0, 1[. There exists a polynomial p such that ∀x ∈ IΠ , |ω(x) − opt(x)| 2p(|x|) . An instance x ∈ IΠ can be written in terms of an integer linear program as: x : opt v(y) subject to y ∈ Cx , where Cx is the constraint-set of x. For any i ∈ {0, . . . , p(|x|)} and for any l ∈ N, we deﬁne xi,l by: xi,l : max[vi,l (y) = v(y)/2i − l] subject to y ∈ Cx , if Π is a maximization problem, or xi,l : min[vi,l (y) = l − v(y)/2i ] subject to y ∈ Cx , if Π is a minimization problem. Any xi,l can be considered as an instance of an NPO problem denoted by Πi,l . Then, the following proposition holds. Proposition 2. Let < min{ρ, 1/2}, x ∈ IΠ and (i, l) ∈ {1, . . . , p(|x|)} × N be such that 2i | opt(x) − ω(x)| 2i+1 and set l = ω(x)/2i . Then, for any y ∈ solΠ (x) = solΠi,l (xi,l ): (1) δΠi,l (xi,l , y) (1 − ) =⇒ δΠ (x, y) 1 − 3; (2) δΠ (x, y) ρ =⇒ δΠi,l (xi,l , y) (ρ − )/(1 + ). The proof of the existence of a DAPX-complete problem is performed along the following schema. We ﬁrst prove that any DAPX problem Π is reducible to max wsat-B by a reduction transforming a PTAS for max wsat-B into a DPTAS for Π; we denote it by ≤D S . Next, we consider a particular APXcomplete problem Π , say max independent set-B; max wsat-B that is in APX is PTAS-reducible to max independent set-B. max independent set-B is both in APX and in DAPX and, moreover, standard and diﬀerential approximation ratios coincide for it; this coincidence draws a trivial reduction

184

G. Ausiello et al.

called ID-reduction; it trivially transforms a diﬀerential polynomial time approximation schema into a standard polynomial time approximation schema. In other words, we prove that Π ≤D S max wsat-B ≤PTAS max independent set-B ≤ID max independent set-B The composition of the three reductions, i.e., the one from Π to max wsat-B, the one from max wsat-B to max independent set-B and the ID-reduction, is a DPTAS reduction transforming a diﬀerential polynomial time approximation schema for max independent set-B into a diﬀerential polynomial time approximation schema for Π, i.e., max independent set-B ∈ DAPX-complete. Theorem 4. max independent set-B is DAPX-complete. Proof. We sketch here the part ∀Π ∈ DAPX, Π ≤D S max wsat-B (we assume integer valued problems; extension to the case of rational values is immediate). Remark that given a formula φ, a variable-weight system w and a constant B, one can decide in polynomial time if (φ, B, w) ∈ Imax wsat-B . Since Π is in DAPX, let T be a polynomial algorithm that guarantees diﬀerential ratio ρ ∈]0, 1[. Let < min{ρ, 1/2}. For any ζ > 0, we denote by Oζ an oracle that, for any instance x of max wsat-B, computes a feasible solution Oζ (x) ∈ solmax wsat-B guaranteeing γmax wsat-B (x, Oζ ) 1 − ζ. We construct an algorithm A (this is the component of ≤D S transforming solutions for max wsat-B into solutions for Π) using this oracle such that: A guarantees diﬀerential approximation ratio 1 − for Π and, in the case where Oζ is polynomial (in other words, Oζ can be seen as a polynomial time approximation schema), A is also polynomial. The ≤D S -reduction claimed is based upon the construction of a family F of instances xi,l : F = {xi,l : (i, l) ∈ F }, where F is of polynomial size and contains a pair (io , lo ) such that: either i0 = 0, 2i0 | opt(x) − ω(x)| 2i0 +1 and l0 = ω(x)/2i0 , or i0 = 0, | opt(x) − ω(x)| 2 and l0 = ω(x). For instance xi0 ,l0 the worst value is 0; henceforth standard and diﬀerential ratios coincide. In other words, δΠi0 ,l0 (xi0 ,l0 , z) = γΠi0 ,l0 (xi0 ,l0 , z), for all feasible z. Moreover, for i0 = 0, δΠ (x, z) = δΠ0,ω(x) (x0,ω(x) , z) = γΠ0,ω(x) (x0,ω(x) , z). We ﬁrst suppose that F can be constructed in polynomial time. For each (i, l) ∈ F , we consider the three functions gi,l , fi,l and ci,l (Proposition 1) for the instance xi,l . We set = min{(ci,l )ρ (), (ci,l )(ρ−)/(1+) (/3) : (i, l) ∈ F } and deﬁne, for (i, l) ∈ F , η = ρ if i = 0; otherwise, η = (ρ−)/(1+). Let z = T(x); then, for any (i, l) ∈ F , we set zi,l = gi,l (xi,l , z, η, O (fi,l (xi,l , z, η))), if fi,l (xi,l , z, η) is an instance of max wsat-B; otherwise we set zi,l = z. Remark that zi,l is a feasible solution for xi,l and, consequently, for x. In all, A constructs zi,l for each (i, l) ∈ F and selects the best among them as solution for x. Next, we prove that A achieves diﬀerential approximation ratio 1 − . Using Propositions 1 and 2, we can show that δΠ (x, zi0 ,l0 ) 1 − . Since (i0 , l0 ) ∈ F , A has already computed the solution zi0 ,l0 . By taking into account that the solution ﬁnally returned by A is the best among the computed ones, we immediately

Completeness in Diﬀerential Approximation Classes

185

conclude that it is at least as good as zi0 ,l0 . Therefore, it guarantees ratio 1 − . Finally, we prove that F can be constructed in polynomial time. Steps sketched just above show that ∀Π ∈ DAPX, Π ≤D S max wsat-B. Theorem 5. min vertex cover-B, max set packing-B, min set coverB, are DAPX-complete under DPTAS-reductions. Furthermore, max independent set, min vertex cover, max set packing, min set cover, max clique and max -colorable induced subgraph, are DAPX-hard under DPTAS-reductions.

4

Diﬀerential PTAS-Hardness

In this section, we will take into consideration the class DPTAS and we will address the problem of completeness in such class. Consider the following reduction preserving fully polynomial time diﬀerential approximation schemata, denoted by DFPTAS-reduction in what follows. Deﬁnition 4. Assume two NPO problems Π and Π . Then, Π ≤DFPTAS Π , if there exist three functions f , g and c such that: (i) f and g are as for PTASreduction (Section 1; (ii) c : (]0, 1[∩Q) × IΠ →]0, 1[∩Q; its time complexity and its value are polynomial in both |x| and 1/; (iii) ∀x ∈ IΠ , ∀ ∈]0, 1[∩Q, ∀y ∈ solΠ (f (x, )), δΠ (f (x, ), y) 1 − c(, x) ⇒ δΠ (x, g(x, y, )) 1 − . Obviously, given two NPO problems Π and Π , if Π ≤DFPTAS Π and Π ∈ DPTAS, then Π ∈ DPTAS. In the following we study completeness not for the whole class DPTAS but for a subclass DPTASp mainly consisting of the maximization problems of PTAS the worst-value of which is computable in polynomial time (this class includes, in particular, maximization problems with worst value 0). Recall that, the ﬁrst problem proved PTAS-complete (under FPTAS reduction) is max linear wsat-B ([2]). Consider two problems Π ∈ DPTASp and Π , instances of which x ∈ IΠ and x ∈ IΠ , respectively, are expressed, in terms of an integer linear programs as: x : opt v(y) subject to y ∈ Cx , x : opt v(y ) − ω(x) subject to: y ∈ Cx and Cx ≡ Cx . Obviously, δΠ (x, y) = δΠ (x , y ) = γΠ (x , y ) and, moreover, Π and Π belong to DPTASp ; also, Π ∈ PTAS and Π ≤FPTAS max linear wsat-B. So, for any Π ∈ DPTASp , Π ≡D Π ≤FPTAS max linear wsat-B; reduction ≡D ◦ ≤FPTAS is a DFPTAS-reduction. AF of DPTASp under aﬃne transforConsider now the closure DPTASp mations of objective functions of its problems. min vertex cover in planar AF graphs is in DPTASp \ DPTASp . AF

and Π its “aﬃne mate” in DPTASp . Then, Let any Π ∈ DPTASp Π ≤AF Π ≡D Π ≤FPTAS max linear wsat-B and since, obviously, the reduction ≤AF ◦ ≡D ◦ ≤FPTAS is a DFPTAS-one, the following proposition holds. Proposition 3. max linear wsat-B is DPTASp

AF

-hard, under ≤DFPTAS .

186

5

G. Ausiello et al.

MAX-SNP and Diﬀerential GLO

In the theory of approximability of optimization problems based upon the standard approximation ratio interesting results have been obtained by studying the behavior of local search heuristics and the degree of approximation that such heuristics can achieve. In particular, in [8,16], the class GLO is deﬁned as the class of NPO-PB problems whose local optima have a guaranteed quality with respect to the global optima. Of course, the diﬀerential counterpart of GLO, called DGLO in what follows, can be deﬁned analogously. In [17] it is shown that max cut, min dominating set-B, max independent set-B, min vertex cover-B, max set packing-B, min coloring, min set cover-B min set w(K)cover-B, min feedback edge set, min feedback vertex set-B and min multiprocessor scheduling, are included in DGLO. Furthermore in [18] it is proved that both min and max tsp on graphs with polynomially bounded edge-distances are also included in DGLO. Let us now consider the relationship of DGLO with respect to the diﬀerenDPTAS tial approximability class DAPX. Let DGLO be the closure of DGLO PTAS under ≤DPTAS . Analogously GLO is deﬁned in [16] where it is also proved PTAS = APX. It is easy to show that the same holds for diﬀerential that GLO approximation. DPTAS

Proposition 4. DAPX = DGLO

.

Among other interesting properties of the class GLO, in [8] it is proved that max 3-sat is complete in GLO ∩ MAX-SNP with respect to LOP-reduction. A related result in [19] shows that MAX-SNP ⊆ Non-Oblivious GLO, a variant of the class GLO deﬁned by means of local search algorithms that are allowed to use more general kinds of objective functions, rather than the natural objective function of the given problem, for improving the quality of the solution. In what follows, we show the existence of complete problems for a large, natural subclass of DGLO. As one can see from the deﬁnition of LOP-reduction in Section 1, the local optimality preserving properties do not depend on the approximation measure adopted. Hence, in an analogous way, we deﬁne here a reduction called DLOP which is a DPTAS-one with the same local optimality preserving properties as the ones of a LOP-reduction (Section 1). Deﬁnition 5. A DLOP-reduction is a DPTAS-reduction with the same surjectivity, partial monotonicity, locality and dominance properties as an LOPreduction. Obviously, given two NPO problems Π and Π , if Π ≤DLOP Π and Π ∈ DGLO, then Π ∈ DGLO. Let DGLO0 be the class of MAX-SNP maximization problems that belong to DGLO and for which the worst value 0 is feasible for any instance (max independent set-B, for example, is such a problem). Note that for the problems of DGLO0 , the standard and diﬀerential approximation ratios coincide. Now

Completeness in Diﬀerential Approximation Classes

187

let us consider the closure of DGLO0 under aﬃne transformations. This leads to the following deﬁnition. Deﬁnition 6. Let Π be a polynomially bounded NPO problem. Then, Π ∈ DGLO if (i) it belongs to DGLO0 , or (ii) it can be transformed into a problem in DGLO0 by means of an aﬃne transformation; in other words, DGLO = AF DGLO0 . Theorem 6. ∀Π ∈ DGLO , Π ≤DLOP max independent set-B. Proof. Assume Π ∈ DGLO . We then have the following two cases: (i) Π ∈ DGLO0 or (ii) Π can be transformed into a problem in DGLO0 by means of an aﬃne transformation. Dealing with case (i), note that for DGLO0 , an LOP-reduction is also a DLOP-one and that the L-reduction of any π ∈ GLO (hence in DGLO0 ) is an LOP-reduction ([8]). We can show that both L-reductions in [3] from max 3-sat to max 3-sat-B and from max 3-sat-B to max independent set-B are also LOP-ones. So, the result follows. Dealing with case (ii), since an aﬃne transformation is a DLOP-reduction, Π ≤DLOP Π and by case (i), Π ≤DLOP max independent set-B. Proposition 5. max cut, min vertex cover-B, max set packing-B, min set cover-B are DGLO -complete, under DLOP-reductions. Note that min multiprocessor scheduling, or even min and max tsp on graphs with polynomially bounded edge-distances belong to DGLO ([17,18]) but neither to GLO, nor to DGLO . On the other hand, min vertex coverB belongs to DGLO but not to MAX-SNP.

References 1. Orponen, P., Mannila, H.: On approximation preserving reductions: complete problems and robust measures. Technical Report C-1987-28, Dept. of Computer Science, University of Helsinki, Finland (1987) 2. Crescenzi, P., Panconesi, A.: Completeness in approximation classes. Inform. and Comput. 93 (1991) 241–262 3. Papadimitriou, C.H., Yannakakis, M.: Optimization, approximation and complexity classes. J. Comput. System Sci. 43 (1991) 425–440 4. Ausiello, G., Crescenzi, P., Protasi, M.: Approximate solutions of NP optimization problems. Theoret. Comput. Sci. 150 (1995) 1–55 5. Crescenzi, P., Trevisan, L.: On approximation scheme preserving reducibility and its applications. In: Foundations of Software Technology and Theoretical Computer Science, FCT-TCS. Number 880 in Lecture Notes in Computer Science, SpringerVerlag (1994) 330–341 6. Ausiello, G., D’Atri, A., Protasi, M.: On the structure of combinatorial problems and structure preserving reductions. In: Proc. ICALP’77. Lecture Notes in Computer Science, Springer-Verlag (1977)

188

G. Ausiello et al.

7. Demange, M., Paschos, V.T.: On an approximation measure founded on the links between optimization and polynomial approximation theory. Theoret. Comput. Sci. 158 (1996) 117–141 8. Ausiello, G., Protasi, M.: NP optimization problems and local optima graph theory. In Alavi, Y., Schwenk, A., eds.: Combinatorics and applications. Proc. 7th Quadriennal International Conference on the Theory and Applications of Graphs. Volume 2. (1995) 957–975 9. Ausiello, G., Crescenzi, P., Gambosi, G., Kann, V., Marchetti-Spaccamela, A., Protasi, M.: Complexity and approximation. Combinatorial optimization problems and their approximability properties. Springer, Berlin (1999) 10. Ausiello, G., Bazgan, C., Demange, M., Paschos, V.T.: Completeness in diﬀerential approximation classes. Cahier du LAMSADE 204, LAMSADE, Universit´e ParisDauphine (2003) Available on http://www.lamsade.dauphine.fr/cahiers.html. 11. Crescenzi, P., Kann, V., Silvestri, R., Trevisan, L.: Structure in approximation classes. SIAM J. Comput. 28 (1999) 1759–1782 12. Monnot, J.: Diﬀerential approximation results for the traveling salesman and related problems. Inform. Process. Lett. 82 (2002) 229–235 13. Hassin, R., Khuller, S.: z-approximations. J. Algorithms 41 (2001) 429–442 14. Bazgan, C., Paschos, V.T.: Diﬀerential approximation for optimal satisﬁability and related problems. European J. Oper. Res. 147 (2003) 397–404 15. Toulouse, S.: Approximation polynomiale: optima locaux et rapport diﬀ´erentiel. PhD thesis, LAMSADE, Universit´e Paris-Dauphine (2001). 16. Ausiello, G., Protasi, M.: Local search, reducibility and approximability of NPoptimization problems. Inform. Process. Lett. 54 (1995) 73–79 17. Monnot, J., Paschos, V.T., Toulouse, S.: Optima locaux garantis pour l’approximation diﬀ´erentielle. Technical Report 203, LAMSADE, Universit´e Paris-Dauphine (2002). Available on http://www.lamsade.dauphine.fr/cahdoc.html#cahiers. 18. Monnot, J., Paschos, V.T., Toulouse, S.: Approximation algorithms for the traveling salesman problem. Mathematical Methods of Operations Research 57 (2003) 387–405 19. Khanna, S., Motwani, R., Sudan, M., Vazirani, U.: On syntactic versus computational views of approximability. SIAM J. Comput. 28 (1998) 164–191

On the Length of the Minimum Solution of Word Equations in One Variable Kensuke Baba1 , Satoshi Tsuruta2 , Ayumi Shinohara1,2 , and Masayuki Takeda1,2 1

2

PRESTO, Japan Science and Technology Corporation (JST) Department of Informatics, Kyushu University 33, Fukuoka 812-8581, Japan {baba,s-tsuru,ayumi,takeda}@i.kyushu-u.ac.jp

Abstract. We show the tight upperbound of the length of the minimum solution of a word equation L = R in one variable, in terms of the diﬀerences between the positions of corresponding variable occurrences in L and R. By introducing the notion of diﬀerence, the proof is obtained from Fine and Wilf’s theorem. As a corollary, it implies that the length of the minimum solution is less than N = |L| + |R|.

1

Introduction

Word equations can be used to describe several features of strings, for example, they generalize pattern matching problem [3,4] with variables. Fig. 1 shows an example of word equations. The fundamental work in word equations is Makanin’s algorithm [10] which decides whether a word equation has a solution (see for a survey on this topic [9]). Plandowski [11] introduced a PSPACE algorithm which gives the best upperbound so far known. On the other hand, the problem is known to be NP-hard [1]. An approach to the problem is to analyze word equations with a restricted number of variables. Charatonik and Pacholski [2], and Ilie and Plandowski [7] introduced a polynomial time algorithm for word equations in two variables. As to word equations in one variable, there is an eﬃcient algorithm by Obono et al. [6] which solves a word equation L = R in O(N log N ) time in terms of N = |L| + |R|. D¸abrowski and Plandowski [5] presented an algorithm of O(N + x log N ) time complexity for the number x of occurrences of the variable x. However, the upperbound of the length of the minimum solution of word equations is not exactly understood even for one-variable version. Let χ be the upperbound, that is, a word equation has a solution if and only if there exists a solution A of length |A| ≤ χ. For any word equation in one variable, we can choose a single candidate for the solution of a length, therefore we have only to check for the χ candidates at most to decide whether a word equation has a solution. Indeed no χ leads a better result for the complexity as long as it is proportional to N , but from a practical viewpoint, χ is quite important. In [6], χ is taken to be equal to 4N without precise proof. Hence, we need to reduce χ as small as possible and prove it formally. B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 189–197, 2003. c Springer-Verlag Berlin Heidelberg 2003

190

Kensuke Baba et al.

Let a, b be characters and x be a variable. The word equation xxbaababa = ababaxabx has a solution

x = ababaababa

Fig. 1. An example of word equation in one variable

In this paper, we show the tight upperbound of the minimum solution for one variable word equations, by introducing a new measure in terms of the positions of variable occurrences. The bound reveals that χ is less than N . We now explain the basic idea brieﬂy. A word equation in one variable is nontrivial only if both side of the equation have the same number of occurrences of the variable: Otherwise, the length of a possible solution is exactly determined by an integer equation on both the length of instance and the number of variable occurrences. Let m be the number of occurrences. We focus on the fact that, for a word equation L = R, the “gap” between the k-th occurrence of the variable x in L and the k-th occurrence in R is preserved for any substitution of a string A, as the gap between the corresponding occurrences of A in L[A/x] and R[A/x]. We denote the gaps by dk (1 ≤ k ≤ m). In the example in Fig. 1, d1 = 5 and d2 = 7. By utilizing this notion, the proof of the upperbound is essentially reducible to one for a word equation which has only one occurrence of x in both side respectively. If A is a solution and is longer than dk , then the k-th pair of occurrences of A overlap each other, that is, dk is a period of A. Therefore, by Fine and Wilf’s theorem [9], the upperbound is max1≤k≤m {dk + p−gcd(dk , p)}−1 for a period p of A. Since the minimum length of p is not larger than min1≤k≤m,dk =0 dk , the tight upperbound will be given as max1≤k≤m dk + min1≤k≤m,dk =0 dk −2. Obviously, min1≤k≤m,dk =0 dk ≤ max1≤k≤m dk < |L|. Thus χ is less than N = 2|L|.

2

Preliminaries

Let Σ be an alphabet and x ∈ / Σ be a variable. The empty word is denoted by ε. The length of a word w is denoted by |w|, where |ε| = 0 and |x| = 1. The i-th element of a word w is denoted by w[i] for 1 ≤ i ≤ |w|. The word w[i]w[i+1] · · · w[j] is called a subword of w, and denoted by w[i : j]. In particular, it is called a preﬁx if i = 1 and a suﬃx if j = |w|. For convenience, let w[i : j] = ε for j < i. A period of a non-empty word w is deﬁned as an integer 0 < p ≤ |w|, such that w[i] = w[i + p] for any 1 ≤ i ≤ |w| − p. Note that the |w| is always a period of w. Proposition 1 (Fine and Wilf ). Let p, q be periods of a word w. If |w| ≥ p + q − gcd(p, q), then gcd(p, q) is also a period of w.

On the Length of the Minimum Solution of Word Equations in One Variable

191

A word equation (in one variable) is a pair of words over Σ ∪ {x} and is usually written by connecting two words with “=”. A solution of a word equation L = R is a homomorphism σ : (Σ ∪ {x})∗ → Σ ∗ leaving the letters of Σ invariant and such that σ(L) = σ(R). Since the solution is uniquely decided by a mapping of x into Σ ∗ , in this paper we deﬁne a solution as a word A ∈ Σ ∗ such that A = σ(x). Therefore, we can rewrite the condition that σ(L) = σ(R) by L[A/x] = R[A/x], where the result w[A/x] of the substitution of A to x in a word w is deﬁned inductively as: if w = ε, w[A/x] = ε; if w = a ∈ Σ, w[A/x] = a; if w = x, w[A/x] = A; if w = w1 w2 , w[A/x] = w1 [A/x]w2 [A/x]. If two words L and R have the same preﬁx M , the solution of a word equation L = R is obtained by solving the word equation L = R where L = M L and R = M R . Therefore, we can assume without loss of generality that any word equation is of the form xL1 = BxR1 for a non-empty word B which has no variable and words L1 , R1 . This form implies that any solution A is a preﬁx of the word B k for a natural number k. By a similar argument for suﬃx, we can assume that either L1 or R1 ends with x. In particular, if L and R have exactly one occurrence of x respectively, the word equation L = R can be reduced to the form xC = Bx for non-empty words B, C which have no variable. We denote by x (w) the number of occurrences of the variable x in a word w. If a word equation L = R has a solution A, the length of L[A/x] is same as the length of R[A/x]. Hence we have |L| + x (L) · (|A| − 1) = |R| + x (R) · (|A| − 1), |L|−|R| and therefore |A| = x (R)− +1. If x (L) = x (R), the length of the solution is x (L) determined uniquely to the word equation and its upperbound is | |L|−|R| |+1 ≤ max(|L|, |R|). If x (L) = x (R), we have |L| = |R|. Proposition 2 ([6]). Let L = R be a word equation. (i) If x (L) = x (R), the length of the solution is determined uniquely with respect to L = R and is at most max(|L|, |R|). (ii) If x (L) = x (R), L = R has a solution only if |L| = |R|.

3

Solutions

We show the upperbound of the length of the minimum solution of word equations in one variable. By Proposition 2, we have only to consider the word equation L = R in the situation that x (L) = x (R) and |L| = |R|. Let x m = x (L) = x (R) and n = |L| = |R|. We denote by x1 , · · · , xm and r1x , · · · , rm the positions of occurrences of x in L and R, respectively in increasing order. A We deﬁne A k and rk for a word A and 1 ≤ k ≤ m as x A k = k + (k − 1)(|A| − 1), rkA = rkx + (k − 1)(|A| − 1).

192

Kensuke Baba et al.

L: R:

x1 x

A 1

L[A/x]:

A

-

r1x x

r2x x

A

-

A 2 A

r1A R[A/x]:

x2 x

A

r2A

A

A

-

A Fig. 2. The diﬀerence xk − rkx is equal to the diﬀerence A k − rk for any A

A k is, intuitively, the position in L[A/x] of a occurrence of A substituted to the kth occurrence of x in L (which is not always the k-th occurrence).Therefore, A k − rkA is the diﬀerence between it and the position of the corresponding occurrence of A in R[A/x]. The diﬀerence does not depend on the length of A, see Fig. 2. Proposition 3. For any word A, any word equation L = R, and integer 1 ≤ k ≤ m, A x x (i) A k − rk = k − rk , A A (ii) L[A/x][k : k + |A| − 1] = R[A/x][rkA : rkA + |A| − 1] = A. Proof. (i) Trivial by the deﬁnition. (ii) We prove for L. By the deﬁnition of substitution, L[A/x] is represented as L[A/x] = L[1 : x1 − 1]AL[x1 + 1 : x2 − 1] · · · L[xk−1 + 1 : xk − 1]AL[xk + 1 : xk+1 − 1] · · · L[xm−1 + 1 : xm − 1]AL[xm + 1 : n].

(1)

The length of the preﬁx of L[A/x] which ends L[xk−1 + 1 : xk − 1] equals to k (x1 −1)+ i=2 {(xi −1)−(xi−1 +1)+1}+(k −1)|A| = xk −k +(k −1)|A| = A k −1 for any 1 ≤ k ≤ m. Thus A k is the position of the occurrence of A which is the next to L[xk−1 + 1 : xk − 1] in the right side of Eq. (1). We denote by dk the absolute value of the diﬀerence, that is, dk = |xk − rkx | for 1 ≤ k ≤ m. Then we have the following lemma. Lemma 1. Let A be a solution of a word equation L = R. For 1 ≤ k ≤ m and dk = 0, if |A| ≥ dk then A has a period dk .

On the Length of the Minimum Solution of Word Equations in One Variable

193

Proof. We can assume rkx < xk without loss of generality. If |A| = dk , by the A x x deﬁnition, dk is a period of A. If |A| > dk , by Proposition 3 (i), A k = rk +k −rk = A A rk + dk < rk + |A|. Since A is a solution of L = R, we consider subwords of A A A L[A/x] and R[A/x], then L[A/x][A k : rk + |A| − 1] = R[A/x][k : rk + |A| − 1]. A A A By Proposition 3 (ii), L[A/x][k : rk + |A| − 1] = A[1 : |A| − (k − rkA )] and A A A R[A/x][A k : rk + |A| − 1] = A[1 + (k − rk ) : |A|]. Thus, by Proposition 3 (i), x x x x A[1 : |A| − (k − rk )] = A[1 + (k − rk ) : |A|] which implies that xk − rkx is a period of A. Lemma 2. Let A be a solution of a word equation L = R and p be a period of A. If |A| ≥ max dk + p − 1, 1≤k≤m

then the preﬁx A[1 : |A| − p] of A is also a solution of L = R. Proof. We prove by induction on the number m = x (L) = x (R). (Base step) By the argument in Section 2, we can assume L = xC and R = Bx with B, C ∈ Σ + . By Lemma 1, d1 = |B| is a period of A. By Proposition 1, gcd(d1 , p) is a period of A, moreover it is also a period of AC and BA. Since A[1 + gcd(d1 , p) : |A|] = A[1 : |A| − gcd(d1 , p)], we have A[1 : |A| − k gcd(d1 , p)]C = (AC)[1 + k gcd(d1 , p) : |A|] = (BA)[1 + k gcd(d1 , p) : |A|] = BA[1 : |A| − k gcd(d1 , p)] for a natural number k such that k gcd(d1 , p) ≤ |A|. (Induction step) We can assume L = L xC and R = R xBx with L , R ∈ (Σ ∪ {x})+ and B, C ∈ Σ + . Then we have dm = |C| and L [A/x]AC = R [A/x]ABA. If |C| ≤ |B|, the result is obviously obtained by induction for two equations L = R xB[1 : |B| − |C|] and xC = B[|B| − |C| + 1 : |B|]x. If |C| > |B|, we have |ABA| > |AC| > |BA| by the assumption |A| ≥ max1≤k≤m dk + p − 1. Hence the occurrence of A starting at A m in L[A/x] and the occurrence of A starting at A rm−1 in R[A/x] have a non-trivial overlapping Q. (This situation is illustrated in Fig. 3.) Now we consider two equations L Q = R x and xC = QBx. The assumption L [A/x]AC = R [A/x]ABA implies L [A/x]Q = R [A/x]A and AC = QBA, that is, A is a solution of the equations. Then, by induction hypothesis, we have L [A /x]Q = R [A /x]A and A C = QBA where A = A[1 : |A| − p]. Thus, we have L[A /x] = L [A /x]A C = L [A /x]QBA = R [A /x]A BA = R[A /x]. Theorem 1 (Tight upperbound). For any word equation L = R such that x (L) = x (R), the length of the minimum solution is at most max dk +

1≤k≤m

The bound is tight.

min

1≤k≤m,dk =0

dk − 2.

194

Kensuke Baba et al. L: R:

L[A/x]: R[A/x]:

L

xm x

R

C

x rm−1

x rm x B x

A m Q

L [A/x]

R [A/x]

A rm−1

A

Q

C

A B

A rm

A

Fig. 3. If |C| > |B| and |A| ≥ dm = |C|, then |ABA| > |AC| > |BA| and two A occurrences of A starting at A m and rm−1 have a overlap Q

Proof. Assume a word equation has a solution A such that |A| ≥ max1≤k≤m dk + min1≤k≤m,dk =0 dk − 1. By Lemma 1, A has a period p ≤ min1≤k≤m,dk =0 dk . Hence, by Lemma 2, A[1 : |A| − p] is also a solution of the word equation. Therefore A is not the minimum solution. To see that the bound is tight, let us consider the following word equation: xxbaababa = ababaxabx. We can verify that the solution of length 10 x = ababaababa. is in fact the minimum solution. Since d1 = 5 and d2 = 7, we have max1≤k≤2 dk = 7 and min1≤k≤2 dk = 5. Thus max1≤k≤2 dk + min1≤k≤2 dk − 2 = 10, which shows the bound is tight. In case of binary alphabet, the minimum solution which length is the upper bound is central which is deﬁned as: A word is central if and only if it is in the set 0∗ ∪ 1∗ ∪ (P ∩ P 10P ) where P is the set of palindrome words. It is obtained by the proof of Lemma 2 and the fact that: a word w is central if and only if it has two periods p and q such that gcd(p, q) = 1 and |w| = p+q−2 [9, pp. 69–70]. We also have the following relaxed upperbound, since min1≤k≤m,dk =0 dk ≤ max1≤k≤m dk < |L|. Corollary 1. For any word equation L = R such that x (L) = x (R), the length of the minimum solution is at most N − 4 = |L| + |R| − 4. Consequently, we have the following upperbound by Proposition 2. Corollary 2. For any word equation L = R, the length of the minimum solution is at most N − 1.

On the Length of the Minimum Solution of Word Equations in One Variable

195

Table 1. The numbers of solvable word equations in one variable in E, classiﬁed by the lengths of their minimum solutions

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

4

3 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0

4 32 20 12 0 0 0 0 0 0 0 0 0 0 0 0

5 220 104 56 24 0 0 0 0 0 0 0 0 0 0 0

6 1388 548 252 140 60 0 0 0 0 0 0 0 0 0 0

length of L (and R) 7 8 9 10 8364 49120 284204 1630124 2868 14856 76236 388212 1208 5844 28268 136536 564 2488 11304 53008 260 1148 4764 20784 116 580 2052 8592 8 264 1152 4368 0 8 504 2148 0 0 24 1084 0 0 8 48 0 0 8 36 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0

11 9303292 1964612 657868 250296 95868 36076 16152 7532 4404 2120 136 24 0 8 8

String Statistics Problem

We are developing a system whose aim is to experimentally analyze the combinatorial property and structures of word equations. As a ﬁrst step, we are recording all solvable word equations (up to a moderate length) in one variable together with their minimum solutions. By the fact that: for any word w, there exists a binary word w which has the same set of periods as w [9, pp. 275–279], we have only to consider a binary alphabet to ﬁnd out the relation between the length of an equation and the length of its solutions. For a ﬁxed alphabet Σ = {a, b} and a speciﬁed length n, we enumerate the set E of all word equations L = R such that (1) both a and b appear either L or R, (2) |L| = |R| = n, (3) L and R contains the same number of variables, and (4) the pairs (L[1], R[1]) and (L[n], R[n]) must be taken from {(x, a), (x, b), (a, x), (b, x)}. Then for each word equation in E, we try to ﬁnd the minimum solution by checking each preﬁx of B k (where B is a constant preﬁx of either L or R) in increasing order up to 2n − 4. If a solution is found, we logged it and turn to the next equation. Otherwise, we can conclude that the word equation has no solution, thanks to the upperbound we have shown (Corollary 1). For interested readers, Table 1 shows the numbers of the solvable word equations in E, classiﬁed by the lengths of their minimum solutions. At i-th row and column labeled n = |L| of the table T , we ﬁll the number of word equations in E of length |L| = |R| = n whose minimum solution is of length i. Remark that some equations may be equivalent each other, by either replacing a with b, exchanging left-side with right-side, or reversing the formulae. We did not exclude these duplications. For example, T (0, 3) = 4 corresponds to the number of equations {abx = xab, bax = xba, xab = abx, xba = bax}, where the

196

Kensuke Baba et al.

empty string is a solution to them. They are equivalent each other. Moreover, T (1, 3) = 4 corresponds to {abx = xba, bax = xab, xab = bax, xba = abx}, whose minimum solutions are of length 1. They are essentially the same. Let us pick up some interesting pairs of equation and its minimum solution. – xxbaababa = ababaxabx, ababaababa, from T (10, 9) = 8, which was used to prove the tightness of the upperbound. This is a unique instance in T (10, 9) = 8, since the other 7 instances are all equivalent to it. – xxbaabababa = abababaxabx, abababaabababa, from T (14, 11) = 8, which also matches the upperbound. This is a unique instance in T (14, 11) = 8, since the other 7 instances are all equivalent to it. – xabxbaaaaaa = aaaaaabaxbx, aaaaaabaaaaaa). This is a unique instance in T (13, 11) = 8.

5

Conclusion

We showed the tight upperbound of the length of minimum solution of word equations in one variable. The upperbound is easily computed from a given word equation. Moreover, we showed concrete examples which match the bound. As a corollary, we also have a more relaxed upperbound which is easier applicable: the length of the minimum solution is less than the size of the total length of a word equation. Khmelevski˘ı [8, pp. 12] proved that if a word equation C0 xC1 · · · xCu = xB1 · · · xBv is solvable, it has a solution of length smaller than M 2 + 3M where M = maxi,j {u, v, |Ci |, |Bj |}. When we consider the upperbound in terms of the length N of a given word equation, the order of this value comes up to N 2 since M ≤ N − 1. Even for the original expression, we can show that the value M 2 + 3M − 1 never be less than the upperbound of our result for a non-trivial word equation. Let ν = u = v and λ = maxi,j {|Ci |, |Bj |}. Then M = max{ν, λ}. By the deﬁnition of dk , we have mink,dk =0 dk ≤ |C0 | ≤ λ and maxk dk ≤ ν k−1 max{ i=0 |Ci |, i=k |Ci |} ≤ νλ. Therefore, maxk dk + mink,dk =0 dk − 2 ≤ νλ + λ − 2 ≤ M 2 + 2M − 2 ≤ M 2 + 3M − 1. Thanks to the bound, we could perform a comprehensive analysis of word equations in one variable up to a moderate size the equations, by enumerating all word equations and solving them one by one. We showed some statistics of the lengths of minimum solutions.

Acknowledgements We thank Yoshihito Tanaka for useful discussions. The relation between the two upperbound in Section 5 was obtained by his idea. We also thank the anonymous referees for their helpful comments to improve this work, in particular the proof of Lemma 2 became clear.

On the Length of the Minimum Solution of Word Equations in One Variable

197

References 1. Angluin, D.: Finding Patterns Common to a Set of Strings. J. Comput. Sys. Sci., Vol. 21 (1980) 46–62 2. Charatonik, W. and Pacholski, L.: Word Equations in Two Variables. Proc. IWWERT’91, LNCS, Vol. 677 (1991) 43–57 3. Crochemore, M. and Rytter, W.: Text Algorithms. Oxford University Press, New York (1994) 4. Crochemore, M. and Rytter, W.: Jewels of Stringology. World Scientiﬁc (2003) 5. D¸abrowski, R. and Plandowski, W.: On Word Equations in One Variable. Proc. MFCS2002, LNCS Vol. 2420 (2002) 212–220 6. Eyono Obono, S., Goralcik, P., and Maksimenko, M.: Eﬃcient Solving of the Word Equations in One Variable. Proc. MFCS’94, LNCS Vol. 841 (1994) 336–341 7. Ilie, L. and Plandowski, W.: Two-Variable Word Equations. Proc. STACS2000, LNCS Vol. 1770 (2000) 122–132 8. Khmelevski˘ı, Yu.I.: Equations in Free Semigroups. Proc. Steklov Inst. of Mathematics 107, AMS (1976) 9. Lothaire, M.: Algebraic Combinatorics on Words. Cambridge University Press (2002) 10. Makanin, G.S.: The Problem of Solvability of Equations in a Free Semigroup. Mat. Sb. Vol. 103, No. 2, 147–236. In Russian; English translation in: Math. USSR Sbornik, Vol. 32 (1977) 129–198 11. Plandowski, W.: Satisﬁability of Word Equations with Constants is in PSPACE. Proc. FOCS’99, IEEE Computer Society Press (1999) 495–500

Smoothed Analysis of Three Combinatorial Problems Cyril Banderier1 , Ren´e Beier2 , and Kurt Mehlhorn2 1

Laboratoire d’Informatique de Paris Nord Institut Galil´ee, Universit´e Paris 13 99, avenue Jean-Baptiste Cl´ement, 93430 Villetaneuse, France [email protected] 2 Max-Planck-Institut f¨ ur Informatik Stuhlsatzenhausweg 85, 66123 Saarbr¨ ucken, Germany {rbeier,mehlhorn}@mpi-sb.mpg.de

Abstract. Smoothed analysis combines elements over worst-case and average case analysis. For an instance x, the smoothed complexity is the average complexity of an instance obtained from x by a perturbation. The smoothed complexity of a problem is the worst smoothed complexity of any instance. Spielman and Teng introduced this notion for continuous problems. We apply the concept to combinatorial problems and study the smoothed complexity of three classical discrete problems: quicksort, left-to-right maxima counting, and shortest paths.

1

Introduction

Recently, Spielman and Teng [13] introduced smoothed analysis, which is intermediate between average case analysis and worst case analysis. The smoothed complexity of an algorithm is max Ey∈U (x) C(y) , x

where x ranges over all inputs, y is a random instance in a neighborhood of x (whose size depends on the smoothing parameter ), E denotes expectation, and C(y) is the cost of the algorithm on input y. In other words, worst-case complexity is smoothed by considering the expected running time in a neighborhood of an instance instead of the running time at the instance. If U (x) is the entire input space, smoothed analysis becomes average case analysis, and if U (x) = {x} for all , smoothed analysis becomes worst case analysis. Smoothed analysis gives information whether instance space contains dense regions of hard instances, see Figure 1. The smoothed complexity of an algorithm is low if worst-case instances are “isolated events” in instance space. In other words, if the smoothed complexity is low, worst case instances are not robust under small changes; most small changes to the instance destroy the property of being worst-case; a small random perturbation destroys the property of being worst-case.

This work was partially supported by the Future and Emerging Technologies programme of the EU under contract number IST-1999-14186 (ALCOM-FT).

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 198–207, 2003. c Springer-Verlag Berlin Heidelberg 2003

Smoothed Analysis of Three Combinatorial Problems

199

Fig. 1. The graphs show conceivable dependency of running time on input instance. We assume a one-dimensional input space; the running time on an instance is shown as the y-value of the graph. The neighborhood of an instance is simply an interval around the instance. In the situation on the left, the smoothed complexity will be equal to the worst case complexity (for all small enough ), and in the situation on the right, the smoothed complexity decreases sharply as a function of

In most sciences, one attempts to develop a theory that explains and predicts observed phenomena. Smoothed Analysis is one attempt to push the analysis of algorithms in this direction. To develop a mathematically rigorous theory of the behavior of algorithms, one must make mathematically rigorous assumptions about their inputs. Smoothed analysis makes the assumption that inputs are subject to noise, circumstance, or randomness. This is an assumption that is valid in most practical problem domains. While it is unreasonable to assume that one can develop a model of inputs subject to noise that precisely models practice, one can try to come close and then reason by analogy. Spielman and Teng [13] showed that the smoothed complexity of the simplex algorithm (with the shadow-vertex pivot rule) for linear programming is polynomial. Linear programming is a continuous problem. The input is a sequence of real numbers (a cost vector, a constraint matrix, a right-hand side). The smoothing operation adds Gaussian noise with parameter σ to each number in the input. The expected running time of the simplex algorithm for such a perturbed instance is polynomial in 1/σ and the number of input variables. The other papers on smoothed analysis [3,5] also discuss continuous problems. Our paper is the ﬁrst to apply the concept of smoothed analysis to discrete problems. We deﬁne natural models of perturbation for sequences and natural numbers and analyze the smoothed complexity of quicksort, left-to-right maxima and the shortest path problem. Partial Permutations: Our ﬁrst model applies to problems deﬁned on sequences. It is parameterized by a real parameter p with 0 ≤ p ≤ 1 and is deﬁned as follows. Given a sequence s1 , s2 , . . . , sn , each element is selected (independently) with probability p. Let m be the number of selected elements (on average m = pn). Choose one of the m! permutations of m elements (uniformly at random) and let it act on the selected elements. E.g., for p = 1/2 and n = 7, one might select m = 3 elements (namely, s2 , s4 , and s7 ) out of an input sequence (s1 , s2 , s3 , s4 , s5 , s6 , s7 ). Applying the permutation (312) to the selected elements yields (s1 , s7 , s3 , s2 , s5 , s6 , s4 ). The probability to obtain this sequence in this way is p3 (1 − p)4 /3!.

200

Cyril Banderier, Ren´e Beier, and Kurt Mehlhorn

We show that the smoothed complexity of a deterministic variant of quicksort is O((n/p) ln p) (Section 2). However, it is not true that every ﬁxed element takes part in O((1/p) ln n)) comparisons on average. The maximum recursion depth is Ω( n/p) (Section 3). Partial Bit Randomization: Our second model applies to problems involving natural numbers and is parameterized by an integer k ≥ 0. For each integer of the input, the last k bits are randomly modiﬁed. This model is a discrete analogue of the model considered by Spielman and Teng. Notice, that in our model of perturbation the expectation of the resulting distribution is not necessarily equal to the unperturbed value. We show that the smoothed running time of the shortest path algorithm of Meyer [10] and Goldberg [6] on a graph with n nodes and m edges is O(m + n(K − k)) when the edge lengths are K bit integers, the last k of which are randomized. In [2] our model is used to analyze clairvoyant scheduling.

2

Quicksort

We analyze a deterministic variant of quicksort under partial permutations. We assume that quicksort takes the ﬁrst element of the list as the pivot and splits the input list with respect to the pivot into two parts: the elements smaller than the pivot and the elements larger than the pivot. We assume that the order of elements in the resulting two sublist is unchanged. Theorem 1 (Quicksort under Limited Randomness). The expected running time (i.e., number of comparisons) of quicksort on a partial permutation of n elements is O((n/p) ln n). Proof: We utilize a proof, based on randomized incremental constructions [11]. Let C denote the number of comparisons performed by the algorithm. Assume that we have a permutation of the numbers 1 to n. Let Xij be the indicator variable which is 1 iﬀ i and j are compared in a run of quicksort with i being the pivot. Clearly C = i,j Xij . Then Xij = 1 if and only if i occurs before all other elements with value between i and j in the sequence. Thus, for a random permutation, prob(Xij = 1) = |1/(j − i + 1)| and hence E[C] =

i=j

1 1 ≤ 2n ≤ 2n ln n . |j − i + 1| k 2≤k≤n

Next we estimate prob(Xij = 1) for partial permutations. Let s1 , . . . , sn be our initial permutation and let L = (8/p) ln n. If i is among s1 , . . . , sL or |j − i| ≤ L, we estimate prob(Xij = 1) for a total contribution of O( np ln n). Next assume that there are at least L elements preceding i in the initial permutation and that |i − j| > L. We split our estimate for prob(Xij = 1) into two parts. First assume that i is selected. Let l = |i − j|. The probability that at most lp/2 elements between i (exclusive) and j (inclusive) are selected is less than

Smoothed Analysis of Three Combinatorial Problems

201

exp(−lp/8). If more than lp/2 elements are selected, Xij = 1 implies that i is ﬁrst in the permutation of the selected elements which happens with probability at most 2/(lp). Together we obtain prob(Xij = 1) ≤ exp(−lp/8) + 2/(lp) and hence n ln n . exp(−lp/8) + 2/(lp) = O p LL

Assume next that i is not selected and let i be the ki -th element in the initial sequence. The probability that less than pki /2 elements before i are chosen or less than pl/2 elements between i and j or more than 2pn elements are chosen altogether is less than exp(−pki /8)+exp(−pl/8)+exp(−pn/2). The contribution of these rare events to the E[C] is only O(1/p) since for the ﬁrst term we have exp(−pki /8) = n exp(−pL/8) exp(−p/8)m = O(1/p) . ki ≥L

j

m≥0

The same bound can be shown for the other two terms. So assume that the required number of elements are chosen. If i occurs before all elements i + 1, . . . , j in the partial permutation, it must be the case that none of the pl/2 selected elements between i and j is inserted before i which happens with probability at most

2pn − ki p/2 2pn

lp/2 ≤ exp(−ki lp/(8n)).

Next observe that n n k=1 l=1

exp(−klp/(8n)) ≤

n k=1

n

16n 1 16n ln n ≤ ≤ ; 1 − exp(−kp/(8n)) kp p k=1

since 1 − e−x ≥ x/2 for 0 ≤ x ≤ 1 and hence 1/(1 − e−x ) ≤ 2/x.

Remark: When we consider partial permutations of a sorted sequence (the worstcase instance without permutations) we are able to get closed form formulae for the Xij ’s. We distinguish 10 subcases, most of them involving 7 nested sums. From these sums (involving binomials), it is possible to get the diﬀerential equation satisﬁed by their generating functions, and then the Frobenius method allows to get the full asymptotic scale which gives a p2 n ln n complexity. We refer the reader to the full paper for details. Pitfalls: The expected running time of quicksort on random permutations can be analyzed in many diﬀerent ways. Many of them rely on the fact that the subproblems generated by recursive calls are again random permutations. This is not true for partial permutations1 as the following example demonstrates. Consider an input 1, 2, 3, 4 and deﬁne q := 1 − p. Assume that 2 is the pivot element and hence the second subproblem consists of the numbers 3, 4. If 2 is 1

In the ﬁrst version of this paper, we fell into this pitfall.

202

Cyril Banderier, Ren´e Beier, and Kurt Mehlhorn

the pivot (ﬁrst element after permutation), at least the numbers 1 and 2 are selected. Conditioned on the fact that 1 and 2 are selected and 2 is made the ﬁrst element we obtain subproblem (3, 4) – – – –

always, when neither 3 or 4 is selected (probability q 2 ), always2 , when 3 is selected, but 4 is not (probability pq) in one out of two cases3 , when 4 is selected but 3 is not (probability pq/2) in one out of two cases, when 3 and 4 are selected (probability p2 /2).

Thus prob((3, 4)) = q 2 + 32 pq + 12 p2 . Consequently, prob((4, 3)) = 12 pq + 12 p2 . On the other hand, applying partial permutations on input sequence 3, 4 gives prob((3, 4)) = q 2 + 2pq + 12 p2 and prob((4, 3)) = 12 p2 . We also point out that the content of the ﬁrst position, even if it is selected, is not a random element of the sequence. It is more likely to be the original element than any other element. The other elements are equally likely. This unbalance results from the fact that if only one element is selected, the permutation of the selected elements has very little freedom. The expected maximum recursion depth of quicksort on random permutations is O(ln n). For partial permutations the expected maximum recursion depth is Ω( n/p). This is a consequence of the result in the next section. We will show that the number of left-to-right-maxima in a partial permutation might be as large as Ω( n/p). The number of left-to-right-maxima is the number of times the largest element is compared to a pivot element. Thus some elements may take part in as many as Ω( n/p) recursive calls which disproves the conjecture that every element takes part in O((1/p) ln n) calls with high probability. In the full version of this paper we show that the worst-case instances (sorted sequence, increasing or decreasing) also exibit the highest smoothed complexity. The complexity landscape for quicksort has two dominant peaks with a rather sharp transition (cf. Figure 1). We will see in the next section that is not always the case.

3

Left-to-Right Maxima

The simplest strategy to determine the largest element in a sequence is to scan the sequence from left to right and to keep track of the largest element seen. The number of changes to the current maximum is called the number of left-toright maxima in the sequence. The sequence 1, . . . , n has n left-to-right maxima and the expected number of left-to-right maxima in a random permutation of n elements is Hn = 1 + 1/2 + · · · + 1/n. Surprisingly, the worst case instance for the classical analysis is not the worst case under smoothing. Theorem 2 (Left-to-Right Maxima). Under the model, partial permutation the smoothed number of left-to-right maxima is Ω( n/p) and O( (n/p) log n), whereas the smoothed complexity for the sequence (1, . . . , n) is 2 3

The relevant partial permutations of the input are (2, 1, 3, 4) and (2, 3, 1, 4). The relevant partial permutations are (2, 1, 3, 4) and (2, 4, 1, 3). The former generates the subproblem (3, 4).

Smoothed Analysis of Three Combinatorial Problems

1−p + ln(pn) + γ + 2 p

1 2 (1 − p) + 2 p2

203

1 1 + O( 2 ) , n n

where γ ≈ .5772 is Euler’s constant. Proof: Due to space limitations we provide a proof only for the two ﬁrst asymptotic terms of the smoothed complexity of the sorted sequence (see full version of the paper for a generating function proof which gives the full asymptotics). The sequence 1, . . . , n has n left-to-right maxima. Smoothing decreases the number to about ln(pn) + 2/p as we show next. Let Xi be the probability that the i-th position is not selected and is a maximum and let Yi be the probability that the i-th position is selected and is a maximum. Consider ﬁrst a selected position. It contains a maximum if and only if it is a maximum among the selected elements. In this case its value is at least i and hence it is also a maximum considering all elements. Thus i Yi is simply the number of maxima among the selected elements. The number of selected elements concentrates around pn and hence E[ i Yi ] ≈ log(pn). Assume next that i is not selected. We start with the observation that Xi and Xn+1−i have the same distribution. Consider i < n/2. Position i stays a maximum if non of the preceding i − 1 elements move to a position larger than i. Analogously, position n + 1 − i stays a maximum if non of the succeeding i − 1 elements move to a position smaller than i + 1 − i. We therefore concentrate on i ≤ n/2. If k1 elements among the ﬁrst i − 1 and k2 elements among the last n − i are selected, the probability that i stays a maximum is f (k1 , k2 ) =

k1 ! · k2 ! . (k1 + k2 )!

The expression for f (k1 , k2 ) is decreasing in both arguments. Namely, k1 ! · (k2 + 1)! · (k1 + k2 )! k2 + 1 f (k1 , k2 + 1) = = ≤1. f (k1 , k2 ) (k1 + k2 + 1)! · k1 ! · k2 ! k1 + k2 + 1 We want to compute E[ i≤n/2 Xi ]. Deﬁne L = (16/p) log n. We split the sum into two parts: i ≤ L and i > L. For the second part, i > L, we expect to select about pi ≥ pL elements less than i and about p(n − i) ≥ pn/2 elements larger than i. The probability that we select less than half the stated number in either part is less than exp(−(16/8) log n) = O(n−2 ) by Chernoﬀ bounds. If at least pL/2 = 8 log n elements smaller than i are selected and at least pn/4 elements larger than i are selected the probability that i is a maximum is less than f (8 log n, pn/4) = O(n−2 ). Taking into account that i in not selected, we get prob(Xi = 1) = O((1 − p)n−2 ) for L ≤ i ≤ n/2. We turn to the i’s with i < L. If none of the ﬁrst i − 1 elements is selected i stays a maximum. If at least one of the ﬁrst i − 1 elements is chosen, the probability that i stays a maximum is at most e−pn/16 + 4/pn. The ﬁrst term accounts for the fact that less than pn/4 elements larger i are selected and the

204

Cyril Banderier, Ren´e Beier, and Kurt Mehlhorn

second term accounts for the fact that at least pn/4 elements larger i are selected and none of them is moved to a position before i. Thus for i < L prob(Xi = 1) ≤ (1 − p) (1 − p)i−1 + e−pn/16 + 4/pn and hence L−1

E[

i=1

Xi ] ≤

1−p 1−p + (1 − p)L(e−pn/16 + 4/pn) = (1 + o(1)), p p

for constant p. Taking into account all i ≥ n/2, we conclude 2(1 − p) E[ (Xi + Yi )] ≤ log(pn) + + o(1) . p i In fact, constant p is not required. The argument remains valid as long as log n/(p2 n) = o(1), i.e., for p log n/n. We now come to the ﬁrst assertion of the theorem: the complexity of the worst case among all perturbations. We show that, for p < 1/2, the smoothed number of left-to-right maxima in a permutation of n elements may be Ω( n/p). Consider the sequence n − k, n − k + 1, . . . , n, 1, 2, . . . , n − k − 1 (where k = n/p) . The ﬁrst part of the sequence consists of the ﬁrst k elements. Let a ≈ pk and b ≈ p(n − k) be the number of selected elements in the ﬁrst and second part of the sequence respectively. For large n, the probability that a > 2pk or b < pn/2 is exponentially small by Chernoﬀ bounds. So assume a ≤ 2pk and b ≥ pn/2. The probability that all elements selected in the ﬁrst part are put into the second part by the random permutation of the selected elements is at least q :=

b−1 b−a+1 b · ··· . a+b a+b−1 b+1

In this case all elements not selected in the ﬁrst part are left-to-right maxima. We have a a 4a2 2a 2a b−a ≥ exp − . = 1− = exp a ln 1 − q≥ a+b a+b a+b a+b since ln(1 − x) ≥ −2x for 0 ≤ x ≤ 3/4. Using a ≤ 2pk and b ≥ pn/2 we get 4(2p)2 n/p 4a2 ≥ exp − ≥ e−32 . q ≥ exp − a+b pn/2 We conclude that with constant probability the number of left-to-right maxima in the perturbed sequence is at least k − a ≥ k(1 − 2p) = Ω( n/p) for p < 1/2.

Smoothed Analysis of Three Combinatorial Problems

205

We next show an almost matching upper bound. Let s1 , . . . , sn be an arbitrary permutation of the numbers 1 to n, let k = 8(n/p) log n, and let I be the set of indices such that i ≥ k and si ≤ n − k. Basically, I ignores the ﬁrst k and the largest k elements of the permutation. We estimate how many si with i ∈ I are left-to-right maxima in the perturbed sequence. Then the total number of maxima is at most 2k larger. Consider a ﬁxed si with i ∈ I. If si is selected and is a maximum in the partial permutation, it must be a maximum among the selected elements. The expected number of left-to-right maxima among the selected elements is ln pn. So assume that si is not selected. With high probability there are at least kp/2 elements preceding si among the selected elements, there are at least kp/2 elements larger than si among the selected elements, and there are at most 2np selected elements. Therefore, the probability that si is a maximum in the perturbed sequence is bounded by kp/2 kp/2 2np − kp/2 k 1 ≤ 1− ≤ exp(−k 2 p/(8n)) = 2np 4n n and hence the expected number of left-to-right maxima in the perturbed se quence is O( (n/p) log n) .

4

Single Source Shortest Path Problems

We consider the single source shortest path problem with nonnegative integer edge weights. As usual, let n and m denote the number of nodes and edges respectively. We assume our edge weights to be in [0, 2K −1], i.e., edge weights are K bit integers. Meyer [10] has shown that the average complexity of the problem is linear O(n + m). He assumes edge weights to be random K bit integers and that a certain set of primitive operations on such integers can be performed in constant time (addition, ﬁnding the ﬁrst bit where two integers diﬀer, . . . ). The algorithm can be used for arbitrary graphs. An alternative algorithm was later given by Goldberg [6] and his work is the starting point for this section. The worst case complexity of his algorithm is O(m + nK). Algorithms with better worst case behavior are known [1,4,12,8]. Theorem 3 (Shortest Paths under Limited Randomness). Let G be an arbitrary graph, let c : E → [0, . . . , 2K − 1] be an arbitrary cost function, and let k be such that 0 ≤ k ≤ K. Let c be obtained from c by making the last k bits of each edge cost random. Then the single source shortest path problem can be solved in expected time O(m + n(K − k)). With full randomness the expected running time is O(m + n), with no randomness the running time is O(m + nK). Limited randomness interpolates linearly between the extremes. Proof: Goldberg has shown that the running time of his algorithm is (K − log min in cost(v) + 1), O(n + m + v

206

Cyril Banderier, Ren´e Beier, and Kurt Mehlhorn

where min in cost(v) denotes the minimal cost of an (directed) edge with target node v. Next observe that min in cost(v) is the minimum of indeg(v) numbers of which the last k bits are random; here indeg(v) denotes the indegree of v. For an edge e, let r(e) be the number of leading zeroes in the random part of e. Then E[r(e)] ≤ 2 and K − log min in cost(v) ≤ K − k + max{r(e) ; e ∈ inedges(v)} ≤K −k+ {r(e) ; e ∈ inedges(v)} Thus E[K − log min in cost(v)] ≤ K − k + O(indeg(v)) and the time bound follows. In our model of perturbation, the last k bits of each weight are set randomly. Alternatively, one might select bits with probability p and set selected bits to random values. With this deﬁnition, the smoothed complexity becomes O(m/p). For an edge e, let r(e) be the number of leading zeroes in the weight of e. Then E[r(e)] ≤ 2/p and K − log min in cost(v) ≤ max{r(e) ; e ∈ inedges(v)} ≤ {r(e) ; e ∈ inedges(v)} Therefore, E[K − log min in cost(v)] ≤ O(indeg(v)/p) and the time bound follows.

5

Conclusion

We have analyzed the smoothed complexity of three combinatorial problems. Smoothed complexity gives additional information about the distribution of hard instances in instance space. We believe, that the analysis of further discrete problems is a worthwhile task. Beccetti et al. [2] have recently analyzed NonClairvoyant scheduling under the partial bit randomization model. From a more theoretical viewpoint, it is natural to raise the question “Is there any relevant notion of smoothed complexity completeness?”. Such a notion would have to extend the notion of average case completeness which was introduced by Levin [9,7]. In conclusion, we believe that smoothed complexity is a key concept to get a better understanding of the behavior of algorithms in practice.

References 1. R. K. Ahuja, K. Mehlhorn, J. B. Orlin, and R. E. Tarjan. Faster algorithms for the shortest path problem. J. Assoc. Comput. Mach., 37(2):213–223, 1990. 2. L. Becchetti, S. Leonardi, A. Marchetti-Spaccamela, G. Sch¨ afer, and T. Vredeveld. Smoothening helps: A probabilistic analysis of the Multi-Level Feedback algorithm. submitted, 2003.

Smoothed Analysis of Three Combinatorial Problems

207

3. A. Blum and J.D. Dunagan. Smoothed analysis of the perceptron algorithm for linear programming. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA-02), pages 905–914. ACM Press, 2002. 4. Boris V. Cherkassky, Andrew V. Goldberg, and Craig Silverstein. Buckets, heaps, lists, and monotone priority queues. SIAM J. Comput., 28(4):1326–1346 (electronic), 1999. 5. J.D. Dunagan, D.A. Spielman, and S-H. Teng. Smoothed analysis of the renegar’s condition number for linear programming. In SIAM Conference on Optimization, 2002. 6. Andrew V. Goldberg. A simple shortest path algorithm with linear average time. In Proceedings of the 9th European Symposium on Algorithms (ESA ’01), pages 230–241. Springer Lecture Notes in Computer Science LNCS 2161, 2001. 7. Yuri Gurevich. Average case completeness. Journal of Computer and System Sciences, 42(3):346–398, 1991. Twenty-Eighth IEEE Symposium on Foundations of Computer Science (Los Angeles, CA, 1987). 8. Torben Hagerup. Improved shortest paths on the word RAM. In Automata, languages and programming (ICALP 2000), pages 61–72. Springer, Berlin, 2000. 9. Leonid A. Levin. Average case complete problems. SIAM J. Comput., 15(1):285– 286, 1986. 10. Ulrich Meyer. Shortest-Paths on arbitrary directed graphs in linear Average-Case time. In Proceedings of the Twelfth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA-01), pages 797–806. ACM Press, 2001. 11. Rajeev Motwani and Prabhakar Raghavan. Randomized algorithms. Cambridge University Press, Cambridge, 1995. 12. Raman. Recent results on the single-source shortest paths problem. In SIGACTN: SIGACT News (ACM Special Interest Group on Automata and Computability Theory), volume 28, pages 61–72. Springer, Berlin, 1997. 13. Daniel A. Spielman and Shang-Hua Teng. Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time. In Proceedings of the ThirtyThird Annual ACM Symposium on Theory of Computing, pages 296–3051, 2001.

Inferring Strings from Graphs and Arrays Hideo Bannai1 , Shunsuke Inenaga2 , Ayumi Shinohara2,3 , and Masayuki Takeda2,3 1

2

Human Genome Center, Institute of Medical Science, University of Tokyo 4-6-1 Shirokanedai, Minato-ku, Tokyo 108-8639, Japan [email protected] Department of Informatics, Kyushu University 33, Fukuoka 812-8581, Japan 3 PRESTO, Japan Science and Technology Corporation (JST) {s-ine,ayumi,takeda}@i.kyushu-u.ac.jp

Abstract. This paper introduces a new problem of inferring strings from graphs, and inferring strings from arrays. Given a graph G or an array A, we infer a string that suits the graph, or the array, under some condition. Firstly, we solve the problem of ﬁnding a string w such that the directed acyclic subsequence graph (DASG) of w is isomorphic to a given graph G. Secondly, we consider directed acyclic word graphs (DAWGs) in terms of string inference. Finally, we consider the problem of ﬁnding a string w of a minimal size alphabet, such that the suﬃx array (SA) of w is identical to a given permutation p = p1 , . . . , pn of integers 1, . . . , n. Each of our three algorithms solving the above problems runs in linear time with respect to the input size.

1

Introduction

To process strings eﬃciently, several kinds of data structures are often used. A typical form of such a structure is a graph, which is specialized for a certain purpose such as pattern matching [1]. For instance, directed acyclic subsequence graphs (DASGs) [2] are used for subsequence pattern matching, and directed acyclic word graphs (DAWGs) [3] are used for substring pattern matching. It is quite important to construct these graphs as fast as possible, processing the input strings. In fact, for any string, its DASG and DAWG can be built in linear time in the length of a given string. Thus, the input in this context is a string, and the output is a graph. In this paper, we introduce a challenging problem that is a ‘reversal’ of the above, namely, a problem of inferring strings from graphs. That is, given a directed graph G, we infer a string that suits G under some condition. Firstly, we consider the problem of ﬁnding a string w such that the DASG of w is isomorphic to a given unlabeled graph G. We show a characterization theorem that gives if-and-only-if conditions so that a directed acyclic graph is isomorphic to a DASG. Our algorithm inferring a string w from G as a DASG is based on this theorem, and it will be shown to run in linear time in the size of G. Secondly, we consider DAWGs in terms of the string inference problem. We also give a linearB. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 208–217, 2003. c Springer-Verlag Berlin Heidelberg 2003

Inferring Strings from Graphs and Arrays

209

time algorithm that ﬁnds a string w such that the DAWG of w is isomorphic to a given unlabeled graph G. Another form of a data structure for string processing is an array of integers. A problem of inferring strings from arrays was ﬁrst considered by Franˇek et al. [4]. They proposed a method to check if an integer array is a border array for some string w. Border arrays are better known as failure functions [5]. They showed an on-line linear-time algorithm to verify if a given integer array is a border array for some string w on an unbounded size alphabet. Duval et al. [6] gave an on-line linear-time algorithm for a bounded size alphabet, to solve this problem. On the other hand, in this paper we consider suﬃx arrays (SAs) [7] in the context of string inference. Namely, given a permutation p = p1 , . . . , pn of integers 1, . . . , n, we infer a string w of a minimal size alphabet, such that the SA of w is identical to p. We present a linear time algorithm to infer string w from a given p. 1.1

Notations on Strings

Let Σ be a ﬁnite alphabet. An element of Σ ∗ is called a string. Strings x, y, and z are said to be a preﬁx, substring, and suﬃx of string w = xyz, respectively. The sets of preﬁxes, substrings, and suﬃxes of a string w are denoted by Preﬁx (w), Substr (w), and Suﬃx (w), respectively. String u is said to be a subsequence of string w if u can be obtained by removing zero or more characters from w. The set of subsequences of a string w is denoted by Subseq(w). The length of a string w is denoted by |w|. The empty string is denoted by ε, that is, |ε| = 0. Let Σ + = Σ ∗ − {ε}. The i-th character of a string w is denoted by w[i] for 1 ≤ i ≤ |w|, and the substring of a string w that begins at position i and ends at position j is denoted by w[i : j] for 1 ≤ i ≤ j ≤ |w|. For convenience, let w[i : j] = ε for j < i. For strings w, u ∈ Σ ∗ , we denote w ≡ u if w is obtained from u by one-to-one character replacements. For a string w let Σw denote the set of the characters appearing in w. 1.2

Graphs

Let V be a ﬁnite set of nodes. An edge is deﬁned to be an ordered pair of nodes. Let E be a ﬁnite set of edges. A directed graph G is deﬁned to be a pair (V, E). For an edge (u, v) of a directed graph G, u is called a parent of v, and v is called a child of u. Let Children(u) = {v ∈ V | (u, v) ∈ E}, and Parents(v) = {u ∈ V | (u, v) ∈ E}. Node u (v, respectively) is called the head (tail, respectively) of edge (u, v). An edge (u, v) is said to be an out-going edge of node u and an in-coming edge of node v. A node without any in-coming edges is said to be a source node of G. A node without any out-going edges is said to be a sink node of G. In a directed graph G, the sequence of edges (v0 , v1 ), (v1 , v2 ), . . . , (vn−1 , vn ) is called a path, and denoted by path(v0 , vn ). The length of the path is deﬁned

210

Hideo Bannai et al.

to be the number of edges in the path, namely, n. If v0 = vn , the path is called a cycle. If G has no cycles, it is called a directed acyclic graph (DAG). An edge of a labeled graph G is an ordered triple (u, a, v), where u, v ∈ V and a ∈ Σ. A path (v0 , a1 , v1 ), (v1 , a2 , v2 ), . . . (vn−1 , an , vn ) is said to spell out string a1 a2 · · · an . For a labeled graph G, let s(G) be the graph obtained by removing all edge-labels from G. For two labeled graphs G and H, we write as G ∼ = H if s(G) is isomorphic to s(H). Recall the following basic facts on Graph Theory, which will be used in the sequel. Lemma 1 (e.g. [8] pp. 6–10). Checking if a given directed graph is acyclic can be done in linear time. Lemma 2 (e.g. [8] pp. 6–8). Connected components of a given undirected graph can be computed in linear time. Without loss of generality, we consider in this paper, DAGs G = (V, E) with exactly one source node and sink node, denoted source and sink , respectively. We also assume that for all nodes v ∈ V (excluding source and sink ), there exists both path(source, v) and path(v, sink ). For nodes u, v ∈ V , let us deﬁne pathLengths(u, v) as the multi-set of lengths of all paths from u to v, and let depths(v) = pathLengths(source, v).

2

Inferring String from Graph as DASG

This section considers the problem of inferring a string from a given graph as an unlabeled DASG. For a subsequence x of string w ∈ Σ ∗ , we consider the end-position of the leftmost occurrence of x in w and denote it by LM w (x), where 0 ≤ |x| ≤ ∗ LM w (x) ≤ |w|. We deﬁne an equivalence relation ∼seq w on Σ by x ∼seq w y ⇔ LM w (x) = LM w (y). Let [x]seq denote the equivalence class of a string x ∈ Σ ∗ under ∼seq w w . The directed acyclic subsequence graph (DASG) of string w ∈ Σ ∗ , denoted by DASG(w), is deﬁned as follows: Deﬁnition 1. DASG(w) is the DAG (V, E) such that V = {[x]seq | x ∈ Subseq(w)}, w seq E = {([x]w , a, [xa]seq w ) | x, xa ∈ Subseq(w) and a ∈ Σ}. According to the above deﬁnition, each node of DASG(w) can be associated with a position of w uniquely. When we indicate the position i of a node v of DASG(w), we write as vi . Theorem 1 (Baeza-Yates [2]). For any string w ∈ Σ ∗ , DASG(w) is the smallest (partial) DFA that recognizes all subsequences of w.

Inferring Strings from Graphs and Arrays

211

Fig. 1. (a) DASG(w) with w = abba (b) DAWG(w) with w = ababcabcd

DASG(w) with w = abba is shown in Fig. 1 (a). Using DASG(w), we can examine whether or not a given pattern p ∈ Σ ∗ is a subsequence of w in O(|p|) time [2]. Details of construction and applications of DASGs can be found in the literature [2]. Theorem 2. A labeled DAG G = (V, E) is DASG(w) for some string w of length n, if and only if the following properties hold. 1. Path property There is a unique path of length n from source to sink . 2. Node number property |V | = n + 1. 3. Out-going edge labels property The labels of the out-going edges of each node v are mutually distinct. 4. In-coming edge labels property The labels of all in-coming edges of each node v are equal. Moreover, the integers assigned to the tails of these edges are consecutive. 5. Character positions property For any node vk ∈ V , assume Parents(vk ) = ∅. Assume vi ∈ Parents(vk ) and vi−1 ∈ / Parents(vk ) for some 1 ≤ i < k. If the in-coming edges of vk are labeled by some character a, then edge (vi−1 , vi ) is also labeled by a. The path of Property 1 is the unique longest path of G, which spells out w. We call this path the backbone of G. The backbone of DASG(w) can be expressed by sequence (v0 , w[1], v1 ), . . . , (vn−1 , w[n], vn ). Lemma 3. For any two strings u, w ∈ Σ ∗ , u ≡ w if and only if DASG(u) ∼ = DASG(w). The above lemma means that, if an unlabeled DAG is isomorphic to the DASG of some string, the string is uniquely determined except for apparent one-to-one character replacements. Theorem 3. Given an unlabeled graph G = (V, E), the string inference problem for DASGs can be solved in linear time. Proof. We describe a linear time algorithm which, when given unlabeled graph G = (V, E), infers a string w where s(DASG(w)) is isomorphic to G. First, recall that the acyclicity test for given graph G is possible in linear time (Lemma 1). If it contains a cycle, we reject it and halt. While traversing G to test the acyclicity

212

Hideo Bannai et al.

of G, we can also compute the length of the longest path from source to sink of G, and let n be the length. We at the same time count the number of nodes in G. If |V | = n + 1, we reject it and halt. Then, we assign an integer i to each node v of G such that the length of the longest path from source to v is i. This corresponds to a topological sort of nodes in G, and it is known to be feasible in O(|V | + |E|) time (e.g. [8] pp. 6–8). After the above procedures, the algorithm starts from sink of G. Let w be a string of length n initialized with nil at each position. The variable unlabeled indicates the rightmost position of w where the character is not determined yet, and thus it is initially set to n = |w|. At step i, the node at position unlabeled is given a new character ci . We then determine all the positions of the character ci in w, by backward traversal of in-coming edges from sink towards source. To do so, we preprocess G after ordering the nodes topologically. At node vi of G, for each vj ∈ Children(vi ) we insert vi to the list maintained in vj , corresponding to a reversed edge (vj , vi ). Since there exists exactly n + 1 nodes in G, the integers assigned to nodes in the backbone are sorted from 0 to n. Therefore, if we start from source, the list of reversed edges of every node is sorted in increasing order. Thus, given a node node, we can examine if the numbers assigned to nodes in Parents(node) are consecutive, in time linear in the number of elements in the list of the reversed edges of node. If they are consecutive, the next position where ci appears in w corresponds to the smallest value in the set (the ﬁrst element in the list), and the process is repeated for this node until we reach source. If, at any point, the elements in the set are not consecutive, we reject G and halt. This part is based on Properties 4 and 5 of Theorem 2. If, in this process, we encounter a position of w in which a character is already determined, we reject G and halt since if G is a DASG, for any position its character has to be uniquely determined. After we ﬁnish determining the positions of ci in w, we decrement unlabeled until w[unlabeled] is undetermined, or if we reach source. If unlabeled = 0 (if not source), then the process is repeated for a new character ci+1 . Otherwise, all the characters have been determined, and we output w. Since each edge is traversed (backwards) only once, and unlabeled is decremented at most n times, we can conclude that the whole algorithm runs in linear time with respect to the size of G.

3

Inferring String from Graph as DAWG

This section considers the problem of inferring a string from a given graph as an unlabeled DAWG. Deﬁnition 2 (Crochemore [9]). The directed acyclic word graph (DAWG) of w ∈ Σ ∗ is the smallest (partial) DFA that recognizes all suﬃxes of w. The DAWG of w ∈ Σ ∗ is denoted by DAWG(w). DAWG(w) with w = ababcabcd is shown in Fig. 1 (b). Using DAWG(w), we can examine whether or not a given pattern p ∈ Σ ∗ is a substring of w in O(|p|) time. Details of construction and applications of DAWGs can be found in the literature [3].

Inferring Strings from Graphs and Arrays

213

Lemma 4. For any two strings u, w ∈ Σ ∗ , u ≡ w if and only if DAWG(u) ∼ = DAWG(w). The above lemma means that, if an unlabeled DAG is isomorphic to the DAWG of some string, the string is uniquely determined except for apparent one-to-one character replacements. We assume that any string w terminates with a special delimiter symbol $ which does not appear in preﬁxes. Then the suﬃxes of w are all recognized at sink of DAWG(w), spelled out from source. Note that, on such an assumption, DAWG(w) is the smallest DFA recognizing all substrings of w. It is not diﬃcult to see that a DAWG will have the following properties. Theorem 4. If a labeled DAG G is DAWG(w) for some string w of length n, then the following properties hold. 1. Length property For each length i = 1, . . . , n, there is a unique path from source to sink of length i, where n is the length of the longest path. 2. In-coming edge labels property The labels of all in-coming edges of each node v are equal. 3. Suﬃx property Let ui = ui [1]ui [2] . . . ui [i] be the labels of a path of length i from source to sink . Then ui [i − j] = w[n − j] for each j = 0, . . . , i − 1. The above theorem gives necessary properties for a DAG to be a DAWG. Therefore, if a DAG G does not satisfy a property of the above theorem, then we can immediately decide that G is not isomorphic to any DAWG. A na¨ıve way to check the length property would take O(n2 ) time since the n total lengths of all the paths is Σi=1 i, but we here introduce how to conﬁrm the length property in linear time. The length property claims that depths(sink ) = {1, 2, . . . , n} holds, where n is the length of the longest path in G from source to sink . The next lemma is a stronger version of the length property, which holds for any node. Lemma 5. Let w be an arbitrary string of length n. For any node v in DAWG(w), the multi-set depths(v) consists of distinct consecutive integers, that is, depths(v) = {i, i + 1, . . . , j} for some 1 ≤ i ≤ j ≤ n. Lemma 6. Length property can be veriﬁed in linear time with respect to the total number of edges in the graph. Proof. If a given G forms DAWG(w) for some string w, by Lemma 5, at each node v, the multi-set depths(v) consists of distinct consecutive integers. Thus depths(v) = {i, i + 1, . . . , j} can be represented by the pair i, j of the minimum i and the maximum j. Starting from source, we traverse all nodes in a depthﬁrst manner, where all in-coming edges of a node must have been traversed to go deeper. If a node v has only one parent node u, then depths(v) is simply

i + 1, j + 1 where depths(u) = i, j. If a node v has k > 1 parent nodes u1 , . . ., uk , we do as follows. Let i1 , j1 = depths(u1 ), . . ., ik , jk = depths(uk ). By Lemma 5, depths(v) = i1 + 1, j1 + 1∪ · · · ∪ ik + 1, jk + 1 must be equal to

214

Hideo Bannai et al.

imin + 1, jmax + 1, where imin = min{i1 , . . . , ik } and jmax = max{j1 , . . . , jk }. (Remark that the union operation is taken over multi-sets.) This can be veriﬁed by sorting the pairs i1 , j1 , . . ., ik , jk with respect to the ﬁrst component in increasing order into i1 , j1 , . . ., ik , jk , (i1 < · · · < ik ) and checking that j1 + 1 = i2 , . . . , jk−1 + 1 = ik . The sorting and veriﬁcation can be done in O(k) time at each node with a radix sort and skip count trick, provided that we prepare an array of size n before the traversal, and reuse it. If depths(sink ) = 1, n ﬁnally, the length property holds. The running time is linear with respect to the number of edges, since each edge is only processed once as out-going, and once as in-coming. Theorem 5. Given an unlabeled graph G = (V, E), the string inference problem for DAWGs can be solved in linear time. Proof (sketch). We describe a linear time algorithm which, when given unlabeled graph G = (V, E), infers a string w where s(DAWG(w)) is isomorphic to G. The algorithm is correct, provided that there exists such a string for G. Invalid inputs can be rejected with linear time sanity checks, after the inference. Initially, we check the acyclicity of the graph in linear time (Lemma 1), and ﬁnd source and sink . Using the algorithm of Lemma 6, we verify the length property in linear time. At the same time, we can mark at each node, its deepest parent, that is, the parent on the longest path from source. Notice that Property 2 of Theorem 4 allows us to label the nodes instead of the edges. From Deﬁnition 2, it is easy to see that the labels of out-going edges from source are distinct and should comprise the alphabet Σw , and therefore we assign distinct labels to nodes in Children(source) (the label for sink can be set to ‘$’). The algorithm then resembles a simple breadth-ﬁrst traversal from sink , going up to source. For any set N of nodes, let Parents(N ) = u∈N Parents(u). Starting with N0 = {sink}, at step i, we will consider labeling a set Ni+1 ⊆ Parents(Ni ) of nodes whose construction is deﬁned below. Nodes may have multiple paths of diﬀerent lengths to the sink, and it is marked visited when it is ﬁrst considered in the set. Ni+1 is constructed by including all unvisited nodes, as well as a single deepest visited node (if any), in Parents(Ni ) (sink is also disregarded since it cannot have a label). With this construction, we will later see that at least one node in Ni+1 will have already been labeled, and therefore from Property 3 of Theorem 4, all other nodes in Ni+1 can be given the same label. When there are no more unvisited nodes, we infer the resulting string w, which is spelled out by longest path from source to sink . The linear run time of the algorithm is straightforward, since it is essentially a linear time breadthﬁrst traversal of the DAG with one extra width at most (notice that redundant traversals from visited nodes can be avoided by using only the deepest parent node marked at each node), and the depth of the traversal is at most the length of the longest path from source to sink . The claim that Ni+1 will contain at least one labeled node for all i is justiﬁed as follows. If Ni+1 contains a node marked visited, we can use this node since the label of nodes are always inferred when they are marked visited. If Ni+1 does not contain a visited node, it is not diﬃcult to see from its construction

Inferring Strings from Graphs and Arrays

215

that this implies that Ni+1 represents the set of all nodes which have a path of length i + 1 to the sink . Then, from the length property, we can see that at least one of these nodes is labeled in the initial distinct labeling of Children(source). If G was not a valid structure for a DAWG, s(DAWG(w)) may not be isomorphic to G. However, G is labeled at the end of the inference algorithm, and we can check if the labeled G and DAWG(w) are congruent or not in linear time. This is done by ﬁrst creating DAWG(w) from w in linear time [3], checking the number of nodes and edges, and then doing a simultaneous linear-time traversal on DAWG(w) and labeled G. For each pair of nodes which have the same path from source in both graphs, the labels of the out-going edges are compared. The inclusion of a single deepest visited node (if any) when constructing Ni+1 from P arents(Ni ) is the key to the linear time algorithm, because including all visited nodes in Parents(Ni ) would result in quadratic running time, while not including any visited nodes would result in failure of inferring the string for some inputs.

4

Inferring String from Suﬃx Array

A suﬃx array SA of a string w of length n is a permutation p = p1 , . . . , pn of the integers 1, . . . , n, which represents the lexicographic ordering of the suﬃxes w[pi : n]. Details of construction and applications of suﬃx arrays can be found in the literature [7]. Opposed to the string inference problem for DASGs and DAWGs, the inferred string cannot be determined uniquely (with respect to ≡). For example, for a given suﬃx array p = p1 , . . . , pn , we can easily create a string w = w[1] . . . w[n] with an alphabet of size n, where w[i] is set to the character with the pi th lexicographic order in the alphabet. Therefore, we deﬁne the string inference problem for suﬃx arrays as: given a permutation p = p1 . . . pn of integers 1, . . . , n, construct a string w with a minimal alphabet size, whose suﬃx array SA(w) = p. The only condition that a permutation p = p1 . . . pn must satisfy for it to represent a suﬃx array of string w is, for all i ∈ 1, . . . n − 1, w[pi : n] ≤lex w[pi+1 : n], where ≤lex represents the lexicographic relation over strings. From the suﬃx array, we are provided with the lexicographic ordering of each of the characters in the string, that is, w[p1 ] ≤lex · · · ≤lex w[pn ]. Let I denote the set of integers where i ∈ I indicates w[pi ]
216

Hideo Bannai et al.

Proof. We give a linear time algorithm to ﬁnd the smallest I deﬁned as above. The algorithm itself is very simple: for all i = 1, . . . , n − 1, if w[pi + 1] ≤lex w[pi+1 + 1] then i ∈ I (w[n + 1] is deﬁned to be ﬁrst in the lexicographic ordering of the alphabet). The validity of the algorithm is shown below. Deﬁne a mapping from a position j in the string, to its lexicographic order k, that is r1 , . . . , rn so that rj = k such that pk = j. For i ∈ 1, . . . n − 1, consider the two suﬃxes w[pi : n] ≤lex w[pi+1 : n]. Notice that, if there exists j s.t. w[pi + j] ≤lex w[pi+1 + j], then there must ∃k < j s.t. w[pi + k]
5

Conclusions and Open Problems

In this paper we introduced a new challenging problem named string inference, where we infer strings from given graphs or arrays. We gave linear-time algorithms to solve the problem for DASGs and DAWGs. We also extended this scheme to arrays, and gave an algorithm that infers a string from a given suﬃx array in linear time. One interesting open problem is whether inferring a string from a given factor oracle [11] can be done in linear time. The factor oracle of a string w is a DFA that ‘at least’ accepts Substr (w), but possibly accepts some subsequences of w as well. Factor oracles therefore can be regarded as an ‘intermediate’ data structure between DAWGs and DASGs. To infer a string from a given unlabeled DAG as a factor oracle, we shall need to know what the language accepted by the factor oracle of w is, but it is still unknown. Therefore, the formal deﬁnition of factor oracles is awaited, and it would be a part of our future work as well. We are also interested in string inference from suﬃx trees [12]. The suﬃx tree of string w is a tree structure that represents Substr (w). The point is that its edge

Inferring Strings from Graphs and Arrays

217

labels are strings (multiple characters). Also, the compact DAWG (CDAWG) [13] of w is a DAG recognizing Substr (w) with string edge labels. Therefore, to infer a string from a suﬃx tree or CDAWG, we need to infer edge labels as strings but their lengths are not given beforehand. We expect that some kinds of word equations will be involved in this problem, and thus this class of the string inference problem should be far more complex than those we have solved in this paper.

References 1. Crochemore, M., Rytter, W.: Jewels of Stringology. World Scientiﬁc (2002) 2. Baeza-Yates, R.A.: Searching subsequences (note). Theoretical Computer Science 78 (1991) 363–376 3. Blumer, A., Blumer, J., Haussler, D., Ehrenfeucht, A., Chen, M.T., Seiferas, J.: The smallest automaton recognizing the subwords of a text. Theoretical Computer Science 40 (1985) 31–55 4. Franˇek, F., Gao, S., Lu, W., Ryan, P.J., Smyth, W.F., Sun, Y., Yang, L.: Verifying a border array in linear time. J. Comb. Math. Comb. Comput. (2002) 223–236 5. Aho, A.V., Hopcroft, J.E., Ullman, J.D.: The design and analysis of computer algorithms. Addison-Wesley (1974) 6. Duval, J.P., Lecroq, T., Lefevre, A.: Border array on bounded alphabet. In: Proc. The Prague Stringology Conference ’02 (PSC’02), Czech Technical University (2002) 28–35 7. Manber, U., Myers, G.: Suﬃx arrays: A new method for on-line string searches. SIAM J. Compt. 22 (1993) 935–948 8. Atallah, M.J., ed.: Algorithms and Theory of Computation Handbook. CRC Press (1998) ISBN:0-8493-2649-4. 9. Crochemore, M.: Transducers and repetitions. Theoretical Computer Science 45 (1986) 63–86 10. Graham, R., Knuth, D., Patashnik, O.: Concrete Mathematics (2nd edition). Addison Wesley (1994) 11. Allauzen, C., Crochemore, M., Raﬃnot, M.: Factor oracle: A new structure for pattern matching. In: Proc. 26th Annual Conference on Current Trends in Theory and Practice of Informatics (SOFSEM’99). Volume 1725 of LNCS., Springer-Verlag (1999) 291–306 12. Weiner, P.: Linear pattern matching algorithms. In: Proc. 14th Annual Symposium on Switching and Automata Theory. (1973) 1–11 13. Blumer, A., Blumer, J., Haussler, D., McConnell, R., Ehrenfeucht, A.: Complete inverted ﬁles for eﬃcient text retrieval and analysis. Journal of the ACM 34 (1987) 578–595

Faster Algorithms for k-Medians in Trees (Extended Abstract) Robert Benkoczi1 , Binay Bhattacharya1 , Marek Chrobak2 , Lawrence L. Larmore3 , and Wojciech Rytter4,5 1

School of Computing Science, Simon Fraser University Burnaby, BC, V5A 1S6, Canada∗ 2 Department of Computer Science, University of California, Riverside, CA 92521∗∗ 3 Department of Computer Science, University of Nevada Las Vegas, NV 89154-4019∗∗∗ 4 Instytut Informatyki, Uniwersytet Warszawski Banacha 2, 02–097, Warszawa, Poland∗∗∗∗ 5 Department of Computer Science, and New Jersey Institute of Technology, Newark New Jersey 07102-1982, USA

Abstract. In the k-median problem we are given a connected graph with non-negative weights associated with the nodes and lengths associated with the edges. The task is to compute locations of k facilities in order to minimize the sum of the weighted distances between each node and its closest facility. In this paper we consider the case when the graph is a tree. We show that this problem can be solved in time O(npolylog(n)) for the following cases: (i) directed trees (and any ﬁxed k), (ii) balanced undirected trees, and (iii) undirected trees with k = 3.

1

Introduction

In the k-median problem we are given a graph G in which each node u has a non-negative weight wu and each edge (u, v) has a non-negative length duv . We extend this notation to arbitrary u, v ∈ G, so that duv is the minimum distance between u, v in G. Our goal is to ﬁnd a set F of k vertices that minimizes costF (G) = x∈G minu∈F wx dxu . The optimal cost of G with k facilities is costk (G) = min {costF (G) : |F | ≤ k}. We think of the nodes of G as customers, with each customer x having a demand wx for some service. F is a set of k facilities that provide this service. It costs dxu to provide a unit of service to the customer at x from the facility located at u. We wish to place k facilities in G so that the overall service cost is minimized. This optimal set is called the k-median of G. The k-median problem has been a subject of study for several ∗ ∗∗ ∗∗∗ ∗∗∗∗

Research supported by NSERC and MITACS. Research supported by grants EPSRC GR/N09077 (UK), CCR-9988360, CCRˇ 0208856 (NSF) and ME-103/INT-9600919 from MSMT (Czech Rep. and NSF). Research supported by grant EPSRC GR/N09077 (UK). Research supported by grant EPSRC GR/N09077 (UK).

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 218–227, 2003. c Springer-Verlag Berlin Heidelberg 2003

Faster Algorithms for k-Medians in Trees

219

decades. The general case is known to be NP-hard (see Problem ND51 in [9]), and most of the past work focused on eﬃcient heuristics (see, for example, [14]) and approximation algorithms [1,7,6,2]. Trees. The case when G is a tree has been also extensively studied. In 1979 Kariv and Hakimi [13] presented a dynamic programming algorithm for this problem that runs in time O(k 2 n2 ). Hsu [12] gave an algorithm with running time O(kn3 ). More recently, Tamir [19] showed that the Kariv-Hakimi algorithm is faster than previously thought, namely that its running time is only O(kn2 ). This is the best upper bound known to this date. Some other special cases of the problem have been investigated. The case k = 2 was considered by Gavish and Sridhar [10] who presented an algorithm with running time O(n log n). The dynamic version of the case k = 2 was studied by Auletta et al. in [3]1 . Hassin and Tamir [11] gave an O(kn)-time algorithm for k facilities when the tree is a path. Another O(kn)-time algorithm for the path was later given by Auletta et al. [4]. Recently, Shah and Farach-Colton [17] reported the ﬁrst sub-quadratic algorithms for solving several facility location problems in trees. However, they leave open the question whether a sub-quadratic algorithm for k-medians is possible. Directed Trees. In rooted directed trees the edges are directed towards the root. Let T be a directed tree with n nodes and root r. If y is the parent of x, denote by dxy the length of edge (x, y). The length function extends in a natural way to any pair x, y where y is an ancestor of x. (If y is not an ancestor of x, we assume dxy = ∞.) Our goal is to ﬁnd a set of nodes F of size at most k that minimizes the quantity: costF (T ) = x∈T minu∈F ∪{r} wx dxu . Note that in this case we actually place k + 1 facilities in the tree, since the “facility” at the root is ﬁxed. The k-median problem for directed trees was introduced by Li et al. [16] to model the placement of web proxies to minimize average latency. In [16], the authors presented a O(k 2 n3 )-time algorithm for this problem. This was later improved to O(k 2 n2 ) by Vigneron et al. [20]. In [20] the authors also noted that the Kariv-Hakimi algorithm can be modiﬁed to ﬁnd k-medians in directed trees in time O(k 2 n2 ). Tamir (private communication) observed that his method from [19] can be adapted to reduce the time complexity to O(kn2 ). In Chrobak et al. [8], eﬃcient algorithms were given for k = 2 and k = 3, with complexities O(n log n) and O(n log2 n), respectively. In the special case when the tree is a line, Li et al. [15] gave an algorithm with running time O(kn2 ). This was subsequently improved by Woeginger [21] to O(kn). (This case can also be solved in time O(kn) using the method from [11].) Our Results. We provide several results that address the question whether kmedians can be found in sub-quadratic time, left open by Shah and FarachColton [17]. We show sub-quadratic algorithms for the following versions: (i) directed trees (for any ﬁxed k), (ii) undirected balanced binary trees, and (iii) arbitrary trees for k = 3. The running time of our algorithms is bounded by 1

In [3], the authors also claimed a linear-time algorithm; however, their running time analysis is ﬂawed.

220

Robert Benkoczi et al.

O(n polylog(n)). For directed trees, if the weights are bounded by a constant, our algorithm runs in time O(n log2 n). Our Approach. Our algorithms are based on a combination of divide-and-conquer and dynamic programming techniques. The divide-and-conquer method is based on a recursive decomposition of the given tree called the spine decomposition [5]. The idea is to decompose a given tree T into subtrees Tuv , where v is a descendant of u and Tuv contains all descendants of u that are not descendants of v. This decomposition has depth O(log n) and its total size is O(n log n). Some previous algorithms for k-medians in trees [10,15,13,19] were based on ancestor-based dynamic programming. They computed, recursively, an optimal cost of placing j facilities in Tu , the subtree of T rooted at u, under the assumption that there is a facility at an ancestor z of u. We use an alternative approach, referred to as undiscretized dynamic programming, which was applied by Langerman et al. [18] for placing ﬁlters in a multi-cast tree, and by Shah et al. [17] and Chrobak et al. [8] for facility location and k-medians. Instead of using an ancestor z as a parameter of the cost functions, we use a real-valued argument α ≥ 0 that represents the distance duz . We compute these cost functions bottom-up, and in order of increasing j, for all subtrees Tuv in the spine decomposition. A cost function for j facilities in Tuv is a minimum of a number of linear functions, with each linear function corresponding to a choice of j facilities. The crucial problem in the analysis of our algorithm is to determine a good estimate on the complexity (the number of line segments) of these cost functions. Our key result is that this complexity is O(m polylog(m)), for m = |Tuv |. It is known [8] that for j = 2 this number is O(m), and we conjecture that it is linear for any constant j. We point out that in the facility location problem studied in [17], the complexity of the resulting dynamic programming functions can be easily shown to be linear (see Lemma 2.1 in [17]). Both in the directed and undirected case, we can think of the k-median problem as the problem of partitioning T into subtrees (k subtrees in the undirected case, and k + 1 in the directed case), for which the sum of the 1-median costs is minimized. The directed case turns out to be easier, mainly because a subtree Tu cannot contain a facility that serves a node that is not in Tu .

2

Spine Decomposition

Throughout the paper, T is a tree (directed or not) with root r and n nodes. Without loss of generality, we assume that T is a binary tree in which each node is either a leaf or has two children. We can also assume that T is rightist, that is, the size of the right subtree of any node is at least the size of its left subtree. For any subtree T of T we write |T | for the size of T (the number of nodes). By Tu we denote the subtree of T rooted at u. The left and right children of u are denoted by left(u) and right(u). We say that v ∈ Tu is a rightist descendant of u if the path from u to v contains only right links. We denote this path by Suv and by Tuv we denote the tree Tu − Tright(v) . The leaf of T that is the rightmost

Faster Algorithms for k-Medians in Trees

u

u

z

µ

(a) ρ

221

(b)

η v

Fig. 1. Two cases in the construction of the spine decomposition tree.

descendant of u is denoted ρ(u). Thus Tu = Tuρ(u) . The path from u to ρ(u) is called a spine. We will refer to subtrees Tu as full subtrees and subtrees Tuv , where v = ρ(u), as partial subtrees. We now deﬁne a spine decomposition tree for T [5], which we denote by SD(T ). (Our presentation of spine decomposition is diﬀerent from the one in [5], but is essentially equivalent.) Each node of SD(T ) is of the form u, v, where v is a rightist descendant of u, and it represents the tree Tuv . The root of SD(T ) is r, ρ(r). Other nodes u, v are deﬁned as follows: Leaves: If u = v and u is a leaf of T then u, u is a leaf of SD(T ). Nodes with one child: If u = v and u is not a leaf of T then u, u has one child z, ρ(z), where z = left(u). Nodes with two children: For u = v, pick a node µ ∈ Suv as follows. If |Tuu | > 1 2 |Tuv |, let µ = u. Otherwise, let µ be the farthest node from u for which |Tuµ | ≤ 23 |Tuv |. (Note that µ = v.) The left child of u, v is u, µ and the right child is η, v, where η = right(µ). Let muv = |Tuv |. If u, v has two children, say u, µ and η, v, then we call u, v left-balanced (resp. right-balanced ) if muµ ≤ 23 muv + 1 (resp. mηv ≤ 2 3 muv + 1). We deﬁne u, v to be balanced, if it is both left- and right-balanced. Nodes that have one child are right-balanced but not left-balanced. Lemma 1. Let u, v ∈ SD(T ) and u = v. If Tuv is a full subtree, that is v = ρ(u), then u, v is balanced. Proof. To simplify notation, let m = muv . Since T is rightist, we have muu − 1 ≤ m−muu , so muu ≤ 12 (m+1). If muu = 12 (m+1) then the claim is trivial, so we can assume muu ≤ 12 m. Then, by the choice of µ, muµ ≤ 23 m and muη > 23 m. If η = v then mηv = 1, and we are done. For η = v, denoting t = right(η), the inequality muη > 23 m implies mtv < 13 m. Thus, since T is rightist, mηv ≤ 2mtv +1 < 23 m+1. Lemma 2. Suppose that u, v ∈ SD(T ) is an unbalanced node with two children u, µ and η, v. Then one of the following cases holds: (a) u, v is right-balanced and µ = u, that is, the left child of u, v is u, u. (b) u, v is left-balanced and η = v, that is, the right child of u, v is v, v. (c) u, v is left-balanced, η = v, the right child of η, v is right-balanced and its left child is η, η.

222

Robert Benkoczi et al.

Proof. If muu > 21 m then µ = u, and mηv < 12 m, so u, v satisﬁes condition (a), and we are done. We can assume now that muu ≤ 12 m. By the construction, muµ ≤ 32 m, so u, v is left-balanced. Thus, by the assumption of the lemma, u, v is not right-balanced, that is mηv > 23 m + 1. We have two subcases. If η = v, then the right child of u, v is v, v, and case (b) holds. For η = v, let t = right(η). Since mηv > 23 m + 1 and muη > 23 m (by the choice of µ), we get mηη = muη + mηv − m = muη + 12 mηv − m + 12 mηv > 2 1 2 1 1 3 m + 2 ( 3 m + 1) − m + 2 mηv > 2 mηv . So the children of η, v will be η, η and 1 2 t, v. Further, mtv < 2 mηv ≤ 3 mηv + 1, so η, v is right-balanced, so (c) holds. By Lemmas 1 and 2, on any root-to-leaf path no more than three unbalanced nodes can occur consecutively. This implies the following lemma. Lemma 3. The height of SD(T ) is O(log n), and its total size, u,v∈SD(T ) |Tuv |, is O(n log n).

3

Directed Trees

Cost Functions. We deﬁne two cost functions as follows: j (α) = the optimal cost of Tu with at most j facilities, such that (i) at least Puv one of the facilities is in Suv , and (ii) u is served by a facility that is above u at distance α. Qjuv (α) = the optimal cost of Tuv with at most j facilities, such that (i) no facility is in Suv , and (ii) u is served by a facility that is above u at distance α. For the cost functions for full subtrees we will simplify notation and write j j Puj (α) = Puρ(u) (α) and Qju (α) = Qjuρ(u) (α). Note the asymmetry between Puv (·) j j and Quv (·): in the deﬁnition of Puv (·) the facilities can be located in the whole subtree Tu , while for Qjuv (·) only in Tuv . However, in both cases the “top” facilities – those that do not have any other facilities above them – are located in Tuv . This is, in fact, the crucial idea behind the dynamic programming formulation. k Our ultimate goal is to compute the cost functions at the root, kthat is kPr (·), k Qr (·). The optimal cost of T with k facilities is costk (T ) = min Pr (0) , Qr (0) . Recurrence Equations. We now set up recurrence equations for our cost functions. There are two base cases: when j = 0 and when u, v is a leaf of SD(T ). 0 (α) is unConsider ﬁrst the case j = 0. For any node u, v ∈ SD(T ), Puv 0 deﬁned, and Quv (α) = αWuv + Quv , where Wuv = w , and Quv = x x∈Tuv Q0uv (0) = x∈Tuv wx dxu . If u, u is a leaf of SD(T ) (that is, u is a leaf of T ), then for j = 1, . . . , k, set j Puu (α) = 0, and Qjuu (α) is undeﬁned. For j > 0 and any non-leaf node u, v of SD(T ), we want to express each cost function for j facilities at u, v in terms of cost functions for j ≤ j at the descendands of u, v in SD(T ), or in terms of any cost functions for j < j.

Faster Algorithms for k-Medians in Trees

u

α

u µ

(a)

α µ

η v

(b)

223

η v

Fig. 2. Illustration of recurrence equation (1). We show two placements of facilities that correspond to the two terms in the minimum.

Suppose that u, v ∈ SD(T ), for u = v, has two children u, µ and η, v, where η = right(µ). Then j Puµ (α) j i Puv (α) = min , (1) j−i mini=0,...,j−1 Quµ (α) + Pηv (α + dηu ) Qjuv (α) = min Qiuµ (α) + Qj−i (2) ηv (α + dηu ) . i=0,...,j

The second case is when we have a node u, u ∈ SD(T ) with one child z, ρ(z), where z = left(u). Then j Puu (α) = min Puj−1 (0), Qj−1 (3) u (0) j j j Quu (α) = αwu + min Pz (α + dzu ), Qz (α + dzu ) . (4) j Note that to determine Puu (α) we use information from an ancestor u, ρ(u) of u, u in SD(T ), but for a smaller number of facilities. The deﬁnition of the cost functions for the leaves and for j = 0 should be obvious. We give a brief justiﬁcation for (1), the arguments for other recurrences j are similar. Supopse that F is a set of j facilities that realizes Puv (α), that α j is F ⊆ Tu , |F | = j, F ∩ Ruv = ∅, and costF (Tu ) = Puv (α). There are two cases (see Figure 2). If F ∩ Ruµ = ∅, then F is accounted for by the top term j (α). Otherwise, F ∩ Rηv = ∅. Let F = in the minimum, costF (Tuα ) = Puµ F ∩ Tuµ , i = |F | and F = F ∩ Tη . We have F ∩ Ruµ = ∅, F ∩ Rηv = ∅, and α+d α ) + costF (Tη ηu ), so this F is accounted for by the costF (Tuα ) = costF (Tuµ i j−i term Quµ (α) + Pηv (α + dηu ). j Theorem 1. The above recurrence equations Puv (·) and Qjuv (·) for cost functions are correct. In particular, costk (T ) = min Prk (0), Qkr (0) . α α Minimizers. Consider a set F ⊆ Tuv of j facilities. Then costF (Tuv ) = Aα + B, 0 where B = costF (Tuv ), and A = x∈X wx , for a set of nodes X ⊆ Tuv − F j that do not have any ancestors in F . We conclude that all functions Puv (·) and j Quv (·) are lower envelopes of a collection of lines, and thus they are piece-wise linear, continuous and convex.

224

Robert Benkoczi et al.

By the complexity of a piece-wise linear function we mean the number of its line segments. If f (·) and g(·) are two piece-wise linear functions, then the complexity of each function f (·) + g(·) and min {f (·), g(·)} is at most the sum of the complexities of f (·) and g(·). j We say that a set of at most j nodes F ⊆ Tu is a Puv -minimizer at α if α j j F ∩Suv = ∅ and costF (Tu ) = Puv (α). F is a Quv -minimizer at α if F ⊆ Tuv −Suv α and costF (Tuv ) = Qjuv (α). F will be called a (j, u, v)-minimizer if it is either a j Puv -minimizer or a Qjuv -minimizer, for some α. j The complexity of each cost function Puv (·) and Qjuv (·) is equal to the number of the respective minimizers (assuming that the ties are broken in an appropriate way). So we are interested in the following question: How many diﬀerent minimizers we may have? Our Linearity Conjecture is that this number is that for any ﬁxed j ≥ 1 and each u, v, the number of (j, u, v)-minimizers is O(|Tuv |). For j = 1 this is is trivially true. The conjecture is supported by the result from [8], where it was proved that it indeed holds for j = 2. (Technically, the deﬁnition of minimizers in [8] was slightly diﬀerent.) Although we were unable to prove this conjecture, we show how to use the recurrence equations for the cost functions to get a weaker bound. Theorem 2. Assume that k is ﬁxed. Then, for any j and each u, v ∈ SD(T ), the number of (j, u, v)-minimizers is O(m logj−1 m), where m = |Tuv |. The proof of this theorem will appear in the ﬁnal version. The cost functions are piece-wise linear and non-decreasing. Each slope value is a sum of at most n weights, so there are only O(Bn) possible slopes. Thus: Lemma 4. Assume that the node weights are positive integers from a range 1, 2, . . . , B, where B is a constant. Then the number of minimizers is O(n). The Algorithm for Directed Trees. Our algorithm uses the recurrence equations developed earlier to compute the cost functions for all j = 0, . . . , k and all nodes u, v of the decomposition tree SD(T ). Algorithm k-MedDirTree Compute the spine decomposition tree SD(T ) Initialize the cost functions for j = 0 and at the leaves of SD(T ) For j = 1, 2, . . . , k do For all u, v ∈ SD(T ) in the bottom-up order do j compute functions Puv (·) and Qjuv (·) using the recurrences (1), (2) and (3) Return costk (T ) = min Prk (0), Qkr (0)

Initialization. The only non-trivial question is how tocompute eﬃciently all functions Q0uv (α) = αWuv + Quv . Recall that Wuv = x∈Tuv wx , and Quv = Q0uv (0) = x∈Tuv wx dxu . If u, v has two children u, µ and η, v, then we have Quv = Quµ + Q0ηv (dηu ) and Wuv = Wuµ + Wηv . If v = u and u, u has one child z, ρ then Quu = Q0zρ (dzu ), and Wuu = Wzρ + wu . Using these recurrences we can compute all functions Q0uv (·) in linear time by traversing SD(T ) bottom-up.

Faster Algorithms for k-Medians in Trees

225

Running Time. We represent each cost function by a list of its line segments, sorted in order of increasing α. Then the sum and the minimum of two cost functions can be computed in time linear in the total size of these functions. j This means that for a given u, v ∈ SD(T ) the computation of functions Puv (·), j Quv (·) according to the recurrences (1), (2), (3), and (4) can be done in time proportional to the size of the cost functions for this particular node u, v (recall that k is a constant). Thus, by Theorem 2, denoting this cost by time(j, u, v), we have time(j, u, v) = O(|Tuv |polylog(n)). For any given j, the running time of the inner loop is bounded by the sum of time(j, u, v) over all u, v ∈ SD(T ). This time is O(n polylog(n)) due to the fact that the total size of all these subtrees is O(n log n), according to Lemma 3. Thus the overall running time is also O(n polylog(n)). If the weights are positive integers bounded by a constant, as in Lemma 4, or if the linearity conjecture holds, then time(j, u, v) = O(|Tuv |). Theorem 3. Algorithm k-MedDirTree solves correctly the k-median problem. Further, assuming that k is a constant: (a) Algorithm k-MedDirTree runs in time O(n polylog(n)). (b) If the linearity conjecture is true, then Algorithm k-MedDirTree runs in time O(n log2 n). (c) If the weights are positive integers bounded by a constant then Algorithm kMedDirTree runs in time O(n log2 n).

4

Undirected Trees

As in the directed case, we need to partition the tree into k subtrees, in order to minimize the total 1-median cost of these subtrees. The diﬀerence is that now the facility in each subtree does not need to be its root. Another way to view the problem is to ﬁnd a set of k nodes R that are the roots of the subtrees in the partition. We refer to those nodes as cut-nodes, and to the edges between them (except for r) and their parents as cut-edges. (For consistency, we refer to r as a cut-node as well.) We now deﬁne several cost functions: j Puv (α) = the optimal cost of Tu with at most j facilities, such that (i) at least one of the cut-nodes is in Suv , and (ii) some nodes in Tu may be served by a facility located outside Tu at distance α from u. ← −j Q uv (α) = the optimal cost of Tuv with at most j facilities, such that (i) there are no cut-nodes in Suv , and (ii) some nodes in Tuv may be served by a facility located outside Tuv at distance α from u. →j − Q uv (α) = the optimal cost of Tuv with at most j facilities, such that (i) there are no cut-nodes in Suv , and (ii) some nodes in Tuv may be served by a facility located outside Tuv at distance α from v. j (s) = the optimal cost of Tu with at most j facilities, such that u is served Cuv by a facility at s ∈ Tuv .

226

Robert Benkoczi et al.

Let also Cuj = mins∈Tu Cuj (s), which is simply costj (Tu ), the cost of Tu with j facilities. Our goal is to compute Crk . The complete set of recurrences will be j j given in the full paper; here we show how to compute Puv (α) and Cuv (s). If the children of u, v are u, µ and η, v, then:   j   Puµ (α) j ← −i Puv (α) = min j−i  mini=0,...,j−1 Q uµ (α) + Pηv (α + dηu )   j (s) Cuµ if s ∈ Tuµ j Cuv (s) = →i − j−i mini=0,...,j Q uµ (dµs ) + Cηv (s) if s ∈ Tηv j Otherwise, when v = u, and u, u has one child z, ρ, then Puu (α) = Cuj and  −j−1 min P j−1 (d ), ← Q (d ) if s = u zu zu z z j (s) = Cuu  j Cz (s) + wu dus if s ∈ Tz

Currently, we have not been able to design an algorithm that would evaluate those recurrences in subquadratic time for the general case. The crucial diﬀerence between the undirected and directed case is the formula for Cuj that involves minimization over the whole tree Tu . However, we show how to do it in two special cases: when k = 3 or when the tree is balanced. The proofs will appear in the full version of this paper. Theorem 4. The 3-median problem in trees can be solved in time O(n log3 n). Theorem 5. The k-median problem in a balanced undirected tree (that is, a tree of depth O(log n)) can be solved in time O(k 2 n logk−1 n + n log n) and space O(kn logk−2 n).

References 1. S. Arora, P. Raghavan, and S. Rao. Approximation schemes for euclidean kmedians and related problems. In Proc. 30th Annual ACM Symposium on Theory of Computing (STOC’98), pages 106–113, 1998. 2. V. Arya, N. Garg, R. Khandekar, A. Meyerson, K. Mungala, and V. Pandit. Local seach heuristic for k-median and facility location problems. In Proc. 16th Annual ACOM Symposium on Computing, pages 21–29, 2001. 3. V. Auletta, D. Parente, and G. Persiano. Dynamic and static algorithms for optimal placement of resources in a tree. Theoretical Computer Science, 165:441–461, 1996. 4. V. Auletta, D. Parente, and G. Persiano. Placing resources on a growing line. Journal of Algorithms, 26:87–100, 1998. 5. R.R. Benkoczi and B.K.Bhattacharya. Spine tree decomposition. Technical Report CMPT1999-09, School of Computing Science, Simon Fraser University, Canada, 1999.

Faster Algorithms for k-Medians in Trees

227

6. M. Charikar and S. Guha. Improved combinatorial algorithms for facility location and k-median problems. In Proc. 40th Symposium on Foundations of Computer Science (FOCS’99), pages 378–388, 1999. 7. M. Charikar, S. Guha, E. Tardos, and D. Shmoys. A constant-factor approximation algorithm for the k-median problem. In Proc. 31st Annual ACM Symposium on Theory of Computing (STOC’99), pages 1–10, 1999. 8. M. Chrobak, L. Larmore, and W. Rytter. The k-median problem for directed trees. In Proc. 26th International Symposium on Mathematical Foundations of Computer Science (MFCS’01), number 136 in Lecture Notes in Computer Science, pages 260–271, 2001. 9. M.R. Garey and D.S. Johnson. Computers and Intractability: a Guide to the Theory of NP-completeness. W.H. Freeman and Co., 1979. 10. R. Gavish and S. Sridhar. Computing the 2-median on tree networks in O(n log n) time. Networks, 26:305–317, 1995. 11. R. Hassin and A. Tamir. Improved complexity bounds for location problems on the real line. Operation Research Letters, 10:395–402, 1991. 12. W.L. Hsu. The distance-domination numbers of trees. Operation Research Letters, 1:96–100, 1982. 13. O. Kariv and S.L. Hakimi. An algorithmic approach to network location problems II: The p-medians. SIAM Journal on Applied Mathematics, 37:539–560, 1979. 14. M.R. Korupolu, C.G. Plaxton, and R. Rajaraman. Analysis of a local search heuristic for facility location problems. Journal of Algorithms, 37:146–188, 2000. 15. B. Li, X. Deng, M. Golin, and K. Sohraby. On the optimal placement of web proxies on the internet: linear topology. In Proc. 8th IFIP Conference on High Peformance Netwworking (HPN’98), pages 485–495, 1998. 16. B. Li, M.J. Golin, G.F. Italiano, X. Deng, and K. Sohraby. On the optimal placement of web proxies in the internet. In IEEE InfoComm’99, pages 1282–1290, 1999. 17. R. Shah and M. Farach-Colton. Undiscretized dynamic programming: faster algorithms for facility location and related problems on trees. In Proc. 13th Annual Symposium on Discrete Algorithms (SODA), pages 108–115, 2002. 18. R. Shah, S. Langerman, and S. Lodha. Algorithms for eﬃcient ﬁltering in contentbased multicast. In Proc. 9th Annual European Symposium on Algorithms (ESA), pages 428–439, 2001. 19. A. Tamir. An O(pn2 ) algorithm for the p-median and related problems on tree graphs. Operations Research Letters, 19:59–64, 1996. 20. A. Vigneron, L. Gao, M. Golin, G. Italiano, and B. Li. An algorithm for ﬁnding a k-median in a directed tree. Information Processing Letters, 74:81–88, 2000. 21. G. Woeginger. Monge strikes again: optimal placement of web proxies in the internet. Operations Research Letters, 27:93–96, 2000.

Periodicity and Transitivity for Cellular Automata in Besicovitch Topologies F. Blanchard1 , J. Cervelle2 , and E. Formenti3, 1

Institut de Math´ematique de Luminy, CNRS, Campus de Luminy Case 907 - 13288 Marseille Cedex 9, France [email protected] 2 Laboratoire d’informatique Institut Gaspard-Monge 5 Bd Descartes, Champs-sur-Marne, F-77454 Marne-la-Vall´ee Cedex 2, France [email protected] 3 Laboratoire d’Informatique Fondamentale de Marseille (LIF) 39 rue Joliot-Curie, 13453 Marseille Cedex 13, France [email protected]

Abstract. We study cellular automata (CA) behavior in Besicovitch topology. We solve an open problem about the existence of transitive CA. The proof of this result has some interest in its own since it is obtained by using Kolmogorov complexity. At our knowledge it if the first result on discrete dynamical systems obtained using Kolmogorov complexity. We also prove that every CA (in Besicovitch topology) either has a unique fixed point or a countable set of periodic points. This result underlines that CA have a great degree of stability and may be considered a further step towards the understanding of CA periodic behavior.

1

Introduction

In the last twenty years CA received a growing and growing attention as formal models for complex systems with applications in almost every scientiﬁc domain. They consists in an inﬁnite lattice of ﬁnite automata. All automata are identical. Each automaton updates its state according to a local rule on the basis of its actual state and of the one of a ﬁxed ﬁnite set of neighboring automata. The state of all automata is updated synchronously. A configuration is a snapshot of the state of all automata in the lattice. The simplicity of the deﬁnition of this model is in contrast with the wide variety of diﬀerent dynamical behaviors most of them are not completely understood yet. Dynamical behavior of CA is studied mainly in the context of discrete dynamical systems by putting on conﬁgurations the classical Cantor topology (i.e. the one obtained by putting the product topology when the set of states of the automata is equipped with the discrete topology). Deterministic chaos is one of the most appealing (and poorly understood) dynamical behavior. Among CA,

Corresponding author

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 228–238, 2003. c Springer-Verlag Berlin Heidelberg 2003

Periodicity and Transitivity for Cellular Automata in Besicovitch Topologies

229

one can ﬁnd many interesting example of this kind of behavior. The problem is that shift map is chaotic according to most popular chaos deﬁnitions in litterature (see [7] for example). The shift map is very a simple CA which shifts left the content of conﬁgurations. The chaoticity of this map is somewhat counter-intuitive (see [4, 1] for a discussion on this topic). In fact, the chaoticity of the shift is mostly due to the structure of the topology than to the intrinsic complexity of the automaton [8, 5]. In [4], to overcome the drawbacks of the Cantor topology (in the context of chaotic behavior), the authors proposed to substitute it with the Besicovitch topology. In [10], the authors proved that this new topology better links the classical notion of sensibility to initial conditions with the intuitive notion of chaotic behavior. As usual, the introduction of a new result discloses a series of new questions (some of them are reported in [6]). In this paper we solve the problem of ﬁnding a transitive CA in Besicovitch topology (Theorem 2) which is qualiﬁed of challenging open problem in [12]. This result has deep implications in CA dynamics. First, it states that they are unable to vary arbitrarily the density of diﬀerences between two conﬁgurations during their evolutions. In its own turn this fact implies that the information contained in conﬁgurations cannot spread too much during evolutions. Second, the proof technique is of some interest in its own since we used Kolmogorov complexity to prove a purely topological property about discrete dynamical systems. The low degree of complexity from a chaotic behavior point of view is underlined by the second main result of the paper: a CA either has a unique ﬁxed point or an uncountable set of periodic points. These two results open the quest for new more appropriate properties for describing the “complex” behavior of CA dynamics. Some very interesting proposals along this line of thoughts may be found in [13]. The authors are currently investigating this subject.

2

Cellular Automata

Formally, a CA is a quadruple d, S, N, λ. The integer d is the dimension of the CA and controls how the cells of the lattice are indexed. Indeed, indexes of cells take values in ZZd . The symbol S is the ﬁnite set of states of cells and λ : SN → S is the local rule which updates the state of a cell on the basis of a (ﬁnite) neighborhood N ⊂ ZZd . A configuration c is a function from ZZd to S and may be viewed as a snapshot of the content of each cell in the lattice. Denote by X the set SZZ of all conﬁgurations. The local rule induces naturally a global d d rule on the space of conﬁgurations fA : SZZ → SZZ as follows d ∀c ∈ SZZ ∀i ∈ ZZd , fA (c)(i) = λ(i + n1 , . . . , i + nt ) ,

where N = (n1 . . . , nt ) and + is addition in ZZd .

230

F. Blanchard, J. Cervelle, and E. Formenti

In the sequel, when no misunderstanding is possible, we will often make no distinction between a CA and its local rule. Moreover we will denote fA simply by f. Some sets of conﬁgurations play a special role in the study of dynamical behavior such as ﬁnite and spatial periodic conﬁgurations. A conﬁguration c ∈ X is finite if it has a ﬁnite number of non-zero cells. A conﬁguration is spatial periodic c ∈ X if there exists p ∈ IN such that ∀i ∈ ZZ, ci = cp+i . The least p with the above property is the (spatial ) period of c. Denote P the set of spatial periodic conﬁgurations. A point x ∈ X is ultimately periodic for f if there exist p, t ∈ IN such that ∀h ∈ IN, fph+t (x) = x. The least integer p with such a property is called the (temporal ) period of x and t is its pre-period. A point is fixed if it has period 1. In this paper, we mainly study one-dimensional CA (d = 1) with S = {0, 1}. For any conﬁguration c ∈ X, ca:b is the word ca ca+1 . . . cb if b a, ε (the empty word) otherwise. The pattern of size 2n+1 centered at index 0 of a conﬁguration c ∈ X is denoted Mx (n) = x−n:n . For any word w ∈ S , |u| denotes its length. Finally, if {0, 1} ⊂ S, 0 is the conﬁguration in which all cells are in state 0, and, similarly, 1 is the conﬁguration in which all cells are in state 1. In the sequel, conﬁgurations like 0 or 1 are said to be homogeneous conﬁgurations. For any c ∈ X, the set Of (c) = {fi (c), i ∈ IN} is the orbit of initial condition c for f (we assume f0 (x) = x). The space-time diagram of initial conﬁguration c ∈ X is a graphical representation of Of (c) and it is obtained by superposing the conﬁgurations c, f( c), . . ., fn (c), . . . This representation is very useful in the visualization of simulations of cellular automata evolutions on a computer. Our main interest is to study CA in the context of discrete dynamical systems i.e. structures U, F, where U is a topological (possibly metric) space and F a continuous function from X to itself. When S is endowed with the discrete topology and SZZ the induced product topology, then any global rule f of a CA can be considered as a discrete dynamical system SZZ , f. The product topology on SZZ is usually called Cantor topology since it is a compact, totally disconnected and perfect space. One can easily verify that the following metric on conﬁgurations ∀x, y ∈ SZZ , d(x, y) = 2min{|i|,xi =yi ,i∈ZZ} . induces exactly the Cantor topology on SZZ .

3

The Besicovitch Topology

The topology with which the space of a dynamical system X, f is endowed plays a fundamental role in the study of the asymptotic behavior. In particular it can ﬁlter out special intrinsic behaviors and hide unimportant marginal phenomena. Besicovitch topology has been introduced in [4] in order to reﬁne the study of sensitivity to initial conditions for CA. In contrast with Cantor topology, this one greatly decreases the “importance” of errors near the cell of index zero by giving to all cells the same weight. Let dB be the following function

Periodicity and Transitivity for Cellular Automata in Besicovitch Topologies

dB (x, y) = lim sup n→ ∞

231

∆(x−n:n , y−n:n ) , 2n + 1

where ∆(x−n:n , y−n:n ) is the number of positions which the words x−n:n and y−n:n diﬀer at. The function dB is a pseudo-distance ([4]), and is called Besicovitch pseudodistance. Besicovitch topology is obtained by taking the quotient space w.r.t. the ˙ this relation. In this equivalence relation “being at zero dB -distance”. Denote ≡ way, dB becomes a metric on classes. In the sequel, when no misunderstanding is possible, we will denote dB simply by d. This topology is suitable for the study of CA. The following proposition allows us to lift a global function from X to itself into a global function from X˙ to itself. ˙ i.e. x≡y ˙ =⇒ f(x)≡f(y). ˙ Proposition 1 ([4]). Any CA is compatible with ≡ For all CA f, denote f˙ the function which transform a class c of X˙ to the class which contains all the images of the conﬁgurations of c by f. Such a f˙ is called a CA on (Besicovitch) classes. 3.1

Fixed and Periodic Points

Some recent works pointed out that CA are not complex from a purely algorithmic complexity point of view [8, 3, 5]. This fact seems to originate from an intrinsic stability of such systems. It is well-known that in Cantor topology, if a CA has a non-homogeneous ﬁxed point then it has at least a countable set of ﬁxed points. Besicovitch topology allows to go even further. In this section we prove that either a CA has one unique ﬁxed point or it has uncountably many periodic points. Clearly these “new” periodic points are due to the special structure of the Besicovitch space but if we analyze more in details how they are built one can see that they are made of larger and larger areas in which the system is periodic even in the Cantor sense. Before introducing the main result of this section (Theorem 1) we need some technical lemma and notation. If p is an integer and u a word of size at least 2p, then p |u|p is the word up+1:|u|−p i.e. the word u in which the ﬁrst and last p letters are deleted. Lemma 1. For all CA of global rule f and radius r, it holds that ∆(r |ab|r , f(ab)) ∆(r |b|r , f(b)) + ∆(r |a|r , f(a)) + 2r, where a and b have length bigger than 2r. Previous result comes from the fact that the image of the concatenation of two words is the concatenation of the images of these words separated by 2r cells of perturbation. An iterated application of this lemma gives an inequality for the concatenation of h words. Lemma 2. Let (ai )i∈1,h be a finite sequence of h words of length greater than 2r. Let x = a1 . . . ah be the concatenation of these words. Then, for all CA f of radius r, ∆(r |x|r , f(x)) h i=1 ∆(r |ai |r , f(ai )) + 2r(h − 1).

232

F. Blanchard, J. Cervelle, and E. Formenti

Proof. By induction on the number of concatenated words and Lemma 1.

We will also make use of the well-known Cesaro’s lemma on series. Lemma 3 (C´ esaro’s Lemma). Let (an )n∈IN and (un )n∈IN be such that lim

n→ ∞

un =l , an

where an is divergent and positive. Then, limn→ ∞

n ui i=0 n i=0 ai

= l.

Proposition 2. In Besicovitch space, any CA with two distinct periodic points of period p1 and p2 has an uncountable number of periodic points whose period is lcm(p1 , p2 ). Proof. Let f be a CA on classes with radius r. Let x and y be two periodic points of f with periods p1 and p2 , respectively. Let p = lcm(p1 , p2 ). Let x a member of the class x and y a member of the class y. Since x and y are distinct, d(x , y ) = δ > 0. Hence, there exists a sequence of integers (un )n∈IN such that, for all integer n > 0 : ∆(x−u , y−u ) δun . n :un n :un

(1)

Let σ be an increasing function such that uσ(0) > 4r and uσ(n+1) > 2uσ(n) . Let vn = uσ(n) . By a simple recurrence on n, it holds that : n

vi 2vn .

(2)

i=0

˙ such that the Let us construct an injection g from {0, 1}ZZ into X (not X) ˙ classes of the image set of g are all periodic points of f. Let α be sequence of {0, 1}IN . For all positive integers i, deﬁne the sequences kα and kα as follows x1:vi if αi = 0 x−vi :−1 if αi = 0 α α ki = and ki = if α = 1 if αi = 1 y1:v y−v i i i :−1 Deﬁne the conﬁguration g(α) as follows α α α α 2rp+1 α α α α k0 k1 k2 . . . kα g(α) = . . . kα n kn−1 . . . k2 k1 k0 0 n−1 kn . . .

Let us prove that the class containing g(α) is a periodic point for f with period p, i.e. d(g(α), fp (g(α))) = 0. One has to prove that lim sup n→ ∞

∆(g(α)−n:n , fp (g(α))−n:n ) =0 2n

Since x and y are periodic points of period p1 and p2 , respectively, and both p1 and p2 are divisors of p, we have that d(x , fp (x )) = d(y , fp (y )) = 0,

Periodicity and Transitivity for Cellular Automata in Besicovitch Topologies

and then, that limn→ ∞ Hence, it holds that lim

n→ ∞

,fp (x )−n:n ) ∆(x−n:n 2n

= limn→ ∞

,fp (y )−n:n ) ∆(y−n:n 2n

233

= 0.

∆(x1:n , fp (x )1:n ) ∆(y1:n , fp (y )1:n ) = lim = 0. n→ ∞ n n

(3)

α Fix an integer n > r. Let us decompose g(α)−n:n in its kα i and ki factors

−→ α ← − α α α 2rp+1 α α α α g(α)−n:n = kα k0 k1 k2 . . . kα h kh−1 . . . k2 k1 k0 0 h−1 kh −→ ← − α α where h depends on n, and kα h [resp. kh ] is the suﬃx [resp. preﬁx] of kh [resp. α kh ] of g(α)−n:n . Since fp is a CA of radius rp, Lemma 2 gives h−1 p α ∆(g(α)−n:n , fp (g(α))−n:n ) ∆(0, fp (02rp+1 )) + i=0 ∆(rp |kα i |rp , f (ki )) h−1 p α + i=0 ∆(rp |kα i |rp , f (ki ))+ −→ −→ α p α ∆(rp |kh |rp , f (kh )) ← − − p ← α +∆(rp |kα h |rp , f (kh )) + 2rp(2h). ← − α By the deﬁnition of kα i , one have that kh is a preﬁx of x or of y . Hence, ← − − p ← α p p ∆(rp |kα h |rp , f (kh )) ∆(rp |x1:n |rp , f (x1:n )) + ∆(rp |y1:n |rp , f (y1:n )) . Similarly −→ −→ p α p p ∆(rp |kα h |rp , f (kh )) ∆(rp |x−n:−1 |rp , f (x−n:−1 )) + ∆(rp |y −n:−1 |rp , f (y −n:−1 )).

Then one has h−1 p α ∆(g(α)−n:n , fp (g(α))−n:n ) 1 + i=0 ∆(rp |kα i |rp , f (ki ))+ h−1 α p α i=0 ∆(rp |ki |rp , f (ki ))+ |rp , fp (x−n:n ))+ ∆(rp |x−n:n p ∆(rp |y−n:n |rp , f (y−n:n )) + 4rph

(4)

By Equation (3), one ﬁnds lim i→ ∞

αi =1

p α ∆(rp |kα i |rp ,f (ki )) vi

And similarly, lim i→ ∞

αi =0

|rp ,fp (x−v )) i :vi vi ,fp (x )−n:n ) ∆(x−n:n limn→ ∞ = 0. n

limi→ ∞

p α ∆(rp |kα i |rp ,f (ki )) vi

∆(rp |x−v

i :vi

= 0.

p α ∆(rp |kα i |rp ,f (ki )) = v i h α p α ∆( |k | ,f (k )) i h i rp = 0, since the 0 and by Cesaro’s Lemma we have limh→ ∞ i=0 rp i=0 vi divergent. The same argument applies to k , and series (vi )i∈IN is positive and h h α p α α p α ∆( |k | ,f (k )) ∆( |k | ,f (k rp rp rp rp i i i i )) therefore one has limh→ ∞ i=0 h v + i=0 h v = 0. i=0 i i=0 i

Summing the two previous equations, we obtain limi→ ∞

234

F. Blanchard, J. Cervelle, and E. Formenti

Since

h i=0

vi 2n, one gets

n

lim

n→ ∞

i=0

p α ∆(rp |kα i |rp , f (ki )) + 2n

n i=0

p α ∆(rp |kα i |rp , f (ki )) =0 2n

(5)

Using Equation (3) again, we obtain that lim

n→ ∞

∆(rp |x−n:n |rp , fp (x−n:n )) ∆(rp |y−n:n |rp , fp (y−n:n )) = lim =0 n→ ∞ 2n 2n

Finally lim

n→ ∞

4rph + 1 =0 2n

(6)

(7)

since h ln2 (n). Using Equations (5), (6) and (7) inside Equation (4), one ﬁnds that p limn→ ∞ ∆(g(α)−n:n f2n(g(α))−n:n ,) = 0 which implies that g(α) is a periodic point of f of period p. Let ∼ be the equivalence relation such that x ∼ y if and only if y and x diﬀer only at a ﬁnite number of positions, and the converse relation. Let α and β be two sequences of {0, 1}IN such that α β. Let (ai )i∈IN be the increasing sequence of indices where they diﬀer. Then, ∆(kα , kβ ∆(g(α)−n:n , g(β)−n:n ) an ) lim sup an an 2n n∈IN i∈IN 2 i=0 vi + 2r + 1 ∆(xv an , yv an ) lim sup an . i∈IN 2 i=0 vi + 2r + 1

d(g(α), g(β)) lim sup

Applying Equation (1) and (2), we obtain d(g(α), g(β)) lim supn∈IN

δvan 2van +1

δ2 . Hence, g(α) and g(β) are in diﬀerent classes. Finally, we prove that all conﬁgurations in g(X) are periodic points of period ˙ g(β). Let E be a set containing a member of each p, and that α β =⇒ g(α) ≡ equivalence class of ∼. Since {0, 1}IN / ∼ is not countable, so is E. Using previous equation, g|E is injective and hence, g(E) is a non countable set of periodic points of f of period p. This result has two easy corollaries. The ﬁrst one is obtained simply recalling that a ﬁxed point is a periodic point of period 1. The second one comes from the fact that if p is a periodic points of period greater of equal to 2, then there are at least two distinct periodic points. Corollary 1. If a CA f has two fixed points, it has an uncountable number of fixed points. Corollary 2. If a CA has a periodic point of period p > 1, then its set of periodic points is not countable. Putting together the results of the two previous corollaries we have the main result of this section.

Periodicity and Transitivity for Cellular Automata in Besicovitch Topologies

235

Theorem 1. Any CA has either one and only one fixed point or an uncountable number of periodic points. Proof. Let f be a CA. There are several cases. First f can have one and only one ﬁxed point. Second, if f can has two ﬁxed points, then it has a uncountable set of periodic points (which are in this case ﬁxed points); ﬁnally, if f has no ﬁx points, then f(0) = 1 and f(1) = (0), and 0 is a periodic point of period 2, and therefore, using Corollary 2, there are uncountably many periodic points for f. Proposition 3. If a surjective CA has a blocking word (or, equivalently, an equicontinuity point for Cantor topology), then its set of periodic points is dense in Besicovitch topology. Remark that the same argument can be used to prove a similar result for the Cantor topology, without making use of measure theory, as it is the case in [2]. Proof. Let f be a CA and w a blocking word for f. One has to prove that for all conﬁgurations x, and all real numbers ε > 0, there exists a periodic conﬁguration at distance less than ε from x. Let y be the following conﬁguration wl if l < |w| ∀n, l < k ∈ IN, ynk+l = xnk+l otherwise, where k = 2ε|w|. The conﬁguration y is everywhere equal to the x except that periodically we put w. The number of diﬀerences between x−n:n and y−n:n is bounded by the product of |w| and the number of times w is written within y−n:n . Hence, it holds that n

2n |w| +|w| ,y−n:n ) d(x, y) limn→ ∞ ∆(x−n:n limn→ ∞ k2n limn→ ∞ ε 2n < ε. 2n+1 Now we are going to prove that y is a periodic point of f. Since w is a blocking word, for all integers i and n, the pattern fi+1 (y)nk+|w|/2:n(k+1)+|w|/2 depends only on the corresponding word of the same size in the pre-image fi (y)nk+|w|/2:n(k+1)+|w|/2 . (i)

For all i, n ∈ IN, let un = fi (y)nk+|w|/2:n(k+1)+|w|/2 . For any ﬁxed (i) is has ﬁnite range and each term depends only on n, the sequence un i∈IN

its predecessor. Hence, it is ultimately periodic. By the hypothesis the CA is surjective, this implies that y is periodic (for if the sequence is periodic of period (k) (k−1) (k+p−1) and un , which p with pre-period k, then un has two pre-images: un is impossible since a surjective CA is pre-injective – see [9] for more on preinjectivity). Since the number of possible values for this sequence is 2k , the period of each column is at most 2p . Hence, the conﬁguration is periodic and its period is at most lcm{2i , 1 i k}. 3.2

Transitive Cellular Automata

As already recalled, Besicovitch topology was introduced in order to further study CA chaotic behavior and, in particular, sensitivity to initial conditions.

236

F. Blanchard, J. Cervelle, and E. Formenti

In [1], the authors wondered about the existence of transitive CA in this topology. The same problem has been qualiﬁed as “challenging” in [12]. In this section we prove that the question has a negative answer. Former researches tried to prove or disprove the existence of transitive CA either by looking for counter-examples or by complicated combinatorial proofs. Here we have drastically diminished the complexity of the problem by making use of Kolmogorov complexity and the classical approach of the “incompressibility method” (see [11] for more on this last subject). Clearly, giving a glance to our proof, now one can ﬁnd a pure combinatorial proof by doing some “reverseengineering”. For any two words u, w on {0, 1} , denote K(u) the Kolmogorov complexity of u and K(u|w), the Kolmogorov complexity of u conditional to w. We make reference to [11] for the precise deﬁnitions of these quantities and all the related well-known inequalities. The intuition behind the proof of next result is that CA cannot increase the algorithmic complexity of conﬁgurations. If a CA would be transitive in Besicovitch topology then, given two conﬁgurations x, y which diﬀers on a sequence of places with relatively large complexity, it should be able to take a point arbitrarily near to x, arbitrarily near to y, but this implies a great change in complexity contradicting our initial intuition. Theorem 2. In Besicovitch topological space there is no transitive CA. Proof. By contradiction, suppose that there exists a transitive CA f of radius r with C states. Let x and y be two conﬁgurations such that for all integers K(x−n:n |y−n:n ) n 2 . One can prove that conﬁgurations x and y exist by a simple counting argument. Since f is transitive, there are two conﬁgurations x and y such that ∆(x−n:n , x−n:n ) 4εn

and

∆(y−n:n , y−n:n ) 4δn

(8)

and an integer u (which only depends on ε and δ) such that fu (y ) = x ,

(9)

−1

where ε = δ = (4e10 log2 C ) . In the sequel of the proof, only n varies, while C, u, x, y, x , y , δ and ε are ﬁxed and independent of n. from the following items: By Equation (9), one may compute the word x−n:n y−n:n , f, u, n and the twice ur bits of y which surrounds y−n:n and which are missing to compute x−n:n with Equation (9). We obtain that K(x−n:n |y−n:n ) 2ur + K(u) + K(n) + K(f) + O(1) o(n)

(10)

(the notations O and o are deﬁned with respect to n). Now, let us evaluate K(y−n:n |y−n:n ). Let a1 , a2 , a3 , . . . , ak be the positive positions which y−n:n and y−n:n diﬀer at, sorted increasingly. Let b1 = a1 and bi = ai −ai−1 , for 2 i k. Using Equation (8), we know that k 4δn. Remark that ki=1 bi = ak n. By

Periodicity and Transitivity for Cellular Automata in Besicovitch Topologies

237

contradiction, let a1 , a2 , a3 , . . . , ak the absolute value of the strictly negative positions which y−n:n and y−n:n diﬀer at, sorted increasingly. Let b1 = a1 and bi = ai − ai−1 , where 2 i k . Equation (8) statesthat k 4δn. Since the ln bi bi logarithm is a concave function, one has ln n k ln k k and hence

ln bi k ln

n k

(11)

which also holds for bi and k . The knowledge of the bi , the bi , and of the k + k states of the cells of y−n:n where y−n:n diﬀers from enough to y−n:n is compute y−n:n from y−n:n . Hence, K(y−n:n |y−n:n ) ln(bi ) + ln(bi ) + (k + k ) log2 C + O(1). Equation (11) states that K(y−n:n |y−n:n ) k ln n k + n k ln kn +(k+k ) log2 C+O(1). The function k → k ln n is increasing on [0, ]. k e As n n n 4n 10 log2 C , we have that k ln 4δn ln ln e k 4δn e10 log C 10 log C k 4δn 2 2 e 2 log2 Cn 40 n and that C(k + k ) e10 log2 C . Replacing a, b and k by a , b and e10 log2 C k , the same sequence of inequalities leads to the same result. One may deduce that (2 log2 C + 80)n K(y−n:n |y−n:n ) + O(1) (12) e10 log2 C ). Similarly, Equation (12) is also true with K(x−n:n |x−n:n The triangular inequality of the Kolmogorov complexity gives K(x−n:n |y−n:n ) K(x−n:n |x−n:n ) + K(x−n:n |y−n:n ) + K(y−n:n |y−n:n ) + O(1) .

2 C+80)n By Equations (12) and (10) one concludes that K(x−n:n |y−n:n ) (2 log e10 log2 C + o(n). The hypothesis on x and y was K(x−n:n |y−n:n ) n 2 . This implies that (2C+80)n n + o(n). Last inequality is false for a big enough n. 2 e10C

References 1. F. Blanchard, E. Formenti, and P. K˚ urka. Cellular automata in the Cantor, Besicovitch and Weyl topological spaces. Complex Systems, 11:107–123, 1999. 2. F. Blanchard and P. Tisseur. Some properties of cellular automata with equicontinuity points. Annales de l’Instute Henri Poincar´e, 36(5):562–582, 2000. 3. C. Calude, P. Hertling, H. J¨ urgensen, and K. Weihrauch. Randomness on full shift spaces. Chaos, Solitons & Fractals, 1:1–13, 2000. 4. G. Cattaneo, E. Formenti, L. Margara, and J. Mazoyer. A Shift-invariant Metric on SZZ Inducing a Non-trivial Topology. In I. Privara and P. Rusika, editors, MFCS’97, volume 1295 of LNCS, Bratislava, 1997. Springer-Verlag. 5. J. Cervelle, B. Durand, and E. Formenti. Algorithmic information theory and cellular automata dynamics. In MFCS’01, volume 2136 of LNCS, pages 248–259. Springer Verlag, 2001. 6. M. Delorme, E. Formenti, and J. Mazoyer. Open problems on cellular automata. Submitted, 2002. 7. R. L. Devaney. An introduction to chaotic dynamical systems. Addison-Wesley, Reading (MA), 1989.

238

F. Blanchard, J. Cervelle, and E. Formenti

8. J-C. Dubacq, B. Durand, and E. Formenti. Kolmogorov complexity and cellular automata classification. Theoretical Computer Science, 259(1–2):271–285, 2001. 9. B. Durand. Global properties of cellular automata. In E. Goles and S. Martinez, editors, Cellular Automata and Complex Systems. Kluwer, 1998. 10. E. Formenti. On the sensitivity of cellular automata in Besicovitch spaces. Theoretical Computer Science, 301(1–3):341–354, 2003. 11. M. Li and P. Vit´ anyi. An Introduction to Kolmogorov complexity and its applications. Springer-Verlag, second edition, 1997. 12. G. Manzini. Characterization of sensitive linear cellular automata with respect to the counting distance. In MFCS’98, volume 1450 of LNCS, pages 825–833. Springer-Verlag, 1998. 13. B. Martin. Damage spreading and µ-sensitivity in cellular automata. Ergodic theory & dynammical systems, 2002. To appear.

Starting with Nondeterminism: The Systematic Derivation of Linear-Time Graph Layout Algorithms Hans L. Bodlaender1 , Michael R. Fellows2 , and Dimitrios M. Thilikos3 1

2

Institute of Information and Computing Sciences, Utrecht University P.O. Box 80.089, 3508 TB Utrecht, The Netherlands [email protected] School of Electrical Engineering and Computer Science, University of Newcastle Callaghan, NSW 2308, Australia 3 Departament de Llenguatges i Sistemes Inform` atics Universitat Polit`ecnica de Catalunya Campus Nord M` odul C6. c/ Jordi Girona Salgado 1-3 08034, Barcelona, Spain

Abstract. This paper investigates algorithms for some related graph parameters. Each asks for a linear ordering of the vertices of the graph (or can be formulated as such), and there are constructive linear time algorithms for the ﬁxed parameter versions of the problems. Examples are cutwidth, pathwidth, and directed or weighted variants of these. However, these algorithms have complicated technical details. This paper attempts to present these algorithms in a diﬀerent more easily accessible manner, by showing that the algorithms can be obtained by a stepwise modiﬁcation of a trivial hypothetical non-deterministic algorithm. The methodology is applied for a generalisation of the cutwidth problem to weighted mixed graphs. As a consequence, we obtain new algorithmic results for various problems like modiﬁed cutwidth, and rederive known results for other related problems with simpler proofs. Keywords: Algorithms and data structures; graph algorithms; algorithm design methodology; graph layout problems; ﬁnite state automata.

1

Introduction

The notion of pathwidth (and the related notion of treewidth) has been applied successfully for constructing algorithms for several problems. One such application area is for problems where linear orderings of the vertices of a given graph

The ﬁrst and third author were partially supported by EC contract IST-1999-14186: Project ALCOM-FT (Algorithms and Complexity - Future Technologies. The second author was supported by the New Zealand Marsden Fund, and the Australian Research Council Project DP0344762. The research of the third author was supported by the Spanish CICYT project TIC2000-1970-CE and the Ministry of Education and Culture of Spain (Resoluci´ on 31/7/00 – BOE 16/8/00).

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 239–248, 2003. c Springer-Verlag Berlin Heidelberg 2003

240

Hans L. Bodlaender, Michael R. Fellows, and Dimitrios M. Thilikos

are to be found, with a speciﬁc parameter of the ordering to be optimised. In this paper, we are interested in a number of related notions which appear to allow the same algorithmic approach for solving them. The central problem in the exposition is the cutwidth problem (see Section 2 for the deﬁnition). While cutwidth is an NP-complete problem [14], we are interested in the ﬁxed parameter variant of it: for ﬁxed k, we ask for an algorithm that given a graph G, decides if the cutwidth of G is at most k, and if so, gives a linear ordering of G with cutwidth at most k. This ﬁxed parameter variant of the problem is known to be linear time solvable (with the constant factor depending exponentially on k) [10,12,16]. Such a linear time algorithm can be of the following form: ﬁrst a path decomposition of bounded pathwidth is found (if the pathwidth of G is more than k, we know that the cutwidth of G is more than k), and then a dynamic programming algorithm is run that uses this path decomposition. Unfortunately, the technical details of this dynamic programming algorithm are rather complex. Other problems that have a similar algorithmic solution are the pathwidth problem itself (see [7,3]), and variants on weighted or directed graphs, including directed vertex separation number [2]. See also [1]. In this paper, we attempt to present these algorithms in a diﬀerent, more easily accessible manner, by showing that the algorithms can be obtained by a stepwise modiﬁcation of a trivial hypothetical non-deterministic algorithm. Thus, while our resulting algorithms will not be much diﬀerent from solutions given in the literature, the reader may understand the underlying principles and the correctness of the algorithms much easier. Also, we give some new results, e.g., our solution for (directed) modiﬁed cutwidth is new. Ingredients of the techniques displayed in this paper appeared in the early 1990’s independently in work of Abrahamson and Fellows [1], Lagergren and Arnborg [13], and Bodlaender and Kloks [7]. In [5], a relation between decision and construction versions of algorithms running on path decomposition with an eye to ﬁnite state automata was established. More background and more references can be found in [11]. Many missing details can be found in [6].

2

Deﬁnitions

A mixed graph is a triple G = (V, E, A). E is the set of undirected edges of G, and A the set of (directed) arcs. When not distinguishing between an undirected edge and an arc, we use the term ‘edge’. Deﬁnition 1. A path decomposition of a mixed graph G = (V, E, A) is a sequence of subsets of vertices (X1 , X2 , . . . , Xr ), such that – 1≤i≤r Xi = V . – for all (v, w) ∈ E ∪ A, there exists an i, 1 ≤ i ≤ r, with v ∈ Xi and w ∈ Xi . – for all i, j, k ∈ I: if i ≤ j ≤ k, then Xi ∩ Xk ⊆ Xj . The width of a path decomposition (X1 , X2 , . . . , Xr ) is max1≤i≤r |Xi | − 1. The pathwidth of G is the minimum width over all possible path decompositions of G.

Starting with Nondeterminism

241

Deﬁnition 2. A linear ordering of a mixed graph G = (V, E, A) is a bijective function f : V → {1, 2, . . . , |V |}, such that for all arcs (v, w) ∈ A: f (v) < f (w). An edge (v, w) ∈ E ∪ A is said to cross vertex x ∈ V in linear ordering f , if f (v) < f (x) < f (w). Edge (v, w) ∈ E ∪ A is said to cross gap i, if f (v) ≤ i < f (w). Deﬁnition 3. Let G = (V, E, A) be a mixed graph, and let f : V → {1, 2, . . . , n} be a linear ordering of G, n = |V |. Let c : E ∪ A → N be a function that assigns to every edge a non-negative integer weight. 1. The total weight of the edges crossing gap i is denoted by nf (i). 2. The weighted cutwidth of f is the maximum over all gaps i, 1 ≤ i ≤ n of nf (i). 3. For 1 ≤ i ≤ n, we denote the total weight of the edges and arcs that cross vertex f −1 (i) by mf (i). 4. The modiﬁed cutwidth of f is max1≤i≤n mf (i). 5. The weighted cutwidth or weighted modiﬁed cutwidth of a mixed graph G is the minimum weighted cutwidth or weighted modiﬁed cutwidth over all possible linear orderings of G. Interesting cases are the unweighted variants (all edges and arcs have weight one), and the cases when E or A are empty. In this way, we obtain the standard cutwidth, modiﬁed cutwidth, directed cutwidth, . . . problems as special cases. The pathwidth of a graph is at most its cutwidth, and at most one larger than its modiﬁed cutwidth. See [4] for an overview of related notions and results. Let Σ be a ﬁnite alphabet. Σ ∗ is the set of all (possibly empty) strings with symbols in Σ. The concatenation of strings s and t is denoted st. A string s ∈ Σ ∗ is a substring of a string t ∈ Σ ∗ , if there are t , t ∈ Σ ∗ , with t = t st . A string s is a subsequence of a string t = t1 t2 . . . tr ∈ Σ ∗ , if there are indices 1 ≤ α(1) < α(2) < · · · < α(q) ≤ r with s = tα(1) · · · tα(q) . Let Σ, Σ0 be disjoint ﬁnite alphabets, and let s be a string in (Σ ∪ Σ0 )∗ . The string s|Σ is the maximal subsequence of s that belongs to Σ ∗ , i.e., s|Σ is obtained from s by removing all symbols in Σ0 . In this paper, we also consider a linear ordering of G = (V, E) as a string in V ∗ , i.e., a string where every element of V appears exactly once. We say that a string t can be obtained by inserting a symbol v into a string s, if we can write s = s1 s2 , t = s1 vs2 , with s1 , s2 substrings of s. We say a path decomposition (X1 , . . . , Xr ) is nice, if |X1 | = 1, and for all i, 1 < i ≤ r, there is a v such that Xi = Xi−1 ∪ {v} (i is called a introduce node, inserting v), or Xi = Xi−1 − {v} (i is called a forget node). It is not hard to see (see e.g. [4]), that a path decomposition of width k can be transformed to a nice path decomposition of width k in linear time. A terminal graph is a triple (V, E, X), with (V, E) a graph, and X an labeled set of distinguished vertices from V , called the terminals. A terminal graph with k terminals is also called a k-terminal graph. Given two k-terminal graphs G and H, G ⊕ H is deﬁned as the graph, obtained by taking the disjoint union

242

Hans L. Bodlaender, Michael R. Fellows, and Dimitrios M. Thilikos

of G and H, then identifying the i’th terminal of G with the i’th terminal of H for all i, 1 ≤ i ≤ k, and then dropping parallel edges. Suppose we have a path decomposition (X1 , . . . , Xr ) of G = (V, E). To each i, 1 ≤ i ≤ r, we can associate the terminal graph Gi = (Vi , Ei , Xi ), with Vi = 1≤j≤i Xj , and Ei = |{{v, w} ∈ E | v, w ∈ Vi }. For a nice path decomposition (X1 , . . . , Xr ) of G = (V, E), we can build a k + 1-coloring : V → {1, . . . , k + 1} of G, such that for all v, w ∈ V , if there is an i with v, w ∈ Xi , then (v) = (w). (Go through the path decomposition from left to right. At each introduce node i, color the inserted vertex in Xi − Xi−1 diﬀerent from the other vertices in Xi .) Call (v) the label of v. We now introduce a notation for the possible operations, that given a graph Gi−1 , build the next graph Gi . We denote the terminal with number r as tr . If i is an introduce node, suppose the inserted vertex v ∈ Xi − Xi−1 has label l , and S ⊆ {1, . . . , k + 1} − {l } is the set of the labels of those vertices in Xi−1 = Xi − {v} that are adjacent to v. We can write an introduce operation as I(r, S), meaning an insertion of a new terminal tr , r ∈ {1, . . . , k + 1}, with this terminal adjacent to the terminals with numbers in S ⊂ {1, . . . , k+1}. The forget operation can be written as F (r), meaning that tr becomes a non-terminal. If f1 is a linear order of G = (VG , EG ), G is a subgraph of H = (VH , EH ), and f is a linear order of H, we say that f extends f1 , if f1−1 (1), f1−1 (2), . . . , f1−1 (|VG |) is a subsequence of f −1 (1), f −1 (2), . . . , f −1 (|VH |), i.e., f can be obtained from f1 by inserting the vertices of VH − VG . To a sequence of vertices v1 , . . . , vn (or a linear order f ), we associate the set of n + 1 gaps: a gap is the location between two successive vertices, or the location before the ﬁrst, or after the last vertex. For a linear order f of G = (V, E) and a subset W ⊆ V , we consider the linear order f |W of G[W ], where for all v, w ∈ W , f (v) < f (w), if and only if f |W (v) < f |W (w), i.e., f |W is obtained in the natural way from f by dropping all vertices not in W from the sequence. For v, w ∈ V , we write f [v, w] as the sequence f |W with W = {x | f (v) ≤ f (x) ≤ f (w)}, i.e., we take the substring that starts at v and ends at w. We use the acronyms NFA and DFA for non-deterministic ﬁnite state automaton and deterministic ﬁnite state automaton. We assume that the reader is familiar with these classic notions, and the fact that for every NFA A, there is a DFA B that accepts the same set of strings.

3

An Algorithm for Weighted Cutwidth

In this section, we give a stepwise derivation of the following result. Theorem 1. Let k, c be constants. One can construct a linear time algorithm, that given a mixed edge-weighted graph G together with a path decomposition of width at most k, decides if the weighted cutwidth of G is at most c, and if so, ﬁnds a linear ordering of G with weighted cutwidth at most r. Due to space constraints, we only discuss the decision version of the result here. The corresponding ordering can be found with additional bookkeeping. We derive

Starting with Nondeterminism

243

the algorithm by ﬁrst giving a naive non-deterministic algorithm for the problem, and then modifying it step by step. We always start the algorithm by making the path decomposition of G nice, without increasing its width. 3.1

A Trivial Non-deterministic Algorithm

The following trivial non-deterministic algorithm ﬁnds a linear order of G, and solves the weighted cutwidth problem. It builds the order by inserting the vertices one by one as they appear in the path decomposition, thus after step i, we have a linear order of Gi . 1. Start with an empty sequence. 2. Now, go through the path decomposition from left to right. If we deal with the ith node of the path decomposition, then (a) If the ith node is an introduce node of vertex v, then insert v nondeterministically at some gap in the sequence such that the resulting sequence has weighted cutwidth at most c, and no arc from or to v is directed in the wrong direction. If there is no such gap, halt and reject. (b) If the ith node is a forget node of vertex v, then we do nothing. 3. If all nodes of the path decompositions have been handled, then accept. 3.2

A Non-deterministic Algorithm that Counts Edges

To determine whether a vertex can be inserted at some place in the sequence without violating the cutwidth condition, the algorithm above needs to consult the graph G. Instead, we can keep the information about the total weight of edges that cross gaps in the sequence. Now, we use sequences of the form nf0 , v1 , nf1 , v2 , . . . , nfn−1 , vr , nfr , i.e., we alternatingly have a number that tells the total weight of edges crossing a gap and a vertex. I.e., the sequence gives the ordering of the vertices and the weights of edges across the gaps. When inserting a vertex v in some gap, the information needed to construct a new sequence from the old sequence is only the sequence and the list of neighbours of v. In the remainder, we often drop the superscript f . We use the same algorithm as above, but now start with the sequence 0, and we have to detail how an insertion of a vertex v takes place now. 1. Non-deterministically, a gap with number nj in the sequence is chosen, such that for every arc (v, x) ∈ A, the gap comes before x in the sequence, and for every arc (x, v) ∈ A, the gap comes after x in the sequence. If no such gap exists, halt and reject. 2. nj is replaced by nj , v, nj . 3. For every terminal x with (v, x) ∈ E ∪ A, add the weight of (v, x) to every number in the sequence between x and v. 4. If we obtain a number that is k + 1 or larger, halt and reject.

244

3.3

Hans L. Bodlaender, Michael R. Fellows, and Dimitrios M. Thilikos

A Non-deterministic Decision Algorithm

Now note that we can forget the names of non-terminal vertices, because vertices that are later inserted have no edges to non-terminals. The algorithm of the previous step is modiﬁed as follows: If the ith node is a forget node of the form F (r), then we replace tr in the sequence by a symbol −. 3.4

A Non-deterministic Finite State Automaton

This step gives the crucial observation that turns the method ﬁrst into a NFA; later steps then transform the NFA into a deterministic linear time algorithm. First, we give an example. Suppose we have a substring 3 − 5 − 7 in the sequence. Then one can note that when the non-deterministic algorithm succeeds in completing the sequence, it can also do so by not inserting any vertex on the gap of the 5: everything it inserts there can also be inserted at the 3. Thus, the 5 can be be forgotten from the sequence. More generally, we have: Lemma 1. Let Gi = (Vi , Ei , X), i = 1, 2 be -terminal graphs. Let f be a linear order of G1 = (V1 , E1 , X) of weighted cutwidth at most c, and let f be a linear order of G1 ⊕ G2 of weighted cutwidth at most c such that f extends f . Suppose we have for 1 ≤ j1 < j2 ≤ |V1 |: – X ∩ {f −1 (j1 + 1), f −1 (j1 + 2), . . . , f −1 (j2 − 1), f −1 (j2 )} = ∅. – nf (j1 ) = minj1 ≤j≤j2 nf (j). – nf (j2 ) = maxj1 ≤j≤j2 nf (j). Let f be the linear order of G1 ⊕G2 that is obtained from f by replacing the substring f [f −1 (j1 ), f −1 (j2 )] by the substring f −1 (j1 ) · (f [f −1 (j1 ), f −1 (j2 )])|V2 −X · f [f −1 (j1 + 1), f −1 (j2 )]. Then the cutwidth of f is at most c. The change in the linear ordering is graphically depicted in Figure 1. The proof (omitted here) is by case analysis, considering the diﬀerent locations for gaps and distinguishing between diﬀerent ‘types’ of edges. Case analysis shows that f still preserves directions of arcs. A similar lemma can be proved for the case that nf (j1 ) = maxj1 ≤j≤j2 nf (j) and nf (j2 ) = minj1 ≤j≤j2 nf (j), and all other conditions are as in the lemma. The main reason why this lemma is interesting is the following corollary. Corollary 1. Consider the non-deterministic decision algorithm. Suppose at some point, a sequence s1 · s2 · s3 , with s2 = n1 − n2 − · · · − nq . s2 does not contain a character of the form tr . Suppose it holds that n1 = min{n1 , . . . , nq } and nq = max{n1 , . . . , nq }, or that n1 = max{n1 , . . . , nq } and nq = min{n1 , . . . , nq }. Then, if there is an extension of the sequence that corresponds to a linear order of G of cutwidth at most c, there is such an extension that does not insert any vertex on the gaps corresponding to the numbers n2 , . . . , nq−1 in substring s2 .

Starting with Nondeterminism

f −1 (j1 )

f −1 (j1 + 1) −1 f (j1 ) + 2

245

f −1 (j2 )

Fig. 1. A graphical depiction of the change going from f to f

This gives an obvious modiﬁcation to the non-deterministic algorithm: when choosing where to insert a vertex, forbid to insert a vertex on any of the gaps n2 , . . . , nq−1 , as indicated in Corollary 1. But then we may note that when we do not insert at the gaps corresponding to these numbers n2 , . . . , nq−1 , we can actually forget these numbers: if we insert v, and an edge e with endpoint v crosses one of these gaps, then e also crosses the gap with value max{n1 , nq }; thus we can drop the numbers n2 , . . . , nq−1 from the sequence. The discussion leads to observing that the following non-deterministic algorithm indeed also correctly decides whether the cutwidth is at most k. The insertion of a vertex is still done as in Section 3.2; the main diﬀerence in the algorithm below is in the compression operation. 1. Start with the sequence 0. 2. Now, go through the path decomposition from left to right. If we deal with the ith node of the path decomposition, then (a) If the ith node is an introduce node of the form I(r, S), then we insert tr non-deterministically at some gap in the sequence such that the resulting sequence has weighted cutwidth at most c. If there is no such gap, halt and reject. (b) If the ith node is a forget node of the form F (r), then we replace tr in the sequence by a symbol −. (c) In both cases, check if the sequence has a substring of the form n1 − n2 − · · · − nq with {nj1 , nj2 } = {minj1 ≤j≤j2 nj , maxj1 ≤j≤j2 nj }. If so, replace the substring n1 − n2 − · · · − nq with the substring n1 − nq . Repeat this step until such a replacement is no longer possible. (We call this a compression operation.) 3. If all nodes of the path decompositions have been handled, then output yes. A sequence of integers that cannot made smaller by the compression operation is called a typical sequence. There are at most 83 22c typical sequences of the integers in {0, 1, . . . , c} ([7, Lemma 3.5]). As we have at most k + 1 terminals, and between every pair of terminals there is a typical sequence, the number of possible sequences that can arise is bounded by a function of k and c; i.e., constant if k and c are constants. (See [15, Lemma 3.2] for an estimate.)

246

Hans L. Bodlaender, Michael R. Fellows, and Dimitrios M. Thilikos

This implies that the algorithm can be viewed as a NFA. The input to the automaton is a string that describes the nice path decompositions; with symbols from the ﬁnite alphabet {I(r, S) | 1 ≤ r ≤ k+1, S ⊆ {1, . . . , k+1}}∪{F (r) | 1 ≤ r ≤ k +1}, i.e., the input string is the sequence of successive introduce and forget operations that give the nice path decomposition. The states of the automaton are the diﬀerent strings that can be formed during the process: the number of diﬀerent such strings, and hence also the number of states is bounded by a constant, depending only on k and d. The possible next states are determined by the symbol (the type of node we deal with in the path decomposition), possibly a non-deterministic choice (where to insert the new vertex) and the old state (the sequence to which the vertex is inserted or the sequence where the forgotten node is replaced by an −). 3.5

A Deterministic Decision Algorithm

As is known for ﬁnite state automata, NFA’s recognise the same set of languages as DFA’s. We can employ the (actually simple) tabulation technique here too, and arrive at our deterministic algorithm. Thus, we take the algorithm of Section 3.4, and make it deterministic, using this technique. 1. Start with a set of sequences A0 that initially contains one sequence 0. 2. Now, go through the path decomposition from left to right. If we deal with the ith node of the path decomposition, then (a) Set Ai = ∅ and Bi = ∅. (b) If the ith node is an introduce node of the form I(r, S), then do for every sequence s ∈ Ai−1 : For every gap in the sequence s, look to the sequence s obtained by inserting tr in s in that gap. If s has weighted cutwidth at most c, preserves directions of arcs to and from tr and s ∈ Bi , then insert s in Bi . (c) If the ith node is a forget node of the form F (r), then for every sequence s ∈ Ai−1 , let s be the sequence obtained by replacing tr in s by a symbol −. If s ∈ Bi , then insert s in Bi . (d) In both cases, let s be the sequence obtained by applying the compression operation to s. If s ∈ Ai , then insert s in Ai . 3. If all r nodes of the path decompositions have been handled, then output yes, if and only if Ar = ∅. I.e., we just tabulate all possible sequences the non-deterministic algorithm can attain at its steps, and thus arrive at an equivalent deterministic algorithm. We mentioned earlier that the number of possible sequences is bounded by a function of k and c, thus, if k and c are ﬁxed, each set S is of constant size. Hence, the algorithm above uses linear time. The standard technique that turns a dynamic programming algorithm for a decision problem into an algorithm that constructs solutions can be employed here too: additional bookkeeping is done, and when the decision problem is positively solved, the kept information is used to construct a solution. We omit the details.

Starting with Nondeterminism

4

247

Other Problems

The technique can also be applied to several other problems. Some well studied problems are just a special case of the weighted cutwidth for mixed graphs problem, like the cutwidth (undirected graphs with all edges weight one) and directed cutwidth (directed graphs with all arcs weight one) problems. For some other problems, like pathwidth and modiﬁed cutwidth, we can derive a linear time algorithm using a transformation, or constructing the algorithm in the same manner as we did for weighted cutwidth. Theorem 2. For each ﬁxed k, l, each of the following problems can be solved in linear time, assuming in each case that G is given together with a path decomposition of width at most l: 1. Given an undirected graph G, determine if the pathwidth of G is at most k, and if so, ﬁnd a path decomposition of width at most k. 2. Given a directed acyclic graph G, determine if the directed vertex separation number of G is at most k, and if so, ﬁnd a corresponding topological sort. 3. Given an undirected, directed, or mixed (weighted) graph G, determine if the (weighted) modiﬁed cutwidth of G is at most k, and if so, ﬁnd a corresponding linear ordering. Similar results hold for weighted modiﬁed cutwidth with positive weights. The result for pathwidth gives an easier proof of a result from [7]. In [2] it was stated without proof that directed vertex separation number is linear time solvable.

5

Conclusions

The techniques described here only deal with problems where a linear order has to be found, and as common characteristic we have that yes-instances have bounded pathwidth. Similar algorithms are known however for notions that imply bounded treewidth (like branchwidth [8], carving width [16] and treewidth itself [13,7]). In order to be able to present or extend such algorithms like we did above, additional techniques have to be added to the machinery, in particular: – The desired output of the problem has a tree structure. Basically, one should show that certain parts of the tree (tree- or branch-decomposition) can be forgotten in the algorithm, and that the remainder of the tree then can be formed by gluing a constant number of paths together. – The input has bounded treewidth, thus a nice tree decomposition can be formed. We now have, in addition to the introduce and forget operations, a join operation, where two partial solutions of two subgraphs have to be combined, basically by ‘interleaving’ these two. The constant factors of the algorithms resulting from the methodology presented in this paper are very large. Finding improvements that lead towards implementations that are fast enough in practice for moderate values of k remains a challenge. Finally, starting with a non-deterministic algorithm and turning it stepwise into a deterministic algorithm appears to be a technique that is useful for the design of ﬁxed parameter tractability results (see [9]), and worthy of further investigation.

248

Hans L. Bodlaender, Michael R. Fellows, and Dimitrios M. Thilikos

References 1. K. R. Abrahamson and M. R. Fellows. Finite automata, bounded treewidth and well-quasiordering. In Proc. of the AMS Summer Workshop on Graph Minors, Graph Structure Theory, Contemporary Mathematics vol. 147, pp. 539–564. American Mathematical Society, 1993. 2. H. Bodlaender, J. Gustedt, and J. A. Telle. Linear-time register allocation for a ﬁxed number of registers. In Proc. of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 574–583. ACM, 1998. 3. H. L. Bodlaender. Treewidth: Algorithmic techniques and results. In Proc. 22nd International Symposium on Mathematical Foundations of Computer Science, MFCS’97, pp. 19–36. LNCS, vol. 1295, Springer-Verlag, 1997. 4. H. L. Bodlaender. A partial k-arboretum of graphs with bounded treewidth. Theor. Comp. Sc., 209:1–45, 1998. 5. H. L. Bodlaender, M. R. Fellows, and P. A. Evans. Finite-state computability of annotations of strings and trees. In Proc. Conference on Pattern Matching, pp. 384–391, 1996. 6. H. L. Bodlaender, M. R. Fellows, and D. M. Thilikos. Derivation of algorithms for cutwidth and related graph layout problems. Technical Report UU-CS-2002-032, Inst. of Inform. and Comp. Sc., Utrecht Univ., Utrecht, the Netherlands, 2002. 7. H. L. Bodlaender and T. Kloks. Eﬃcient and constructive algorithms for the pathwidth and treewidth of graphs. J. Algorithms, 21:358–402, 1996. 8. H. L. Bodlaender and D. M. Thilikos. Constructive linear time algorithms for branchwidth. In Proc. 24th International Colloquium on Automata, Languages, and Programming, pp. 627–637. LNCS, vol. 1256, Springer-Verlag, 1997. 9. J. Chen, D. K. Friesen, W. Jia, and I. Kanj. Using nondeterminism to design deterministic algorithms. In Proc. 21st Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2001, pp. 120–131. LNCS, vol. 2245, Springer-Verlag, 2001. 10. M.-H. Chen and S.-L. Lee. Linear time algorithms for k-cutwidth problem. In Proc. Third International Symposium on Algorithms and Computation, ISAAC’92, pp. 21–30. LNCS, vol. 650, Springer-Verlag, 1992. 11. R. G. Downey and M. R. Fellows. Fixed-parameter tractability and completeness I: Basic results. SIAM J. Comput., 24:873–921, 1995. 12. N. G. Kinnersley and W. M. Kinnersley. Tree automata for cutwidth recognition. Congressus Numerantium, 104:129–142, 1994. 13. J. Lagergren and S. Arnborg. Finding minimal forbidden minors using a ﬁnite congruence. In Proc. of the 18th International Colloquium on Automata, Languages and Programming, pp. 532–543. LNCS vol. 510, Springer-Verlag, 1991. 14. B. Monien and I. H. Sudborough. Min cut is NP-complete for edge weighted trees. Theor. Comp. Sc., 58:209–229, 1988. 15. D. M. Thilikos, M. J. Serna, and H. L. Bodlaender. A constructive linear time algorithm for small cutwidth. Technical Report LSI-00-48-R, Departament de Llenguatges i Sistemes Informatics, Univ. Politecnica de Catalunya, Barcelona, Spain, 2000. 16. D. M. Thilikos, M. J. Serna, and H. L. Bodlaender. Constructive linear time algorithms for small cutwidth and carving-width. In Proc. 11th International Symposium on Algorithms And Computation ISAAC ’00, pp. 192–203. LNCS vol. 1969, Springer-Verlag, 2000.

Error-Bounded Probabilistic Computations between MA and AM Elmar B¨ohler, Christian Glaßer, and Daniel Meister Theoretische Informatik, Universit¨at W¨urzburg, 97074 W¨urzburg, Germany {boehler,glasser,meister}@informatik.uni-wuerzburg.de

Abstract. We introduce the probabilistic complexity class SBP. This class emerges from BPP by keeping the promise of a probability gap but decreasing the probability limit to exponentially small values. We locate SBP in the polynomial-time hierarchy, more precisely, between MA and AM. We provide evidence that SBP does not coincide with these and other known complexity classes. We construct an oracle relative to which SBP is not contained in ΣP 2. We provide a new characterization of BPPpath . This characterization shows that SBP is a subset of BPPpath . Consequently, there is an oracle relative to which BPPpath is not contained in ΣP 2.

1

Introduction

The use of randomness provides an extension of conventional deterministic Turing machines. The origins of this idea go back to the work of de Leeuw, Moore, Shannon, and Shapiro [dLMSS56]. In 1972, Gill started the investigation of probabilistic polynomialtime bounded machines [Gil72,Gil77]. Such machines can be considered to be usual polynomial-time Turing machines but choose every step randomly with a certain probability. Usually, the choice is between two steps of equal probability 12 . An input x is accepted by a probabilistic machine if it is accepted with probability at least 12 (the probability limit). The complexity class containing all sets acceptable by probabilistic polynomial-time bounded machines is denoted by PP. A well-known subclass of PP is BPP (bounded-error probabilistic polynomial-time) [Gil72,Gil77]. For each language L in this class, there exists ρ > 12 and a probabilistic polynomial-time decision procedure which ﬁnds the correct answer to arbitrary queries “x ∈ L?” with probability larger than ρ. Making use of an ampliﬁcation technique one can increase this probability of success to values arbitrarily close to 1, which means that almost every probabilistic computation gives the correct answer. The only difference between PP and BPP is the existence of a probability gap ε > 0: A probabilistic machine deciding a set in BPP promises not to accept or reject with a probability in the interval [ 21 − ε, 12 + ε]. What happens if we keep the gap but lower the limit of 12 ? Nothing changes as long as the probability limit is decreased by a polynomial factor. However, if we decrease the probability limit by an exponential factor, then we seem to leave BPP and obtain a new complexity class. Languages in this class are accepted by probabilistic polynomial-time machines that may have an exponentially small probability limit and keep the promise of a probability gap. We denote this class B. Rovan and P. Vojt´asˇ (Eds.): MFCS 2003, LNCS 2747, pp. 249–258, 2003. c Springer-Verlag Berlin Heidelberg 2003

250

Elmar B¨ohler, Christian Glaßer, and Daniel Meister

by SBP (small bounded-error probability). We stick to the notation SBP rather than to SBPP, since the latter denotes the restriction of BPP where a semi-random source of randomness is used [Vaz86]. So far, SBP is an extension of BPP. However, SBP appears in other contexts: The class PP can be deﬁned via GapP functions. Since these functions have different characterizations [FFK94,Gup95], the following statements are equivalent to saying that L ∈ PP. 1. There exists a balanced nondeterministic polynomial-time machine M such that x ∈ L if and only if accM (x) > rejM (x). 2. There exist f, g ∈ #P such that x ∈ L if and only if f (x) > g(x). 3. There exist f ∈ #P and g ∈ FP such that x ∈ L if and only if f (x) > g(x). Interestingly, this equivalence completely disappears if we demand a gap. By this, we mean that there must be some ε > 0 such that either accM (x) > (1+ε) · rejM (x) or accM (x) < (1−ε) · rejM (x) for statement 1, and f (x) > (1+ε) · g(x) or f (x) < (1−ε) · g(x) for statements 2 and 3. The modiﬁed statement 1 deﬁnes BPP. We show that the modiﬁed statement 2 describes exactly the class BPPpath . This characterization of BPPpath is new and of interest in its own right. We show that, apart from the original deﬁnition of SBP, one can allow any polynomial-time computable probability limit. Therefore, statement 3 equipped with a gap exactly describes SBP. So if we start from three equivalent characterizations of PP and introduce a gap, then the equivalence of the three statements disappears and we obtain the promise classes BPP, BPPpath , and SBP. In particular, this shows that SBP can be thought of as a restriction of BPPpath , and therefore BPP ⊆ SBP ⊆ BPPpath . Further relations of SBP with gap-deﬁnable counting classes as well as detailed proofs can be found in the technical report [BGM02]. Paper Outline. In section 3, we formally introduce SBP and give alternative deﬁnitions of this class. Since SBP allows ampliﬁcation, we can prove that SBP is closed under union. With a technique of universal hash functions [Sip83,BM88] we show that SBP is in the polynomial-time hierarchy. More precisely, BP·UP ∪ MA ⊆ SBP ⊆ BPPpath ∩ AM.

(1)

By the results of Babai [Bab85], it follows that SBP ⊆ ΠP 2. In section 4, we provide evidence for the strictness of the inclusions given in (1). This is done by means of collapse consequences and oracle constructions. We construct an oracle relative to which SBP is not contained in ΣP 2 . This indicates that SBP might not be closed under complementation. As a consequence, we obtain that BPPpath ⊆ ΣP 2 with respect to this oracle. This answers an open question of Han, Hemaspaandra, and Thierauf [HHT97] who ask for BPPpath ’s relationship to RNP and ΣP 2. Our oracle separates two complexity classes by a single diagonalization argument. However, this argument is not easy to obtain. The construction extends ideas from Baker and Selman [BS79] by including redundant encoding of strings. We introduce a suitable witness language that is decidable in SBP. Then, we use a large bit string and encode it in a redundant way into the oracle. An SBP machine can process all this information. However, if a ΣP 2 machine can simulate the SBP machine, then it reveals information of the encoded data. Because of the redundant encoding, this is enough to completely

Error-Bounded Probabilistic Computations between MA and AM

251

reconstruct the original bit string. This gives a description of the bit string that is too short to be possible. In this way we can diagonalize against every ΣP 2 machine. In particular, P ΣP 2 = Π2 relative to our oracle.

2

Preliminaries

df We ﬁx the alphabet Σ ={0, 1}. For a nondeterministic polynomial-time Turing machine M , let accM (x) and rejM (x) denote the number of accepting and rejecting paths df of M on input x, respectively. Let totalM (x) = accM (x) + rejM (x) denote the total number of computation paths. Throughout the paper, if not stated otherwise, variables are natural numbers and polynomials have natural coefﬁcients. The characteristic function of a set B is denoted by cB . When we talk about probabilistic machines, we mean balanced machines, unless we explicitly announce them to be unbalanced (as needed in the deﬁnition of BPPpath ). Let M be a (possibly unbalanced) probabilistic machine. For every B ⊆ Σ ∗ × Σ ∗ , every function f : N → N, and every x ∈ Σ ∗ , let df count=f B (x) = #{y : |y| = f (|x|) and (x, y) ∈ B}.

For B ∈ P and a polynomial p, obviously count=p B is in #P. With the help of so-called operators, one can, starting from an existing complexity class C, deﬁne new classes. We say A ∈ ∃·C if there are B ∈ C and a polynomial p such that for all x ∈ Σ ∗ , x ∈ A ⇐⇒ count=p B (x) ≥ 1. We say A ∈ ∀·C if there are p(|x|) B ∈ C and a polynomial p such that for all x ∈ Σ ∗ , x ∈ A ⇐⇒ count=p B (x) = 2 [Sto77,Wra77]. We say A ∈ BP·C if there are B ∈ C, a polynomial p, and ε > 0 such that for all x ∈ Σ ∗ , if x ∈ A, then count=p (x) > 12 + ε · 2p(|x|) , and if x ∈ / A, then B 1 p(|x|) =p countB (x) < 2 − ε · 2 [Sch89]. We say A ∈ U·C if there are B ∈ C and a polynomial p such that for all x ∈ Σ ∗ , if x ∈ A, then count=p / A, B (x) = 1, and if x ∈ (x) = 0. It is obvious that ∃·P = NP, ∀·P = coNP, and BP·P = BPP then count=p B [Gil77,Sch89].

3 The Class SBP The class BPP is deﬁned to be the class of all sets that can be decided by a probabilistic polynomial-time machine such that the acceptance probability must never be in a (closed) interval of non-zero length around 12 . Even a very small forbidden interval sufﬁces to admit a probability ampliﬁcation, which means in case of BPP that the ampliﬁed machine outputs the correct result with a probability arbitrarily close to 1. We introduce the class SBP, which is deﬁned similarly, except that the value 12 is lowered to exponentially small values. Deﬁnition 1. The set A is in SBP if there exist B ∈ P, polynomials p and q, and ε > 0 such that for all x ∈ Σ ∗ : p(|x|) x ∈ A =⇒ count=q B (x) > (1 + ε) · 2 p(|x|) x∈ / A =⇒ count=q B (x) < (1 − ε) · 2

252

Elmar B¨ohler, Christian Glaßer, and Daniel Meister

From this deﬁnition, SBP seems to be more powerful than BPP. In contrast, definitions that arise from those of BPP and SBP by not demanding a gap both lead to PP. 3.1

Properties of SBP

The gap in the deﬁnition of SBP allows ampliﬁcation. However, in contrast to BPP, it does not seem to be possible to raise the probability of the correct answer. At least, we can lower the probability of obtaining the wrong result to values arbitrarily close to 0, if the input has to be rejected. Proposition 1 (Ampliﬁcation). A ∈ SBP if and only if for every polynomial r > 0 there exist B ∈ P and polynomials q and s such that for all x ∈ Σ ∗ : r(|x|) · 2s(|x|) x ∈ A =⇒ count=q B (x) > 2 1 s(|x|) x∈ / A =⇒ count=q B (x) < r(|x|) · 2 2

Apart from the original deﬁnition of SBP, we can allow any polynomial-time computable probability limit. Proposition 2. A set A is in SBP if and only if there exist f ∈ #P, g ∈ FP, and ε > 0 such that for all x ∈ Σ ∗ : x ∈ A =⇒ f (x) > (1 + ε) · g(x) x∈ / A =⇒ f (x) < (1 − ε) · g(x) Proof. It sufﬁces to show the implication from right to left. Without loss of generality, we may assume that g > 0. Let q be a polynomial such that f (x) ≤ 2q(|x|) for all x ∈ Σ ∗ , and df p(|x|) /g(x)· choose a polynomial p such that 2ε ·2p(n) > 2q(n) for all n ≥ 0. Let h(x) = 2 f (x), and note that h ∈ #P. Now observe the following implications. f (x) p(|x|) ε p(|x|) ·2 ·2 x ∈ A =⇒ h(x) > − f (x) > 1 + g(x) 2 f (x) p(|x|) ε p(|x|) ·2 ·2 x∈ / A =⇒ h(x) ≤ < 1− g(x) 2 SBP’s characterization in Proposition 2 allows ampliﬁcation. Proposition 3. A ∈ SBP if and only if for every h ∈ FP, h > 1, there exist f ∈ #P and g ∈ FP such that for all x ∈ Σ ∗ : x ∈ A =⇒ f (x) > h(x) · g(x) x∈ / A =⇒ f (x) < g(x) It is known that BPP is closed under union, intersection, and complement. We cannot show SBP to be likewise robust. We will see that there is an oracle relative to which SBP = coSBP (cf. Corollary 5). Besides that it remains open whether SBP is closed under intersection. We do not know whether there is an oracle relative to which SBP is not closed under intersection. However, we can prove that it is closed under union.

Error-Bounded Probabilistic Computations between MA and AM

253

Proposition 4. SBP is closed under ∪. df 4. Proposition 3 provides functions f1 , f2 ∈ Proof. Let A1 , A2 ∈ SBP and let h(x) = df df df #P and g1 , g2 ∈ FP. Let ε = 1/3, f (x) = f1 (x) · g2 (x) + f2 (x) · g1 (x) and g(x) = 3· g1 (x)·g2 (x). Observe that ε, f , and g characterize A1 ∪A2 in the sense of Proposition 2.

3.2

Inclusion Relations

Babai [Bab85] introduced the Arthur-Merlin classes MA and AM as interacting proof systems with bounded number of rounds and public coin tosses. Theorem 1. BP·UP ∪ MA ⊆ SBP. Proof. This is achieved by Sch¨oning’s ampliﬁcation technique [Sch89]. Babai and Moran introduced ε-approximate lower bound protocols of AM [BM88]. They used Sipser’s technique on universal hashing [Sip83] and showed that these protocols solve the following promise problem in an AM-manner (AM-conditions only have to hold for inputs satisfying the promise). Input: An NP machine M , a number N , and a string x, Promise: |accM (x)| < N or |accM (x)| ≥ (1 + ε) · N , Question: Is |accM (x)| ≥ (1 + ε) · N ? If N is the probability limit of an SBP machine M on input x, then, by deﬁnition of SBP, the promise is always satisﬁed. This shows: Theorem 2. SBP ⊆ AM. Particularly, SBP is contained in ΠP 2 . Note that it is not clear whether the mentioned promise problem is in AM. The reason is that inputs that do not satisfy the promise may destroy the promise of possible AM machines. Corollary 1. ∃·BPP = NPBPP ⊆ MA ⊆ SBP ⊆ AM = BP·NP. For PP the notion of balanced and unbalanced probabilistic machines are equal [Sim75]. However, in case of BPP we obtain the new class BPPpath . Deﬁnition 2 ([HHT97]). A set A is in BPPpath if there exist a nondeterministic polynomial-time Turing machine M and ε > 0 such that for all x ∈ Σ ∗ : 1 + ε · totalM (x) x ∈ A =⇒ accM (x) > 2 1 − ε · totalM (x) x∈ / A =⇒ accM (x) < 2 Theorem 3 ([HHT97]). PNP[log] ⊆ BPPpath . If we use two #P functions in Proposition 2 instead of one #P function and one FP function we obtain a new characterization of BPPpath .

254

Elmar B¨ohler, Christian Glaßer, and Daniel Meister

Theorem 4. L ∈ BPPpath if and only if there exist f, g ∈ #P and ε > 0 such that: x ∈ L =⇒ f (x) > (1 + ε) · g(x) x∈ / L =⇒ f (x) < (1 − ε) · g(x) Proof. If L ∈ BPPpath , it is easy to see, that L satisﬁes the right hand side of the proposition. For the other direction, let L satisfy the right-hand side of the proposition. Since f, g ∈ #P there exist nondeterministic polynomial-time Turing machines N1 and N2 with accN1 (x) = f (x) and accN2 (x) = g(x) for all x ∈ Σ ∗ . One can verify that the following nondeterministic polynomial-time Turing machine M accepts L in the sense of BPPpath for a certain polynomial q. M works as follows on input x: First, M produces two paths while making one nondeterministic step. On the ﬁrst (resp., second) path M simulates N1 (resp., N2 ) on input x. Each time this simulation ends with a rejecting path, M makes one more nondeterministic step in order to produce one accepting and one rejecting path. If the simulation of N1 (resp., N2 ) ends with an accepting path, then M makes q(|x|) additional nondeterministic steps in order to produce 2q(|x|) accepting (resp., rejecting) paths. Corollary 2. BPP ⊆ SBP ⊆ BPPpath .

4

Separation by Oracle Results

In the previous sections our observations aimed at localizing SBP with respect to known complexity classes. In particular, we obtained BP·UP∪MA ⊆ SBP ⊆ AM∩BPPpath . However, up to now we have not provided any evidence of the strictness of these inclusions. The objective of this section is to ﬁnd hints that separate the classes BPP, BPPpath , and AM from SBP. Furthermore, we prove separation results with respect to ΣP 2 and other classes. We start with the separation of SBP from BPP and BPPpath . Theorem 5. If BPP = SBP or SBP = BPPpath , then the polynomial-time hierarchy collapses to ΣP 2. Proof. If SBP ⊆ BPP, then NP ⊆ BPP and the polynomial-time hierarchy collapses [Sip83,KL82]. If BPPpath ⊆ SBP, then coNP ⊆ AM and the polynomial-time hierarchy collapses [BHZ87]. In contrast to Theorem 5 we cannot prove consequences for SBP = MA or SBP = AM which are similarly unlikely. Therefore, we approach this question by utilizing relativizations. Theorem 6. There exists an oracle A such that AMA ⊆ SBPA and coAMA ⊆ SBPA . Proof. This follows by an oracle A showing that AMA ∩ coAMA ⊆ PPA [Ver92]. Remember that AM contains classes like NP, BPP, MA, and it is unlikely that AM is contained in ΣP 2 . In this light AM seems to be quite powerful. However, Boppana, H˚astad, and Zachos showed that unless the polynomial-time hierarchy collapses, AM (and therefore also SBP) is not powerful enough to contain coNP [BHZ87]. In connection with Yao’s oracle this gives the following consequence.

Error-Bounded Probabilistic Computations between MA and AM

255

Theorem 7 ([Yao85,BHZ87]). There exists an oracle A such that coNPA ⊆ AMA . A

Corollary 3. There exists an oracle A such that coNPA ⊆ SBPA and ΣP ⊆ SBPA . 2 We come to a new oracle construction showing that SBP is not contained in ΣP 2. We will prove a stronger result, namely that there exists an oracle relative to which P BP·UP ⊆ ΣP 2 . Since BP·UP ⊆ SBP and MA ⊆ Σ2 relative to all oracles, this ﬁnally A NP yields SBPA ⊆ MAA and SBPA ⊆ ΣP 2 . So relative to our oracle, BPPpath ⊆ R P and BPPpath ⊆ Σ2 . This solves an open question of Han, Hemaspaandra, and Thierauf [HHT97]. Theorem 8. There exists an oracle A such that BP·UPA ⊆ ∃·∀·PA . A

Corollary 4. There exists an oracle A such that SBPA ⊆ MAA , SBPA ⊆ ΣP 2 , and A PA BPPpath ⊆ Σ2 . Corollary 5. There exists an oracle relative to which SBP = coSBP. The remaining part of this section sketches the proof of Theorem 8. We want to mention that Santha [San89] constructs a similar oracle relative to which AM is not contained in ΣP 2 . An examination of his proof shows that it actually establishes Theorem 8. df Proof. We construct oracle stages A1 , A2 , . . ., and let A = i≥1 Ai . As an abbreviation df df df 12 for intervals of stages Ai we use A[k, j] = A . Let a and ai+1 = 2ai for 1 =2 k≤i≤j i i ≥ 1. It is easy to show that 2ai /4 ≥ (ai )i for i ≥ 1. Deﬁne the following conditions for B ⊆ Σ ∗ and i ≥ 1: df C1(B, i) = for every x ∈ Σ ai /4 there exists at most one y ∈ Σ ai ·3/4 with xy ∈ B 1 df |B ∩ Σ ai | = 2ai /4 ∨ |B ∩ Σ ai | ≤ · 2ai /4 C2(B, i) = 2

Our construction is such that Ai ⊆ Σ ai ∧ C1(A[1, i], i) ∧ C2(A[1, i], i) for each i ≥ 1. For B ⊆ Σ ∗ let df ai W (B) ={0 : i ≥ 1 and for all x ∈ Σ ai /4 there exists exactly one y ∈ Σ ai ·3/4 such that xy ∈ B}.

We use W (A) as witness language: On one hand it is clear that, if Ai ⊆ Σ ai ∧ C1(A[1, i], i) ∧ C2(A[1, i], i) for i ≥ 1, then W (A) ∈ BP·UPA . On the other hand we show that W (A) ∈ / ∃·∀·PA . Let T1 , T2 , . . . be an enumeration of all triples T = (M, r, s) where M is a deterministic polynomial-time oracle machine and r, s are polynomials. For Ti = (Mi , ri , si ) we may assume that ri (n) ≤ ni and there is a polynomial ti (n) ≤ ni such that the computation MiB (x, y, z) halts within ti (|x|) steps for all x ∈ Σ + , y ∈ Σ ri (|x|) , z ∈ Σ si (|x|) . To reach W (A) ∈ / ∃·∀·PA , the construction of Ai diagonalizes against the “∃·∀·P-machine” Ti : If Ti = (Mi , ri , si ), then the construction of Ai prevents A[1,i] 0ai ∈ W (A[1, i]) ⇐⇒ (∃y ∈ Σ ri (ai ) )(∀z ∈ Σ si (ai ) ) (0ai , y, z) ∈ L(Mi ) .

256

Elmar B¨ohler, Christian Glaßer, and Daniel Meister

So the construction will additionally satisfy conditions C3(A[1, i], i), which are deﬁned as follows. df C3(B, i) = ¬ 0ai ∈ W (B) ⇐⇒ (∃y ∈ Σ ri (ai ) )(∀z ∈ Σ si (ai ) ) (0ai , y, z) ∈ L(MiB ) As an abbreviation for the conditions deﬁned so far we use df C(B, i) = C1(B, i) ∧ C2(B, i) ∧ C3(B, i).

Claim 1. There exist oracle stages A1 , A2 , . . . such that Ai ⊆ Σ ai and C(A[1, i], i) for all i ≥ 1. The idea of the proof is as follows: We start with some large number N, transform it into a redundant representation, and use it as an oracle. If Claim 1. does not hold, then we can describe half of the information contained in the oracle by a small number of bits. Since our representation is redundant, we can reconstruct N. Hence, N has a description that is too short. This is a contradiction and proves Claim 1.. With Claim 1. at hand we complete the proof of the theorem: If W (A) ∈ ∃·∀·PA , then there exists some i ≥ 1 such that 0ai ∈ W (A) ⇐⇒ (∃y ∈ Σ ri (ai ) )(∀z ∈ Σ si (ai ) )[(0ai , y, z) ∈ L(MiA )]. The ai ’s grow fast enough such that MiB (0ai , y, z) cannot ask for words of length ≥ ai+1 . Moreover, 0ai ∈ W (A) ⇐⇒ 0ai ∈ W (A[1, i]). It follows that A[1,i]

0ai ∈ W (A[1, i]) ⇐⇒ (∃y ∈ Σ ri (ai ) )(∀z ∈ Σ si (ai ) )[(0ai , y, z) ∈ L(Mi

)].

This contradicts C3(A[1, i], i), which holds by Claim 1.; this proves the theorem.

5

Conclusions and Open Questions

We introduced SBP and showed that it arises in several contexts. We proved that it is located between MA and AM ∩ BPPpath . We found evidence that it is unlikely that SBP coincides with any of these classes. We showed that SBP is closed under union. By our oracle construction, SBP does not seem to be closed under complementation. We do not know whether SBP is closed under intersection. The idea of concatenating two machines does not work. Is there an oracle separating SBP and its closure under intersection? We know that BPPpath as well as AM are closed under intersection, so that the closure of SBP under intersection is contained in AM ∩ BPPpath . Is this inclusion strict? Han, Hemaspaandra, and Thierauf ask whether BPPpath has complete sets [HHT97]. Aspnes, Fischer, Fischer, Kao, and Kumar study the predictability of stock markets [AFF+ 01]. They give a problem that is hard for BPPpath , but its completeness is unknown. So far, we neither know complete sets for BPPpath , nor do we know an oracle relative to which BPPpath does not have complete sets. The same questions are interesting with respect to SBP. Note that there exists an oracle relative to which BPP does not have complete sets [Sip82,HH88].

Error-Bounded Probabilistic Computations between MA and AM

257

A similar open issue is to ﬁnd natural problems in SBP that are not obviously in BPP. The graph non-isomorphism problem and some matrix group problems over ﬁnite ﬁelds are known to be in AM [GMW91,Sch88,Bab85]. Is one of them in SBP? Other open questions address the separation of SBP from MA and AM. Can one extend this to collapse consequences? In addition one should look for unlikely consequence of the assumption SBP ⊆ ΣP 2.

Acknowledgements We thank Klaus W. Wagner for initiating this work and for many helpful discussions. In particular, the idea of the class SBP is due to him. Furthermore, we thank Stephen A. Fenner, Frederic Green, Lane A. Hemaspaandra, Sven Kosub, and Heribert Vollmer for helpful hints. We thank an anonymous referee for informing us of Santha’s paper [San89].

References AFF+ 01.

J. Aspnes, D. F. Fischer, M. J. Fischer, M. Y. Kao, and A. Kumar. Towards understanding the predictability of stock markets from the perspective of computational complexity. In Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms, pages 745–754. ACM Press, 2001. Bab85. L. Babai. Trading group theory for randomness. In Proceedings 17th Symposium on Theory of Computing, pages 421–429. ACM Press, 1985. BGM02. E. B¨ohler, C. Glaßer, and D. Meister. Error-bounded probabilistic computations between MA and AM. Technical Report 299, Julius-Maximilians-Universit¨at W¨urzburg, 2002. Available at http://www.informatik.uni-wuerzburg.de/reports/tr.html. BHZ87. R. B. Boppana, J. H˚astad, and S. Zachos. Does co-NP have short interactive proofs? Information Processing Letters, 25(2):127–132, 1987. BM88. L. Babai and S. Moran. Arthur-Merlin games: A randomized proof system, and a hierarchy of complexity classes. Journal of Computer and System Sciences, 36:254– 276, 1988. BS79. T. P. Baker and A. L. Selman. A second step towards the polynomial hierarchy. Theoretical Computer Science, 8:177–187, 1979. dLMSS56. K. de Leeuw, E. F. Moore, C. E. Shannon, and N. Shapiro. Computability by probabilistic machines. In C. E. Shannon, editor, Automata Studies, volume 34 of Annals of Mathematical Studies, pages 183–198. Rhode Island, 1956. FFK94. S. Fenner, L. Fortnow, and S. Kurtz. Gap-deﬁnable counting classes. Journal of Computer and System Sciences, 48:116–148, 1994. Gil72. J. Gill. Probabilistic Turing Machines and Complexity of Computations. PhD thesis, University of California Berkeley, 1972. Gil77. J. Gill. Computational complexity of probabilistic turing machines. SIAM Journal on Computing, 6:675–695, 1977. GMW91. O. Goldreich, S. Micali, and A. Widgerson. Proofs that yield nothing but their validity or all languages in NP have zero-knowledge proof systems. Journal of the Association for Computing Machinery, 38(1):691–729, 1991. Gup95. S. Gupta. Closure properties and witness reductions. Journal of Computer and System Sciences, 50:412–432, 1995.

258 HH88. HHT97. KL82. San89. Sch88. Sch89. Sim75. Sip82.

Sip83. Sto77. Vaz86. Ver92. Wra77. Yao85.

Elmar B¨ohler, Christian Glaßer, and Daniel Meister J. Hartmanis and L. A. Hemachandra. Complexity classes without machines: On complete languages for UP. Theoretical Computer Science, 58:129–142, 1988. Y. Han, L. A. Hemaspaandra, and T. Thierauf. Threshold computation and cryptographic security. SIAM Journal on Computing, 26(1):59–78, 1997. R. Karp and R. Lipton. Turing machines that take advice. L’enseignement math´ematique, 28:191–209, 1982. M. Santha. Relativized Arthur-Merlin versus Merlin-Arthur games. Information and Computation, 80:44–49, 1989. U. Sch¨oning. Graph isomorphism is in the low hierarchy. Journal of Computer and System Sciences, 37:312–323, 1988. U. Sch¨oning. Probabilistic complexity classes and lowness. Journal of Computer and System Sciences, 39:84–100, 1989. J. Simon. On Some Central Problems in Computational Complexity. PhD thesis, Cornell University, 1975. M. Sipser. On relativization and the existence of complete sets. In Proceedings 9th ICALP, volume 140 of Lecture Notes in Computer Science, pages 523–531. Springer Verlag, 1982. M. Sipser. A complexity theoretic approach to randomness. In Proceedings of the 15th Symposium on Theory of Computing, pages 330–335, 1983. L. Stockmeyer. The polynomial-time hierarchy. Theoretical Computer Science, 3:1–22, 1977. U. Vazirani. Randomness, Adversaries and Computation. PhD thesis, University of California Berkeley, 1986. N. K. Vereshchagin. On the power of PP. In Proceedings 7th Structure in Complexity Theory, pages 138–143. IEEE Computer Society Press, 1992. C. Wrathall. Complete sets and the polynomial-time hierarchy. Theoretical Computer Science, 3:23–33, 1977. A. C. C. Yao. Separating the polynomial-time hierarchy by oracles. In Proceedings 26th Foundations of Computer Science, pages 1–10. IEEE Computer Society Press, 1985.

A Faster FPT Algorithm for Finding Spanning Trees with Many Leaves Paul S. Bonsma, Tobias Brueggemann, and Gerhard J. Woeginger University of Twente, The Netherlands {p.s.bonsma,t.brueggemann,g.j.woeginger}@math.utwente.nl

Abstract. We describe a new, fast, and fairly simple FPT algorithm for the problem of deciding whether a given input graph G has a spanning tree with at least k leaves. The time complexity of our algorithm is polynomially bounded in the size of G, and its dependence on k is roughly O(9.49k ). This is the fastest currently known algorithm for this problem.

1

Introduction

In the max-leaf spanning tree problem (MaxLeaf), an input consists of a connected, undirected, simple graph G = (V, E) on n = |V | vertices. The objective is to ﬁnd a spanning tree for G with the maximum number of leaves. This problem has been well-studied over the last twenty years. On the negative side, MaxLeaf is known to be NP-hard (Garey & Johnson [11]), and APX-hard (Galbiati, Maﬃoli & Morzenti [9,10]). On the positive side, the literature contains some polynomial time approximation algorithms for MaxLeaf that have fairly small worst case performance guarantees (a guarantee of 3 by Lu & Ravi [15], and a guarantee of 2 by Solis-Oba [16]). Moreover, it is known that the following natural parameterized version of problem MaxLeaf falls into the complexity class FPT of Downey & Fellows [6]: “Given an n-vertex graph G and a positive integer parameter k, does G possess a spanning tree with at least k leaves? ” Sloppily speaking, a problem belongs to the complexity class FPT, if it has an algorithm with a time complexity of the form O(f (k) poly(n)). Here the dependence f (k) of the running time on k may be arbitrary; for instance, f (k) may grow doubly exponentially with k, or even worse. However, the running time must be polynomially bounded in the input size n. A problem with such an algorithm is said to be ﬁxed-parameter tractable (FPT, for short). Fellows & Langston [8] observed that MaxLeaf belongs to FPT via the graph minors machinery of Robertson & Seymour; their argument was non-constructive and did not explicitly yield an algorithm. Bodlaender [2] constructed the ﬁrst FPT algorithm for MaxLeaf. Its time complexity was linear in n and had a parameter function f (k) of roughly (17k 4 )!; we stress that [2] was only interested in proving the existence of such an algorithm, and did not put any eﬀort in getting a good time complexity. A little bit later, Downey & Fellows [5] constructed a B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 259–268, 2003. c Springer-Verlag Berlin Heidelberg 2003

260

Paul S. Bonsma, Tobias Brueggemann, and Gerhard J. Woeginger

better FPT algorithm for MaxLeaf with f (k) = (2k)4k . The fastest known FPT algorithm for MaxLeaf prior to our work is due to Fellows, McCartin, Rosamond & Stege [7]; its parameter function is roughly k (14.23)k . The literature also contains a number of purely combinatorial results around problem MaxLeaf, mainly in extremal graph theory. Ding, Johnson & Seymour [4] prove that whenever a graph G = (V, E) satisﬁes |E| ≥ |V | + 12 (k − 1)(k − 2) and |V | = k + 1, then it possesses a spanning tree with at least k leaves. Another branch of extremal results deals with graphs with large minimum degree. Nati Linial conjectured around 1987 that if the minimum degree of G is at least δ, then there exists a spanning tree with at least n(δ − 2)/(δ + 1) + cδ leaves, where cδ is a constant only depending on δ. Alon [1] used a probabilistic argument to disprove Linial’s conjecture for the cases where the minimum degree δ is suﬃciently large. However, for small values of δ the conjecture turned out to be true. Proposition 1. (Linial & Sturtevant [14]; Kleitman & West [13]) Every connected graph G = (V, E) with minimum vertex degree at least 3 has a spanning tree with at least 14 |V | + 2 leaves. Moreover, such a spanning tree can be determined in polynomial time. Kleitman & West [13] proved Linial’s conjecture for δ = 4 with c4 = 8/5, and Griggs & Wu [12] proved Linial’s conjecture for δ = 5 with c5 = 2. All these bounds are best possible. The cases with δ ≥ 6 are not well-understood. In particular, we do not know the value of the smallest δ for which Linial’s conjecture is false. For more information, we refer the reader to Caro, West & Yuster [3]. The contribution of this paper is the following: We construct a fast FPT algorithm for MaxLeaf with parameter function f (k) ≈ ck where c = 256/27 ≈ 9.4815. Our solution approach is quite diﬀerent from the previous lines of attack in [2,5,7]. It uses heavier combinatorial machinery from the literature, and it is based on three main ingredients. The ﬁrst ingredient is a preprocessing procedure that translates any instance (G; k) of MaxLeaf into an (equivalent) smaller instance (G ; k ) that satisﬁes a number of nice structural properties. This preprocessing procedure is described in detail in Section 3. The second ingredient is the combinatorial result in Proposition 1. Finally, the third ingredient is an (expensive!) enumerative procedure that checks an exponential number of vertex subsets whether they can form the leaf-set of a certain type of spanning tree. The second and third ingredient are discussed in Section 4. This section also describes our FPT algorithm, and analyzes its time complexity.

2

Notation and Preliminaries

The degree d(v) of a vertex v is the number of its neighbors. For a subset S ⊆ V , we denote by G[S] the subgraph of G induced by S. Moreover, we write G − S short for G[V − S]. By the leaf-set leaf(G), we denote the set of all vertices in V that have degree 1. Next, we consider the sequence of graphs that starts with the graph G0 = G and that has Gi = Gi−1 − leaf(Gi−1 ) for all i ≥ 1. Let k

A Faster FPT Algorithm for Finding Spanning Trees with Many Leaves

261

be the smallest index for which Gk = Gk+1 holds. Since this graph Gk results from repeatedly shaving oﬀ the leaves from G, the remaining graph Gk is either empty or has minimum vertex degree 2. The graph Gk is called the shaved graph of G, and we will denote it by shave(G). We partition the vertex set of G into three disjoint classes S ∅ (G), S ≥3 (G), and S =2 (G): – The set S ∅ (G) of so-called shaved-oﬀ vertices contains the vertices of G that do not show up in shave(G). – The set S ≥3 (G) of so-called shaved high-degree vertices contains the vertices that have degree at least 3 in the shaved graph shave(G). – The set S =2 (G) of so-called shaved degree-2 vertices contains the vertices that have degree 2 in the shaved graph shave(G). If S ≥3 (G) = ∅, then shave(G) is either a cycle, or it is the empty graph. Moreover, shave(G) is the empty graph if and only if G is cycle-free. Let u and v be two vertices in S ≥3 (G). A caterpillar path (or c-path, for short) between u and v is a subgraph of G that is a simple path with end vertices u and v, and all internal vertices in S =2 (G). If x1 , . . . , xr are the internal vertices of a c-path ordered from u to v, then P = u, x1 , . . . , x2 , v denotes this c-path. Note that a single edge uv with u, v ∈ S ≥3 (G) also forms a c-path. A vertex x ∈ S ∅ (G) is said to be in the neighborhood of a c-path from u to v, if there is a path P from an interior vertex of the c-path to x that only traverses vertices from S ∅ (G). In other words, x is in the neighborhood if it belongs to one of the trees that are dangling from the c-path. If P is a c-path, C(P ) denotes the (connected) subgraph of G induced by V (P ) and all the vertices in S ∅ (G) that are in the neighborhood of the c-path P (C(P ) is the actual caterpillar). A vertex x ∈ S =2 (G) without neighbors in S ∅ (G) is called an α-vertex, and a vertex x ∈ S =2 (G) with at least one neighbor in S ∅ (G) is called a β-vertex. Proposition 2. Let G = (V, E) be a graph that contains a (not necessarily spanning) tree T = (V , E ) with V ⊆ V and E ⊆ E, and assume that T has at least leaves. Then G has a spanning tree T with at least leaves.

3

The Preprocessing Phase

In this section, we will prove a number of reduction lemmas that altogether lead to the following theorem. Theorem 3. There exists a polynomial time algorithm that translates any instance (G; k) of MaxLeaf into another instance (G ; k ) that satisﬁes the following properties: – (G; k) is a YES-instance if and only if (G ; k ) is a YES-instance; – G has at most as many edges as G; G has at most as many c-paths as G; and k ≤ k holds; – Between any two vertices u, v ∈ S ≥3 (G ), there is at most one c-path; – For any u ∈ S ≥3 (G ), none of the c-paths both starts and ends in u;

262

Paul S. Bonsma, Tobias Brueggemann, and Gerhard J. Woeginger

– Every vertex in S ∅ (G ) is a leaf; – For every c-path in G , its internal vertices form an alternating sequence of α- and β-vertices. The ﬁrst property in this theorem expresses the equivalence of instances (G ; k ) and (G; k). The second property states that the reduction does not blow up the data. The structural information is in the third and fourth property: The c-paths in G are well-behaved. No two of them run in parallel, and none of them forms a loop. The ﬁfth and sixth property limit the possible forms of c-paths in G . We will present a number of lemmas that make small cosmetic changes to the instance, that make the instance simpler and smaller, and that step by step lead to the desired properties. When we say an instance (G; k) containing a certain structure is reducible, we mean that we can replace this structure with a diﬀerent structure giving a new instance (G ; k ) that satisﬁes the ﬁrst two properties of Theorem 3. Furthermore, in such a replacement step either the number of edges or the number of c-paths decreases. This guarantees that the algorithm will terminate in polynomial time. Lemma 4. If G has a bridge e = uv with d(u) ≥ 2 and d(v) ≥ 2, then (G; k) is reducible. Proof. In every spanning tree, the bridge e is used and the vertices u and v are non-leaves. Hence, we may contract e without changing the problem. Lemma 5. If G contains two leaves v1 and v2 that are adjacent to the same vertex u, then (G; k) is reducible. We apply Lemmas 4 and 5 to G over and over again, as long as this is possible. The resulting graph G already has a fairly simple structure. Every vertex in S ∅ (G) is a leaf. Every vertex in S ≥3 (G) ∪ S =2 (G) is adjacent to at most one vertex from S ∅ (G). Observe that if S ≥3 (G) = ∅, then (G, k) already satisﬁes the properties in Theorem 3. So from now on we will assume that S ≥3 (G) = ∅. Lemma 6. Consider a c-path P = u, x1 , . . . , xr , v with u, v ∈ S ≥3 (G). (a) If for some i, the vertices xi and xi+1 both are α-vertices, then (G; k) is reducible. (b) If for some i, the vertices xi and xi+1 both are β-vertices, then (G; k) is reducible. Proof. (a) We claim that there always exists an optimal tree for MaxLeaf that avoids the edge xi xi+1 . Consider an optimal tree T that does use xi xi+1 . If we remove the edge xi xi+1 from T , it breaks into two subtrees T1 and T2 . By Lemma 4 we may assume that xi xi+1 is not a bridge in G, and so there is an edge y1 y2 in G that connects T1 and T2 . Then it can be seen that the new tree T := T − xi xi+1 + y1 y2 is an optimal tree, too (the transformation cannot decrease the number of leaves). Since the edge xi xi+1 is not needed in an optimal tree, we may remove it from G and reduce the instance.

A Faster FPT Algorithm for Finding Spanning Trees with Many Leaves

263

(b) We claim that there always exists an optimal tree for MaxLeaf that uses the edge xi xi+1 . Consider an optimal tree T that avoids xi xi+1 . If we insert the edge xi xi+1 into T , this creates a cycle C; let y1 y2 = xi xi+1 be any edge on C. Then the new tree T := T + xi xi+1 − y1 y2 is optimal, too: The vertices xi and xi+1 were non-leaves in T , and they are non-leaves in T . Since the edge xi xi+1 can be put into every optimal tree, and since the β-vertices xi and xi+1 can never be leaves, we may contract the edge xi xi+1 in G and reduce the instance. Lemma 7. Consider a c-path P = u, x1 , . . . , xr , v with u, v ∈ S ≥3 (G). If for some i, the vertices xi−1 and xi+1 both are α-vertices and xi is a β-vertex, then (G; k) is reducible. Proof. Vertex xi−1 is incident to the edge xi−1 xi and to a second edge that we call e. Vertex xi+1 is incident to the edge xi xi+1 and to a second edge that we call f . The β-vertex xi has an incident edge g that connects it to a leaf y. We claim that there always exists an optimal tree for MaxLeaf that uses the three edges xi−1 xi , xi xi+1 , and g. Indeed, if T is an optimal tree with xi−1 xi ∈ / T , then T = T − e + xi−1 xi is a spanning tree with at least as many leaves. Similarly, if xi xi+1 ∈ / T , then T = T − f + xi xi+1 is a spanning tree with at least as many leaves. The edge g must be in T , since it is the only edge incident to the leaf y. We reduce (G; k) as follows: We contract xi−1 , xi , xi+1 , and y into a single vertex, and we replace k by k − 1. As an immediate consequence of Lemmas 6 and 7, we may assume from now on that on any c-path P = u, x1 , . . . , xr , v with u, v ∈ S ≥3 (G), r ≤ 3 holds and that the interior vertices are alternatingly α-vertices and β-vertices. Moreover, if r = 3 then x1 and x3 are β-vertices and x2 is an α-vertex. Next, we will study the situations where G contains two vertices u and v in S ≥3 (G) that are connected by two distinct c-paths P = u, x1 , . . . , xr , v and P = u, x1 , . . . , xs , v . We will show that in all these situations, the instance (G; k) is reducible. Here is some intuition for this: Since the two paths P and P form a cycle, any spanning tree must avoid one edge from P , or one edge from P , or both, one edge from P and one from P . A spanning tree cannot avoid two edges from the same path, since this would yield isolated vertices. There are not many possibilities for avoiding edges, since the c-paths and their neighborhood are strongly structured. We list and analyze all possible cases in the following two subsections. 3.1

The Cases Where Both Caterpillar Paths Have Interior Vertices

Lemma 8. Let P = u, x1 , . . . , xr , v and P = u, x1 , . . . , xs , v be two c-paths between u, v ∈ S ≥3 (G). If r ≥ 2 and s ≥ 1, and if x1 and x1 both are β-vertices, then (G; k) is reducible. Proof. Consider an optimal tree T for MaxLeaf. If T does not use the edge ux1 , then x1 x2 ∈ T and the new tree T − x1 x2 + ux1 has at least as many leaves as T (In the new tree x2 is a leaf, and in the old tree u might have been a leaf). Hence, there always exists an optimal tree T that uses ux1 .

264

Paul S. Bonsma, Tobias Brueggemann, and Gerhard J. Woeginger

If in this optimal tree T the vertex u is a leaf, then ux1 ∈ / T and x1 x2 ∈ T . Furthermore, the β-vertex x1 cannot be a leaf in any spanning tree. Therefore, T + ux1 − x1 x2 has at least as many leaves as T , and is another optimal tree. To summarize, there always exists an optimal tree that contains the edge ux1 , and in which u and x1 are non-leaves. Hence, we may contract the edge ux1 in G to a single vertex without changing the problem. Lemma 9. Consider two c-paths P = u, x1 , . . . , xr , v and P = u, x1 , v between u, v ∈ S ≥3 (G). If r ≥ 1 and if the (unique) interior vertex x1 of P is an α-vertex, then (G; k) is reducible. Proof. We claim there always exists an optimal tree for MaxLeaf that does not use both of the edges ux1 and x1 v. Indeed, consider an optimal tree T . Suppose that it uses both edges ux1 and x1 v. Then T must avoid exactly one edge yz in P . We distinguish a number of cases: First, if y and z both are interior vertices of P , then one of them is an α-vertex and one of them is a β-vertex. Hence, at most one of them can be a leaf in T . Then the tree T − ux1 + yz has x1 as a new leaf, and hence it has at least as many leaves as T . In the second case, we assume that T avoids the edge yz = ux1 . If at most one of u and x1 is in leaf(T ), then T − ux1 + ux1 has at least as many leaves as T . If both u and x1 are in leaf(T ), then the tree T − ux1 + ux1 still has u as a leaf, and it has the new leaf x1 instead of the old leaf x1 . Hence, also in this case, T − ux1 + ux1 has at least as many leaves as T . The third case, where T avoids the edge yz = xr v is symmetric to the second case. To summarize, there always exists an optimal spanning tree T in which x1 is a leaf. The reduction is now as follows: We remove x1 from G, and we replace k by k − 1. (In any spanning tree, one of u and v must be a non-leaf, and to that non-leaf we then can reconnect vertex x1 ). Lemma 10. Consider two c-paths P = u, x1 , v and P = u, x1 , v with one interior vertex between u, v ∈ S ≥3 (G). If x1 and x1 both are β-vertices, then (G; k) is reducible. Lemma 11. Consider two c-paths P = u, x1 , x2 , v and P = u, x1 , x2 , v between u, v ∈ S ≥3 (G). If x1 and x2 both are α-vertices, and if x2 and x1 both are β-vertices, then (G; k) is reducible. Let us ﬁnish this subsection by checking that we have indeed settled all the cases where both c-paths P = u, x1 , . . . , xr , v and P = u, x1 , . . . , xs , v have at least one interior vertex (and hence satisfy r, s ≥ 1). By Lemmas 6 and 7, we may assume that these c-paths alternatingly consist of α-vertices and β-vertices. Moreover, there are at most 3 interior vertices, and if there are exactly 3 then the ﬁrst and last one are β-vertices and the middle one is an α-vertex. Without loss of generality we assume that r ≥ s. First assume that s = 1: If x1 is an α-vertex, then by Lemma 9 this situation is reducible. If x1 is a β-vertex and r ≥ 2, then Lemma 8 can be applied to reduce the instance. If x1 is a β-vertex and r = 1, then either Lemma 9 or

A Faster FPT Algorithm for Finding Spanning Trees with Many Leaves

265

Lemma 10 lead to a reduction. Next assume that s = 2 (and r ≥ 2): If u or v are adjacent to two β-vertices on P and P , then Lemma 8 applies, otherwise Lemma 11 applies. The case s = r = 3 can be settled by Lemma 8. 3.2

The Cases Where One of the Caterpillar Paths Is an Edge

In this subsection, we discuss the case where u, v ∈ S ≥3 (G) are connected by the edge uv and by a c-path P = u, x1 , . . . , xr , v . There are four possible cases that are handled in the four lemmas below: (i) r = 3; (ii) r = 2; (iii) r = 1 and x1 is a β-vertex; (iv) r = 1 and x1 is an α-vertex. Lemma 12. If there is a c-path P = u, x1 , x2 , x3 , v between u, v ∈ S ≥3 (G) with uv ∈ E, then (G; k) is reducible. Proof. We may assume by Lemmas 6 and 7 that x1 and x3 are β-vertices and that x2 is an α-vertex. We claim that there always exists an optimal tree for MaxLeaf that does not use the edge x1 x2 . Indeed, consider an optimal tree T with x1 x2 ∈ T . If x2 x3 ∈ / T , then we may simply switch the edges x1 x2 and x2 x3 and derive another optimal tree. Now assume T contains both x1 x2 and x2 x3 . By symmetry, we may furthermore assume that ux1 ∈ T . There remain / T (otherwise there would be two cases to consider: First, if x3 v ∈ T , then uv ∈ a cycle). Then T − x1 x2 + uv has at least as many leaves as T (since u and v / T , then T − x1 x2 + x3 v has at can not both be leaves in T ). Second, if x3 v ∈ least as many leaves as T . Since the edge x1 x2 is not needed in an optimal tree, we may remove it from G and reduce the instance. Lemma 13. If there is a c-path P = u, x1 , x2 , v between u, v ∈ S ≥3 (G) with uv ∈ E, then (G; k) is reducible. Lemma 14. Consider a c-path P = u, x1 , v between u, v ∈ S ≥3 (G) with uv ∈ E. If x1 is a β-vertex, then (G; k) is reducible. Proof. In this case, there always exists an optimal tree for MaxLeaf that does not use the edge uv. We reduce the instance by removing uv from G. Lemma 15. Consider a c-path P = u, x1 , v between u, v ∈ S ≥3 (G) with uv ∈ E. If x1 is an α-vertex, then (G; k) is reducible. To complete the proof of Theorem 3, we still have to get of rid of c-paths that start and end in the same vertex u ∈ S ≥3 (G). But this is straightforward: Exactly one edge has to removed from this cycle, and one can compute the best edge to remove and remove it in polynomial time.

4

The Algorithm

In this section, we will construct our fast FPT algorithm. The algorithm ﬁrst goes through the preprocessing procedure described in the preceding Section 3.

266

Paul S. Bonsma, Tobias Brueggemann, and Gerhard J. Woeginger

By Theorem 3 it ends up with an instance (G; k) such that between any u, v ∈ S ≥3 (G) there is at most one connecting c-path in G, and such that there is no c-path that starts and ends in the same vertex. In addition we know that every c-path consists of an alternating sequence of α-vertices and β-vertices and every vertex in S ∅ (G) is a leaf. Next, we distinguish three cases that depend on the cardinality of S ≥3 (G). The ﬁrst case deals with S ≥3 (G) = ∅. In this case either G is a tree and the problem is trivial or shave(G) is a cycle C, and this cycle C is the unique cycle in G. This case is straightforward, since exactly one of the edges in C must be broken. In the second case, |S ≥3 (G)| ≥ 4k holds. We deﬁne a graph H that has vertex set S ≥3 (G) and that has an edge uv for every two vertices u, v ∈ S ≥3 (G) that are connected by a c-path in shave(G). By Theorem 3, the graph H is simple and loop-free. Moreover, every vertex in S ≥3 (G) has degree at least 3 in shave(G), and so its incident c-paths connect it to at least three pairwise distinct other vertices in S ≥3 (G). This means that H is connected, has minimum vertex degree at least 3, and has at least 4k vertices. Hence, Proposition 1 can be applied to ﬁnd a spanning tree T in H with at least k leaves. By replacing the edges in T by their corresponding c-paths in G, we get a (not necessarily spanning) tree in G that has at least k leaves. Finally, Proposition 2 yields a spanning tree with at least k leaves for G, and thus (G; k) is a YES-instance of MaxLeaf. This completes the discussion of the second case. In the third case, |S ≥3 (G)| < 4k holds. We deﬁne an edge-weighted graph H that has vertex set S ≥3 (G) and that has an edge uv for every two vertices u, v ∈ S ≥3 (G) that are connected by a c-path in G. We assign the following weights to edges in H: if the c-path corresponding to edge e contains an αvertex, set w(e) = 1, otherwise set w(e) = 0. If T is a spanning tree in G and all edges of a c-path P are present in T , we say T uses the c-path P . Observe that for any spanning tree T in G, the edges in H corresponding to c-paths used in T induce a spanning tree in H. For any c-path P in G, if this c-path is used in T , none of the vertices in V (P ) ∩ S =2 (G) is a leaf. Since the α-vertices and β-vertices alternate and β-vertices can never be a leaf, at most one of those vertices is a leaf if the c-path is not used. Also, if the c-path does not contain α-vertices, the c-path can never contain an internal vertex that is a leaf. So the weights assigned to edges in H represent the cost of using the corresponding c-path in a spanning tree in G, in terms of the loss of possible leaves. We enumerate all subsets Y ⊆ S ≥3 (G) with |Y | ≤ k. For each such subset Y , we perform the following test procedure. A spanning tree T of G is called Y compatible, if Y ⊆ leaf(T ) holds. We determine the maximum possible number of leaves in S =2 (G) ∪ S ∅ (G) over all the Y -compatible spanning trees T for G. That means, vertices in Y must be leaves in T (but are not counted as leaves) and vertices in S ≥3 (G) − Y may be leaves or may be non-leaves in T (but in either case are not counted as leaves).

A Faster FPT Algorithm for Finding Spanning Trees with Many Leaves

267

Lemma 16. G has a Y -compatible spanning tree if and only if G − Y is connected, and if in the graph H, every y ∈ Y has a neighbor that is not in Y . How can we compute the Y -compatible spanning tree with the maximum number of leaves in S =2 (G) ∪ S ∅ (G)? Since all vertices in S ∅ (G) are always leaves, this is also the Y -compatible spanning tree that maximizes the number of leaves in S =2 (G). If Y does not satisfy the two conditions in Lemma 16, then we stop without a solution. Otherwise, we remove Y from H. For the remaining graph H − Y , we then compute a minimum spanning tree TH with respect to the edge weights w deﬁned above. This tree is used to construct a spanning tree T in G: for all edges in H − Y do the following: if the edge e in H − Y corresponding to c-path P is used in TH , then all edges of C(P ) are added to E(T ). If this edge e is not used in TH , then choose one edge in E(P ) incident with an α-vertex if possible, otherwise choose an arbitrary edge of E(P ). Add all other edges of C(P ) to E(T ). This guarantees that P is not used in T , and that P contributes w(e) leaves to T . Now for c-paths P with one end vertex y ∈ Y : add all edges of C(P ) to E(T ), except for the unique edge incident with y. For every y: if y has a neighbor in G that is a β-vertex or a neighbor in S ≥3 (G), connect y to this neighbor in T . Otherwise connect it to an arbitrary neighbor (an α-vertex). This maximizes the number of leaves in c-paths with end vertex y, when y should be a leaf. Lemma 16 shows that there are no c-paths in G between two vertices in Y with at least one internal vertex. Therefore this procedure has constructed a spanning tree in G: of every c-path, at most one edge is not present in E(T ), and the edges in H corresponding to c-paths that are used in T form a spanning tree in H. Also, every vertex in y is a leaf in T , so T is Y -compatible. By construction, T is also a Y -compatible spanning tree with maximum number of leaves in S =2 (G). We determine the overall number of leaves in S =2 (G) ∪ S ∅ (G) in T . If + |Y | ≥ k, then we have found the desired solution. If + |Y | < k, then there is no solution for this particular choice of Y , and we test the next set. Lemma 17. If all the tests for all subsets Y ⊆ S ≥3 (G) with |Y | ≤ k fail, then the graph G does not possess a spanning tree with at least k leaves. Lemma 17 yields the correctness of our algorithm. What about its time complexity? Before starting the testing of the subsets Y , we compute all the edge weights for H in O(n2 ) time. Every single test boils down to computing a minimum spanning tree TH for H, and to translating TH into a spanning tree for G. Since H has at most 4k vertices, all this can be done in O(k 3 ) time. So the time complexity mainly depends on the √ number of tested subsets Y . By using Stirling’s approximation x! ≈ xx e−x 2πn for factorials, and by suppressing some constant factors, we see that the number of tested subsets is at most 4k (4k)! e3k ek 44k 256 k (4k)4k = ) ≈ 9.4815k . · · = = ( ≈ 4k 3k k 3k k (3k)! k! e (3k) k 3 27 Summarizing, we have proved the following theorem.

268

Paul S. Bonsma, Tobias Brueggemann, and Gerhard J. Woeginger

Theorem 18. The problem MaxLeaf of deciding whether a given input graph G has a spanning tree with at least k leaves can be solved by an FPT algorithm with time complexity O(n3 +9.4815k k 3 ). The parameter function of this algorithm is f (k) = O(9.49k ).

References 1. N. Alon (1990). Transversal numbers of uniform hypergraphs. Graphs and Combinatorics 6, 1–4. 2. H.L. Bodlaender (1993). On linear time minor tests and depth-ﬁrst search. Journal of Algorithms 14, 1–23. 3. Y. Caro, D.B. West, and R. Yuster (2000). Connected domination and spanning trees with many leaves. SIAM Journal on Discrete Mathematics 13, 202–211. 4. G. Ding, Th. Johnson, and P. Seymour (2001). Spanning trees with many leaves. Journal of Graph Theory 37, 189–197. 5. R. Downey and M.R. Fellows (1995). Parameterized computational feasibility. In: P. Clote, J. Remmel (eds.): Feasible Mathematics II. Birkh¨ auser, Boston, 219– 244. 6. R. Downey and M.R. Fellows (1998). Parameterized Complexity. Springer Verlag. 7. M.R. Fellows, C. McCartin, F.A. Rosamond, and U. Stege (2000). Coordinated kernels and catalytic reductions: An improved FPT algorithm for max leaf spanning tree and other problems. Proceedings of the 20th Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS’2000), Springer, LNCS 1974, 240–251. 8. M.R. Fellows and M.A. Langston (1992). On well-partial-order theory and its applications to combinatorial problems of VLSI design. SIAM Journal on Discrete Mathematics 5, 117–126. 9. G. Galbiati, F. Maffioli, and A. Morzenti (1994). A short note on the approximability of the maximum leaves spanning tree problem. Information Processing Letters 52, 45–49. 10. G. Galbiati, A. Morzenti, and F. Maffioli (1997). On the approximability of some maximum spanning tree problems. Theoretical Computer Science 181, 107– 118. 11. M.R. Garey and D.S. Johnson (1979). Computers and Intractability. W.H. Freeman and Co., New York. 12. J.R. Griggs and M. Wu (1992). Spanning trees in graphs of minimum degree four or ﬁve. Discrete Mathematics 104, 167–183. 13. D.J. Kleitman and D.B. West (1991). Spanning trees with many leaves. SIAM Journal on Discrete Mathematics 4, 99–106. 14. N. Linial and D. Sturtevant (1987). Unpublished result. 15. H.-I. Lu and R. Ravi (1998). Approximating maximum leaf spanning trees in almost linear time. Journal of Algorithms 29, 132–141. 16. R. Solis-Oba (1998). 2-approximation algorithm for ﬁnding a spanning tree with the maximum number of leaves. Proceedings of the 6th Annual European Symposium on Algorithms (ESA’1998), Springer, LNCS 1461, 441–452.

Symbolic Analysis of Crypto-Protocols Based on Modular Exponentiation Michele Boreale1 and Maria Grazia Buscemi2 1

Dipartimento di Sistemi e Informatica, Universit` a di Firenze, Italy 2 Dipartimento di Informatica, Universit` a di Pisa, Italy [email protected],[email protected]

Abstract. Automatic methods developed so far for analysis of security protocols only model a limited set of cryptographic primitives (often, only encryption and concatenation) and abstract from low-level features of cryptographic algorithms. This paper is an attempt towards closing this gap. We propose a symbolic technique and a decision method for analysis of protocols based on modular exponentiation, such as DiﬃeHellman key exchange. We introduce a protocol description language along with its semantics. Then, we propose a notion of symbolic execution and, based on it, a veriﬁcation method. We prove that the method is sound and complete with respect to the language semantics.

1

Introduction

During the last decade, a lot of research eﬀort has been directed towards automatic analysis of crypto-protocols. Tools based on ﬁnite-state methods (e.g. [13]) take advantage of a well established model-checking technology, and are very effective at ﬁnding bugs. Inﬁnite-state approaches, based on a variety of symbolic techniques ([2,3,8,14]), have emerged over the past few years. Implementations of these techniques (e.g. [4,16]) are still at an early stage. However, symbolic methods seem to be very promising in two respects. First, at least when the number of sessions is bounded, they can accomplish a complete exploration of the protocol’s state space: thus they provide proofs or disproofs of correctness under Dolev-Yao-like [11] assumptions - even though the protocol’s state space is inﬁnite. Second, symbolic methods usually rely on representations of data that help to control very well state-explosion induced by communications. The application of automatic methods has mostly been conﬁned to protocols built around ‘black-box’ enciphering and hashing functions. In this paper, we take a step towards broadening the scope of symbolic techniques, so as to include a class of low-level cryptographic operations. In particular, building on the general framework proposed in [5], we devise a complete analysis method for protocols that depend on modular exponentiation operations, like the DiﬃeHellman key-exchange [10]. We expect that our methodology may be adapted to other low-level primitives (like RSA encryption).

This work has been partially supported by EU within the FET - Global Computing initiative, projects MIKADO and PROFUNDIS and by MIUR project NAPOLI.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 269–278, 2003. c Springer-Verlag Berlin Heidelberg 2003

270

Michele Boreale and Maria Grazia Buscemi

The Diﬃe-Hellman protocol is intended for exchange of a secret key over an insecure medium, without prior sharing of any secret. The protocol has two public parameters: a large prime p and a generator α for the multiplicative group Zp∗ = {1, . . . , p−1}. Assume A and B want to establish a shared secret key. First, A generates a random private value nA ∈ Zp∗ and B generates a random private value nB ∈ Zp∗ . Next, A and B exchange their public values (exp (x, y) denotes xy mod p): 1. A −→ B : exp (α, nA ) 2. B −→ A : exp (α, nB ). Finally, A computes the key as K = exp (exp (α, nB ), nA ) = exp (α, nA × nB ), and B computes K = exp (exp (α, nA ), nB ) = exp (α, nA × nB ). Now A and B share K, and A can use it to, say, encrypt a secret datum d and send it to B: 3. A −→ B : {d}K . The protocol’s security depends on the diﬃculty of the discrete logarithm problem: computing y is computationally infeasible if only x and exp (x, y) are known. When deﬁning a model for low-level protocols of this sort, one is faced with two conﬂicting requirements. On one hand, one should be accurate in accounting for the operations involved in the protocol (exponentiation, product) and their ‘relevant’ algebraic laws; even operations that are not explicitly mentioned in protocols, but that are considered feasible (like taking the k th root modulo a prime, and division) must be accounted for, because an adversary could in principle take advantage of them. On the other hand, one must be careful in keeping the model eﬀectively analysable. In this respect, recent undecidability results on related problems of equational uniﬁcation [12] indicate that some degree of abstraction is unavoidable. The limitations of our model are discussed in Section 2. Technically, we simplify the model by avoiding explicit commutativity laws and by keeping a free algebra model and ordinary uniﬁcation. In fact, we ‘promote’ commutativity to non-determinism. As an example, upon evaluation of the expression exp (exp (α, n), m), an attacker will non-deterministically produce exp (α, m × n) or exp (α, n × m). The intuition is that if there is some action that depends on these two terms being equal modulo ×-commutativity, then there is an execution trace of the protocol where this action will take place. This seems reasonable since we only consider safety properties (i.e., ‘no bad action ever takes place’). Here is a more precise description of our work. In Section 2, parallelling [5], we introduce a syntax for expressions (including exp (·, ·) and related operations), along with a notion of evaluation. Based on this, we present a small protocol description language akin to the applied pi [1] and its (concrete) semantics. The latter assumes a Dolev-Yao adversary and is therefore inﬁnitary. In Section 2, we introduce a ﬁnitary symbolic semantics, which relies on a form of narrowing strategy, and discuss its relationship with the concrete semantics. A veriﬁcation method based on the symbolic semantics is presented in Section 4: the main result is Theorem 2, which asserts the correctness and completeness of the method with respect to the concrete model. Remarkably, the presence of the modular root

Symbolic Analysis of Crypto-Protocols Based on Modular Exponentiation

271

operation plays a crucial role in the completeness proof. Directions for further research are discussed in Section 5. An extended version of the present paper is available as [6]. Complete proofs will appear in a forthcoming full version. Very recent work by Millen and Shmatikov [15] shows how to reduce the symbolic analysis problem in the presence of modular exponentiation and multiplication plus encryption to the solution of quadratic Diophantine equations; decidability, however, remains an open issue. Closely related to our problem is also protocol analysis in the presence of the xor operation, which has been recently proven to be decidable by Chevalier et al. [7] and, independently, by Comon-Lundh and Shmatikov [9].

2

The Model

We recall here the concept of frame from [5], and tailor it to the case of modular exponentiation and multiplication. We consider two countable disjoint sets of names m, n, . . . ∈ N and variables x, y, . . . ∈ V. The set N is in turn partitioned into a countable set of local names a, b, . . . ∈ LN and a countable set of environmental names a, b, . . . ∈ EN : these two sets represent inﬁnite supplies of fresh basic values (keys, random numbers,. . . ) at disposal of processes and of the (hostile) environment, respectively. It is technically convenient also to consider which will be used as place-holders a set of marked variables x ˆ, yˆ, zˆ, . . . ∈ V, is ranged for generic messages known to the environment. The set N ∪ V ∪ V over by u, v, . . .. Given a signature Σ of function symbols f, g, . . ., each coming with its arity (constants have arity 0), we denote by EΣ the algebra of terms (or where expressions) on N ∪ V ∪ Σ, given by the grammar: ζ, η ::= u | f (ζ), ζ is a tuple of terms of the expected length. A term context C[·] is a term with a hole that can be ﬁlled with any ζ, thus yielding an expression C[ζ]. Deﬁnition 1 (Frame for Exponentiation). A frame F is a triple (Σ, M, ↓), where: Σ is a signature; M ⊆ EΣ is a set of messages M, N, . . .; ↓ ⊆ EΣ × EΣ is an evaluation relation. We write ζ ↓ η for (ζ, η) ∈ ↓ and say that ζ evaluates to η. Besides shared-key encryption {ζ}η and decryption decη (ζ) (with η used as the key), the other symbols of Σ represent arithmetic operations modulo a ﬁxed and public prime number, which is kept implicit: exponentiation exp (ζ, η), root extraction root (ζ, η), a constant α that represents a public generator and two constants for multiplicative unit (unit, 1), two distinct symbols for the product mult(ζ, η) and its result ζ × η, three symbols, inv(ζ), inv (ζ) and ζ −1 , representing the multiplicative inverse operation. The reason for using diﬀerent symbols for the same operation is discussed below. All the underlying operations are computationally feasible. Evaluation (↓) is the reﬂexive and transitive closure of an auxiliary relation ;, as presented in Table 1. There, we use ζ1 ×ζ2 ×· · ·×ζn as a shorthand for ζ1 × (ζ2 × · · · × ζn ), while (i1 , . . . , in ) is any permutation of (1, . . . , n). The relation ; is terminating, but not conﬂuent. In fact, the non-determinism of ; is intended

272

Michele Boreale and Maria Grazia Buscemi Table 1. FDH , a frame for modular exponentiation

Signature

Σ

=

{ α,

unit ,

· × ·,

1,

{·}(·) ,

mult(·, ·) ,

Factors

f

::=

u | u

F

::=

1 | f1 × · · · × fk

Keys

K, H

::=

f | exp (α, F )

Messages

M, N

::=

F | K | {M }K

(Mult) (Inv1 ) (Inv2 ) (Unit1 ) (Exp) (Root) (Ctx)

inv(·) ,

inv (·) ,

exp (·, ·) , −1

(·)

root (·, ·) ,

}

−1

Products

(Dec)

dec(·) (·) ,

decη ({ζ}η ) ; ζ mult(ζ1 × · · · ζk , ζk+1 × · · · × ζn ) ; ζi1 × · · · × ζin

inv(ζ1 × · · · × ζn ) ; inv (ζ1 ) × · · · × inv (ζn )

inv (ζ

−1

);ζ

unit × ζ ; ζ

(Inv3 ) (Unit2 )

inv (ζ) ; ζ

−1

1≤k
(Inv4 )

inv (ζ) × ζ ; unit

unit ; 1

exp (exp (ξ, η), ζ) ; exp (ξ, mult(η, ζ)) root (exp (ξ, η), ζ) ; exp (ξ, mult(η, inv(ζ))) ζ ; ζ Evaluation ζ ↓ η iﬀ C[ζ] ; C[ζ ]

ζ ;∗ η

to model the commutativity and the associativity of the product operation, as reﬂected in the rule (Mult). Also note rule (Root): in modular arithmetic, taking the k th root amounts to raising to the (k −1 mod p − 1)th power. The adoption −1 of distinct symbols for product (mult and ×), inverse (inv, inv and () ), and unit (unit and 1), along with the rules, ensure termination of both ; and the induced narrowing relation, introduced in Sec. 3. The choice of the above message and rule formats corresponds to imposing the following restrictions on the attacker and on the honest participants: (1) there is a ﬁxed upper bound (l) on the number of factors; (2) product and inverse operations cannot be applied to exponentials and to encrypted terms; (3) exponentiation starts from the basis α, and exponents can only be products. More accurately, starting from a term obeying the above conditions, an attacker is capable of ‘deducing’ all - though not necessarily only - AC variants of the message represented by the term, in a sense made precise below. Terms not obeying the conditions are just not guaranteed to produce any message. Restriction (1) might be relaxed at the cost of introducing a class of operations multl , for each l ≥ 0, but for simplicity we shall stick to the above model in this paper. The deduction relation below expresses how the environment can generate new messages starting from an initial set of messages S. Note that environmental names and marked variables are treated as terms known to the environment. Pf (X) denotes the set of ﬁnite subsets of X. Deﬁnition 2 (Deduction Relation). For S ⊆ M, the set H(S) is inductively deﬁned by the following rules:

Symbolic Analysis of Crypto-Protocols Based on Modular Exponentiation

H0 (S) = S∪ EN ∪ V; i H(S) = H (S).

273

: f ∈ Σ, ζ ⊆ Hi (S) }; Hi+1 (S) = Hi (S) ∪ {f (ζ)

i≥0

The deduction relation ⊆ Pf (M) × M is deﬁned as: S M if and only if there exists ζ ∈ H(S) such that ζ ↓ M. For instance, let S = { nA , exp (α, nB ) }. Then, S exp (α, nA × nB ) and S exp (α, nB × nA ). As another example, let S = { {m}exp (α,k×l) , exp (α, k × h), h, l }. Then, S m since there exists ζ ∈ H(S), ζ = decη ({m}exp (α,k×l) ), with η = exp (root (exp (α, k × h), h), l), s.t. ζ ↓ m. We now present a calculus which is a variant of the applied pi [1]. We consider a set L of labels which is ranged over by a, b, . . . and assume a unique public channel; thus input and output labels (a, b, . . .) are simply ‘tags’ attached to process actions for ease of reference. The syntax of agents is as follows: A, B ::= 0 | a(x). A | aζ. A | let x = ζ in A | [ζ = η]A | A || B | (new a) A.

The occurrences of variable x are bound in input and let operators. Notions of free variables (v(A) ⊆ V), substitution ([ζ/u]), and α-equivalence arise as expected. We denote by v(A) the set of free variables of A. An agent A is a process if v(A) = ∅; P, Q, . . . range over the set of processes P. Example 1 (the Diﬃe-Hellman key exchange). The process P below is a formalisation of a one-session version of the Diﬃe-Hellman protocol: A = (new nA ) a1exp (α, nA ). a2(x). let z = exp (x, nA ) in a3{d}z . 0 B = (new nB ) b1(y). b2exp (α, nB ). let w = exp (y, nB ) in b3(t). let t = decw (t) in B P = A || B.

Here B is a continuation of B after the reception of the encrypted datum d. The states of a protocol model are pairs s, P , where s records the current environment’s knowledge and P is a process term. An action is a term of the form a M (input action) or a M (output action), for a a label and M a message. The set of actions Act is ranged over by α, β, . . ., while the set of strings of actions Act∗ is ranged over by s, s , . . .. String concatenation is written ‘·’ . act(s) and msg(s) are the sets of actions and messages, respectively, appearing in s, and s M stands for msg(s) M . We deﬁne traces, that is, sequences of actions that may result from the interaction between a process and its environment. In traces, each message received by a process (input message) can be synthesised from the knowledge the environment has previously acquired. Deﬁnition 3 (Traces and Conﬁgurations). A trace is a string s ∈ Act∗ such that ∀s1 , s2 and a M , if s = s1 · a M · s2 then s1 M . A conﬁguration, written as s, P , is a pair composed by a ground trace s and a process P . A conﬁguration is initial if en(s, P ) = ∅. Conﬁgurations are ranged over by C, C , · · · .

274

Michele Boreale and Maria Grazia Buscemi

The semantics of the calculus is given in terms of a transition relation, which is also referred to as ‘concrete’ (as opposed to the ‘symbolic’ one discussed in the next section). Given the evaluation relation (↓), the concrete transition relation −→ is standard. Hence, here we just present the two most relevant transition rules and refer the reader to [5] for a complete treatment. (Inp) s, a(x). P −→ s · a M , P [M/x]

(Out) s, a ζ . P −→ s · a M , P

s M ζ↓M

Rule (Inp) makes the transition relation inﬁnitely-branching, as M ranges over the inﬁnite set {M : s M }. Rule (Out) allows to lift the non-determinism of ; to processes; this is used to render commutativity and associativity of product. The security properties that can be formalised within our model are correspondence assertions of the kind ‘for every generated trace, whenever action β occurs in the trace, then action α must have occurred at some previous point in the trace’. These correspondence assertions are deﬁned below. Given a conﬁguration s, P and a trace s , we say that s, P generates s , written s, P s , if s, P −→∗ s , P for some P . A substitution θ maps variables to messages; we let ρ range over ground substitutions. Deﬁnition 4 (Properties and Satisfaction Relation). Let α and β be actions and s be a trace. We say that α occurs prior to β in s if whenever s = s ·β·s then α ∈ act(s ). For v(α) ⊆ v(β), we write s |= α ← β, and say s satisﬁes α ← β, if for each ground substitution ρ it holds that αρ occurs prior to βρ in s. We say that a conﬁguration C satisﬁes α ← β, and write C |= α ← β, if all traces generated by C satisfy α ← β. Assertions α ← β can express interesting authentication and secrecy properties. We set secrecy in the style of [2] within our framework by assuming a conventional ‘absurd’ action ⊥ that it is nowhere used in agent expressions. Thus, ⊥ ← α means that action α should never take place. Example 2 (Diﬃe-Hellman, continued). The property that the protocol P in Example 1 should not leak the datum d can be expressed also by saying that the adversary will never be capable of synthesising d, without prior knowledge of it. This can be formalised by extending P with a ‘guardian’ process g(t). 0 that at any time can pick up one message from the network and then stop: S = P || g(t). 0. Then we check whether this guardian can ever pick d from the network, i.e. whether CDH = , S ← Secret(d), with Secret(d) = ⊥ ← g d

and being the empty trace.

3

Symbolic Semantics

We equip the frame FDH with a symbolic evaluation relation (↓s ), which is in agreement with its concrete counterpart (↓ ). Intuitively, ζ ↓θ η means that ζ evaluates to η under all instances ρ of θ. The main advantage of the symbolic

Symbolic Analysis of Crypto-Protocols Based on Modular Exponentiation

275

evaluation relation with respect to the concrete one is that inﬁnitely many pairs (ζ, η) such that ζ ↓ η can be represented as a single judgement ζ0 ↓θ η0 , for appropriate ζ0 , θ, η0 . The symbolic evaluation relation ↓s of FDH is presented θ in Table 2: it is deﬁned as the reﬂexive and transitive closure of the relation ;s . Table 2. Symbolic Evaluation Relation (↓s ) for FDH (DecS ) (MultS )

θ

decη (ζ) ;s x1 θ θ

mult(ζ1 , ζn ) ;s (xi1 × · · · × xin ) θ θ

(Inv1S )

inv(ζ) ;s (inv (xi1 ) × · · · × inv (xin )) θ

(Inv2S )

inv (ζ) ;s x1 θ

(Inv3S )

inv (ζ) ;s ζ −1

(Unit1S ) (Exp1S ) (Exp2S ) (Root1S ) (Root2S )

θ = mgu(ζ = {x1 }x2 , η = x2 ) 1 ≤ k < n ≤ l, θ = mgu(ζ1 = x1 × · · · × xk , ζ2 = xk+1 × · · · × xn ) 1 ≤ n ≤ l, θ = mgu(ζ = x1 × · · · × xn )

θ

θ = mgu(ζ, x1 −1 )

(Inv4S )

unit × ζ ;s ζ

(Unit2S )

θ

inv (ζ) × η ;s unit

θ = mgu(ζ, η)

unit ;s 1

θ

θ = [exp (α, x1 )/x]

exp (x, ζ) ;s exp (α, mult(x1 , ζ))

exp (exp (ξ, η), ζ) ;s exp (ξ, mult(η, ζ)) θ

θ = [exp (α, x1 )/x]

root (x, ζ) ;s exp (ξ, mult(η, inv(ζ)))

root (exp (ξ, η), ζ) ;s exp (α, mult(x1 , inv(ζ))) θ

ζ ;s ζ

θ

θ

1 n Symbolic Eval.: ζ ↓θ η iﬀ ζ ;s · · · ;s η and θ = θ1 · · · θn θ C[ζ] ;s Cθ[ζ ] Variables x1 , · · · , xn are chosen fresh according to some arbitrary but ﬁxed rule.

(CtxS )

θ

Lemma 1. ;s is image-ﬁnite and terminating. Hence, ↓s is image-ﬁnite. Deﬁnition 5 (Symbolic Traces and Conﬁgurations). A symbolic trace is a string s ∈ Act∗ s.t.: (a) en(s) = ∅, and (b) for each s1 , s2 , α and x, if s = s1 ·α·s2 and x ∈ v(α) − v(s1 ) then α is an input action. Symbolic traces are ranged over by σ, σ , . . .. A symbolic conﬁguration, written σ, A S , is a pair composed by a symbolic trace σ and an agent A, such that en(A) = ∅ and v(A) ⊆ v(σ). The symbolic semantics is given in terms of a symbolic transition relation −→S which is standard (see [5]), once the symbolic evaluation relation (↓s ) has been deﬁned. Here we simply present the symbolic versions of the input and output rules for comparison with their concrete counterparts: (InpS ) σ, a(x). A S −→S σ · a x , A S (OutS ) σ, a ζ . A S −→S σθ · a M , Aθ S

ζ ↓θ M

In rule (InpS ), input variables are not instantiated immediately. Rather, the input message is represented as a free variable and constraints on this variable are

276

Michele Boreale and Maria Grazia Buscemi

added as soon as they are needed, and recorded via mgu’s. This may occur due to rule (OutS ). For example, let P = a k . a(x). let z = root (x, k) in P . After an output action and an input action, the symbolical evaluation of root (x, k) produces a global substitution θ = [exp (x1 , k)/x] (x1 fresh), to be applied to the whole conﬁguration, and a local substitution θ = [exp (x1 , k)/z ]. I.e., , P S −→S∗ σθ, P θθ S , with σ = a k · a x . Whenever σ, A S −→∗S σ , A S for some A , we say that σ, A S symbolically generates σ , and write σ, A S S σ . The relation −→S is ﬁnitelybranching since ↓s is. Hence, each conﬁguration generates a ﬁnite number of symbolic traces. It is important to stress that many symbolic traces are in fact nonsense – sequences of actions that cannot be instantiated to any concrete trace. For instance, let P = a(y). let x = deck (y) in a x . 0. The initial conﬁguration , P S symbolically generates a {x0 }k · a x0 , which is inconsistent, as the environment cannot generate the value k in {x0 }k (i.e. k). The problem of detecting these inconsistent traces, that might give rise to ‘false positives’ when checking protocol properties, will be faced in the next section. The next theorem establishes a correspondence between the concrete and the symbolic transition relations. It relies on the notion of consistency, deﬁned below. Recall that marked variables are intended to carry messages known by the environment. We denote by σ\ˆ x the longest preﬁx of σ not containing x ˆ. Deﬁnition 6 (Consistency). Let σ ∈ Act∗ and ρ be a ground substitution. We say that ρ satisﬁes σ if σρ is a ground trace and, for each x ˆ ∈ v(σ), it holds that (σ\ˆ x)ρ ρ(ˆ x). In this case σρ is a solution of σ and σ is consistent. Theorem 1 (Concrete vs. Symbolic Semantics). C be an initial conﬁguration and s be a ground trace. Then C s if and only if there exists σ such that C S σ and s is a solution of σ.

4

A Veriﬁcation Method

A crucial point of the method we present is checking consistency of symbolic traces. Remark that a symbolic trace σ needs not have solutions (ground instances that are traces). The next result allows us to check consistency. Proposition 1. Let σ be a symbolic trace. Then there exists a ﬁnite set of traces Reﬁnement(σ), which are instances of σ and have the following property: for any s, s is a solution of σ if and only if s is a solution of some σ ∈ Reﬁnement(σ). Proposition 1 implies that σ is consistent if and only if Reﬁnement(σ) = ∅. Roughly, the set Reﬁnement(σ) is computed by repeatedly unifying each input message in σ to terms that can be synthetized out of previous messages in σ. We refer to [5] for details; here we just give an example. Let σ = c h · c x · c exp (α, k × h) · c {m}exp (α,k×x) · c m . Clearly, σ is consistent: e.g., map x to h. And, indeed, Reﬁnement(σ ) = {σ } where σ = σ [xˆ/x]. It is notable that

Symbolic Analysis of Crypto-Protocols Based on Modular Exponentiation

277

the root extraction operation, though not mentioned in σ , is essential to prove that σ is a trace. In fact, the environment is capable of learning m only by computing the key of the encrypted message as exp (root (exp (α, k × h), h), x ˆ). The veriﬁcation method M(C, α ← β) below checks whether C |= α ← β or not. Moreover, if the property is not satisﬁed, M(C, α ← β) computes a trace violating the property, that is, an attack on C. M(C, α ← β) 1. compute ModC = {σ | C S σ}; 2. foreach σ ∈ ModC do 3. foreach action γ in σ do 4. if ∃ θ = mgu(β, γ) and ∃ σ ∈ Reﬁnement(σθ) where 5. σ = σθθ and αθθ does not occur prior to βθθ in σ 6. then return(No, σ ); 7. return(Yes);

To understand how the method works, consider the simple case α = ⊥, i.e. the veriﬁcation of C |= ⊥ ← β. This means verifying that in the concrete semantics, no instance of action β is ever executed starting from C. By Theorem 1, this amounts to checking that for each σ symbolically generated by C, no solution of σ contains an instance of β. First, one checks whether there is a mgu θ of γ and β, for every γ in σ. If, for every σ, such a θ does not exist or if it exists but σθ is not consistent (check at step 4), then the property holds true, otherwise it does not, and the trace σ violating the property is reported. Theorem 2 (Correctness and Completeness). Let C be an initial conﬁguration and α and β be actions with v(α) ⊆ v(β). (1) If M(C, α ← β) returns (No, σ ) then C |= α ← β. In particular, for any injective ground substitution ρ : v(σ ) → EN , C σ ρ and σ ρ |= α ← β. (2) If C |= α ← β then M(C, α ← β) returns (No, σ ) and for any injective ground substitution ρ : v(σ ) → EN , C σ ρ and σ ρ |= α ← β. The method has been applied to analyse the Diﬃe-Hellman protocol and it has detected the usual man-in-the-middle attack (see [6]).

5

Conclusions and Future Work

We have presented a model and a method for the analysis of protocols built around shared-key encryption and modular exponentiation. We are conﬁdent that our approach smoothly carries over when including other common enciphering, signing and hashing primitives. We also believe the method is eﬀective in practice, because the symbolic model is compact, and the reﬁnement procedure at its heart is only invoked on demand and on single symbolic traces. We are in the process of integrating our technique into the STA analysis tool ([4]). Our technical development has been conﬁned to multiplication and exponentiation, but the methodology presented suggests directions for extensions to other low-level primitives.

278

Michele Boreale and Maria Grazia Buscemi

Acknowledgements We thank the anonymous referees for helpful comments.

References 1. M. Abadi, C. Fournet. Mobile Values, New Names, and Secure Communication. In Conf. Rec. of POPL’01, 2001. 2. R.M. Amadio, S. Lugiez. On the reachability problem in cryptographic protocols. In Proc. of Concur’00, LNCS 1877, Springer-Verlag, 2000. Full version: RR 3915, INRIA Sophia Antipolis. 3. M. Boreale. Symbolic Trace Analysis of Cryptographic Protocols. In Proc. of ICALP’01, LNCS 2076, Springer-Verlag, 2001. 4. M. Boreale, M. Buscemi. Experimenting with STA, a Tool for Automatic Analysis of Security Protocols. In Proc. of SAC’02, ACM Press, 2002. 5. M. Boreale and M. Buscemi. A Framework for the Analysis of Security Protocol. In Proc. of CONCUR ’02, LNCS 2421. Springer-Verlag, 2002. 6. M. Boreale and M. Buscemi. On the Symbolic Analysis of Low-Level Cryptographic Primitives: Modular Exponentiation and the Diﬃe-Hellman Protocol. To appear in Proc. of FCS’03, ENTS Series. Elsevier Science, 2003. 7. Y. Chevalier, R. Kuesters, M. Rusinowitch, and M. Turuani. An NP Decision Procedure for Protocol Insecurity with Xor. In Proc. of LICS ’03, IEEE Computer Society Press, 2003. 8. H. Comon, V. Cortier, J. Mitchell. Tree automata with one memory, set constraints and ping-pong protocols. In Proc. of ICALP’01, LNCS 2076, Springer-Verlag, 2001. 9. H. Comon-Lundh and V. Shmatikov. Intruder Deductions, Constraint Solving and Insecurity Decision in Presence of Exclusive or. In Proc. LICS ’03, IEEE Computer Society Press, 2003. 10. W. Diﬃe, M. Hellman. New Directions in Cryptography. IEEE Transactions on Information Theory, 22(6):644-654, 1976. 11. D. Dolev, A. Yao. On the security of public-key protocols. IEEE Transactions on Information Theory, 29(2):198-208, 1983. 12. D. Kapur, P. Narendran, and L. Wang. An E-uniﬁcation Algorithm for Analyzing Protocols that Use Modular Exponentiation. In Proc. of RTA ’03, LNCS 2706, Springer-Verlag, 2003. 13. G. Lowe. Breaking and Fixing the Needham-Schroeder Public-Key Protocol Using FDR. In Proc. of TACAS’96, LNCS 1055, Springer-Verlag, 1996. 14. J. Millen, V. Shmatikov. Constraint solving for bounded-process cryptographic protocol analysis. In Proc. of 8th ACM Conference on Computer and Communication Security, ACM Press, 2001. 15. J. Millen and V. Shmatikov. Symbolic Protocol Analysis with Products and DiﬃeHellman Exponentiation. In Proc. of 16th IEEE Computer Security Foundations Workshop, IEEE Computer Society Press, 2003. 16. V. Vanack`ere. The TRUST Protocol Analyser, Automatic and Eﬃcient Veriﬁcation of Cryptographic Protocols. In Proc. of Verify ’02, 2002.

Denotational Testing Semantics in Coinductive Form Michele Boreale1 and Fabio Gadducci2 1

Dipartimento di Sistemi e Informatica, Universit` a di Firenze Via Lombroso 6/17, 50134 Firenze, Italia [email protected] 2 Dipartimento di Informatica, Universit` a di Pisa via Buonarroti 2, 56125 Pisa, Italia [email protected]

Abstract. Building on recent work by Rutten on coinduction and formal power series, we deﬁne a denotational semantics for the csp calculus and prove it fully abstract for testing equivalence. The proposed methodology allows for abstract deﬁnition of operators in terms of behavioural diﬀerential equations and for coinductive reasoning on them, additionally dispensing with continuous order-theoretic structures. Keywords: process calculi, coinduction, formal power series, testing equivalence.

1

Introduction

Testing equivalence [3,4] and bisimilarity [6] are two classical proposals for process calculi semantics. They oﬀer diﬀerent trade-oﬀs between mathematical tractability and accuracy of process description. Bisimilarity comes equipped with a nice coinductive proof technique. However, it lacks a natural denotational model, and is often blamed of being over-discriminating. Testing equivalence offers perhaps a more faithful picture of reality, with e.g. a proper distinction between termination and divergence, and comes equipped with a fully abstract denotational model. Unfortunately, it lacks tractable proof techniques. In this paper we make an attempt at reconciling testing and bisimulation, while keeping the beneﬁts of both. The key to reconciliation are formal power series (fps’s) over a semiring, and the related ﬁnality and coinduction principle, as presented in recent works by Rutten (see e.g. [8]). A formal power series is a function from the set of words over an alphabet A to a generic semiring K. The set of such objects, denoted KA, can be given an automaton structure, with inputs in A and outputs in K. This particular automaton is ﬁnal, in the sense that there is a unique homomorphism onto it from every automaton on K, and enjoys moreover a coinduction principle, by which the unique homomorphism maps two bisimilar states into the same fps.

Research partially supported by the the EU within the FET – Global Computing Initiative, project mikado, and by the Italian MIUR projects cometa and napoli.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 279–289, 2003. c Springer-Verlag Berlin Heidelberg 2003

280

Michele Boreale and Fabio Gadducci

In this paper we consider a simple process calculus and introduce a semiring for testing, KT . Next, we turn the calculus into an automaton Aut on KT and show that bisimulation over Aut coincide with (must) testing equivalence. Hence, the unique homomorphism from Aut to KT A yields a fully abstract semantics for testing equivalence. Finally, we deﬁne a set of operators on KT A and show that the ﬁnal homomorphism is compositional with respect to these operators, meaning that the semantics is a denotational one. One nontrivial point of this construction is the treatment of divergence – the possibility for a process of getting engaged in an inﬁnite sequence of internal actions – that is not easily dealt with via bisimulation. In fact, we found it convenient to introduce partial formal power series, and to modify the notion of homomorphism accordingly. The beneﬁts of the above methodology can be summarized as follows. – Simplicity of the semantic domain. In particular, we dispense with continuous (order-theoretic, metric, topological,...) structures, and the existence of least ﬁx-points relies solely on the automaton structure of formal power series. – Uniform and abstract deﬁnitions of operators on the semantic domain, via behavioural diﬀerential equations (bde’s, [8]). This beneﬁt shows up clearly upon comparison of bde’s with the somehow intricate formulations often found in a standard denotational settings (see e.g. [4]). – Coinductive reasoning. Proofs by coinduction, which amount to exhibiting appropriate (bi)simulation relations, are used to show existence of least ﬁxpoints, full abstraction and compositionality of the semantics. The rest of the paper is organized as follows. In Section 2 we introduce (a substantial fragment of) Hoare’s csp [5], and present testing equivalence for this calculus. In Section 3, building on [8], we introduce partial formal power series and the related coinduction and ﬁnality principles. The subsequent Section 4 introduces the semiring for testing KT and the automaton Aut on csp processes, then showing that the resulting ﬁnal homomorphism is fully abstract for testing equivalence. In Section 5 we equip the set of fps’s with an algebraic structure and prove that the ﬁnal homomorphism is compositional with respect to this structure. Section 6 discusses related works and further directions for research.

2

CSP and Testing Semantics

We introduce a simple process calculus, essentially Hoare’s csp [5], and then present the associated testing equivalence semantics. 2.1

Syntax and Operational Semantics

We assume a countable set of visible actions, denoted by A and ranged over by a, b, . . .; a set X of agent variables, ranged over by x, y, . . .; and an invisible action τ ∈ A, with the set A ∪ {τ } and ℘f (A) (the ﬁnite subsets of A) ranged over by µ and L, respectively. Terms are built according to the following syntax P ::= x | nil | a.P | P ⊕ P | P + P | P [a] | P L P | recx .P .

Denotational Testing Semantics in Coinductive Form

281

A term is closed if each occurrence of a variable x is in the scope of a recx operator. The set of closed terms, or processes, is denoted by P and ranged over by P, Q, R, . . . . The constant nil represents the terminated process. The action preﬁx a.P performs an atomic action a and evolves to P . The operator ⊕ describes nondeterministic internal choice: P ⊕ Q may evolve via an invisible action either to P or to Q. Summation + denotes non-deterministic external choice: P + Q behaves either as P or as Q, the choice being triggered by the execution of a visible action. In the parallel composition P L Q, processes P and Q must evolve synchronously with respect to each action a ∈ L, while they may evolve independently with any µ ∈ L. The process P [a] behaves like P , except that any execution of action a is “hidden” and turned into τ . The intended meaning of the recursion operator recx .P is the process deﬁned by the equation x = P . The operational semantics of this language is described by a labeled transition system deﬁned in the so-called sos style by the set of axioms and inference rules of Table 1 (where we have omitted the symmetric rules for the parallel composition and the two choice operators). Table 1. The operational semantics for P. act :

−

rec :

a

a.P −→ P

− τ

recx .P −→ P [recx .P/x]

a

suma :

sumτ :

a

P + Q −→ P

P [a] −→ P [a]

a

parL :

2.2

P −→ P τ

P + Q −→ P + Q

hideµ :

P −→ P

a

P L Q −→ P L Q

µ

P [a] −→ P [a]

µ = a

µ

a

P −→ P , Q −→ Q

τ

P ⊕ Q −→ P

µ

P −→ P τ

−

τ

P −→ P

a

hidea :

plus :

a∈L

parµ :

P −→ P µ

P L Q −→ P L Q

µ ∈ L

Testing Semantics

The idea underlying testing semantics is that of equating two processes whenever no external observer can tell them apart. By “external observer” we mean any process, running in parallel with the observed process (see [3,4]). Within the testing approach one distinguishes between may and must semantics: informally, the former is meant to preserve safety properties of a process, while the latter is meant to preserve liveness ones. In the sequel we shall consider the must variant, which is by far more challenging, both because it takes into account the branching structure of processes, and because of the special role played by divergence in this semantics. For technical convenience, we shall rely on an alternative, observerindependent characterization of this equivalence, given below. This deﬁnition is easily proved to coincide with the original one (see e.g. [4]).

282

Michele Boreale and Fabio Gadducci

Deﬁnition 1 (testing equivalence [3,4]). Let A∗ denote the set of ﬁnite words on A, ranged over by ω. Let us consider the relations and predicates def

τ

⇒ = (−→)∗ ; a def a =⇒ = ⇒−→⇒ for any a ∈ A; an ω def a1 =⇒ = =⇒ . . . =⇒ for any ω = a1 · · · an ∈ A∗ ; def a I(P ) = {a | ∃ Q : P =⇒ Q}; τ τ P ⇓ holds true iﬀ there is no inﬁnite sequence of τ actions P −→ −→ · · · starting from P (otherwise P ⇑ holds); – P ⇓ ω (read P converges on ω) holds true iﬀ for each preﬁx ω of ω, whenever

– – – – –

ω

P =⇒ P then P ⇓ (otherwise P ⇑ ω holds). We say that two processes P and Q are testing equivalent, and write P Q, ω if for each ω ∈ A∗ , whenever P ⇓ ω, it holds that (a) Q ⇓ ω, and (b) if Q =⇒ Q ω then there is P such that P =⇒ P and I(P ) ⊆ I(Q ) (and vice-versa).

3

Coinduction on Partial Formal Power Series

This section presents a few deﬁnitions and results on formal power series, Moore automata and the related coinduction principles. They are directly inspired by [8], extending Rutten’s work in order to take into account partially deﬁned series. 3.1

Formal Power Series and Moore Automata

Deﬁnition 2 (semirings). A (commutative, unitary) semiring is a ﬁve-tuple K = K, ⊕K , ⊗K , 0K , 1K such that K is a set, 0K , 1K ∈ K, and ⊕K , ⊗K : K × K → K are binary operators on K making the triples K, ⊕K , 0K and K, ⊗K , 1K commutative monoids, additionally satisfying – x ⊗K (y ⊕K z) = (x ⊗K y) ⊕ (x ⊗K z) for all x, y, z ∈ K; – 0K ⊗K x = 0K for all x ∈ K. We often drop the subscript K , if no confusion can arise. Let A, B be sets. We denote by B⊥ the extension of B with a new element ⊥, and we model a partial function from A to B as a total function f : A → B⊥ , writing f (a) ↑ if f (a) = ⊥, and f (a) ↓ otherwise. A n-ary function is strict if it yields ⊥ whenever one of its argument is ⊥, and for any function f : A1 × · · · × An → B⊥ , we let f⊥ : A1⊥ × · · · × An⊥ → B⊥ denote the strict extension of f , deﬁned as expected. For any relation R ⊆ A × B, we denote by R⊥ the relation obtained by adding the pair ⊥, ⊥ to R. Given a semiring K, by abuse of notation we shall use ⊕ and ⊗ to denote the strict extensions of the corresponding operators on K⊥ . Deﬁnition 3 (partial formal power series). Let A be a set, and let K be a semiring. A partial formal power series (also partial fps) on K and A is a function σ : A∗ → K⊥ , such that for all words ω ∈ A∗ , if σ(ω) ↑ then σ(ωa) ↑, for each a ∈ A. The set of all series on K and A is denoted by KA. Let σ be a series in KA, and let ω ∈ A∗ . The ω-input derivative of σ, written σω , is the series deﬁned by σω (ω ) = σ(ωω ) for all words ω ∈ A∗ .

Denotational Testing Semantics in Coinductive Form

283

Let σ ∈ KA, and let ω ∈ A∗ : the coeﬃcient of ω with respect to σ is the value σ(ω), and σ() is called the constant of the series. Moreover, a series σ is total if σ(ω) ↓ for each ω ∈ A∗ . Of special interest are the following fps’s: 0 is the fps yielding 0 for all coeﬃcients, 1 is the fps with constant 1 and 0 elsewhere, Ω is the fps yielding ⊥ for all coeﬃcients (everywhere undeﬁned). Coeﬃcients bear diﬀerent interpretations on diﬀerent semirings. For example, let A = {X} and let K be the semiring of real numbers: any total fps represents a power series in the usual sense (interpreting the word X · · · X, the element X replicated n times, as X n ). If A is any set, and K is the semiring of boolean values (i.e., true and false), a total fps just represents a subset of A∗ , hence, a language over A. There is no obvious interpretation for partial fps’s in these cases. As we shall see, partiality is well suited to represent divergence in processes. In the rest of the section we let A denote a set, and K a semiring. Deﬁnition 4 (Moore automaton). A (partial) Moore automaton with inputs in A and outputs in K is a pair S, (oS , δS ) consisting of a set S of states, and of a pair of functions: an output function oS : S → K⊥ and a transition function δS : S × A → S⊥ , such that if oS (s) ↑ then δS (s, a) ↑ for each a ∈ A. The output oS (s) yields an observation at state s, while the transition δS speciﬁes the state δS (s, a) reached from s after the consumption of an input a. 3.2

Finality and Coinduction

Deﬁnition 5 (homomorphism and bisimulation). Let S, T be automata with inputs in A and outputs in K. A homomorphism between S and T is a function f : S → T preserving input and output, i.e., such that oS (s) = oT (f (s)) and f⊥ (δS (s, a)) = δT (f (s), a) for all s ∈ S and a ∈ A. A bisimulation is a relation R ⊆ S × T preserving input and output, i.e., such that if s, t ∈ R then oS (s) = oT (t) and δS (s, a), δT (t, a) ∈ R⊥ for all a ∈ A. Let S be an automaton, and let s, s ∈ S: s and s are bisimilar (denoted by s ∼ s ) if there exists a bisimulation R between S and itself, such that R contains s, s . It is immediate to show that ∼ is a bisimulation, and that in fact it is an equivalence relation on S. We now recall how to turn the set of fps’s into an automaton, using input derivatives in order to deﬁne both input and transition functions. The resulting automaton has the property of being ﬁnal in the class of automata on A and K, and additionally, it satisﬁes a coinduction principle, in the sense that over this automaton bisimilarity coincides with the identity relation. Deﬁnition 6 (the automaton of partial fps’s). The automaton MKA on A and K is the pair KA, (oM , δM ), where the output and transition functions are deﬁned by oM (σ) = σ() and δM (σ, a) = σa for all series σ and a ∈ A. The proposition below is proved similarly to the corresponding one in [8].

284

Michele Boreale and Fabio Gadducci

Proposition 1 (ﬁnality and coinduction). The automaton MKA satisﬁes the coinduction principle: for all series σ, σ ∈ KA, if σ ∼ σ then σ = σ . Moreover, MKA is ﬁnal: for any automaton S with inputs in A and outputs in K there exists a unique homomorphism l : S → KA, additionally satisfying s ∼ s in S iﬀ l(s) = l(s ).

4

A Semiring for Testing Equivalence

The main result of this section is that, for an appropriate choice of the semiring, the set of csp processes can be turned into a Moore automaton, in such a way that testing equivalence on the original transition system corresponds to bisimulation on that automaton. This will yield a fully abstract semantics for testing equivalence, in terms of the unique mapping into the ﬁnal automaton. 4.1

The Semiring KT

We present a concept routinely used in characterisations of testing equivalences. Deﬁnition 7 (saturated sets [3,4]). Let F ⊆ﬁn ℘f (A). We say that F is saturated if for all X ∈ F , whenever there exists Y such that X ⊆ Y ⊆ Z∈F Z, then Y ∈ F . The saturation of F , written S(F ), is the smallest saturated family which contains F . The set of all saturated families on A is denoted by F(A). A useful lemma for checking equality of two saturated sets. Lemma 1. Let F, G ⊆ﬁn ℘f (A). Let us assume (a) X∈F X = Y ∈G Y and (b) for each X ∈ F there exists Y ∈ G such that Y ⊆ X (and vice-versa). Then, S(F ) = S(G). Let us now denote by KT the ﬁve-tuple F(A), ⊕T , ⊗T , ∅, {∅}, such that F ⊕T G = S(F ∪ G) and F ⊗T G = {X ∪ Y | X ∈ F, Y ∈ G}. It is an easy consequence of Lemma 1 that KT is a semiring. Note in particular that F ⊗T G is obviously a saturated set, if so are F and G. Our aim now is to ﬁnd a subset of KT A for suitably interpreting csp processes. We shall do so by imposing a consistency requirement between output values and transitions at each state. First of all, we regard those derivatives σa leading to a state with 0 as output value as being null. So, let us deﬁne the support of σ, written supp(σ), as the set {a| σ(a) = 0}. Deﬁnition 8 below requires that all and only actions in the support contribute to the output value, at convergent states. This condition implies that in those states the support is ﬁnite and that if the output value is 0 (i.e. ∅) then that state is actually 0. Deﬁnition 8 (consistent partial fps’s). The set KTc A of consistent fps’s is the largest set of fps’s, not containing 0 and closedunder non null derivatives, such that if σ ∈ KTc A and σ(ω) ↓ then supp(σω ) = Z∈σ(ω) Z for each ω ∈ A∗ .

Denotational Testing Semantics in Coinductive Form

4.2

285

A Moore Automaton for Testing

The next step is turning the set of processes into an automaton on KT . In view of dealing with the interpretation of open terms (Section 5), we ﬁnd it convenient to consider a larger set, which also includes constants for all consistent fps’s. Let P e be the set of all closed terms built from the csp operators, plus a set of distinct constants σ, one for each σ ∈ KTc A. In order to extend the transition system of P to P e , we introduce some notation. Given D = {P1 , P2 , . . . , Pn } ⊆ﬁn e P , we let P ∈D P denote the process P1 ⊕ P2 ⊕ · · · ⊕ Pn , with summands Pi arranged in some ﬁxed order and with the proviso that if D = ∅ then this / P e . Similarly, we let ΣP ∈D P denote the process term denotes the constant 0 ∈ P1 + P2 + · · · + Pn , with the proviso that if D = ∅ then this term denotes the process nil. The transition system deﬁned in Table 1 is extended to P e via the following (ccs-style) recursive deﬁnitions, one for each constant σ ∈ P e L∈σ() Σa∈L a.σa if σ() ↓ σ= recx .x if σ() ↑ . The extended transition system is still ﬁnitely branching, since consistent fps’s have ﬁnite support, and testing equivalence on P e conservatively extends testing equivalence on P. We build now an automaton out of P e . We ﬁrst need a technical lemma, standard from the theory of testing equivalence (see e.g. [4]). Lemma 2. Let P ∈ P e . Then, the set I(P ) is ﬁnite and, for each ω such that ω P ⇓ ω, the set {P | P =⇒ P } is ﬁnite. Deﬁnition 9 (P e as a Moore automaton). The Moore automaton Aut is the pair P e ∪ {0}, (o, δ), where δ and o are deﬁned as follows (assuming o(0) = ∅ and δ(0, a) = 0 for each a ∈ A) def S({I(Q)| P =⇒ Q}) if P ⇓ o(P ) = ⊥ if P ⇑ a def {Q| P =⇒ Q} if P ⇓ a δ(P, a) = ⊥ if P ⇑ a . Proposition 2 (testing vs. bisimulation). Let P, Q ∈ P e . Then, P Q if and only if P ∼ Q in Aut. Let lT be the unique homomorphism Aut → MKT A induced by ﬁnality (Proposition1) and let [[·]] denote its restriction to P e (noting lT (P e ) ⊆ KTc A). Corollary 1 (full abstraction). The mapping [[·]] : P e → KTc A is fully abstract for testing equivalence on P e , i.e., P Q if and only if [[P ]] = [[Q]].

5

Compositionality

The next step is to show that the ﬁnal automaton can be equipped with an algebraic structure which is well suited to interpret processes. In fact, this structure will be used to prove that the mapping [[·]] deﬁned in the previous section also preserves the process operators, that is, [[·]] yields a truly denotational semantics.

286

Michele Boreale and Fabio Gadducci

5.1

Algebraic Operators via BDE’s

A behavioural diﬀerential equation (shortly bde, [8]) can be used to deﬁne a fps σ by specifying an initial condition (the value of σ()) and a condition on its input derivatives (the fps’s σa , for all a ∈ A). For example, the conditions σ() = 1 and σa = σ (for each a) form a bde whose unique solution is the fps that associates 1 to every word in A∗ . As we shall see, the coinduction principle on a ﬁnal automaton MKA allows for showing the existence and uniqueness of the solution of a given bde. In particular, bde’s can be used as a means to deﬁning operators on KA in a simple and uniform fashion. Table 2 contains a set of bde’s, deﬁning a few operators on KTc A clearly inspired by those of csp. In fact, by abuse of notation, we shall use symbols drawn from csp syntax to denote some of these operators. The equations in Table 2 deserve some explanation, but let us introduce the relevant notation ﬁrst. For a ﬁnite set D = {σ1 , . . . , σn } of fps’s, D denotes σ1 ⊕ · · · ⊕ σn , with the summands arranged in some ﬁxed order, with the convention that this term denotes 0 if D = ∅. Finally, for any saturated set F and L ⊆ﬁn A, let F \\L denote the (obviously saturated) set {X\L| X ∈ F }. In the equation for hiding [b](σ), the undeﬁned series Ω models divergence, which may arise either because there is a ﬁnite sequence of b actions leading σ to a divergent state, or because σ has an inﬁnite sequence of b actions. By the conditions imposed on consistent fps’s, either of these two cases arises iﬀ there is no sequence of b’s leading to 0. The constant coeﬃcient of the parallel composition σ L ρ is the product of σ and ρ’s constants, but synchronised actions (in L) that are not in the support of both are subtracted away. Finally, note the form of the derivative for the ⊗ operator, intended to model external nondeterminism +, which is reminiscent of the csp law a.P + a.Q a.(P ⊕ Q). Table 2. Behavioural diﬀerential equations on KTc A. σ if a = b with (b.(σ))() = {{ b }} (b.(σ))a = 0 otherwise (σ ⊕ ρ)a = σa ⊕ ρa with (σ ⊕ ρ)() = σ() ⊕K ρ() (σ ⊗ ρ)a = σ ⊕ ρ with (σ ⊗ ρ)() = σ() ⊗K ρ() a a  if a = b and ∃j : σbj = 0 0 [b]( i∈IN σ ) if a = b and ∃j : σbj = 0 ([b](σ))a = i b a l  Ω otherwise i \{b} if ∃j : σbj = 0 KT , i∈IN l σ(b )\ with ([b](σ))() = ⊥ otherwise σa L ρa if a ∈ L (σ L ρ)a = (σa L ρ) ⊕ (σ L ρa ) otherwise with (σ L ρ)() = (σ() ⊗K ρ())\\(L\(supp(σ) ∩ supp(ρ))

Proposition 3 (operators on KTc A via bde’s). There exist unique strict, associative and commutative binary operators ⊕, ⊗, L (for each L ⊆ﬁn A), unique strict unary operators [b] and unique unary operators b. (for each b ∈ A) on KTc A satisfying the bde’s of Table 2, for all σ, ρ ∈ KTc A and a ∈ A.

Denotational Testing Semantics in Coinductive Form

5.2

287

The Denotational Mapping

The existence of suitable ﬁx-points in KTc A is needed to model recursion. So, consider the partial order ⊆ on KT A given by inclusion, i.e., such that σ ⊆ τ iﬀ for each ω ∈ A∗ σ(ω) ↓ implies σ(ω) = τ (ω). We shall use ⊆ for selecting minimal solutions to recursive equations on KTc A. Deﬁnition 10 (simulation). Let S an automaton on A and K. A simulation on S is a relation R ⊆ S×S such that if s, t ∈ R and oS (s) ↓ then oS (s) = oS (t) and δS (s, a), δS (t, a) ∈ R ∪ ({⊥} × S⊥ ) for each a ∈ A. Let denote the greatest simulation relation over S: it is a preorder on S and the coinduction principle carries over to simulation. More precisely, the unique homomorphism l from S to the ﬁnal automaton on KT A given by Proposition 1 maps into ⊆ , that is, s t in S iﬀ l(s) ⊆ l(t) in KT A. The semantics for recursive terms is usually obtained by taking open terms into account via environments. Although this would be possible, the coinduction principle for simulation allows instead for relying on the extended syntax of P e . Informally, we internalize environments by regarding [[P ]]E , for E the environment mapping x ˜ to σ ˜ , as [[P [σ˜/x˜]]]. In the sequel, P [x] denotes an open P e term where only variable x may occur free, and P [Q] denotes P [Q/x], for each Q ∈ P e . Proposition 4 (ﬁxpoints). Let P [x] be an open term in P e , and moreover let F : KTc A → KTc A be deﬁned as F (σ) = [[P [σ]]] for each σ ∈ KTc A. Then F has a least (with respect to ⊆ ) ﬁx-point σ0 , and it holds that σ0 = [[recx .P ]]. Let ﬁx(F ) denote the least ﬁx-point of any F : KTc A → KTc A, if it exists. Theorem 1 (compositionality). The mapping [[·]] : P e → KTc A is a homomorphism with respect to the operators of csp syntax and the operators on KTc A deﬁned in Table 2. In particular, the equalities in Table 3 hold. Table 3. The denotational equalities. [[σ]] = σ

[[nil]] = 1

[[P ⊕ Q]] = [[P ]] ⊕ [[Q]] [[P [a]]] = [a]([[P ]])

[[a.P ]] = a.([[P ]]) [[P + Q]] = [[P ]] ⊗ [[Q]]

[[P L Q]] = [[P ]] L [[Q]]

[[recx .P ]] = ﬁx(λσ.[[P [σ/x]]])

Example 1. Consider the processes P = a.b.nil + a.c.nil and Q = a.(b.nil ⊕ c.nil). They are testing equivalent, while they are not bisimilar in the sense of Milner [6]. Dropping a few parentheses, we have [[P ]] = a.b.1 ⊗ a.c.1 and [[Q]] = a.(b.1 ⊕ c.1). Note now that the following principle can be proved by coinduction: if σ ∈ KTc A then σ = σ() ⊕ a∈supp(σ) a.σa (for k the fps with constant coeﬃcient k and 0 elsewhere). By applying this principle, and using the bde’s for ⊕ and ⊗, it is immediate to check that [[P ]] = {{a}} ⊕ a.(b.1 ⊕ c.1) = [[Q]].

288

Michele Boreale and Fabio Gadducci

Remark 1 (Acceptance trees and fps’s). The classical denotational model of must testing is given in terms of acceptance trees (shortly at’s, [4]). There is a close analogy between at’s and fps’s since, essentially, a ﬁnite at is a tree with arcs labeled by actions and nodes labeled by saturated sets or by ∅. In [4], the set of ﬁnite at’s is turned into an algebraic cpo by an “ideal completion” construction, and the resulting domain is used to interpret the calculus. We could trace a correspondence between completions of ﬁnite at’s and fps’s following [1]. A Moore automaton is a coalgebra of the ω-continuous functor F : S → {⊥} + K × (A → S⊥ ). The initial algebra of F is the set of ﬁnite, non-empty trees, with leaves labelled by ⊥, nodes labeled by elements in F(A), and edges labeled by elements in A. The corresponding ﬁnal coalgebra is the set of all inﬁnite trees, equipped with a partial order induced by the operation of replacing subtrees with ⊥-leaves. We believe that this partial order reﬂects the simulation preorder we used in our deﬁnition of the ﬁx-point operator.

6

Conclusions and Related Work

The paper proposes a denotational semantics for testing equivalence, building on recent work by Rutten on the coalgebraic presentation of formal power series. While the coalgebraic characterisation of processes up-to bisimulation is by now well-known, ﬁnal semantics has seldom been used for capturing diﬀerent behavioural equivalences. Related to ours, we are only aware of Power and Turi’s work on trace semantics [7]. The present paper is in the same spirit, but diﬀerently from them we work within the concrete, set-theoretic setting of fps’s, and deal with must testing, which is more challenging than trace equivalence. A formulation of must testing in terms of bisimulation, at least for strongly convergent processes, has been given in [2], but no denotational setting was worked out. We believe instead that our characterisation of testing equivalence by means of fps’s suggests a methodology – whose core lies in the deﬁnition of operators via bde’s and in the choice of an appropriate semiring – for diﬀerent equivalences and process calculi, e.g. with name mobility. At the same time, alternative semantics or language extensions may be suggested by the domain of fps’s itself. As an example, the constant series 0 on our semiring KT really asks for including a deadlock operator in the language.

References 1. J. Ad´ amek. Final coalgebras are ideal completions of initial algebras. Journal of Logic and Computation, 12:217–242, 2002. 2. R. Cleaveland and M. Hennessy. Testing equivalence as a bisimulation equivalence. Formal Aspects of Computing, 5:1–20, 1993. 3. R. De Nicola and M. Hennessy. Testing equivalences for processes. Theoret. Comput. Sci., 34:83–133, 1984. 4. M. Hennessy. Algebraic Theory of Processes. MIT Press, 1988. 5. C.A.R. Hoare. Communicating Sequential Processes. Prentice Hall, 1985.

Denotational Testing Semantics in Coinductive Form

289

6. R. Milner. Communication and Concurrency. Prentice Hall, 1989. 7. J. Power and D. Turi. A coalgebraic foundation for linear time semantics. In M. Hofmann, G. Rosolini, and D. Pavlovic, editors, Category Theory and Computer Science, volume 29 of El. Notes in Theoret. Comput. Sci.. Elsevier Science, 1999. 8. J.J.M.M. Rutten. Automata, power series, and coinduction: taking input derivatives seriously. Technical Report SEN-R9901, CWI, 1999.

Lower Bounds for General Graph–Driven Read–Once Parity Branching Programs Henrik Brosenne , Matthias Homeister, and Stephan Waack Institut f¨ ur Numerische und Angewandte Mathematik Georg–August–Universit¨ at G¨ ottingen Lotzestr. 16–18, 37083 G¨ ottingen, Germany {brosenne,homeiste,waack}@math.uni-goettingen.de

Abstract. We prove the ﬁrst exponential lower bounds on the size of graph–driven read–once parity branching programs. The functions used are linear codes, permutation matrices and a function that has small unrestricted read–once parity branching programs. Moreover, we characterize all BP1s that are guided by graph orderings. The latter is the case if and only if for each input there is a variable ordering that is compatible with each computation path for the input.

1

Introduction

A branching program (BP for short) B on the set of Boolean variables {x1 , . . . , xn } is a directed acyclic graph with one source and one target. The outdegree of the target and the indegree of the source are both equal to zero. The source is joined to its successors by unlabeled directed edges. The nodes different from the source and the target, the so–called branching nodes, are labeled with Boolean variables and the outgoing edges are labeled with 1 or with 0. The size of a BP B denoted by SIZE (B) or by |B| is the number of its nodes. A branching program is called deterministic if the source has exactly one successor, and each branching node is left by at most one 0- and one 1-edge. An input a ∈ {0, 1}n activates all edges labeled with ai outgoing from nodes labeled with xi , for i = 1, 2, . . . , n. Moreover, the edges leaving the source are activated by all elements of a ∈ {0, 1}n . A computation path for an input a ∈ {0, 1}n in a BP B on {x1 , . . . , xn } is a path in B from the source whose edges are activated by a. Such a path is called an accepting one, if it leads to the target. A parity branching program (⊕–BP for short) is a branching program equipped with the parity representation mode. It represents a Boolean function f : {0, 1}n → {0, 1} deﬁned as follows. f (a) = 1 if and only if the number of accepting computation paths for a is odd. A nondeterministic branching program (∨–BP for short) uses the common nondeterministic representation mode. An input a ∈ {0, 1}n is accepted if and only if there is an accepting computation path under a.

Supported by DFG grant Wa 766/4-2 Supported in part by DFG grant Wa766/4-2

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 290–299, 2003. c Springer-Verlag Berlin Heidelberg 2003

Lower Bounds

291

If a branching program is deterministic, then the above mentioned representation modes coincide. The bestknown lower bound on the size of unrestricted 2 deterministic BPs is of order Ω (logn n)2 . It was proved by Nechiporuk [20] in 1966. Consequently, restricted models have been studied intensively. (See [26] for an overview.) Here we can only review those results closely related to ours. Nevertheless, the recent breakthroughs for semantic super–linear length BPs due to Ajtai, Beame, Saks, Sun, and Vee (see [1], [2], and [3]) must be mentioned. A branching program is called read–once (BP1 for short) if on every path from the source to the target each variable is tested at most once. Ordered binary decision diagrams (OBDDs), introduced by Bryant ([9], [10]), are deterministic BP1s with the following additional property. There is a permutation σ of the set {1, 2, . . . , n} such if node v labeled with xσ(j) is a successor of node u labeled with xσ(i) , then i > j. As for proving lower bounds, the existence of a global variable ordering ensures that one can proceed as follows. Having put a cut through a σ–OBDD representing f at distance of say k from the source, the number of distinct subfunctions f |π , where π ranges over all paths from the source to the frontier nodes of the cut, is a lower bound on the σ–OBDD size of f . OBDDs are highly restricted branching programs. Many even simple functions have exponential OBDD–size (see [7], [11]). To maintain the essence of the above subfunction argument for more general models, the following observation is useful. If B is a deterministic BP1 on {x1 , x2 , . . . , xn }, then for each input a ∈ {0, 1}n there is a variable ordering σ(a) according to which the bits of a are queried. But not every combination of variable orderings can be implemented by deterministic BP1s. Only those resulting from graph orderings, independently introduced by Gergov and Meinel (see [12]) and Sieling and Wegener (see [23]), are possible. Deﬁnition 1. A graph ordering G is a deterministic BP1 such that each branching node has outdegree two, and each variable is tested on each path from the source to the target exactly once. A BP1 B is called a graph–driven one guided by a graph ordering G over the same set of variables as B, if the following condition is satisﬁed. For an arbitrary input a ∈ {0, 1}n , the list of variables inspected on every computation path for a in B is a subsequence of the corresponding list resulting from G. For every deterministic BP1 B, it is easy to construct a graph ordering G that guides B. But it is clear that there are BP1s that are not guided by a graph ordering. Of course, OBDDs are graph–driven deterministic BP1s. ⊕-OBDDs were introduced by Gergov and Meinel in [13], they have been intensively studied in [25] from a theoretical point of view. Heuristics for a successful practical implementation are due to Meinel and Sack (see [21], [18], [19]). Examples of functions showing that ⊕-OBDDs are more powerful than OBDDs are given in [13]. Graph–driven ⊕–BP1s have a strictly larger descriptive power than both deterministic BP1s and ⊕-OBDDs with respect to polynomial size. This follows from results due to Sieling [24].

292

Henrik Brosenne, Matthias Homeister, and Stephan Waack

Up to now, proving superpolynomial lower bounds on the size of ⊕–BP1s is a challenging open problem in complexity theory. In [22] exponential lower bounds for pointer functions on the size of (⊕, k)– BP1s are proved. A (⊕, k)–BP1 is a read–once BP with the source being the only nondeterministic node, where k denotes the fan–out √ of the source. In [8] exponential lower bounds of magnitude 2Ω( n) on the size of well– structured graph–driven ⊕–BP1 for certain linear code functions have been proved. Well–structured ⊕–BP1s and ∨–BP1s have been further investigated in [4] and [6]. In [4] a strongly exponential lower bound for integer multiplication is proved. In [6] polynomial size well–structured ⊕–BP1s are separated from polynomial size general ⊕–BP1s. The notion of well–structured graph–driven BP1s was introduced in [23]. The reader who is not familiar with this notion is referred to [26]. The results of this paper can be summarized as follows. In Section 2 we characterize all BP1s that are guided by graph orderings. The latter is the case if and only if for each input there is a variable ordering that is compatible with each computation path for the input. This shows the condition of being guided by a graph ordering is in fact a very natural combinatorial one. In Section 3 we prove a lower bound criterion for general graph–driven ⊕–BP1s. This criterion is applied in three cases. It is supposed that certain linear code functions do not have polynomially bounded unrestricted ⊕–BP1s. The result of Section √ 4 supports this supposition. We prove a lower bound of magnitude 2Ω( n) on the size of general graph–driven ⊕–BP1s representing them. Read-once projections introduced by Bollig and Wegener in [5] are projections of multiplicity one for each variable. They are an appropriate reduction notion for all branching program classes C subject to the read-once condition, since they preserve polynomial size. Given such a class C, in [26] at page 89 the following is observed. An exponential lower bound for the function PERM on the size of C-BP1s proves that projections with multiplicity 2 may lead to an exponential blow-up of the C size. Such a lower bound on the size of OBDDs and ∨-OBDDs is proved in [16] √ and [15], respectively. In Section 5 we prove a lower bound of magnitude 2Ω( n) on the size of general graph–driven ⊕–BP1s representing PERM. Finally, in Section 6 it is proved that unrestricted ⊕–BP1s are strictly stronger than graph–driven ⊕–BP1s.

2

Characterizing General Graph–Driven ⊕–BP1s

In this section we give the following characterization for a ⊕–BP1 to be a graph– driven one. Proposition 1. Let be B a ⊕–BP1 on the set of variables {x1 , x2 , . . . , xn }. Then there exists a graph–ordering G such that B is guided by G if and only if the following condition is satisﬁed. For each input a there is an ordering σ(a) of

Lower Bounds

293

{x1 , x2 , . . . , xn } such that on each computation path for a the bits of a are queried according to σ(a). Proof. It is clear that the condition is necessary. Assume now that the condition is fulﬁlled for a ⊕–BP1 B. We show that we can choose a variable xi such that for each input a the variables can be tested according to an ordering that starts with xi . Then the graph–ordering G can be constructed as follows. At the very beginning we create the unlabeled source s. The unique successor u of s is labeled with xi . Then we calculate the subdiagrams Bxi =0 (Bxi =1 , resp.) by setting in B the variable xi to 0 (to 1, resp.). Now for Bxi =b (b = 0, 1) there is another variable xjb that can be tested ﬁrst in Bxi =b for all inputs a with ai = b. So we label the b−successor of u by xjb and then the procedure iterates. We assume that for each variable xi there is an input ai such that all orderings compatible with ai can not start with xi . Then for each xi there is an input ai and a computation path pi for ai such that a variable xj (j = i) is tested on pi before xi . After having renamed the indices we get inputs a1 , a2 , . . . , aν and computation paths p1 , p2 , . . . , pν such that the variable xν is tested before x1 on p1 , and for i = 2, . . . , ν the variable xi−1 is the ﬁrst variable tested on pi , and xi occurs on pi , too. Clearly, the number ν is always greater than or equal to 2 and less than or equal to n. We call the sequence xν , x1 , . . . , xν−1 , xν a cycle with respect to the inputs a1 , a2 , . . . , aν and the corresponding computation paths p1 , p2 , . . . , pν . For i = 1, . . . , ν, let Si be the set of variables tested on pi before xi with xi ν being excluded. The number i=1 |Si | is called the weight of the cycle. Let us consider from now on a cycle xν , x1 , . . . , xν−1 , xν with respect to the inputs a1 , a2 , . . . , aν and the corresponding computation paths p1 , p2 , . . . , pν of minimal weight. We observe that the minimality entails that the sets Si (i = 1, . . . , ν) are pairwise disjoint. Since the sets S1 , S2 , . . . , Sν are pairwise disjoint, there is an input a such that for all i = 1, . . . , ν, a |Si = ai |Si . Contradiction.

3

A Lower Bound Criterion for Graph–Driven ⊕–BP1s

Let B be a graph–driven ⊕–BP1 on the set of variables {x1 , x2 , . . . , xn } guided by a graph ordering G representing the Boolean function f . We deﬁne two vector spaces over IF2 . The space IB(B) is spanned by the functions Resv , where v is a node of B and Resv denotes the function represented by the subdiagram of B rooted at v. The second space, denoted by IBG (f ), is the span of all subfunctions f |π , where π is a path from the source to a node w in G and f |π results from f by setting the variable according to the labels of the nodes and edges on π. There is an equivalent way of deﬁning the function f ∈ IBn represented by a ⊕–BP B on {x1 , x2 , . . . , xn }. For each node u of the diagram B, we inductively

294

Henrik Brosenne, Matthias Homeister, and Stephan Waack

deﬁne its resulting function Resu . The resulting function of the target equals the all–one function. For a branching node u labeled with the variable x, Resu := (x ⊕ 1) ∧ Resv ⊕ x ∧ Resv . v∈Succ0 (u)

v∈Succ1 (u)

If s is the source, then Ress := v∈Succ(s) Resv . The function Res(B) : {0, 1}n → {0, 1} represented by the whole diagram is deﬁned to be Ress . We are now in the position to state a lower bound criterion, whose proof is very easy. It estimates the size of a graph–driven ⊕–BP1 by an invariant of the graph ordering and the function represented. Theorem 1. Let B be a graph–driven ⊕–BP1 guided by a graph ordering G representing the Boolean function f . Then SIZE (B) ≥ dimIF2 IBG (f ). Proof. First we observe that SIZE (B) ≥ dimIF2 IB(B). Let f |π be any generating element of the vector space IBG (f ), and let α be the partial assignment to the set of variables {x1 , x2 , . . . , xn } associated with the path π. Since the branching program B is guided by the graph ordering G, we are led to nodes v1 , v2 , . . ., vν when traversing B starting ν at the source according to the partial assignment α. Consequently, f |π = j=1 Resvj . This entails that IB(B) contains IBG (f ) as a subspace. The claim follows. Corollary 1. Let π1 , π2 , . . . , πν be paths in G starting at the source. Let α1 , . . . , αν be the partial assignments associated with these paths. If the subfunctions fα1 , . . . , fαν are linearly independent, then SIZE (B) ≥ ν. Proof. The subspace of IBG (f ) spanned by {fα1 , . . . , fαν } is of dimension ν.

4

A Lower Bound for Linear Codes

A linear code C is a linear subspace of IFn2 . Our ﬁrst explicit lower bound is for the characteristic function of such a linear code C, that is fC : IFn2 → {0, 1} deﬁned by fC (a) = 1 ⇐⇒ a ∈ C. To this end we will give some basic deﬁnitions and facts on linear codes. The Hamming distance of two code words a, b ∈ C is deﬁned to be the number of 1’s of a⊕b. The minimal distance of a code C is the minimal Hamming distance of two distinct elements of C. The dual C ⊥ is the set of all vectors b such that a1 b1 ⊕ . . . ⊕ an bn = 0, for all elements a ∈ C. A set D ⊆ IFn2 is deﬁned to be k-universal, if for any subset of k indices I ⊆ {1, . . . , n} the projection onto these coordinates restricted to the set D gives the whole space IFk2 . The next lemma is well–known. See [14] for a proof. Lemma 1. If C is a code of minimal distance k + 1, then its dual C ⊥ is k– universal. Theorem 2. Let C ⊆ IFn2 be a linear code of minimal distance d whose dual C ⊥ has minimal distance d⊥ . Then each graph–driven ⊕–BP1 representing its ⊥ characteristic function fC has size bounded below by 2(min{d,d }−1) .

Lower Bounds

295

Proof. Let B be a graph–driven ⊕–BP1 guided by G representing f = fC . Consider the set of all nodes of the graph ordering G at depth k from the source, where k := min{d, d⊥ } − 1. Thus for each such node v and each path π leading from the source to v exactly k variables are tested on π. For m := 2k , let α1 , . . . , αm be the partial assignments of the variables x1 , x2 , . . . , xn resulting from these paths. Observe, that the code C is both of distance k and k–universal. We consider the subfunctions fα1 , fα2 , . . . , fαm and take notice of the fact that these functions formally depend on all variables x1 , x2 , . . . , xn . According to corollary 1 it suﬃces to prove that fα1 , fα2 , . . . , fαm are linearly independent. Let {fαi1 , fαi2 , . . . , fαiµ } be any nonempty subset of {fα1 , fα2 , . . . , fαm }. Having assumed without loss of generality µ that (αi1 , αi2 , . . . , αiµ ) is equal to (α1 , α2 , . . . , αµ ), we have to show that i=1 fαi = 0. For each partial assignment α1 to the variables {x1 , x2 , . . . , xn } whose domain is complementary to the domain of α1 , we can deﬁne a vector (α1 , α1 ) := (a1 , a2 , . . . , an ) ∈ {0, 1}n as follows. α1 (xj ) if α1 (xj ) is deﬁned; aj := α1 (xj ) if α1 (xj ) is deﬁned (j = 1, 2, . . . n) . Since C is k–universal, there is a α1 such that a := (α1 , α1 ) is a member of C. Consequently fα1 (a) = 1. For each 1 < i ≤ µ, we show that fαi (a) = 0. Obviously, fαi (a) = f (a(i) ), where αi (xj ) if αi (xj ) is deﬁned; (i) aj := aj otherwise. Since the distance between a and a(i) is less than or equal to k, the claim follows. Corollary 2. Let n = 2l and r = l/2 . Then every graph–driven ⊕–BP1 representing the characteristic function √ of the r–th order binary Reed–Muller code R(r, l) has size bounded below by 2Ω ( n) . Proof. We apply that the code R(r, l) is linear and has minimal distance 2l−r . It is known that the dual of R(r, l) is R(l − r − 1, l) (see [17]).

5

A Lower Bound for Permutation Matrices

A n × n matrix over {0, 1} is deﬁned to be a permutation matrix, if each row and each column contains exactly one entry 1. The well-known function PERM depending on n2 Boolean variables accepts exactly those inputs corresponding to permutation matrices. In this section we adopt ideas from [16] and [15]. Theorem 3. ⊕–BP1 representing PERM has size bounded Each graph–driven below by Ω n−1/2 2n .

296

Henrik Brosenne, Matthias Homeister, and Stephan Waack

Proof. Let B be a graph–driven ⊕–BP1 guided by G representing the function f := PERMn depending on the variables xij (i, j = 1, 2, . . . , n). We consider the n! inputs a = (aij )1≤i,j≤n that correspond to permutation matrices and the corresponding paths in the graph ordering G. Having tested exactly n/2 variables 1, we truncate these paths. Let A1 := {α1 , α2 , . . . , αν } be the partial assignments to the set of variables {xij | i, j = 1, 2, . . . , n} associated with these truncated paths. For such a partial assignment α let R(α) be the set of row indices i such that α(xij ) = 1, for some column index j. Analogously, let C(α) be the set of column indices j such that α(xij ) = 1, for some row index i. Then |C(α)| = |R(α)| = n/2 by construction. We consider sets A of the above deﬁned partial assignments such that for distinct α, β ∈ A it holds that R(α) = R(β) or C(α) = C(β). We recapitulate the proof given in [16] for the fact, that there is one of these subsets A such that 2 |A| ≥ n!/ ((n/2)!) . Indeed, if we ﬁx two subsets C, R ⊆ {1, 2, . . . , n} of columns and rows, where 2 |C| = |R|, then there are exactly n2 ! inputs a that lead to partial assignments α such that C(α) = C and R(α) = R. They result from combining the (n/2)! bijections from R to C with the (n/2)! bijections from {1, 2, . . . , n} \ R to {1, 2, . . . , n} \ C. Since there are exactly n! accepted inputs, the claim follows. n 2 Let A = {α1 , α2 , . . . , αm } m ≥ n!/ 2 ! be a set of partial assignments the existence of which we have just proved. To prove that A fulﬁls the prerequisites of Corollary 1, we choose an arbitrary subset A of A. We may µ assume without loss of generality that A = {α1 , α2 , . . . , αµ }. We show that i=1 fαi = 0. By the choice of A there is an partial assignment α1 such that (α1 , α1 ) is a permutation matrix. Thus fα1 (0, . . . , 0, α1 ) = 1, where (0, . . . , 0, α1 ) is the following matrix. α1 (xk, ) if α1 (xk, ) is deﬁned; (0, . . . , 0, α1 )k, := 0 otherwise. We show now, that fαi (0, . . . , 0, α1 ) = 0 for i > 1. For the sake of deriving a contradiction, let us assume that there is an index i > 1 such that fαi (0, . . . , 0, α1 ) = 1. Then there is a permutation matrix consisting of only those ones set by αi and α1 . But this implies R(αi ) = R(α1 ) and C(αi ) = C(α1 ), because otherwise there is a row or a column of the permutation matrix a deﬁned by if αi (xk, ) is deﬁned; αi (xk, ) ak, := (0, . . . , 0, α1 )k, if αi (xk, ) is not deﬁned; without any one. Contradiction. Now the claim follows from Stirling’s formula.

Lower Bounds

6

297

Unrestricted ⊕–BP1s Are Stronger than Graph–Driven Ones

To get a function that has small unrestricted but requires exponential sized graph–driven ⊕–BP1s we consider the following function that as PERM depends on a matrix of n2 Boolean variables. It is deﬁned by 1lC ∨ 1lR+ where 1 if each column of X contains exactly one 1; 1lC (X) = 0 otherwise.   1 if n − 1 rows of X contain exactly one 1 1lR+ (X) = and one row contains two 1’s;   0 otherwise. It is easy to construct an OBDD of size bounded above by O n2 testing the variables in a columnwise manner and taking the value one if each column contains a single 1. In the same way we get a linear sized OBDD that tests the variables in a rowwise manner and accepts if n − 1 rows contain a single 1 and one row contains exactly two. Joining the sources of these two OBDDs together, we get a ⊕–BP1 of linear size that represents 1lC ∨ 1lR+ . Theorem 4. Each graph–driven ⊕–BP1 representing 1lC ∨ 1lR+ has size bounded below by Ω n−1/4 · 2n/2 . Proof. Let B be a graph–driven ⊕–BP1 guided by a graph ordering G that represents f := 1lC ∨ 1lR+ on the variables xij (i, j = 1, 2, . . . , n). Without loss of generality we suppose that n is even. As in the case of Theorem 3, we consider the n! inputs a = (aij )1≤i,j≤n that correspond to permutation matrices and the corresponding paths in the graph ordering G. Having tested exactly n/2 variables 1, we truncate these paths. We consider the partial assignments A1 := {α1 , α2 , . . . , αν } to the set of variables {xij | i, j = 1, 2, . . . , n} associated with these truncated paths. First, we observe that

2 n − n/2 2 |A1 | = ν ≤ n · ≤ n2 · (2en)n/2 . (1) n/2 Indeed, there are i+n/2 paths in G starting from the source along which i n/2 variables are tested 0, and n/2 variables are tested 1. Permutation matrices can only follow those paths, where the number of variables tested 0 is less than or equal to n2 − n. Equation 1 follows. Second, without loss of generality let A2 := {α1 , α2 , . . . , αµ } be those elements of A1 that can be extended to atleast two permutation matrices. By the pin!−ν . geon hole principle, we get ν−µ+µ· n2 ! ≥ n!. Consequently, |A2 | = µ ≥ (n/2)!−1 Without loss of generality, let A = {α1 , α2 , . . . , ακ } for κ ≤ ν be those elements of A2 such that C(α) = C(β) or R(α) = R(β) (see the proof of

298

Henrik Brosenne, Matthias Homeister, and Stephan Waack

Theorem 3 for the deﬁnitions of R and C). Since at most (n/2)! elements n!−ν . α ∈ A2 may have the same pair (R, C), we obtain |A| = κ ≥ (n/2)!·(n/2)! ν n! Since limn→∞ (n/2)!·(n/2)! = 0 we get |A| = κ ≥ (n/2)!·(n/2)! − o(1). Now, let AC (AR ) be a subset of A of maximal size satisfying the following C(β) property. If α, β ∈ AC (α, β ∈ AR ) are two distinct elements, then C(α) = (R(α) = R(β)).It follows from an easy counting argument, that |AC | < |A| implies |AR | ≥ |A|. Thus we have the following two cases to distinguish. Case |AC | ≥ |A|. Let AC = {β1 , β2 , . . . , βν } be any subset of the set AC . ν We have to show, that i=1 fβi = 0. By the choice of A there is a partial assignment β1 such that (β1 , β1 ) is a permutation matrix. Thus fβ1 (0, . . . , 0, β1 ) = 1, where the matrix (0, . . . , 0, β1 ) is deﬁned as in the proof of theorem 3. Moreover, we get that fβi (0, . . . , 0, β1 ) = 0 for i > 0. By the deﬁnition of AC we have that C(βi ) = C(β1 ) for i > 0. Thus there is a column of the matrix a deﬁned by if βi (xk, ) is deﬁned; βi (xk, ) ak, := (2) (0, . . . , 0, β1 )k, if βi (xk, ) is not deﬁned; without any one. So we get that fβi (0, . . . , 0, β1 ) = 0 for i > 0 and the inequality ν i=1 fβi = 0 follows. Case |AR | ≥ |A|. Let AR = {β1 , β2 , . . . , βν } be any subset of the set AR . ν Again we have to show, that i=1 fβi = 0. Each element βi of AR can be extended to at least two permutation matrices (βi , βi ) and (βi , β ). So we can construct an assignment βi∗ such that (βi , βi∗ ) is a matrix that contains n − 1 rows with exactly one entry 1 and one row that contains exactly two ones. Thus fβ1 (0, . . . , 0, β1∗ ) = 1. For i > 0 we get that fβ1 (0, . . . , 0, βi∗ ) = 0 since similar to the ﬁrst case there is a row of the matrix a as deﬁned in (2) without any one. So the claim follows.

References 1. M. Ajtai. A non-linear time lower bound for Boolean branching programs. In Proceedings, 40th FOCS, pages 60–70, 1999. 2. P. Beame, M. Saks, X. Sun, and E. Vee. Super–linear time-space tradeoﬀ lower bounds for randomized computations. In Proceedings, 41st FOCS, pages 169–179, 2000. 3. P. Beame and E. Vee. Time-space trade-oﬀs, multiparty communication complexity, and nearest neighbour problems. In Proceedings, 34th STOC, pages 688–697, 2002. 4. B. Bollig, St. Waack, and P. Woelfel. Parity graph-driven read-once branching programs and an exponential lower bound for integer multiplication. In Proceedings 2nd IFIP International Conference on Theoretical ComputerScience, 2002. 5. B. Bollig and I. Wegener. Read-once projections and formal circuit veriﬁcation with binary decision diagrams. In Proceedings, STACS’96, Lecture Notes in Computer Science, pages 491–502. Springer Verlag, 1996. 6. B. Bollig and P. Woelfel. A lower bound technique for nondeterministic graphdriven read-once branching programs and its applications. In Proceedings, 27th MFCS, Lecture Notes in Computer Science. Springer, 2002.

Lower Bounds

299

7. Y. Breitbart, H. B. Hunt, and D. Rosenkrantz. The size of binary decision diagrams representing Boolean functions. Theoretical Computer Science, 145:45–69, 1995. 8. H. Brosenne, Homeister M., and St. Waack. Graph–driven free parity BDDs: Algorithms and lower bounds. In Proceedings, 26th MFCS, volume 2136 of Lecture Notes in Computer Science, pages 212–223. Springer Verlag, 2001. 9. R. E. Bryant. Symbolic manipulation of Boolean functions using a graphical representation. In Proceedings, 22nd DAC, pages 688–694, Piscataway, NJ, 1985. IEEE. 10. R. E. Bryant. Graph-based algorithms for Boolean function manipulation. IEEE Transactions on Computers, 35:677–691, 1986. 11. R. Bryant. On the complexity of VLSI implementations of Boolean functions with applications to integer multiplication. IEEE Transactions on Computers, 40:205– 213, 1991. 12. J. Gergov and Ch. Meinel. Frontiers of feasible and probabilistic feasible Boolean manipulation with branching programs. In Proceedings, 10th STACS, volume 665 of Lecture Notes in Computer Science, pages 576–585. Springer Verlag, 1993. 13. J. Gergov and Ch. Meinel. Mod-2-OBDDs – a data structure that generalizes exor-sum-of-products and ordered binary decision diagrams. Formal Methods in System Design, 8:273–282, 1996. 14. S. Jukna. Linear codes are hard for oblivious read-once parity branching programs. Information Processing Letters, 69:267–269, 1999. 15. M. Krause, Ch. Meinel, and St. Waack. Separating the eraser Turing machine classes Le , NLe , co-NLe , and Pe . Theoretical Computer Science, 86:267–275, 1991. 16. M. Krause. Exponential lower bounds on the complexity of local and real-time branching programs. Journal of Information Processing and Cybernetics (EIK), 24:99–110, 1988. 17. E. J. MacWilliams and N. J. A. Sloane. The Theory of Error–Correcting Codes. Elsevier, 1977. 18. Ch. Meinel and H. Sack. Heuristics for ⊕-OBDDs. In Proceedings, IEEE/ACM International Workshop of Logig and Synthesis, pages 304–309, 2001. 19. Ch. Meinel and H. Sack. Improving XOR-node placements for ⊕-OBDDs. In Proceedings, 5th International Workshop of Reed-Muller Expansion in Circuit Design, pages 51–55, 2001. ` I. Nechiporuk. A Boolean function. Sov. Math. Doklady, 7:999–1000, 1966. 20. E. 21. H. Sack. Improving the Power of OBDDs by Integrating Parity Nodes. PhD thesis, Univ. Trier, 2001. 22. P. Savick´ y and D. Sieling. A hierarchy result for read–once branching programs with restricted parity nondeterminism. In Proceedings, 25th MFCS, volume 1893 of Lecture Notes in Computer Science, pages 650–659. Springer Verlag, 2000. 23. D. Sieling and I. Wegener. Graph driven BDDs – a new data structure for Boolean functions. Theoretical Computer Science, 141:238–310, 1995. 24. D. Sieling. Lower bounds for linear transformed OBDDs and FBDDs. In Proceedings, 19th FSTTCS, number 1738 in Lecture Notes in Computer Science, pages 356–368. Springer Verlag, 1999. 25. St. Waack. On the descriptive and algorithmic power of parity ordered binary decision diagrams. Information and Computation, 166:61–70, 2001. 26. I. Wegener. Branching Programs and Binary Decision Diagrams – Theory and Applications. SIAM Monographs on Discrete Mathematics and Applications. SIAM, Philadelphia, 2000.

The Minimal Graph Model of Lambda Calculus Antonio Bucciarelli1 and Antonino Salibra2,

2

1 Universit´e Paris 7, Equipe PPS 2 place Jussieu, 72251 Paris Cedex 05, France [email protected] Universit`a Ca’Foscari di Venezia, Dipartimento di Informatica Via Torino 155, 30172 Venezia, Italia [email protected]

Abstract. A longstanding open problem in lambda-calculus, raised by G.Plotkin, is whether there exists a continuous model of the untyped lambda-calculus whose theory is exactly the beta-theory or the beta-eta-theory. A related question, raised recently by C.Berline, is whether, given a class of lambda-models, there is a minimal equational theory represented by it. In this paper, we give a positive answer to this latter question for the class of graph models a` la Plotkin-Scott-Engeler. In particular, we build a graph model the equational theory of which is exactly the set of equations satisﬁed in any graph model.

1

Introduction

Lambda theories (that is, congruence relations on λ-terms closed under α- and βconversion) are equational extensions of the untyped lambda calculus that are closed under derivation. Lambda theories arise by syntactical considerations, a lambda theory may correspond to a possible operational (observational) semantics of the lambda calculus, as well as by semantic ones, a lambda theory may be induced by a model of lambda calculus through the kernel congruence relation of the interpretation function (see e.g. [4], [7]). Since the lattice of the lambda theories is a very rich and complex structure, syntactical techniques are usually difﬁcult to use in the study of lambda theories. Therefore, semantic methods have been extensively investigated. Computational motivations and intuitions justify Scott’s view of models as partially ordered sets with a least element and of computable functions as monotonic functions over these sets. After Scott, mathematical models of the lambda calculus in various categories of domains were classiﬁed into semantics according to the nature of their representable functions (see e.g. [4], [7], [17]). Scott’s continuous semantics [19] is given in the category whose objects are complete partial orders (cpo’s) and morphisms are Scott continuous functions. The stable semantics introduced by Berry [8] and the strongly stable semantics introduced by Bucciarelli-Ehrhard [9] are strengthenings of the continuous semantics. The stable semantics is given in the category of DI-domains with stable functions as morphisms, while the strongly stable one is given in the category of DI-domains with coherence, and strongly stable functions as morphisms. All these

Partially supported by MURST Coﬁn’01 COMETA Project and by a visiting fellowship granted by the Equipe PPS of the University Paris 7-Denis Diderot.

B. Rovan and P. Vojt´asˇ (Eds.): MFCS 2003, LNCS 2747, pp. 300–307, 2003. c Springer-Verlag Berlin Heidelberg 2003

The Minimal Graph Model of Lambda Calculus

301

semantics are structurally and equationally rich in the sense that it is possible to build up 2ℵ0 models in each of them inducing pairwise distinct lambda theories [14] [15]. The following are long standing open problems in lambda calculus (see Berline [7]): Question 1. Is there a continuous (stable, strongly stable) model whose theory is exactly λβ or λβη (where λβ is the minimal lambda theory and λβη is the minimal extensional lambda theory)? Question 1 can be weakened in two ways: Question 2. Is λβ the intersection of all theories of continuous semantics? and similarly for λβη and extensional models; and similarly for other semantics. A lambda theory T is the minimal theory of a class C of models if there exists a model in the class which induces T , while all the other models in the class induce theories including T . Question 3. Given a class of models in a given semantics, is there a minimal lambda theory represented in it? Two related question have been answered. Given a semantics, it is natural to ask if all possible λ-theories are induced by a model in the semantics. Negative answers to this question for the continuous, stable and strongly stable semantics were obtained respectively by Honsell-Ronchi della Rocca [13], Bastonero-Gouy [6] and Salibra [18]. All the known semantics are thus incomplete for arbitrary lambda theories. On the other hand, Di Gianantonio et al. [12] have shown that λβη can arise as the theory of a model in the ω1 -semantics (thus all questions collapse and have a positive answer in this case). If ω0 and ω1 denote, respectively, the ﬁrst inﬁnite ordinal and the ﬁrst uncountable ordinal, then the models in the ω1 -semantics are the reﬂexive objects in the category whose objects are ω0 - and ω1 -complete partial orders, and whose morphisms preserve limits of ω1 -chains (but not necessarily of ω0 -chains). Another result of [12] is that Question 3 admits a positive answer for Scott’s continuous semantics, at least if we restrict to extensional models. However, the proofs of [12] use logical relations, and since logical relations do not allow to distinguish terms with the same applicative behavior, the proofs do not carry out to non-extensional models. Among the set-theoretical models of the untyped lambda calculus that were introduced in the seventies and early eighties, there is a class whose members are particularly easy to describe (see Section 2 below). These models, referred to as graph models, were isolated by Plotkin, Scott and Engeler [4] within the continuous semantics. Graph models have been proved useful for giving proofs of consistency of extensions of λcalculus and for studying operational features of λ-calculus. For example, the simplest of all graph models, namely Engeler-Plotkin’s model, has been used to give concise proofs of the head-normalization theorem and of the left-normalization theorem of λcalculus (see [7]), while a semantical proof based on graph models of the “easiness” of (λx.xx)(λx.xx) was obtained by Baeten and Boerboom in [3]. Intersection types were introduced by Dezani and Coppo [10] to overcome the limitations of Curry’s type discipline. They provide a very expressive type language which allows to describe and capture various properties of λ-terms. By duality, type theories give rise to ﬁlter models of lambda calculus (see [1], [5]). Di Gianantonio and Honsell

302

Antonio Bucciarelli and Antonino Salibra

[11] have shown that graph models are strictly related to ﬁlter models, since the class of λ-theories induced by graph models is included in the class of λ-theories induced by non-extensional ﬁlter models. Alessi et al. [2] have shown that this inclusion is strict, namely there exists an equation between λ-terms which cannot be proved in any graph model, whilst this is possible with non-extensional ﬁlter models. In this paper we show that the graph models admit a minimal lambda theory. This result provides a positive answer to Question 3 for the restricted class of graph models.An interesting question arises: what equations between λ-terms are equated by this minimal lambda theory? The answer to this difﬁcult question is still unknown; we conjecture that the right answer is the minimal lambda theory λβ. By what we said in the previous paragraph this would solve the same problem for the class of ﬁlter models. We conclude this introduction by giving a sketch of the technicalities used in the proof of the main theorem. For any equation between λ-terms which fails in some graph model we ﬁx a graph model, where the equation fails. Then we use a technique of completion for gluing together all these models in a unique graph model. Finally, we show that the equational theory of this completion is the minimal lambda theory of graph models.

2

Graph Models

To keep this article self-contained, we summarize some deﬁnitions and results concerning graph models that we need in the subsequent part of the paper. With regard to the lambda calculus we follow the notation and terminology of [4]. The class of graph models belongs to Scott’s continuous semantics. Historically, the ﬁrst graph model was Plotkin and Scott’s Pω , which is also known in the literature as “the graph model". “Graph” referred to the fact that the continuous functions were encoded in the model via (a sufﬁcient fragment of) their graph. As a matter of notation, for every set D, D∗ is the set of all ﬁnite subsets of D, while P(D) is the powerset of D. Deﬁnition 1. A graph model is a pair (D, p), where D is an inﬁnite set and p : D∗ ×D → D is an injective total function. Let (D, p) be a graph model and EnvD be the set of D-environments ρ mapping the set of the variables of λ-calculus into P(D). We deﬁne the interpretation M p : EnvD → P(D) of a λ-term M as follows. – xpρ = ρ(x) – (M N )pρ = {α ∈ D | ∃a ⊆ Nρp s.t. p(a, α) ∈ Mρp } p } – (λx.M )pρ = {p(a, α) | α ∈ Mρ[x:=a] It is not difﬁcult to show that any graph model (D, p) is a model of β-conversion, i.e., it satisﬁes the following condition: λβ M = N ⇒ Mρp = Nρp , for all λ-terms M, N and all environments ρ. Then any graph model (D, p) deﬁnes a model for the untyped lambda calculus through the reﬂexive cpo (P(D), ⊆) determined by the continuous (w.r.t. the Scott topology) mappings

The Minimal Graph Model of Lambda Calculus

F : P(D) → [P(D) → P(D)];

303

G : [P(D) → P(D)] → P(D),

deﬁned by F (X)(Y ) = {α ∈ D : (∃a ⊆ Y ) p(a, α) ∈ X}; G(f ) = {p(a, α) : α ∈ f (a), a ∈ D∗ }, where [P(D) → P(D)] denotes the set of continuous selfmaps of P(D). For more details we refer the reader to Berline [7] and to Chapter 5 of Barendregt’s book [4]. Given a graph model (D, p), we say that M p = N p if, and only if, Mρp = Nρp for all environments ρ. The lambda theory T h(D, p) induced by (D, p) is deﬁned as T h(D, p) = {M = N : M p = N p }. It is well known that T h(D, p) is never extensional because (λx.x)p = (λxy.xy)p . Given this huge amount of graph models (one for each total pair (D, p)), one naturally asks how many different lambda theories are induced by these models. Kerth has shown in [14] that there exist 2ℵ0 graph models with different lambda theories. A lambda theory T is the minimal lambda theory of the class of graph models if there exists a graph model (D, p) such that T = T h(D, p) and T ⊆ T h(E, i) for all other graph models (E, i). The completion method for building graph models from “partial pairs” was initiated by Longo in [16] and recently developed on a wide scale by Kerth in [14] [15]. This method is useful to build models satisfying prescribed constraints, such as domain equations and inequations, and it is particularly convenient for dealing with the equational theories of the graph models. Deﬁnition 2. A partial pair (D, p) is given by an inﬁnite set D and a partial, injective function p : D∗ × D → D. A partial pair is a graph model if and only if p is total. We always suppose that no element of D is a pair. This is not restrictive because partial pairs can be considered up to isomorphism. Deﬁnition 3. Let (D, p) be a partial pair. The Engeler completion of (D, p) is the graph model (E, i) deﬁned as follows: – E = n∈ω En , where E0 = D, En+1 = En ∪ ((En∗ × En ) − dom(p)). – Given a ∈ E ∗ , α ∈ E, p(a, α) if a ∪ {α} ⊆ D, and p(a, α) is def ined i(a, α) = (a, α) otherwise It is easy to check that the Engeler completion of a given partial pair (D, p) is actually a graph model. The Engeler completion of a total pair (D, p) is equal to (D, p). A notion of rank can be naturally deﬁned on the Engeler completion (E, i) of a partial pair (D, p). The elements of D are the elements of rank 0, while an element α ∈ E − D has rank n if α ∈ En and α ∈ En−1 . We conclude this preliminary Section by remarking that the classic graph models, such as Plotkin and Scott’s Pω [4] and Engeler-Plotkin’s EA (with A an arbitrary nonempty set of “atoms”) [7], can be viewed as the Engeler completions of suitable partial pairs. In fact, Pω and E are respectively isomorphic to the Engeler completions of ({0}, p) (with p(∅, 0) = 0) and (A, ∅).

304

Antonio Bucciarelli and Antonino Salibra

3 The Minimal Graph Model Let I be the set of equations between λ-terms which fail to hold in some graph model. For every equation e ∈ I, we consider a ﬁxed graph model (De , ie ), where the equation e fails to hold. Without loss of generality, we may assume that De1 ∩ De2 = ∅ for all distinct equations e1 , e2 ∈ I. We consider the pair (DI , qI ) deﬁned by: DI = De ; qI = ie . e∈I

e∈I

This pair fails to be a graph model because the map qI : DI∗ × DI → DI is not total (qI is deﬁned only on the pairs (a, x) such that a ∪ {x} ⊆ De for some e ∈ I). Finally, let (E, i) be the Engeler completion of (DI , qI ). We are going to show that the theory of (E, i) is the intersection of all the theories of graph models, i.e. that: Theorem 1. The class of graph models admits a minimal lambda theory. From now on, we focus on one of the (De , ie ), in order to show that all the equations between closed lambda terms true in (E, i) are true in (De , ie ). The idea is to prove that, for all closed λ-terms M M ie = M i ∩ De .

(1)

This takes a structural induction on M , and hence the analysis of open terms too. Roughly, we are going to show that equation (1) holds for open terms as well, provided that the environments satisfy a suitable closure property introduced below. Deﬁnition 4. Given e ∈ I, we call e-ﬂattening the following function fe : E → E deﬁned by induction on the rank of elements of E: if rank(x) = 0 then fe (x) = x if rank(x) = n + 1 and x = ({y1 , ..., yk }, y) then ie ({fe (y1 ), ..., fe (yk )} De , fe (y)) if fe (y) ∈ De fe (x) = x otherwise For all a ⊆ E, fe (a) will denote the set {fe (x) : x ∈ a}. The following easy facts will be useful: Lemma 1. (a) For all x ∈ E, if fe (x) ∈ De then fe (x) = x. (b) If a ⊆ E, z ∈ E and fe (z) ∈ De , then fe (i(a, z)) = ie (fe (a) ∩ De , fe (z)) ∈ De . We notice that Lemma 1(b) holds, a fortiori, if z ∈ De . Deﬁnition 5. For a ⊆ E let a ˆ = a ∪ fe (a); we say that a is e-closed if a ˆ = a. Lemma 2. For all a ⊆ E, a ˆ ∩ De = fe (a) ∩ De .

The Minimal Graph Model of Lambda Calculus

305

Proof. By deﬁnition, a ˆ = a ∪ fe (a), hence a ˆ ∩ De = (a ∩ De ) ∪ (fe (a) ∩ De ) Since fe restricted to De is the identity function, we have a ∩ De ⊆ fe (a) ∩ De , and we are done. Deﬁnition 6. Let ρ : V ar → P(E) be an E-environment. We deﬁne the e-restriction ρe of ρ by ρe (x) = ρ(x) ∩ De , while we say that ρ is e-closed if for every variable x, ρ(x) is e-closed. The following proposition is the key technical lemma of the paper: Proposition 1. Let M be a λ-term and ρ be an e-closed E-environment; then (i) Mρi is e-closed. (ii) Mρi ∩ De ⊆ Mρie . Proof. We prove (i) and (ii) simultaneously by induction on the structure of M . If M ≡ x, both statements are trivially true. Let M ≡ λx.N , and let us start by proving the statement (i): given y = i(a, z) ∈ Mρi , we have to show that fe (y) ∈ Mρi . First we remark that, if rank(y) = 0 or fe (z) ∈ De , then by Lemma 1(a) fe (y) = y and we are done. Hence, let y = i(a, z) and fe (z) ∈ De ; we have y ∈ Mρi i ⇒ z ∈ Nρ[x:=a] i ⇒ z ∈ Nρ[x:=ˆ a] i ⇒ fe (z) ∈ Nρ[x:=ˆ a] i ⇒ fe (z) ∈ N(ρ[x:=ˆ a])e ⇒ fe (z) ∈ Nρie [x:=fe (a)∩De ] ⇒ i(fe (a) ∩ De , fe (z)) ∈ Mρie ⇒ ie (fe (a) ∩ De , fe (z)) ∈ Mρie ⇒ fe (y) ∈ Mρie ⇒ fe (y) ∈ Mρi

by deﬁnition of ( )i by monotonicity of ( )i w.r.t. environments by (i), remark that ρ[x := a ˆ] is closed by (ii) , since fe (z) ∈ De by Lemma 2 by deﬁnition of ( )i by deﬁnition of (E, i) by deﬁnition of fe by monotonicity of ( )i

Let us prove that M ≡ λx.N satisﬁes (ii): y ∈ Mρi ∩ De i ⇒ (∃a ⊆ De )(∃z ∈ De ) y = ie (a, z) and z ∈ Nρ[x:=a] i ⇒ z ∈ N(ρ[x:=a]) e ⇒ z ∈ Nρie [x:=a] ⇒ y ∈ Mρie

by deﬁnition of ( )i and by y ∈ De by (ii), remark that a ˆ=a since a ⊆ De by deﬁnition of ( )i

Let M ≡ P Q. (i) Let z ∈ (P Q)iρ . If fe (z) = z we are done, otherwise by Lemma 1(a) fe (z) ∈ De . Moreover, ∃a ⊆ E such that i(a, z) ∈ Pρi and a ⊆ Qiρ . Applying (i) and Lemma 1(b) to P we get

306

Antonio Bucciarelli and Antonino Salibra

fe (i(a, z)) = ie (fe (a) ∩ De , fe (z)) = i(fe (a) ∩ De , fe (z)) ∈ Pρi . Applying (i) to Q we get fe (a) ⊆ Qiρ . Hence fe (z) ∈ Mρi . (ii) If z ∈ (P Q)iρ ∩ De , then ∃a ⊆ E such that i(a, z) ∈ Pρi and a ⊆ Qiρ . Since ρ is e-closed and z ∈ De , then by (i) and by Lemma 1(b) we get fe (i(a, z)) = ie (fe (a) ∩ De , z) ∈ Pρi and fe (a) ∩ De ⊆ Qiρ . Now, by (ii), we obtain ie (fe (a) ∩ De , z) ∈ Pρie and fe (a) ∩ De ⊆ Qiρe , and we conclude z ∈ (P Q)iρe . Proposition 2. Let M be a λ-term and ρ : V ar → P(De ) be a De -environment; then we have Mρi ∩ De = Mρie . Proof. We prove by induction on the structure of M that Mρi ∩ De ⊆ Mρie . The converse is ensured by Mρie ⊆ Mρi and Mρie ⊆ De , both trivially true. If M ≡ x, the statement trivially holds. Let M ≡ λxN ; if y ∈ Mρi ∩ De , then y = ie (a, z) with a ∪ {z} ⊆ De , and ie i . By induction hypothesis z ∈ Nρ[x:=a] , and hence ie (a, z) = y ∈ Mρie . z ∈ Nρ[x:=a] Let M ≡ P Q; If z ∈ (P Q)iρ ∩ De , then ∃a ⊆ E such that i(a, z) ∈ Pρi and a ⊆ Qiρ . Since ρ is e-closed and z ∈ De , we can use Lemma 1(b) and Proposition 1(i) to obtain fe (i(a, z)) = ie (fe (a) ∩ De , z) ∈ Pρi . Hence we can use the induction hypothesis to get ie (fe (a) ∩ De , z) ∈ Pρie . Moreover, fe (a) ∩ De ⊆ Qρie by using again Proposition 1(i) and the induction hypothesis on Q. Hence z ∈ (P Q)iρe . Proposition 3. T h(E, i) ⊆ T h(De , ie ). Proof. Let M i = N i . By the previous proposition we have M ie = M i ∩ De = N i ∩ De = N ie . Theorem 1 is an immediate corollary of Proposition 3 and of the deﬁnition of (E, i).

4

Conclusion

We have shown that the graph models admit a minimal lambda theory T h(E, i). Graph models provide a suitable framework for proving the consistency of extensions of λβ. For instance, for every closed λ-term M there exists a graph model (DM , iM ) such that (λx.xx)(λx.xx)iM = M iM [3]. Symmetrically, one could use them in order to realise inequalities between non β-equivalent terms: given M =β N , this can be achieved by ﬁnding a graph model (D, i) such that M i = N i . We are not able to perform this construction in general, yet, but we conjecture that T h(E, i) = λβ. Another question raised by this work concerns the generality of the notions of eﬂattening and e-closure, introduced to prove the minimality of (E, i). Actually it should be possible to apply our technique for proving that classes of models other than graph models, which, informally, are closed under direct product of “pre-models” and free completion, admit a minimal lambda theory.

The Minimal Graph Model of Lambda Calculus

307

References 1. Abramsky, S.: Domain theory in logical form. Annals of Pure and Applied Logic 51 (1991) 1–77 2. Alessi, F., Dezani, M., Honsell, F.: Filter models and easy terms. in ICTCS’01, LNCS 2202, Springer-Verlag, (2001) 17–37 3. Baeten, J., Boerboom, B.: Omega can be anything it should not be. Indag. Mathematicae 41 (1979) 111–120 4. Barendregt, H.P.: The lambda calculus: Its syntax and semantics. Revised edition, Studies in Logic and the Foundations of Mathematics 103, North-Holland Publishing Co., Amsterdam (1984) 5. Barendregt, H.P., Coppo, M., Dezani, M.: A ﬁlter lambda model and the completeness of type assignment. J. Symbolic Logic 48 (1983) 931–940 6. Bastonero, O., Gouy, X.: Strong stability and the incompleteness of stable models of λcalculus. Annals of Pure and Applied Logic 100 (1999) 247–277 7. Berline, C.: From computation to foundations via functions and application: The λ-calculus and its webbed models. Theoretical Computer Science 249 (2000) 81–161 8. Berry, G.: Stable models of typed lambda-calculi. Proc. 5th Int. Coll. on Automata, Languages and Programming, LNCS vol.62, Springer-Verlag (1978) 9. Bucciarelli, A., Ehrhard, T.: Sequentiality and strong stability. Sixth Annual IEEE Symposium on Logic in Computer Science (1991) 138–145 10. Coppo, M., Dezani, M.: An extension of the basic functionality theory for the λ-calculus. Notre Dame J. Formal Logic 21 (1980) 685–693 11. Di Gianantonio, P., Honsell, F.: An abstract notion of application. in M. Bezem and J.F. Groote, editors, Typed lambda calculi and applications, LNCS 664, Springer-Verlag, (1993) 124–138 12. Di Gianantonio, P., Honsell, F., Plotkin, G.D.: Uncountable limits and the lambda calculus. Nordic J. Comput. 2 (1995) 126–145 13. Honsell, F., Ronchi della Rocca, S.: An approximation theorem for topological λ-models and the topological incompleteness of λ-calculus. Journal Computer and System Science 45 (1992) 49–75 14. Kerth, R.: Isomorphism and equational equivalence of continuous lambda models. Studia Logica 61 (1998) 403–415 15. Kerth, R.: On the construction of stable models of λ-calculus. Theoretical Computer Science 269 (2001) 16. Longo, G.: Set-theoretical models of λ-calculus: theories, expansions and isomorphisms. Ann. Pure Applied Logic 24 (1983) 153–188 17. Plotkin, G.D.: Set-theoretical and other elementary models of the λ-calculus. Theoretical Computer Science 121 (1993) 351–409 18. Salibra, A.: Topological incompleteness and order incompleteness of the lambda calculus. ACM Transactions on Computational Logic (2003) 19. Scott, D.S.: Continuous lattices. In: Toposes, Algebraic geometry and Logic (F.W. Lawvere ed.), LNM 274, Springer-Verlag (1972) 97–136

Unambiguous Automata on Bi-inﬁnite Words Olivier Carton LIAFA, Universit´e Paris 7 [email protected] http://www.liafa.jussieu.fr/˜carton/

Abstract. We consider ﬁnite automata accepting bi-inﬁnite words. We introduce unambiguous automata where each accepted word is the label of exactly one accepting path. We show that each rational set of bi-inﬁnite words is accepted by such an automaton. This result is a counterpart of McNaughton’s theorem for bi-inﬁnite words.

1

Introduction

Roughly speaking, an acceptor is unambiguous if each word is accepted in only one way. For instance, a context-free grammar is unambiguous if any word has at most one derivation tree. Similarly a ﬁnite state automaton is generally called unambiguous if any ﬁnite word labels at most one accepting path. In this area, an important question is whether any acceptor of some type is equivalent to another one of the same type which is unambiguous. For instance it is well known that any Turing machine is equivalent to a deterministic one which is obviously unambiguous [6]. On the contrary, it is not true that any context-free grammar is equivalent to an ambiguous one. Some context-free languages are inherently ambiguous [3, Prop 1.8]. In this paper, we address the problem of ambiguity for automata accepting bi-inﬁnite words. For automata on ﬁnite words, any automaton is equivalent to a deterministic one that can be computed using the classical subset construction [10, Thm 2.1]. This gives an equivalent unambiguous automaton. For ω-words, the same result holds but this is already a nontrivial result [8]. Furthermore, Muller automata which have a more powerful acceptance condition must be used and all determinization algorithms of automata over ω-words that have been given so far are complex [11]. Determinization is not the only way to get unambiguous automata on ω-words [1]. In [5], it has been proved that any B¨ uchi is equivalent to an unambiguous one which is actually co-deterministic. The determinization of automata has even been extended to automata on transﬁnite words by B¨ uchi [4]. In this paper, we show that any automaton on bi-inﬁnite words is equivalent to an unambiguous one. We actually prove two results, a speciﬁc one for rational sets that are shift invariant and one for all rational sets. The former one holds for automata with a simple and natural acceptance mode which is only suitable for shift invariant sets. The latter one holds for automata with a more powerful acceptance mode suited to all rational sets. This result is also stronger because B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 308–317, 2003. c Springer-Verlag Berlin Heidelberg 2003

Unambiguous Automata on Bi-inﬁnite Words

309

it states the existence of automata that are unambiguous and complete. There was no counterpart of McNaughton’s result for automata on bi-inﬁnite words. Our result ﬁlls this gap. Complementation is often proved using unambiguous automata. For instance, proving that the complement of a rational set of ﬁnite words is also rational is usually achieved through the computation of an equivalent deterministic and complete automaton (see [10, Cor 2.3]. For ω-words, complementation becomes also easy if a deterministic Muller automaton is provided. In both cases, the key idea for complementation is to use an automaton where each word labels exactly one path that can be accepting. In deterministic automata, this path is of course the unique path stating in the initial state. Our second result provides unambiguous and complete automata that can be used for complementation. A rational expression is unambiguous if each word is described in a unique way by the expression. This means that all unions appearing in the expression are disjoint unions and that product, stars and ω-iterations are unambiguous. Recall for instance that a product XY is unambiguous if any word z in XY has a unique factorization z = xy where x and y are in X and Y . It is well-known that any rational set of ﬁnite words is described by an unambiguous rational expression [10, Prop 4.3]. It follows from Arnold’s results [1], that the same result holds for rational sets of inﬁnite words. No similar result is known for sets of bi-inﬁnite words but the unambiguous automata that we have introduced seem to be a step towards such a result. In symbolic dynamics [7,2] are studied sets of bi-inﬁnite words that are shift invariant and closed for the usual topology. These sets are called subshifts and they are characterized by the factors of their elements. Recall that a factor of a bi-inﬁnite word is a ﬁnite word that appear in it. When its set of factors is rational, a subshift is the set of labels of all bi-inﬁnite paths in an automaton without any acceptance condition and it is called a soﬁc subshift. Automata without any acceptance condition where each bi-inﬁnite word labels at most one path are called local. They correspond to a strict subclass of soﬁc subshifts called subshifts of ﬁnite type. The paper is organized as follows. The notion of bi-inﬁnite words is deﬁned in Section 2. Automata accepting these words are introduced in Section 3. The deﬁnition of unambiguous automata is given is Section 4. Each of the two main results is stated and proved in a subsection (Subsections 4.1 and 4.2) of this section.

2

Bi-inﬁnite Words

In this section, we recall the notion of bi-inﬁnite words. We actually deﬁne two notions of bi-inﬁnite words, namely pointed and unpointed words. The latter one refers to shift invariant sets. Let A be a ﬁnite alphabet whose elements are called letter. A ﬁnite word over A is a ﬁnite sequence a1 . . . an of letters. An ω-word x is a inﬁnite sequence x0 x1 x2 . . . of letters. The sets of ﬁnite and ω-words are respectively denoted by A∗ and Aω . A pointed bi-inﬁnite word z or a pointed word for short is a sequence

310

Olivier Carton

. . . z−2 z−1 · z0 z1 z2 . . . of letters indexed by the set Z of relative integers. A dot is inserted between the letters indexed by −1 and 0 to mark that position. The sets of all pointed bi-inﬁnite words is denoted AZ . It is equipped with the shift σ : AZ → AZ deﬁned by σ((zk )k∈Z ) = (zk−1 )k∈Z . The image by the shift of a pointed word . . . z−3 z−2 z−1 · z0 z1 z2 . . . is thus the pointed word . . . z−3 z−2 · z−1 z0 z1 z2 . . . where the marked position has been shift one position to the left. The shift induces the equivalence relation ∼ on AZ deﬁned as follows. Two pointed words z and z satisfy z ∼ z if z = σ n (z) for some relative integer n. The class of a pointed word z is denoted by σ Z (z) = {σ n (z) | n ∈ Z}. If z is equal to . . . z−2 z−1 · z0 z1 z2 . . ., its class σ Z (z) is written . . . z−2 z−1 z0 z1 z2 . . . without any dot. A class σ Z (z) is called an unpointed biinﬁnite word or an unpointed word for short. The set of all unpointed words is denoted by Aζ . For a set Z of pointed words, we denote by σ Z (Z) the set {σ Z (z) | z ∈ Z} of unpointed words. Let z = (zi )i∈Z be the pointed word deﬁned by zi = a if i is even and zi = b if i is odd. The class σ Z (z) of z only contains the two pointed words z = . . . abab · abab . . . = (ab)ω˜ · (ab)ω and σ(z) = . . . baba · baba . . . = (ba)ω˜ · (ba)ω . This class is the unpointed word . . . ababab . . . which is denoted (ab)ζ . A set Z of pointed words is shift invariant if it satisﬁes σ(Z) = Z. If Z is shift invariant, each class σ Z (z) is either contained in Zif z ∈ Z or disjoint from Z if z ∈ / Z. The set Z is then equal to the union z∈Z σ Z (z) of classes. This means that a shift invariant set Z can be identiﬁed with the set σ Z (Z) of unpointed words. Let x = x0 x1 x2 . . . and y = y0 y1 y2 . . . be two ω-words. We denote by x ˜ · y the pointed word z = . . . x2 x1 x0 · y0 y1 y2 . . . obtained by concatenating the mirror image of x with y. The corresponding unpointed word . . . x2 x1 x0 y0 y1 y2 . . . is denoted by x ˜y. For two sets X and Y of ω-words, we respectively denote by ˜ · Y and XY ˜ the sets {˜ X x · y | x ∈ X and y ∈ Y } and {˜ xy | x ∈ X and y ∈ Y }. ˜ · Y ) = XY ˜ . Note that σ Z (X Let A be the alphabet {a, b} and let X and Y be the sets Aω and aAω of ˜ · Y = Aω˜ · aAω is the set of pointed words having an a at ω-words. The set X ˜ = Aω˜ aAω is the set of unpointed words having at least position 0 whereas XY one occurrence of an a. 2.1

Rational Sets of Bi-inﬁnite Words

A set Z of pointed (respectively words is said to be rational if Z n unpointed) ˜ i · Yi (respectively Z = n X ˜ is equal to a ﬁnite union Z = i=1 X i=1 i Yi ) where each set Xi or Yi is a rational set of ω-words. For a shift invariant set Z, Z is rational as a set of pointed words if and only if σ Z (Z) is rational as a set of unpointed words. In the sequel, we need the following result. Theorem 1. The complement of a rational set of pointed (respectively unpointed) words is also rational. We refer the reader to [9, chap. 9] for a proof of this result.

Unambiguous Automata on Bi-inﬁnite Words

3

311

Automata

In this section, we recall the notion of automata accepting bi-inﬁnite words. These automata are a natural extension of B¨ uchi automata to bi-inﬁnite words. We refer the reader to [9, chap. 9] for a complete introduction to these notions. As we shall see, automata equipped with the most natural acceptance mode accept unpointed words. For pointed words, an additional set of states called middle states is needed to deﬁne the accepting paths. An automaton over the alphabet A is given by a set Q of states, a set E ⊆ Q × A × Q of transitions and by sets I ⊆ Q and F ⊆ Q of initial and ﬁnal states. For an automaton accepting pointed words, a set M ⊆ Q of middle states is also given. Such an automaton A is denoted (Q, A, E, I, F ) or (Q, A, E, I, M, F ) in a the latter case. Each transition (p, a, q) is denoted by p − → q and the letter a is called its label. Generally speaking, a path in A is a sequence of consecutive transitions. As for words, we consider inﬁnite and bi-inﬁnite paths. An inﬁnite (respectively bi-inﬁnite) path is an inﬁnite (respectively bi-inﬁnite) sequence of consecutive transitions. The label of a path is the concatenation of the labels of its transitions. The label of an inﬁnite (respectively bi-inﬁnite) path is thus an inﬁnite (respectively bi-inﬁnite) word over A. More formerly, the label of an inﬁnite path a

a

a

0 1 2 q0 −→ q1 −→ q2 −→ q3 · · ·

is the ω-word a0 a1 a2 . . . and the label of a bi-inﬁnite path γ a−2

a−1

a

a

a

· · · q−2 −−→ q−1 −−→ q0 −−0→ q1 −−1→ q2 −−2→ q3 · · · is the pointed bi-inﬁnite word z = (ak )k∈Z . Note that the pointed bi-inﬁnite word σ n (z) is the label of the path σ n (γ) obtained by shifting the path. Therefore, we say that the unpointed word σ Z (z) is also the label of the path γ. An inﬁnite path is said to be initial if its ﬁrst state is initial. This deﬁnition is of course not relevant to bi-inﬁnite paths which do not have any ﬁrst state. A bi-inﬁnite path is said to be initial if at least one initial state occurs inﬁnitely often on the left. More formerly, let us deﬁne the left limit lim−∞ γ of a bi-inﬁnite path γ by lim γ = {q | ∀n ∈ N ∃k k < −n and qk = q}.

−∞

A bi-inﬁnite path γ is initial if lim−∞ γ ∩I is non-empty. An inﬁnite or bi-inﬁnite path is ﬁnal if at least one ﬁnal state occurs inﬁnitely often on the right. Let us deﬁne similarly the right limit of an inﬁnite or bi-inﬁnite path γ by lim γ = {q | ∀n ∈ N ∃k k > n and qk = q}.

+∞

An inﬁnite or bi-inﬁnite path is ﬁnal if lim+∞ γ ∩ F is non-empty. In a B¨ uchi automaton A = (Q, A, E, I, F ), an inﬁnite path is accepting if it is both initial and ﬁnal. This means that its ﬁrst state is initial and that it

312

Olivier Carton

satisﬁes the well-known B¨ uchi condition [12, p. 136]. An ω-word is accepted by a B¨ uchi automaton if it is the label of an accepting path. In the ﬁgures, initial and ﬁnal states are respectively marked by a small incoming and outgoing arrow.

Fig. 1. An automaton for unpointed words

Example 1. Consider the automaton pictured in Figure 1 as a B¨ uchi automaton. It accepts the set b∗ aAω of ω-words having at least one occurrence of an a. In an automaton A = (Q, A, E, I, F ) a bi-inﬁnite path is accepting if it is both initial and ﬁnal. This means that its left part satisﬁes the B¨ uchi condition with respect to initial states and that its right part satisﬁes the B¨ uchi condition with respect to ﬁnal states. An unpointed word z is accepted by an automaton if it is the label of an accepting path. Example 2. Consider again the automaton pictured in Figure 1. It accepts the set Aω˜ aAω of unpointed words having at least one occurrence of an a. This acceptance mode is the most natural one for bi-inﬁnite words because it simply generalizes the B¨ uchi acceptance condition used for inﬁnite words. For unpointed words, this mode is suitable since there is a Kleene-like theorem. A set of unpointed words is rational if and only if it is accepted by a ﬁnite automaton. For pointed words, this acceptance mode is unsuitable for the following reason. If γ is an accepting path, the path σ(γ) is also accepting. Therefore, if the pointed word z is accepted, the word σ(z) must also be accepted. Its turns out that only shift invariant sets can be accepted by automata with this acceptance mode. In an automaton A = (Q, A, E, I, M, F ) with a set M of middle states, a bi-inﬁnite path γ is accepting, if it is both initial and ﬁnal and if the state q0 at position 0 in γ is a middle state. In the ﬁgures, the middle states are marked by a double circle. Example 3. Consider the automaton pictured in Figure 2. It accepts the set Aω˜ · aAω of pointed words having an a at position 0.

4

Unambiguous Automata

In this section, we introduce the notion of unambiguous automata on bi-inﬁnite words. We ﬁrst give the deﬁnitions of unambiguity and completeness for these

Unambiguous Automata on Bi-inﬁnite Words

313

Fig. 2. An automaton for pointed words

automata. These deﬁnitions make sense for both automata on pointed bi-inﬁnite words and automata on unpointed bi-inﬁnite words. However, the results that can be obtained are diﬀerent. For pointed words, the main result is that any rational set can be accepted by an automaton which is both unambiguous and complete. For unpointed words, it is only possible to accept any rational set by an unambiguous automaton. The automaton cannot be complete in general. This comes intrinsically from the acceptance mode. We begin with the deﬁnition and we provide some examples and counterexamples. Deﬁnition 1. An automaton A is ζ-unambiguous (respectively ζ-complete) if any bi-inﬁnite word is the label of at most (respectively at least) one path which is initial and ﬁnal. Note ﬁrst that the initial and ﬁnal paths of an automaton only depend on its initial and ﬁnal states. Therefore, this deﬁnition can be applied to both automata with or without middle states. Second, an automaton is unambiguous (respectively complete) for pointed words if and only if it is unambiguous (respectively complete) for pointed words. This is due to the fact that if a bi-inﬁnite word z labels a path, the word σ(z) labels the path σ(γ). To illustrate this deﬁnition, we provide examples of ζ-ambiguous and ζunambiguous automata.

Fig. 3. ζ-ambiguous automaton of Example 4

Example 4. The automaton pictured in Figure 3 accepts the set bω˜ (a + b)∗ bω of unpointed words having ﬁnitely many a. This automaton is ζ-ambiguous since the word bζ is the label of the two accepting paths 0ω˜ 12ω and 0ω˜ 112ω . The automaton pictured in Figure 4 accepts the set same set but it is ζ-unambiguous. This automaton is however not ζ-complete since the word aζ is not the label of a path.

314

Olivier Carton

Fig. 4. ζ-unambiguous automaton of Example 4

From the previous example, it might seem obvious to ﬁnd an unambiguous automaton recognizing a given set of unpointed words. The following example shows that it is not always so easy.

Fig. 5. ζ-unambiguous automaton of Example 5

Example 5. The set (ab∗ )ω˜ (a + b)ω + (a + b)ω˜ (b∗ a)ω of unpointed words having inﬁnitely many a is accepted by the automaton pictured in Figure 5. This automaton is ζ-unambiguous. This automaton is also not ζ-complete since the word bζ is not the label of a path. 4.1

Pointed Words

In this section, we focus our attention to pointed bi-inﬁnite words. We ﬁrst show that unambiguous and complete automata behave well with respect to complementation. Then, we state and we prove the main result for pointed words. Complementation becomes easy with unambiguous and complete automata on pointed words. It suﬃces to exchange middle states and non-middle states to get an automaton accepting the complement set. Proposition 1. Let A = (Q, A, E, I, M, F ) be a ζ-unambiguous and ζ-complete automaton accepting a subset Z of bi-inﬁnite pointed words. The automaton (Q, A, E, I, Q \ M, F ) accepts the complement of Z. In the previous examples, we have seen that many typical rational sets of unpointed bi-inﬁnite words can accepted by a ζ-unambiguous automata. The

Unambiguous Automata on Bi-inﬁnite Words

315

following theorem states that this is actually true for any rational set of pointed bi-inﬁnite words. This is one of the main results of the paper. Theorem 2. Any rational set of pointed bi-inﬁnite words is accepted by a ζunambiguous and ζ-complete automaton. The result of McNaughton states that any B¨ uchi automaton can be replaced by an equivalent automaton which is deterministic and thus unambiguous. The previous theorem is thus the counterpart of McNaughton’s result for bi-inﬁnite words. The proof of Theorem 2 is based on a similar result concerning automata on ω-words. This result involves the notion of an unambiguous automaton on ω-words that we now deﬁne. Deﬁnition 2. An automaton A = (Q, A, E, I, F ) is said to be ω-unambiguous (respectively ω-complete) if any ω-word is the label of at most (respectively at least) one ﬁnal path. The main diﬀerence with the deﬁnitions on unambiguity and completeness for automata on bi-inﬁnite words is that ﬁnal paths are considered instead of paths which are both initial and ﬁnal. We can now state the analogous result for ω-words. Theorem 3 ([5]). Any rational set of ω-words is accepted by an ω-unambiguous and ω-complete automaton. Let A = (Q, A, E, I, F ) and A = (Q , A, E , I , F ) be two automata on ω ˜ words. The (synchronized) product A×A is the automaton B deﬁned as follows. The set of states of B is Q × Q , its sets of initial and ﬁnal states are respectively F × Q and Q × F and its set T of transitions is deﬁned by a

a

a

T = {(p, q) − → (p , q ) | p − → p ∈ E and q − → q ∈ E }. Note that the transitions of A are used backwards in the automaton A˜ × A . The following lemma allows us to combine unambiguous and complete automata on ω-words to make unambiguous and complete automata on bi-inﬁnite words. Lemma 1. If A and A are two ω-unambiguous and ω-complete automata, the synchronized product A˜ × A is ζ-unambiguous and ζ-complete. The main idea of the proof of Theorem 2 is to use the synchronized product of automata on ω-words and Lemma 1. 4.2

Unpointed Words

In this section, we focus our attention to unpointed words. Theorem 4. Any rational set of unpointed bi-inﬁnite words is accepted by a ζ-unambiguous automaton.

316

Olivier Carton

Note that the automaton given by the previous theorem cannot be complete since it would accept the full set Aζ . The result can however be applied to both a set X and its complement X = Aζ \X to get two unambiguous automata AX and AX recognizing X and X. The automaton A = AX ∪ AX is still unambiguous and it is also complete. One obtains an unambiguous and complete automaton which is naturally divided in two parts. The unique path labeled by any unpointed bi-inﬁnite word x is either in the ﬁrst part of the automaton if x belongs to X and in the second part otherwise. If for instance the automata pictured in Figure 4 and 5 are joined, the resulting automaton is unambiguous and complete. The construction used in the proof of Theorem 4 is illustrated by the following example.

¼

Fig. 6. Product automaton of Example 6

Example 6. Consider again the automaton of Example 2 pictured in Figure 1. It accepts the set X = Aω˜ aAω of bi-inﬁnite words having at least one occurrence of a. The construction described in the proof of Theorem 4 can be used to get an unambiguous automaton recognizing X. The complement of X is the set X = bζ . The unambiguous and complete automata A (in the Figure is actually pictured the automaton A˜ obtained by reversing the transitions of A) and A pictured in Figure 6 accept the sets bω and bω . The automaton A˜ × A is unambiguous and complete. One gets an unambiguous automata recognizing X by removing the state 02 from it.

Unambiguous Automata on Bi-inﬁnite Words

317

References 1. A. Arnold. Rational ω-languages are non-ambiguous. Theoret. Comput. Sci., 26:221–223, 1983. 2. M.-P. B´eal and D. Perrin. Symbolic dynamics and ﬁnite automata. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume 2, chapter 10. Springer-Verlag, 1997. 3. J. Berstel and L. Boasson. Context-free languages. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, volume B, chapter 2, pages 59–102. Elsevier, 1990. 4. J. R. B¨ uchi. Transﬁnite automata recursions and weak second order theory of ordinals. In Proc. Int. Congress Logic, Methodology, and Philosophy of Science, Jerusalem 1964, pages 2–23. North Holland, 1964. 5. O. Carton and M. Michel. Unambiguous B¨ uchi automata. Theoret. Comput. Sci., 297:37–81, 2003. 6. J. E. Hopcroft and J. D. Ullman. Introduction to Automata Theory, Languages and Computation. Addison-Wesley, 1979. 7. D. Lind and B. Marcus. An Introduction to Symbolic Dynamics and Coding. Cambridge University Press, 1995. 8. R. McNaughton. Testing and generating inﬁnite sequences by a ﬁnite automaton. Inform. Control, 9:521–530, 1966. ´ Pin. Inﬁnite words. To appear, available at 9. D. Perrin and J.-E. http://www.liafa.jussieu.fr/˜jep/Resumes/InfiniteWords.html. 10. D. Perrin. Finite automata. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, volume B, chapter 1, pages 1–57. Elsevier, 1990. 11. S. Safra. On the complexity of ω-automata. In 29th Annual Symposium on Foundations of computer sciences, pages 24–29, 1988. 12. W. Thomas. Automata on inﬁnite objects. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, volume B, chapter 4, pages 133–191. Elsevier, 1990.

Relating Hierarchy of Temporal Properties to Model Checking ˇ Ivana Cern´ a and Radek Pel´ anek Department of Computer Science, Faculty of Informatics Masaryk University Brno, Czech Republic {cerna,xpelanek}@fi.muni.cz

Abstract. The hierarchy of properties as overviewed by Manna and Pnueli [18] relates language, topology, ω-automata, and linear temporal logic classiﬁcations of properties. We provide new characterisations of this hierarchy in terms of automata with B¨ uchi, co-B¨ uchi, and Streett acceptance condition and in terms of ΣiLTL and ΠiLTL hierarchies. Afterwards, we analyse the complexity of the model checking problem for particular classes of the hierarchy and thanks to the new characterisations we identify those linear time temporal properties for which the model checking problem can be solved more eﬃciently than in the general case.

1

Introduction

Model checking has become a popular technique for formal veriﬁcation of reactive systems. The model checking process has several phases – the major ones being modelling of the system, speciﬁcation of desired properties of the system, and the actual process of automatic veriﬁcation. Each of these phases has its speciﬁc diﬃculties. In this paper we study linear temporal properties and algorithms for the automatic veriﬁcation of these properties. Reactive systems maintain an ongoing interaction with their environment and thus produce computations – inﬁnite sequences of states. When analysing the behaviour of such a system we are interested in some ﬁnite set AP of observable propositions about states. Hence, we can view a computation of the system as an inﬁnite word over 2AP . In general, we deﬁne a temporal property as a language of inﬁnite words. A reactive system S is said to have a property P if all possible computations of S belong to P . The problem of proper and correct speciﬁcation of properties the system ought to satisfy led to a careful study of theoretical aspects of properties. Manna and Pnueli [18] have proposed a classiﬁcation of temporal properties into a hierarchy. They characterise the classes of the hierarchy through four views: a language-theoretic view, a topological view, a temporal logic view, and an automata view. The fact that the hierarchy can be deﬁned in many diﬀerent ways shows the robustness of this hierarchy.

ˇ grant no. 201/03/0509 Supported by GA CR

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 318–327, 2003. c Springer-Verlag Berlin Heidelberg 2003

Relating Hierarchy of Temporal Properties to Model Checking

319

Model checking theory is devoted to the development of eﬃcient algorithms for the automatic veriﬁcation of properties of reactive systems. A very successful approach to verifying properties expressed as linear temporal logic (LTL) formulas makes use of automata over inﬁnite words. Here the problem of verifying a property reduces to the problem whether a given automaton recognises a non-empty language (so called non-emptiness check). The complexity of the non-emptiness check depends on the type of the automaton. Bloem, Ravi, and Somenzi [1] have studied two specialised types of automata, called weak and terminal, for which the non-emptiness check can be performed more eﬃciently than in the general case. Our Contribution. Our aim is to classify temporal properties speciﬁable by linear temporal logic formulas with respect to the complexity of their veriﬁcation. To this end we provide a classiﬁcation of temporal properties through two new views. First, we characterise properties in terms of automata over inﬁnite words (ω-automata) with B¨ uchi, co-B¨ uchi, and Streett acceptance condition and in terms of weak and terminal automata. Weak and terminal automata are used in the veriﬁcation process and are checked for non-emptiness. For the second characterisation we introduce a new hierarchy (called UntilRelease hierarchy) of LTL formulas based on alternation depth of temporal operators Until and Release. We provide a relationship between the Until-Release hierarchy and the hierarchy by Manna and Pnueli [18]. Our new classiﬁcation provides us with an exact relationship between the type of a formula and the type of an automaton, which is checked for nonemptiness in the model checking process of the formula. In the second part of the paper we enquire into particular automata and analyse the complexity of their non-emptiness check in connection with both explicit and implicit representation of automata. This gives us an exact relationship between types of properties and the complexity of their veriﬁcation. Finally, we discuss the possibility of exact determination of the type of a formula. Due to space limitations complete proofs (and some formal deﬁnitions) are omitted and can be found in the full version of the paper [3]. Related Work. The previous work on veriﬁcation, which takes into account a classiﬁcation of properties, is partly devoted to the proof-based approach to veriﬁcation [4]. Papers on specialised model checking algorithms either cover only part of the hierarchy or have a heuristic nature. Vardi and Kupferman [15] study the model checking of safety properties. Schneider [19] is concerned with a translation of persistence properties into weak automata. Bloem and Somenzi study heuristics for the translation of a formula into weak (terminal) automaton [21] and suggest specialised algorithms for the non-emptiness problem [1]. Our work covers all types of properties and brings out the correspondence between the type and the complexity of non-emptiness check.

320

2

ˇ Ivana Cern´ a and Radek Pel´ anek

Hierarchy of Temporal Properties

The hierarchy studied by Manna and Pnueli [18] classiﬁes properties into six classes: guarantee, safety, obligation, persistence, recurrence, and reactivity properties. Deﬁnition 1 (Language-Theoretic View [18]). Let P ⊆ Σ ω be a property over Σ. – P is a safety property if there exists a language of ﬁnite words L ⊆ Σ ∗ such that for every w ∈ P all ﬁnite preﬁxes of w belong to L. – P is a guarantee property if there exists a language of ﬁnite words L ⊆ Σ ∗ such that for every w ∈ P there exists a ﬁnite preﬁx of w which belongs to L. – P is an obligation property if P can be expressed as a positive boolean combination of safety and guarantee properties. – P is a recurrence property if there exists a language of ﬁnite words L ⊆ Σ ∗ such that for every w ∈ P inﬁnitely many preﬁxes of w belong to L. – P is a persistence property if there exists a language of ﬁnite words L ⊆ Σ ∗ such that for every w ∈ P all but ﬁnitely many preﬁxes of w belong to L. – P is a reactivity property if P can be expressed as a positive boolean combination of recurrence and persistence properties. In what follows, the abbreviation κ-property stands for a property of one of the six above mentioned types. Inclusions, which relate the corresponding classes into a hierarchy, are depicted in Fig. 1. Classes which are higher up strictly contain classes which are lower down. 2.1

Automata View

Manna and Pnueli [18] have deﬁned the hierarchy of properties in terms of deterministic Streett predicate automata. Automata for considered classes of properties diﬀer in restrictions on their transition functions and acceptance conditions. In this section we provide a new characterisation of the hierarchy in terms of deterministic ω-automata which uses only restrictions on acceptance conditions (the transition function is always the same). We ﬁnd this characterisation more uniform and believe that it provides better insight into the hierarchy. On top of that we study other widely used types of ω-automata and show that each of them exactly corresponds to one class in the hierarchy. An ω-automaton is a tuple A = Σ, Q, q0 , δ, α, where Σ is a ﬁnite alphabet, Q is a ﬁnite set of states, q0 ∈ Q is an initial state, δ is a transition function, and α is an acceptance condition. The transition function determines four types of automata: deterministic, nondeterministic, universal, and alternating. A nondeterministic automaton has a transition function of the type δ : Q × Σ → 2Q . A run π of such an automaton on an inﬁnite word w = w(0)w(1) . . . over Σ is a sequence of states π = r0 , r1 , . . . such that r0 = q0 and ri+1 ∈ δ(ri , w(i)) for each i ≥ 0. A nondeterministic automaton accepts a word w if there exists an

Relating Hierarchy of Temporal Properties to Model Checking

321

accepting run (see below) on w. Universal automata are deﬁned in the same way, the only diﬀerence is that the universal automaton accepts a word w if all runs on w are accepting. Deterministic automata are such that |δ(q, a)| = 1 for all q ∈ Q, a ∈ Σ (there is a unique run on each word). Alternating automata form a generalisation of nondeterministic and universal automata. General automata Accepting cycle

Reactivity Streett

Recurrence

Persistence

Buchi ¨

Weak automata

co-Buchi ¨

¾ÄÌÄ

¾ÄÌÄ

Fully accepting cycle

Obligation Occ. Streett

Safety

Guarantee

Occ. co-Buchi ¨

Occ. Buchi ¨

½ÄÌÄ

½ÄÌÄ

Terminal aut. Reachability

Fig. 1. Relations between classes of the hierarchy and their diﬀerent characterisations. Classes which are higher up properly contain classes which are lower down. Classes on the same level are dual with respect to complementation, while the classes obligation and reactivity can be obtained by boolean combinations of properties from classes lower down

For a run π we deﬁne the inﬁnity set, Inf (π), to be the set of all states that appear inﬁnitely often in π and the occurrence set, Occ(π), to be the set of states that appear at least once in π. Acceptance conditions α are deﬁned with respect to inﬁnity set as follows: – B¨ uchi condition α ⊆ Q : a run π is accepting iﬀ Inf (π) ∩ α = ∅ – co-B¨ uchi condition α ⊆ Q : a run π is accepting iﬀ Inf (π) ∩ α = ∅ – Streett condition α = {G1 , R1 , . . . , Gn , Rn }, Gi , Ri ⊆ Q : a run π is accepting iﬀ ∀i : (Inf (π) ∩ Gi = ∅ ⇒ Inf (π) ∩ Ri = ∅) For every acceptance condition we can deﬁne its “occurrence” version [17,23] if Occ(π) substitues for Inf (π) (also called Staiger-Wagner acceptance). According to the acceptance condition we denote ω-automata as B¨ uchi and occurence B¨ uchi automata respectively and so on. A property P is deﬁned to be speciﬁable

322

ˇ Ivana Cern´ a and Radek Pel´ anek

Table 1. The expressivity – each of 24 possible inter-combinations of the transition function and acceptance condition corresponds to one of the six hierarchy classes B¨ uchi Deterministic recurrence Nondeterministic reactivity Universal recurrence Alternating reactivity

co-B¨ uchi persistence persistence reactivity reactivity

Streett reactivity reactivity reactivity reactivity

B¨ uchi guarantee persistence guarantee persistence

Occurrence co-B¨ uchi safety safety recurrence recurrence

Streett obligation persistence recurrence reactivity

by automata if there is an ω-automaton A which accepts a word w if and only if w ∈ P . Theorem 1. Let P be a property speciﬁable by automata. Then P is a guarantee, safety, obligation, persistence, recurrence, or reactivity property if and only if it is speciﬁable by a deterministic occurrence B¨ uchi, occurrence co-B¨ uchi, occurrence Streett, co-B¨ uchi, B¨ uchi, or Streett automaton respectively (see Table 1). Proof. For each class from the hierarchy a κ-automaton is deﬁned in [18] by posing speciﬁc restrictions on transition functions and acceptance conditions of deterministic Streett predicate automata. Using an adjustment of accepting conditions and a copy construction one can eﬀectively transform κ-automata to above mentioned automaton and vice versa. (Details can be found in the full version of the paper [3].) To make the picture complete we have examined other types of automata as well (see Table 1). For every possible combination of transition function and acceptance condition the class of speciﬁable properties exactly coincides with one class in the hierarchy. Results for inﬁnite occurrence acceptance conditions follow from [16]. Universal occurrence B¨ uchi and nondeterministic occurrence co-B¨ uchi automata can be determinised through the power set construction and thus they recognise the same classes as their deterministic counterparts. The other results for occurrence acceptance condition follow from [17]. 2.2

Linear Temporal Logic View

In this section we characterise the hierarchy of properties through a new hierarchy of LTL formulas based on an alternation depth. The set of LTL formulas is deﬁned inductively starting from a countable set AP of atomic propositions, Boolean operators, and the temporal operators X (Next), U (Until) and R (Release): Ψ := a | ¬Ψ | Ψ ∨ Ψ | Ψ ∧ Ψ | X Ψ | Ψ U Ψ | Ψ R Ψ LTL formulas are interpreted in the standard way on inﬁnite words over the alphabet 2AP . A property P is deﬁned to be speciﬁable by LTL if there is an LTL formula ϕ such that w |= ϕ if and only if w ∈ P . In recent years, considerable eﬀort has been devoted to the study of LTL hierarchies which were deﬁned with respect to the number of nested temporal

Relating Hierarchy of Temporal Properties to Model Checking

323

operators Until, Since, and Next ([10,22,14]). These hierarchies provide interesting characterizations of LTL deﬁnable languages. However, they do not seem to have a direct connection to the model checking problem. We propose a new hierarchy which is based on alternation depth instead of nested depth, and establish its connection with the hierarchy of properties. In the next Section we demonstrate that this classiﬁcation directly reﬂects the hardness of the veriﬁcation problem for particular properties. Let us deﬁne hierarchies ΣiLTL and ΠiLTL , which reﬂect alternations of Until and Release operators in formulas. We use the Σ/Π notation since the way the hierarchy is deﬁned strongly resembles the quantiﬁer alternation hierarchy of ﬁrst-order logic formulas or ﬁxpoints alternation hierarchy of µ-calculus formulas. Deﬁnition 2. The class Σ0LTL = Π0LTL is the least set containing all atomic propositions and closed under the application of boolean and Next operators. LTL is the least set containing ΠiLTL and closed under the applicaThe class Σi+1 tion of conjunction, disjunction, Next and Until operators. LTL is the least set containing ΣiLTL and closed under the applicaThe class Πi+1 tion of conjunction, disjunction, Next and Release operators. The following theorem shows that the type of a property and alternation depth of its speciﬁcation are closely related. Theorem 2. A property that is speciﬁable by LTL is a guarantee (safety, persistence, recurrence respectively) property if and only if it is speciﬁable by a formula from the class Σ1LTL (Π1LTL , Σ2LTL , Π2LTL respectively) (see Fig. 1). Proof. The proof makes use of the classiﬁcation of LTL formulas by Chang, Manna, and Pnueli [4]. Here every κ-property is syntactically characterised with the help of a κ-formula. We can transform any guarantee (safety, persistence, recurrence respectively) formula into an equivalent Σ1LTL (Π1LTL , Σ2LTL , Π2LTL respectively) formula. Theorem 3. A property is speciﬁable by LTL if and only if it is speciﬁable by a positive boolean combination of Σ2LTL and Π2LTL formulas. Therefore both ΣiLTL and ΠiLTL hierarchies collapse in the sense that every LTL formula is speciﬁable both by a Σ3LTL and Π3LTL formula.

3

Model Checking and Hierarchy of Properties

The model checking problem is to determine for a given reactive system K and a temporal formula ϕ whether the system satisﬁes the formula. A common approach to model checking of ﬁnite state systems and LTL formulas is to construct an automaton A¬ϕ for the negation of the property and to model the system

324

ˇ Ivana Cern´ a and Radek Pel´ anek

as an automaton K. The product automaton K × A¬ϕ is then checked for nonemptiness. The product automaton is a nondeterministic B¨ uchi automaton. For the formal deﬁnition of the problem and detailed description of the algorithm we refer to [5]. Our aim is to analyse the complexity of the non-emptiness check depending on the type of the veriﬁed property. As the complexity of the non-emptiness check is determined by attributes of an automaton, the question is whether for diﬀerent types of formulas one can construct diﬀerent types of automata. We give a comprehensive answer to this question in this section. In the next section we demonstrate how the complexity of the non-emptiness check varies depending on the type of automata. To classify nondeterministic B¨ uchi automata we adopt the criteria proposed by Bloem, Ravi, and Somenzi [1]. They diﬀerentiate general, weak, and terminal automata according to the following restrictions posed on their transition functions: - general : no restrictions - weak : there exists a partition of the set Q into components Qi and an ordering ≤ on these sets, such that for each q ∈ Qi , p ∈ Qj , if ∃a ∈ Σ : q ∈ δ(p, a) then Qi ≤ Qj . Moreover for each Qi , Qi ∩ α = ∅, in which case Qi is a rejecting component, or Qi ⊆ α, in which case Qi is an accepting component. - terminal : for each q ∈ α, a ∈ Σ it holds δ(q, a) = ∅ and δ(q, a) ⊆ α. Each transition of a weak automaton leads to a state in either the same or lower component. Consequently each run of a weak automaton gets eventually trapped within one component. The run is accepting iﬀ this component is accepting. The transition function of a terminal automaton is even more restricted – once a run of a terminal automaton reaches an accepting state the run is accepting regardless of the suﬃx. Terminal and weak automata are jointly called specialised automata. It shows up that the classes of properties speciﬁable by weak and terminal automata coincide with classes of the hierarchy. Theorem 4. A property P speciﬁable by automata is a guarantee (persistence) property if and only if it is speciﬁable by a terminal (weak) automaton. Theorem 4 raises a natural question whether and how eﬀectively one can construct for a given guarantee (persistence) formula the corresponding terminal (weak) automaton. A construction of an automaton for an LTL formula was ﬁrst proposed by Wolper, Vardi, and Sistla [24]. This basic construction has been improved in several papers ([12,21,9]) where various heuristics have been used to produce automaton as small and as “weak” as possible. Although these heuristics are quite sophisticated, they do not provide any insight into the relation between the formula and the “weakness” of the resulting automaton. Constructions for special types of properties can be found in [19,15]. We present a new modiﬁcation of the original construction which yields for a formula from the class Σ1LTL and Σ2LTL a specialised automaton. Theorem 5. For every Σ1LTL (Σ2LTL ) formula ϕ we can construct a terminal (weak) automaton accepting the property deﬁned by ϕ.

Relating Hierarchy of Temporal Properties to Model Checking

325

Proof. States of the automaton are sets of subformulas of the formula ϕ. The transition function is constructed in such a way that the following invariant is valid: if the automaton is in a state S then the remaining suﬃx of the word should satisfy all formulas in S. The acceptance condition is used to enforce the fulﬁllment of Until operators. For Σ1LTL and Σ2LTL formulas the acceptance condition can be simpliﬁed thanks to the special structure of alternation of Until and Release operators in the formula.

4

Non-emptiness Algorithms

In the previous section we showed that we can eﬀectively construct specialised automata for formulas from lower classes of the hierarchy. Since the veriﬁed system K can be modelled as an automaton without acceptance conditions, the type of the product automaton is determined entirely by the type of the automaton A¬ϕ , that is even the product automaton is specialised. In this section we revise both explicit and symbolic non-emptiness algorithms for diﬀerent types of automata. General Automata. For general automata the non-emptiness check is equivalent to the reachability of an accepting cycle (i.e. cycle cointaining an accepting state). The most eﬃcient explicit algorithm is the nested depth-ﬁrst search (DFS) algorithm [6,13]. With the symbolic representation one has to use nested ﬁxpoint computation (e.g. Emerson-Lei algorithm) with a quadratic number of symbolic steps (for an overview of symbolic algorithms see [11]). Weak Automata. States of a weak automaton are partitioned into components and therefore states from each cycle are either all accepting (the cycle is fully accepting) or all non-accepting. The non-emptiness problem is equivalent to the reachability of a fully accepting cycle. The explicit algorithm has to use only a single DFS [8] in this case. With the symbolic representation single ﬁxpoint computation [1] with a linear number of steps is suﬃcient. Terminal Automata. Once a terminal automaton reaches an accepting state, it accepts the whole word. Thus the non-emptiness of a terminal automaton can be decided by a simple reachability analysis. With the symbolic representation there is even asymptotical diﬀerence between the algorithms for general and specilized cases. All explicit algorithms have linear time complexity, but the use of specialized algorithms still brings several beneﬁts. Time and space optimalizations, “Guided search” heuristics [8], and the partial-order reduction [13] can be employed more directly for specialized algorithms. Algorithms for specialized automata can be more eﬀectively transformed to distributed ones [2]. These beneﬁts were already experimentally demonstrated. Edelkamp, Lafuente, and Leue [8] extended the explicit model checker SPIN by a non-emptiness algorithm which to a certain extent takes the type of an automaton into consideration. Bloem, Ravi, and Somenzi [1] performed experiments with symbolic algorithms and in [2] experiments with distributed algorithms are presented.

326

5

ˇ Ivana Cern´ a and Radek Pel´ anek

Conclusions

The paper provides a new classiﬁcation of temporal properties through deterministic ω-automata and through the Until-Release hierarchy. It provides eﬀective transformation of the Σ1LTL (Σ2LTL ) formula into terminal (weak) automaton and it argues that the non-emptiness problem for these automata can be solved more eﬃciently. It is decidable whether given formula speciﬁes property of type κ [3]. In a case that it is guarantee (persistence) formula it is possible to transform it into an equivalent Σ1LTL (Σ2LTL ) formula. Thus the new classiﬁcations provide us with exact relationship between the type of a formula and the type of the non-emptiness problem. The determination of the type of a formula and the transformation are rather expensive (even deciding whether a given formula speciﬁes a safety property is PSPACE-complete [20]). However, formulas are usually quite short and it is typical to make many tests for one ﬁxed formula. In such a case, the work needed for determining the type of the formula is amortised over its veriﬁcation. Moreover, most of the practically used formulas are simple. We have studied the Speciﬁcation Patterns System [7] that is a collection of the most often veriﬁed properties. It shows up that most of the properties can be easily transformed into terminal (41%) or weak (54%) automata. We conclude that model checkers should take the type of the property into account and use the specialized nonemptiness algoritms as often as possible.

References 1. R. Bloem, K. Ravi, and F. Somenzi. Eﬃcient decision procedures for model checking of linear time logic properties. In Proc. Computer Aided Veriﬁcation, volume 1633 of LNCS, pages 222–235. Springer, 1999. ˇ 2. I. Cern´ a and R. Pel´ anek. Distributed explicit fair cycle detection. In Proc. SPIN Workshop on Model Checking of Software, volume 2648 of LNCS, pages 49–73. Springer, 2003. ˇ 3. I. Cern´ a and R. Pel´ anek. Relating hierarchy of linear temporal properties to model checking. Technical Report FIMU-RS-2003-03, Faculty of Informatics, Masaryk University, 2003. http://www.ﬁ.muni.cz/informatics/reports/. 4. E. Y. Chang, Z. Manna, and A. Pnueli. Characterization of temporal property classes. In Proc. Automata, Languages and Programming, volume 623 of LNCS, pages 474–486. Springer, 1992. 5. E. M. Clarke, O. Grumberg, and D. A. Peled. Model Checking. The MIT Press, 1999. 6. C. Courcoubetis, M. Vardi, P. Wolper, and M. Yannakakis. Memory-eﬃcient algorithms for the veriﬁcation of temporal properties. Formal Methods in System Design, 1:275–288, 1992. 7. M. B. Dwyer, G. S. Avrunin, and J. C. Corbett. Property speciﬁcation patterns for ﬁnite-state veriﬁcation. In Proc. Workshop on Formal Methods in Software Practice, pages 7–15. ACM Press, 1998.

Relating Hierarchy of Temporal Properties to Model Checking

327

8. S. Edelkamp, A. L. Lafuente, and S. Leue. Directed explicit model checking with HSF-SPIN. In Proc. SPIN Workshop on Model Checking of Software, volume 2057 of LNCS, pages 57–79. Springer, 2001. 9. K. Etessami and G. J. Holzmann. Optimizing B¨ uchi automata. In Proc. CONCUR, volume 1877 of LNCS, pages 153–167. Springer, 2000. 10. K. Etessami and T. Wilke. An Until hierarchy for temporal logic. In Proc. IEEE Symposium on Logic in Computer Science, pages 108–117. Computer Society Press, 1996. 11. K. Fisler, R. Fraer, G. Kamhi Y. Vardi, and Zijiang Yang. Is there a best symbolic cycle-detection algorithm? In Proc. Tools and Algorithms for Construction and Analysis of Systems, volume 2031 of LNCS, pages 420–434. Springer, 2001. 12. R. Gerth, D. Peled, M. Y. Vardi, and P. Wolper. Simple on-the-ﬂy automatic veriﬁcation of linear temporal logic. In Proc. Protocol Speciﬁcation Testing and Veriﬁcation, pages 3–18. Chapman & Hall, 1995. 13. G. J. Holzmann, D. Peled, and M. Yannakakis. On nested depth ﬁrst search. In Proc. SPIN Workshop, pages 23–32. American Mathematical Society, 1996. 14. A. Kuˇcera and J. Strejˇcek. The stuttering principle revisited: On the expressiveness of nested X and U operators in the logic LTL. In Proc. Computer Science Logic, volume 2471 of LNCS, pages 276–291. Springer, 2002. 15. O. Kupferman and M. Y. Vardi. Model checking of safety properties. Formal Methods in System Design, 19(3):291–314, 2001. 16. C. L¨ oding. Methods for the transformation of omega-automata: Complexity and connection to second order logic. Master’s thesis, Christian-Albrechts-University of Kiel, 1998. 17. C. L¨ oding and W. Thomas. Alternating automata and logics over inﬁnite words. In Proc. IFIP International Conference on Theoretical Computer Science, volume 1872 of LNCS, pages 521–535. Springer, 2000. 18. Z. Manna and A. Pnueli. A hierarchy of temporal properties. In Proc. ACM Symposium on Principles of Distributed Computing, pages 377–410. ACM Press, 1990. 19. K. Schneider. Improving automata generation for linear temporal logic by considering the automaton hierarchy. In Proc. Logic for Programming, Artiﬁcial Intelligence, and Reasoning, volume 2250 of LNCS, pages 39–54. Springer, 2001. 20. A. P. Sistla. Safety, liveness, and fairness in temporal logic. Formal Aspects of Computing, 6(5):495–512, 1994. 21. F. Somenzi and R. Bloem. Eﬃcient B¨ uchi automata from LTL formulae. In Proc. Computer Aided Veriﬁcation, volume 1855 of LNCS, pages 248–263. Springer, 2000. 22. D. Therien and T. Wilke. Nesting Until and Since in linear temporal logic. In Proc. Symposium on Theoretical Aspects of Computer Science, volume 2285 of LNCS, pages 455–464. Springer, 2002. 23. W. Thomas. Languages, automata and logic. In Handbook of Formal Languages, volume 3, pages 389–455. Springer, 1997. 24. P. Wolper, M.Y. Vardi, and A.P. Sistla. Reasoning about iniﬁnite computation paths. In Proc. Symp. on Foundations of Computer Science, pages 185 – 194, Tuscon, 1983.

Arithmetic Constant-Depth Circuit Complexity Classes Hubie Chen Department of Computer Science, Cornell University Ithaca, NY 14853, USA [email protected]

Abstract. The boolean circuit complexity classes AC 0 ⊆ AC 0 [m] ⊆ T C 0 ⊆ N C 1 have been studied intensely. Other than N C 1 , they are deﬁned by constant-depth circuits of polynomial size and unbounded fan-in over some set of allowed gates. One reason for interest in these classes is that they contain the boundary marking the limits of current lower bound technology: such technology exists for AC 0 and some of the classes AC 0 [m], while the other classes AC 0 [m] as well as T C 0 lack such technology. Continuing a line of research originating from Valiant’s work on the counting class P , the arithmetic circuit complexity classes AC 0 and N C 1 have recently been studied. In this paper, we deﬁne and investigate the classes AC 0 [m] and T C 0 , new arithmetic circuit complexity classes that are deﬁned by constant-depth circuits and are analogues of the classes AC 0 [m] and T C 0 .

1

Introduction

The study of counting complexity was initiated by Valiant’s work on P , the class of functions mapping a string x to the number of accepting paths of x on an NP-machine [11]. The class L, deﬁned similarly but with NL-machines, has also been studied [13,10,6,5]. Both P and L can be obtained by “arithmetizing” boolean circuit characterizations of N P and N L given in [12]. To arithmetize a boolean circuit, we propagate all NOT gates to the input level, convert OR gates to addition (+) gates, and convert AND gates to multiplication (∗) gates. Viewing the inputs to the circuit as taking on the values 0, 1 from the natural numbers, we obtain circuits which map naturally from the binary strings {0, 1}∗ to the natural numbers. More recently, the arithmetic classes AC 0 , BW BP , N C 1 , and SAC 1 have been deﬁned and studied [1,7,3,8,4,13]. Other than BW BP , these classes are arithmetic versions of boolean classes typically deﬁned by circuits, and arise from arithmetizing the corresponding boolean circuits. These classes obey the inclusion chain AC 0 BW BP ⊆ N C 1 ⊆ SAC 1 , which essentially mirrors the known relationships AC 0 BW BP = N C 1 ⊆ SAC 1 of boolean classes. Lying inbetween the boolean classes AC 0 and N C 1 are a hierarchy of classes AC 0 [m] and the class T C 0 , which have been studied extensively. (For any m, B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 328–337, 2003. c Springer-Verlag Berlin Heidelberg 2003

Arithmetic Constant-Depth Circuit Complexity Classes

329

we have the inclusions AC 0 ⊆ AC 0 [m] ⊆ T C 0 ⊆ N C 1 .) Not only have these classes given insight into the structure of N C 1 , but the class T C 0 captures the complexity of natural problems such as multiplication and division, while the AC 0 [m] hierarchy is particularly interesting since it contains the boundary marking the limits of current lower bounds technology. In this paper, we introduce the classes AC 0 [m] and T C 0 , arithmetic versions of the boolean classes AC 0 [m] and T C 0 . Just as AC 0 [m] and T C 0 give a reﬁned view of N C 1 , our new arithmetic classes reﬁne N C 1 . Shadowing their boolean counterparts, these classes fall into the inclusion chain AC 0 ⊆ AC 0 [m] ⊆ T C 0 ⊆ BW BP ⊆ N C 1 . Both the original boolean classes and the new arithmetic classes are deﬁned by constant-depth circuits, which in this paper are always of unbounded fan-in and polynomial size. The class AC 0 [m] (respectively T C 0 ) consists of those functions computable by constant-depth circuits with AND, OR, and MOD m (respectively MAJORITY) gates. In order to deﬁne the classes AC 0 [m] and T C 0 , we introduce arithmetic extensions of the functions MOD m and MAJORITY. Then, we arithmetize AC 0 [m] and T C 0 as above, but in addition convert MOD m and MAJORITY gates into their arithmetic extensions. While T C 0 is shown to be equal to its boolean analogue T C 0 , our deﬁnition of AC 0 [m] begets additional complexity classes. By deﬁning Diﬀ AC 0 [m] to consist of those functions equal to the diﬀerence of two AC 0 [m] functions, we obtain another hierarchy of classes, which includes the already studied Diﬀ AC 0 [1,7,3] at the bottom. Moreover, we deﬁne two generic operators on arithmetic function classes; acting on the classes Diﬀ AC 0 [m] and AC 0 [m] with these operators gives us new language classes. This paper focuses on studying the structure of three hierarchies: the AC 0 [m] hierarchy, the Diﬀ AC 0 [m] hierarchy, and a hierarchy of language classes. The hierarchy of language classes includes the classes AC 0 [m] and the classes induced by applying the mentioned operators to the classes AC 0 [m] and Diﬀ AC 0 [m]. We prove class separations and containments where possible. Although making unconditional statements about these classes would in many cases require new lower bounds, it is often possible to show that a question in one hierarchy is equivalent to a question in another. For instance, we prove that the hierarchy of the classes AC 0 [m] has exactly the same structure as the hierarchy of the classes AC 0 [m]: AC 0 [m] ⊆ AC 0 [m ] if and only if AC 0 [m] ⊆ AC 0 [m ]. We also investigate closure properties of the classes AC 0 [m] and Diﬀ AC 0 [m]. The closure properties proved here generalize those appearing in previous work [1,7,3]. One reason why this plethora of new classes is interesting is that it oﬀers rephrasings of open questions. Not only does the AC 0 [m] hierarchy have the same structure as the AC 0 [m] hierarchy, but the classes AC 0 [m] seem to give an alternate decomposition1 of T C 0 . As a result, any question regarding the structure of the boolean AC 0 [m] hierarchy can be rephrased as a question concerning arithmetic classes, oﬀering a new line of attack on such questions. As mentioned, the AC 0 [m] hierarchy is particularly important because it contains 1

Note that by Theorem 7, F AC 0 [m] is properly contained in AC 0 [m], assuming that AC 0 [m] = T C 0 .

330

Hubie Chen

both classes for which we have lower bounds technology, and classes for which we do not: such technology exists for the class AC 0 [m] when m is a prime power, but not when m is a composite with two or more distinct prime factors [9]. Note that ours are not the ﬁrst results providing an interface between boolean and arithmetic circuit complexity, in the constant-depth setting: an intriguing result obtained by Agrawal et al. is that deciding whether or not two AC 0 circuits are equal characterizes exactly T C 0 [1]. Another reason to be interested in the classes introduced here is that they provide reﬁnements of open questions, which may be more tractable than the original questions. For instance, for any odd positive integer m, new language classes sitting inbetween AC 0 [m] and AC 0 [2m] are induced by our arithmetic classes. These new classes oﬀer us the ability to “interpolate” between existing classes. In particular, when m is an odd prime, our new classes sit inbetween a class (AC 0 [m]) for which we have lower bound technology, and a class (AC 0 [2m]) for which we do not. Thus, there is the natural question of whether or not one can prove lower bounds using one of these new classes. Establishing lower bounds technology for one of these new classes is necessarily no more diﬃcult than doing so for AC 0 [2m], since the new classes are contained in AC 0 [2m]. Studying these “reﬁned” questions is not only independently interesting, but may give insight into the original questions. The contents of this paper are as follows. In Section 2, we deﬁne the complexity classes to be studied, as well as the operators on arithmetic classes. In Section 3, we study the language classes induced by the arithmetic classes. Section 4 contains a normal form theorem, which essentially states that our arithmetic circuits need only use the arithmetic MOD and MAJORITY gates on 0-1 valued inputs. This theorem in turn allows us to show that the AC 0 [m] hierarchy is isomorphic to the AC 0 [m] hierarchy. Section 5 studies the classes Diﬀ AC 0 [m]; focus is given to the question of whether or not Diﬀ AC 0 [m] = Diﬀ AC 0 [2m]. (This equality is unconditionally true when m = 1.) Aided by the notion of normal form, we derive a number of closure properties in Section 6. For more background on circuit complexity, we refer the reader to the book [14], which contains a chapter on arithmetic circuit complexity; the survey [2] is also a good source of information on arithmetic circuit complexity.

2

Preliminaries

We let N denote the set of natural numbers, {0, 1, 2, . . .}; and, we let N+ denote the set of positive integers, {1, 2, 3, . . .}. The complexity classes that we study in this paper are deﬁned by constant depth circuits; what varies among the deﬁnitions of the classes are the types of gates allowed. We will instantiate the following deﬁnition with diﬀerent bases of gates to deﬁne our classes. Deﬁnition 1. We say that a function is computable by AC 0 circuits over the basis B if the function can be computed by a family of constant depth, polynomial size circuits of unbounded fan-in with gates from B and inputs from {0, 1, xi , xi }. (By the size of a circuit, we mean the number of gates plus the number of wires.)

Arithmetic Constant-Depth Circuit Complexity Classes

2.1

331

Boolean Classes

The boolean functions and classes in the next two deﬁnitions have been studied in past work. Deﬁnition 2. We deﬁne the following boolean functions: – MOD m on inputs x1 , . . . , xk takes on the value 1 if the number of xi ’s that are nonzero is a multiple of m; and the value 0 otherwise. – MAJORITY on inputs x1 , . . . , xk takes on the value 1 if the number of xi ’s that are nonzero is strictly greater than k/2; and the value 0 otherwise. Deﬁnition 3. We deﬁne the following classes of boolean functions2 : – AC 0 (F AC 0 ) is the class of functions computable by AC 0 circuits over the basis of boolean functions {AND, OR} with exactly one output gate (one or more output gates). – AC 0 [m1 , . . . , ml ] (F AC 0 [m1 , . . . , ml ]) is the class of functions computable by AC 0 circuits over the basis of boolean functions {AND, OR, MOD m1 , . . . , MOD ml } with exactly one output gate (one or more output gates). (This deﬁnition is for all {m1 , . . . , ml } ⊆ N+ .) – T C 0 (F T C 0 ) is the class of functions computable by AC 0 circuits over the basis of boolean functions {AND, OR, MAJORITY} with exactly one output gate (one or more output gates). We note a lower bound due to Smolensky. Theorem 1. [9] If p and q are distinct primes, then MOD q ∈ / AC 0 [p]. This separates the classes AC 0 [q] and AC 0 [p] (for p, q distinct primes), and will allow us to derive separations of some of the classes which we introduce. 2.2

Arithmetic Classes

We give arithmetic versions of the deﬁnitions of the functions MOD m and MAJORITY. Deﬁnition 4. Deﬁne η : N → N so that η(x) is 1 if x = 0, and equal to x otherwise. We deﬁne the following arithmetic functions: k – AMOD m on inputs x1 , . . . , xk takes on the value i=1 η(xi ) if the number of xi ’s that are nonzero is a multiple of m; and the value 0 otherwise. 2

Although these classes are often deﬁned so that NOT gates are allowed anywhere in the corresponding circuits, it is an easy exercise to show that such deﬁnitions are equivalent to ours. Our deﬁnitions will make proving various properties more convenient.

332

Hubie Chen

k – AMAJORITY on inputs x1 , . . . , xk takes on the value i=1 η(xi ) if the number of xi ’s that are nonzero is strictly greater than k/2; and the value 0 otherwise. Notice that these arithmetic functions coincide with their boolean counterparts on 0-1 valued inputs, typing issues aside3 . With these new functions in hand, we can now deﬁne the arithmetic complexity classes to be studied in this paper. The deﬁnition is parallel to Deﬁnition 3. Deﬁnition 5. We deﬁne the following classes of functions from {0, 1}∗ to N; the functions +, ∗ denote the usual arithmetic sum and product in N, and the input gates are interpreted as the values 0, 1 in N. – AC 0 is the class of functions computable by AC 0 circuits over the basis {+, ∗}. – AC 0 [m1 , . . . , ml ] is the class of functions computable by AC 0 circuits over the basis {+, ∗, AMOD m1 , . . . , AMOD ml }. (This deﬁnition is for all {m1 , . . . , ml } ⊆ N+ .) – T C 0 is the class of functions computable by AC 0 circuits over the basis {+, ∗, AMAJORITY}. 2.3

Operators on Arithmetic Classes

By way of some generic operators deﬁned on arithmetic classes, we will obtain yet more complexity classes. The class Diﬀ AC 0 was studied in [1,7,3]; we generalize it here. Deﬁnition 6. We deﬁne the following classes of functions from {0, 1}∗ to Z: – DiﬀAC 0 is the class of functions expressible as the diﬀerence of two AC 0 functions. (That is, DiﬀAC 0 = {f − g : f, g ∈ AC 0 }.) – DiﬀAC 0 [m1 , . . . , ml ] is the class of functions expressible as the diﬀerence of two AC 0 [m1 , . . . , ml ] functions. The following two operators allow us to obtain language classes from arithmetic classes. Deﬁnition 7. Suppose that C is a class of functions from {0, 1}∗ to Z. Deﬁne χC to be the class of languages with characteristic function in C. Deﬁne LowOrdC to be the class of languages with characteristic function equal to the low order bit of a function in C (i.e., the value of a function in C modulo two). 3

In this paper, we will generally ignore such typing issues, and associate the boolean values 0, 1 with the natural numbers 0, 1.

Arithmetic Constant-Depth Circuit Complexity Classes

2.4

333

Unambiguous Circuits

We now observe a basic fact: that our arithmetic classes are at least as powerful as their boolean analogues4 . Lemma 1. F AC 0 ⊆ AC 0 , F AC 0 [m1 , . . . , ml ] ⊆ AC 0 [m1 , . . . , ml ] (for all m1 , . . . , ml ∈ N), and F T C 0 = T C 0 .

3

Language Classes

In this section, we study the language classes that result by allowing the operators χ and LowOrd to act on the arithmetic classes. We ﬁrst show that the characteristic functions in each of the function classes AC 0 , AC 0 [m1 , . . . , ml ], T C 0 , are exactly the corresponding boolean language classes5 . Lemma 2. AC 0 = χAC 0 , AC 0 [m1 , . . . , ml ] = χAC 0 [m1 , . . . , ml ], and T C 0 = χT C 0 . This lemma immediately allows us to derive separations of the arithmetic classes. Theorem 2. For all primes p, AC 0 AC 0 [p]. For all distinct primes p, q, AC 0 [p] \ AC 0 [q] is nonempty. The other language classes we get by operating on AC 0 [m] and Diﬀ AC 0 [m] fall into a chain of inclusions bounded below and above by AC 0 [m] and AC 0 [2, m], respectively. Theorem 3. For all odd positive integers m, we have AC 0 [m] ⊆ χDiﬀAC 0 [m] ⊆ LowOrdDiﬀAC 0 [m] = LowOrdAC 0 [m] ⊆ AC 0 [2, m]. If m is an even positive integer, all of the classes coincide.

4

Arithmetic Classes

We now study the relationships of the arithmetic classes to each other, and to the boolean classes. We begin by proving that every function from an arithmetic class can be computed by circuits in normal form: circuits where the inputs (and hence outputs) of the AMOD (or AMAJORITY) gates always have 0-1 values. This notion of normal form will allow us to derive many facts concerning the structure of the classes AC 0 [m], and will also aid us in proving closure properties. Deﬁnition 8. Let us say that a AC 0 [m1 , . . . , ml ] (T C 0 ) circuit is in normal form if on all inputs x, all AMOD (AMAJORITY) gates appearing in the circuit receive as inputs only the values 0 and 1. We say that a AC 0 [m1 , . . . , ml ] (T C 0 ) circuit family is in normal form if all of its circuits are in normal form. 4

5

Note that we view the classes F AC 0 , F AC 0 [m1 , . . . , ml ], and F T C 0 as classes of functions from {0, 1}∗ to N, in order to compare them with the corresponding arithmetic classes. To do this, we view a string of bits as a natural number in the i usual way: the string yn . . . y0 represents the natural number n 2 yi . i=0 By associating languages with their characteristic functions, we view the classes AC 0 , AC 0 [m1 , . . . , ml ], and T C 0 as language classes.

334

Hubie Chen

Theorem 4. For every function f in AC 0 [m1 , . . . , ml ] (T C 0 ), there is a AC 0 [m1 , . . . , ml ] (T C 0 ) circuit family in normal form computing f . In some sense, what we are showing is that only the boolean function MOD (MAJORITY) is required in arithmetic circuits, to capture the full power of AC 0 [m1 , . . . , ml ] (T C 0 ). We now show that the structure of the classes AC 0 [m] is isomorphic to the structure of the classes AC 0 [m], with respect to the subset relation ⊆. Theorem 5. For all positive integers m1 , m2 , AC 0 [m1 ] ⊆ AC 0 [m2 ] if and only if AC 0 [m1 ] ⊆ AC 0 [m2 ]. Corollary 1. For all positive integers m1 , m2 , AC 0 [m1 ] = AC 0 [m2 ] if and only if AC 0 [m1 ] = AC 0 [m2 ]. Thus far, we have sometimes restricted our attention to the classes AC 0 [m]; the next theorem justiﬁes this restriction, showing that any class AC 0 [m1 , . . . , ml ] is equivalent to some class AC 0 [m]. Theorem 6. For every subset {m1 , . . . , ml } ⊆ N+ , we have AC 0 [m1 , . . . , ml ] l = AC 0 [ i=1 mi ] = AC 0 [p1 , . . . , pk ] where p1 , . . . , pk are the primes dividing l i=1 mi . The following two corollaries can be derived by making simple modiﬁcations to the proof of Theorem 5, along with the fact that AC 0 [m] ⊆ T C 0 . Corollary 2. For all positive integers m, AC 0 [m] ⊆ T C 0 . Corollary 3. For all positive integers m, AC 0 [m] = T C 0 if and only if AC 0 [m] = T C 0 . The next theorem demonstrates that for a particular m, the AC 0 [m] versus T C 0 question is equivalent to the F AC 0 [m] versus AC 0 [m] question. Theorem 7. Let m be a positive integer. Either F AC 0 [m] = AC 0 [m] = F T C 0 , or F AC 0 [m] AC 0 [m] F T C 0 .

5

Diﬀerence Classes

We now focus on the diﬀerence classes Diﬀ AC 0 [m]. First, we observe a separation. Theorem 8. For all odd primes p, DiﬀAC 0 DiﬀAC 0 [p]. In the case of Diﬀ AC 0 [2], we have class equality with Diﬀ AC 0 . Roughly, this is because the boolean function MOD 2 is contained in Diﬀ AC 0 .

Arithmetic Constant-Depth Circuit Complexity Classes

335

Theorem 9. DiﬀAC 0 = DiﬀAC 0 [2]. There is the more general question of whether or not Diﬀ AC 0 [m] = Diﬀ AC 0 [2m] for m > 1. We are not able to answer this question unconditionally, but can connect it to a question concerning language classes: equality holds if and only if AC 0 [2m] ⊆ χDiﬀ AC 0 [m]. Theorem 10. Let m be a positive integer. The following are equivalent: 1. AC 0 [2m] ⊆ DiﬀAC 0 [m] 2. DiﬀAC 0 [2m] = DiﬀAC 0 [m] 3. AC 0 [2m] ⊆ χDiﬀAC 0 [m]

6

Closure Properties

6.1

Maximum and Minimum

Theorem 11. Let m be a positive integer. Neither AC 0 [m] nor DiﬀAC 0 [m] is closed under MAX, unless AC 0 [2m] = T C 0 . The same theorem holds for MIN in place of MAX. 6.2

Division by a Constant

For a positive integer c, we say that a function class C is closed under division by c if f ∈ C implies that fc ∈ C. Theorem 12. Let m be a positive integer and let p be a prime. The class AC 0 [p, (p − 1), m] is closed under division by p. Corollary 4. Let m be a positive integer and let p be a prime. The class DiﬀAC 0 [p, (p − 1), m] is closed under division by p. The idea behind the proofs of Theorem 12 and Corollary 4 is to modify AC 0 [p, (p − 1), m] circuits inductively so that for each gate g in the original circuit, both the remainder of g (modulo p) and gp are computed in the modiﬁed g from the circuit6 . The MOD p − 1 gates are used to compute the remainder of remainders of its inputs gi , when g is a multiplication gate (i.e., g = gi ). In the direction of proving converses of Theorem 12 and Corollary 4, we have the following theorem and corollary. Some new terminology is required for their statement. We call a function f non-trivial if it is not constant. We say that a function f is symmetric if it is non-trivial and for any x1 , . . . , xn ∈ n n {0, 1} and x1 , . . . , xn ∈ {0, 1}, i=1 xi = i=1 xi implies f (x1 , . . . , xn ) = f (x1 , . . . , xn ). We say that a symmetric function f has period k if k ≥ 1, and for n n any x1 , . . . , xn ∈ {0, 1} and x1 , . . . , xn ∈ {0, 1}, i=1 xi = k + i=1 xi implies f (x1 , . . . , xn ) = f (x1 , . . . , xn ). 6

In light of the equalities GapAC 0 = Diﬀ AC 0 = Diﬀ AC 0 [2], Corollary 4, when instantiated with p = 2 and m = 1, gives [3, Theorem 8].

336

Hubie Chen

Theorem 13. Let m be a positive integer. If AC 0 [m] is closed under division by p, then there exist symmetric functions with periods p and p−1 in DiﬀAC 0 [m]. Corollary 5. Let m be a positive integer. If AC 0 [m] is closed under division by p, then MOD p ∈ AC 0 [2m], and there exists a divisor q > 1 of p − 1 such that MOD q ∈ AC 0 [2m]. 6.3

Choose

We say that a function class C is closed under the choose operation if f ∈ C implies that fk ∈ C for every positive integer k. Theorem 14. Let m be a positive integer. The classes AC 0 [m] and DiﬀAC 0 [m] are closed under the choose operation.

7

Future Work

We identify some open issues as possible avenues for future work. – Are there combinatorial problems complete for the classes AC 0 [m]? A characterization of AC 0 by the problem of counting paths in a certain class of graphs is given in [3]. – Can Corollary 5 be improved to show that under the hypotheses, MOD p − 1 is in AC 0 [2m] (as opposed to just MOD q for a non-trivial divisor q of p−1)? – Can one generalize the characterization of Diﬀ AC 0 = Diﬀ AC 0 [2] as GapAC 0 (given in [7])? Access to the complex second roots of unity {1, −1} seems to be what gives GapAC 0 the ability to compute MOD 2. If we allow AC 0 the third roots of unity as constants, we obtain circuits that can compute both MOD 2 and MOD 3. What languages can be computed by AC 0 circuits when the pth roots of unity (for a prime p) are allowed as constants? Can give a general characterization of such circuits which includes the result of [7] as a particular case? – Another possibility is to study the power of arithmetic circuits when the underlying algebraic structure is a group ring, such as NG for some ﬁnite group G. If the constants {1g : g ∈ G} are allowed in circuits over NG that can multiply, then for all m dividing |G|, MOD m is computable. – Suppose that C1 , C2 are classes from {AC 0 [m] : m ≥ 2} ∪ {T C 0 }. It was demonstrated that C1 = C2 if and only if C1 = C2 (Corollaries 1 and 3). It is the case that T C 0 = N C 1 implies T C 0 = N C 1 (since χT C 0 = T C 0 , and χN C 1 = N C 1 ). Does the converse hold? – Of course, this list would not be complete without a request for new lower bounds. We conjecture that for odd primes p, χDiﬀ AC 0 [p] is properly contained in AC 0 [2p] (and hence that Diﬀ AC 0 [p] = Diﬀ AC 0 [2p]). Can this be proved? More generally, can one prove any lower bounds using the classes χDiﬀ AC 0 [p]?

Arithmetic Constant-Depth Circuit Complexity Classes

337

Acknowledgements The author would like to thank Eric Allender for many interesting discussions. Riccardo Pucella deserves thanks for his useful comments on a draft of this paper.

References 1. M. Agrawal, E. Allender, and S. Datta. On TC0 , AC0 , and arithmetic circuits. In Proceedings 12th Computational Complexity, pages 134–148. IEEE Computer Society Press, 1997. 2. E. Allender. Making computation count: arithmetic circuits in the nineties. SIGACT News, 28(4):2–15, 1998. 3. E. Allender, A. Ambainis, D. A. Mix Barrington, S. Datta, and H. LˆeThanh. Bounded depth arithmetic circuits: Counting and closure. In Proceedings 26th International Colloquium on Automata, Languages, and Programming (ICALP), Lecture Notes in Computer Science 1644, pages 149–158, 1999. 4. E. Allender, J. Jiao, M. Mahajan, and V. Vinay. Non-commutative arithmetic circuits: depth reduction and size lower bounds. Theoretical Computer Science, 209:47–86, 1998. 5. E. Allender and M. Ogihara. Relationships among PL, #L, and the determinant. RAIRO – Theoretical Informatics and Applications, 30:1–21, 1996. ` 6. C. Alvarez and B. Jenner. A very hard log space counting class. Theoretical Computer Science, 107:3–30, 1993. 7. A. Ambainis, D. A. Mix Barrington, and H. LˆeThanh. On counting AC 0 circuits with negative constants. In Proceedings 23rd Mathematical Foundations of Computer Science, Lecture Notes in Computer Science 1450, pages 409–417, Berlin, 1998. Springer-Verlag. 8. H. Caussinus, P. McKenzie, D. Th´erien, and H. Vollmer. Nondeterministic NC1 computation. Journal of Computer and System Sciences, 57:200–212, 1998. 9. R. Smolensky. Algebraic methods in the theory of lower bounds for Boolean circuit complexity. In Proceedings 19th Symposium on Theory of Computing, pages 77–82. ACM Press, 1987. 10. S. Toda. Classes of arithmetic circuits capturing the complexity of computing the determinant. IEICE Transactions on Communications/Electronics/Information and Systems, E75-D:116–124, 1992. 11. L. G. Valiant. The complexity of computing the permanent. Theoretical Computer Science, 8:189–201, 1979. 12. H. Venkateswaran. Circuit deﬁnitions of non-deterministic complexity classes. SIAM Journal on Computing, 21:655–670, 1992. 13. V. Vinay. Counting auxiliary pushdown automata and semi-unbounded arithmetic circuits. In Proceedings 6th Structure in Complexity Theory, pages 270–284. IEEE Computer Society Press, 1991. 14. H. Vollmer. Introduction to Circuit Complexity. Springer-Verlag, 1999.

Inverse NP Problems Hubie Chen Department of Computer Science, Cornell University Ithaca, NY 14853, USA [email protected]

Abstract. One characterization of the class NP is as the class of all languages for which there exists a polynomial-time veriﬁer with the following properties: for every member of the language, there exists a polynomiallysized proof causing the veriﬁer to accept; and, for every non-member, there is no proof causing the veriﬁer to accept. Relative to a particular veriﬁer, every member x of the language induces a set of proofs, namely, the set of proofs causing the veriﬁer to accept x. This paper studies the complexity of deciding, given a set Π of proofs, whether or not there exists some x inducing Π (relative to a particular veriﬁer). We call this decision problem the inverse problem for the veriﬁer. We introduce a new notion of reduction suited for inverse problems, and use it to classify as coNP-complete the inverse problems for the “natural” veriﬁers of many NP-complete problems.

1

Introduction

By now, the complexity class NP has become one of the most pervasive and successful notions in computer science. Indeed, the intimately connected notion of NP-completeness is described as “computer science’s favorite paradigm, fad, punching bag, buzzword, alibi, and intellectual export” in a lively discussion on the inﬂuence and nature of NP-completeness [11]. There are many equivalent characterizations of the class NP. According to one such characterization, a language is in NP if there is an eﬃcient (polynomial-time) veriﬁer with the following properties: for every member of the language, there exists a polynomially-sized proof causing the veriﬁer to accept (upon being given the member along with the proof); and, for every non-member, there is no proof causing the veriﬁer to accept. Thus, with a veriﬁer for a language in hand, deciding membership in the language amounts to deciding whether or not a given potential member has a proof or not. Inverse Problems. Relative to a particular veriﬁer, every member x of the language induces a set of proofs, namely, the set of proofs causing the veriﬁer to accept x as a member. Call a set of proofs arising in this way induced. This paper studies a simply stated and natural question: what is the complexity of deciding, given a set Π of proofs, whether or not Π is induced? We call this decision problem the inverse problem for the veriﬁer. B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 338–347, 2003. c Springer-Verlag Berlin Heidelberg 2003

Inverse NP Problems

339

As a ﬁrst example, consider the natural veriﬁer for 3-SAT, which accepts a 3-SAT formula and an assignment to the variables of the formula when the assignment satisﬁes the formula; call the inverse problem for this veriﬁer inverse 3-sat. Inverse 3-sat is then the question of deciding, given a set Π of assignments, if there is a 3-SAT formula with exactly Π as its set of satisfying assignments. It is fairly straightforward to show that inverse 3-sat is in coNP: for any set of assignments Π, we can eﬃciently compute the “candidate formula” F containing all 3-clauses satisﬁed by all assignments in Π. It can be veriﬁed that if any 3-SAT formula has Π as set of satisfying assignments, then the candidate formula is such a formula. Intuitively, this is because we placed as many clauses as possible in the candidate formula F , so any 3-SAT formula including Π in its induced set can only be less constrained – that is, admit more satisfying assignments – than F . Thus, inverse 3-sat is in coNP: to show that a set Π is not in inverse 3-sat, it suﬃces to give an assignment outside of Π satisﬁed by the candidate formula (of Π). In previous work, Kavvadias and Sideri [8] considered the inverse satisﬁability problem, which includes as a speciﬁc case the inverse 3-sat problem. (It is from them that we adopt the term inverse problem.) They examined the class of generalized boolean satisﬁability problems on which Schaefer’s dichotomy theorem [12] was proved – a class which includes Horn SAT and 2-SAT, two well-known tractable subclasses of SAT; as well as not-all-equal SAT and one-in-three SAT, two intractable variants of SAT. For any satisﬁability problem from this generalized class, the inverse problem is always in coNP by an argument similar to that given above. The intriguing result they obtained is that the inverse satisﬁability problem is intractable (coNP-complete) if and only if the original satisﬁability problem is intractable (NP-complete)! Thus, it follows from their result that inverse 3-sat is coNP-complete because 3-SAT is NP-complete, as well as that inverse 2-sat is in P because 2-SAT is in P. To give another example of an inverse problem, we begin with the language k-clique consisting of pairs G, k such that G is an undirected graph with a clique of size k. Consider the veriﬁer for this language which accepts H as a proof for G, k if H is a size k subset of the vertex set of G that forms a clique in G. Call the inverse problem for this veriﬁer inverse k-clique. Then, a wellformed input to the inverse k-clique problem is a set Π = {H1 , . . . , Hm }, where each Hi is a set of size k; and the problem is to decide whether or not there is a graph G such that {H1 , . . . , Hm } is exactly the set of k-cliques in G. In analogy to the “candidate formula” for an instance of inverse 3-sat, we can compute from an instance Π of inverse k-clique a “candidate graph” G with the property that if any graph has Π as its set of k-cliques, then G is such a graph: our candidate graph has vertex set ∪m i=1 Hi , with an edge between two vertices if the two vertices are both contained in Hi , for some i. Thus, inverse k-clique is in coNP: to show that Π is not in inverse k-clique, it suﬃces to demonstrate that the candidate graph for Π has a k-clique not in Π. It will be demonstrated in this paper that inverse k-clique is also coNP-complete.

340

Hubie Chen

These examples might seem to be evidence for the conjecture that the inverse problem for veriﬁers accepting NP-complete languages will always be coNPcomplete. However, a fairly succinct argument demonstrates that this conjecture fails. Suppose we take the veriﬁer for k-clique and deﬁne a new veriﬁer which accepts only “padded” proofs of the form s, H where H is a clique and s is a string with, say, length equal to |H|. Then, the inverse problem becomes “easier”: for every clique H accepted by the ﬁrst veriﬁer, exponentially many proofs of the form s, H are accepted by the second veriﬁer. If padding of suﬃcient length is applied to the proofs of the veriﬁer for k-clique, we obtain a veriﬁer for k-clique with inverse complexity in P. Of course, such a padded veriﬁer may seem to be quite artiﬁcial. For many NP-complete languages L, one may feel that there is a “natural” veriﬁer for L (such as the veriﬁers described in the above examples), and we will show that the inverse problem for many such veriﬁers is coNP-complete. Nonetheless, we emphasize that for each language L in NP, there is more than one veriﬁer1 for L, and that the inverse problem is deﬁned relative to a veriﬁer for an NP language, and not just an NP language by itself. Related Work and Contributions. Our study of inverse problems contributes to a line of research which aims to understand the properties of the natural veriﬁers for NP-complete languages. This research program has its origins in the work done on the counting class P by Simon and Valiant [13,14]; one important outcome of this work was the observation that many natural veriﬁers are related by parsimonious reductions, a stronger notion of many-one reduction that preserves the number of proofs. A notion of reduction even stronger than parsimonious reduction was studied by Lynch and Lipton [10]; they required that when a language member x many-one reduces to another language y, there be an eﬃciently computable function mapping proofs for x to proofs for y. Fischer, Hemaspaandra, and Torenvliet [5] went one step further to study what they called witness-isomorphic reductions, which require that there be eﬃciently computable isomorphisms between the proof sets of language members reduced to one another. Other work along this line includes [1], in which Agrawal and Biswas gave suﬃcient conditions for a veriﬁer to accept a NP-complete language. These suﬃcient conditions allow one to derive many of the known NP-completeness results in a uniform manner. Our work complements these previous results by identifying an entirely new feature shared by many of the natural veriﬁers for NP-complete languages – namely, coNP-complete “inverse complexity.” This paper is also a contribution to the study of structure identiﬁcation [3], where the goal is to decide, given a set of data points, whether or not the set is meaningfully structured in a way that admits a description of desirable type. More broadly, we believe that the study of inverse complexity sheds new light on the multitude of natural properties of classical combinatorial structures yielding NP-complete decision problems. We note that inverse complexity has consequences for another task involving computationally diﬃcult problems – the task of generating multiple solutions 1

In fact, there are inﬁnitely many.

Inverse NP Problems

341

to a search problem, which is of both theoretical and practical interest (see, for example, [6,9,7,4]). Here, a relevant problem is to decide whether or not all proofs of a language member have (at some point in time) been generated – precisely, to decide, given a language member x and a non-empty set of proofs Π of x, whether or not Π is the full set of proofs induced by x. Hardness of the inverse problem implies hardness of this decision problem, under a mild assumption. Results and Paper Organization. The notation and conventions used throughout the paper are ﬁxed in Section 2. In this section, we also deﬁne the notion of a candidate function for a veriﬁer, examples of which include the “candidate formula” and “candidate graph” procedures given above. In Section 3, we prove that inverse k-clique is coNP-complete. We also formulate a suﬃcient condition for an inverse problem to be in P, and give an example application of the condition by showing that inverse bipartite matching is in P. We then develop a notion of reduction (Section 4) which allows us to compare the complexity of inverse problems with relative ease. In particular, we deﬁne what we call a π-reduction between veriﬁers with candidate functions, and show that the existence of a π-function implies the existence of a π-reduction. By giving a sequence of π-functions, we leverage the hardness of inverse k-clique and inverse 3-sat to show the coNP-completeness of many other inverse problems. We have hinted that the “inverse complexity” of a language depends strongly on the veriﬁer used to accept the language. In the full version of this paper, we formally demonstrate and study this dependence. Among other results, we show that there exists a veriﬁer with a Σ2p -complete inverse problem, giving a new and natural example of a Σ2p -complete problem. We also prove that for all natural NP-complete languages L and any language A in NP, there is a veriﬁer for L with inverse complexity A.

2

Preliminaries

We assume familiarity with basic notions of complexity theory, such as the complexity classes P, NP, coNP, Σ2p , and Π2p , described for instance in [2]. We write A ≤pm B if there is a many-one polynomial time reduction from A to B, and A ≡pm B if A ≤pm B and B ≤pm A. The power set of a set S is denoted by P(S). ∗ Throughout this paper, Σ denotes a ﬁxed ﬁnite alphabet. When S ⊆ Σ is a set of strings, S denotes S = x∈S |x|, that is, the sum of the lengths of the strings in S. (We assume that such a set S ⊆ Σ ∗ is represented by a string of length linear in S.) Deﬁnition 1. A relation R ⊆ Σ ∗ × Σ ∗ is a veriﬁer if membership in R is decidable in polynomial time and there is a polynomial p such that (for all x, π ∈ Σ ∗ ) R(x, π) implies that |π| ≤ p(|x|). When R is a veriﬁer, we let R(x) denote the set of proofs for x ∈ Σ ∗ , that is, {π ∈ Σ ∗ : R(x, π)}; and, we let L(R) denote the set {x ∈ Σ ∗ : ∃π ∈ Σ ∗ such that R(x, π)}, which we call the language associated with or accepted by R.

342

Hubie Chen

We now deﬁne the inverse problem corresponding to a veriﬁer, which is the problem of focal interest in this paper. Deﬁnition 2. Suppose that R is a veriﬁer. Let R−1 denote the language {Π ⊆ Σ ∗ : ∃x ∈ L(R) such that R(x) = Π}. We call this language the inverse problem for R. When R is a veriﬁer, we will refer to the members of the language L(R) as theorems; when x is a theorem, the elements of R(x) are said to be its proofs. Using this terminology, the inverse problem is the question of deciding, given a set of proofs Π, whether or not there is a theorem x with exactly Π as its set of proofs. A natural initial consideration is whether or not there is an upper bound on the complexity of R−1 . We can obtain such an upper bound when we place a restriction which we call fairness on R. All veriﬁers considered in this paper obey this restriction. Deﬁnition 3. A veriﬁer R is a fair veriﬁer if there is a polynomial q such that for all x ∈ L(R), there exists x ∈ L(R) where |x | ≤ q(R(x)) and R(x ) = R(x). Put diﬀerently, R is fair if for all sets of proofs Π, when there exists a theorem x with exactly Π as its set of proofs (that is, Π = R(x)), then there exists such an x with length bounded above by a polynomial in Π. Suppose R is a fair veriﬁer. In deciding whether or not a set of proofs Π is in R−1 , we need only consider as potential theorems the x which are bounded above in length by a polynomial in Π, the length of the representation of Π. Moreover, by the deﬁnition of a veriﬁer, a proof of x (that is, an element of R(x)) has length polynomial in |x|. These two facts lead to the following observation. Observation 1. If R is a fair veriﬁer, then R−1 is in Σ2p . We note that without the restriction of fairness, the complexity of R−1 can be non-recursive – in fact, as high as the halting problem for general Turing machines2 . Deﬁnition 4. A non-empty set Π ⊆ Σ ∗ is well-formed (relative to an NP veriﬁer R) if there exists x ∈ Σ ∗ such that Π ⊆ R(x). The inverse problem is only “interesting” on well-formed sets, since a set that is not well-formed cannot be in R−1 . Checking well-formedness of a set can be done in polynomial time for all veriﬁers considered in this paper. 2

For any recursively enumerable language L (such as the halting problem), there exists a veriﬁer R such that R−1 = L. To see this, let M be a Turing machine accepting L. Deﬁne the veriﬁer R(x, π) to be true when x is of the form 0n , π and M accepts when simulated for time n on input π: then, R−1 = L. Since R−1 is always recursively enumerable, we obtain that a language is recursively enumerable if and only if it is equal to R−1 for some veriﬁer R.

Inverse NP Problems

343

We now introduce the notion of a candidate function for a veriﬁer R. This notion can be thought of as a generalization of the “candidate formula” and “candidate graph” procedures described in the introduction. A candidate function eﬃciently maps a set of proofs Π to a “candidate theorem” having the property that if any theorem has Π as its set of proofs, then the candidate theorem is such a theorem. In addition, the candidate theorem for Π is optimistic in the sense that all proofs in Π are proofs of it. Deﬁnition 5. Let R be a veriﬁer. A polynomial time computable function C : P(Σ ∗ ) → Σ ∗ is a candidate function for R if the following two conditions hold. 1. For all well-formed Π ⊆ Σ ∗ , all proofs in Π are proofs for C(Π): Π ⊆ R(C(Π)). 2. For all well-formed Π ⊆ Σ ∗ , if there exists an x ∈ L(R) such that Π = R(x), then Π = R(C(Π)). Note that when C(Π) has exactly Π as its set of proofs, there is no requirement that C(Π) be the unique theorem with Π as its set of proofs. Indeed, there will generally not be such a unique theorem. When a veriﬁer has a candidate function, we can give an even better upper bound on the complexity of its inverse problem than that of Observation 1. (Notice that every veriﬁer with a candidate function is fair.) Observation 2. If R is a veriﬁer with a candidate function, then R−1 is in coNP. This is because we no longer have to search over all theorems x to ﬁnd a match for Π, but rather can simply compute the candidate theorem in polynomial time, and then check whether or not its set of proofs is Π. Deﬁnition 6. Let R be a veriﬁer. Deﬁne the exhaustive proof problem for R to be the language Exhaustive(R) = {x, Π ∈ Σ ∗ × P(Σ ∗ ) : R(x) = Π, Π = ∅} In other words, the exhaustive proof problem is to determine, given an x and a nonempty set of proofs Π for x, whether or not Π contains all proofs of x (relative to R). This is the case when π is a proof of x if and only if π is in Π (for all π). To verify this one needs only examine π of length polynomial in x (by deﬁnition of a veriﬁer), leading to the following observation. Observation 3. If R is a veriﬁer, then Exhaustive(R) is in coNP. Notice that without the restriction that the given set of proofs must be nonempty, the exhaustive proof problem for the veriﬁer R of any NP-complete language L would trivially be coNP-complete. In this case, the reduction mapping a string x to the pair x, ∅ would reduce co-L to the exhaustive proof problem for R. Although this trivial reduction fails for the given deﬁnition of the exhaustive proof problem, we can nonetheless establish hardness of this problem by using the hardness of the corresponding inverse problem.

344

Hubie Chen

Lemma 1. If R is a veriﬁer with candidate function, then R−1 ≤pm exhaustive(R).

3

Inverse Problems

In this section, we give some initial complexity classiﬁcations of inverse problems. The hardness results given here will serve as starting points for deriving further hardness results. First, we note a theorem established in previous work by Kavvadias and Sideri. Theorem 4. [8] Inverse 3-sat is coNP-complete. Next, we establish that the inverse clique problem is coNP-hard. This is done by reducing from the clique problem itself. The idea is to take a graph G (an instance of the clique problem) and “expose” its edges by creating a clique for each edge of G. The resulting candidate graph corresponding to the set of cliques Π has, for each clique in the original graph G, a clique not in Π. Theorem 5. Inverse k-clique is coNP-complete. It generally seems to be the case that the inverse problem for a natural veriﬁer accepting an NP-complete language is coNP-complete: in the next section, we give many examples where this is true. However, there is at least one exception to this apparent rule of thumb. Observation 6. Inverse circuit-sat is in P. This is because every well-formed set Π has a circuit with exactly Π as the satisfying assignments, namely, a DNF circuit directly encoding Π. One might contrast this observation with Theorem 4; intuitively, the diﬀerence is that the class of boolean circuits is capable of expressing all possible sets of satisfying assignments, whereas the sets of satisfying assignments expressible by 3-SAT formulas is restricted. We can place the complexity of other inverse problems inside P by making use of a link between inverse problems and the notion of output polynomial time. (This notion has been discussed in previous work, for example [6].) Deﬁnition 7. Say that a veriﬁer R has an output polynomial time algorithm if given x, the set R(x) can be computed in time polynomial in |x| + R(x). When R has an output polynomial time algorithm, both the exhaustive proof problem and inverse problem for R become easy. The following lemma is implicit in [8]. Lemma 2. Suppose that R is a veriﬁer with an output polynomial time algorithm. Then, Exhaustive(R) is in P. If in addition R has a candidate function, then R−1 is in P.

Inverse NP Problems

345

This lemma allows us to establish the tractability of inverse bipartite matching, as there is an output polynomial time algorithm for bipartite matching3 . Theorem 7. inverse bipartite matching is in P.

4

π-Reductions

We now develop a notion of reduction between pairs of veriﬁers with candidate functions called π-reduction, which allows us to compare the complexity of the respective inverse problems. When there is a π-reduction from one veriﬁer to another, the inverse problem of the ﬁrst reduces in polynomial time to the inverse problem of the second. In addition, we show that the existence of a particular type of function, which we call a π-function, is suﬃcient for a π-reduction to exist; giving such π-functions will be our main tool for showing hardness of inverse problems. Throughout this section, we assume that R and S are veriﬁers with candidate functions CR and CS , respectively. Deﬁnition 8. A polynomial time computable function g : P(Σ ∗ ) → P(Σ ∗ ) is a π-reduction from R to S if the following three conditions hold. 1. For all well-formed Π ⊆ Σ ∗ , g preserves the number of proofs: |Π| = |g(Π)|. 2. For all well-formed Π ⊆ Σ ∗ , the candidate theorems for Π and g(Π) have the same number of proofs: |R(CR (Π))| = |S(CS (g(Π)))|. 3. If Π ⊆ Σ ∗ is not well-formed, then g(Π) is not well-formed. Lemma 3. If g is a π-reduction from R to S, then R−1 ≤pm S −1 via g. Deﬁnition 9. Suppose f : Σ ∗ × Σ ∗ → Σ ∗ is a partial function computable in polynomial time, and let fx : Σ ∗ → Σ ∗ denote the function deﬁned by fx (π) = f (x, π). Let us say that f is a π-function from R to S (relative to CR and CS ) if for all well-formed Π ⊆ Σ ∗ , when x and y are set as x = CR (Π) and y = CS (fx (Π)), fx is a bijection between R(x) and S(y). Lemma 4. Suppose that there exists a π-function from R to S (relative to CR and CS ). Then there exists a π-reduction from R to S, and R−1 ≤pm S −1 . 3

Although it would be tempting to conjecture that there is a correspondence between the complexity of the counting problem corresponding to a veriﬁer R (given x, compute |R(x)|) and the inverse problem R−1 , a general correspondence seems unlikely in light of Theorem 4, Theorem 5, Observation 6 and Theorem 7. The counting versions of 3-sat, k-clique, circuit-sat, and bipartite matching are all P-complete [14]. Moreover, the decision problems for 3-sat and circuit-sat are both NP-complete, so even attempting to predict the inverse complexity based on both the counting complexity and decision complexity seems diﬃcult.

346

Hubie Chen

Having developed a notion of reduction between inverse problems, we can classify the complexity of many natural inverse problems with relative ease. Theorem 8. Inverse exact cover, inverse vertex cover, inverse knapsack, inverse steiner tree, and inverse partition are all coNPcomplete. In the full version of this paper, we give precise descriptions of the veriﬁers that are addressed by Theorem 8. All of these veriﬁers have candidate functions, and thus the corresponding inverse problems are all in coNP. We establish that the inverse problems are all coNP-hard by starting with the fact that inverse 3sat and inverse k-clique are coNP-hard (Theorems 4 and 5), and then giving a sequence of π-functions. We note that parsimonious reductions h between R and S such that there is an eﬃcient mapping from proofs of x ∈ L(R) to proofs of h(x) are often (but not always) of great help in developing π-functions. Corollary 1. Exhaustive 3-sat, exhaustive circuit-sat, exhaustive kclique, exhaustive exact cover, exhaustive vertex cover, exhaustive knapsack, exhaustive steiner tree, and exhaustive partition are all coNP-complete4 .

5

Conclusions

In this paper, we formalized the inverse problem for an arbitrary NP veriﬁer, which includes as particular cases the inverse satisﬁability problems studied by Kavvadias and Sideri [8]. We developed a notion of reduction for inverse problems and used it to classify the inverse complexity of many natural veriﬁers. We also formally demonstrated that the “inverse complexity” of a veriﬁer for language L cannot be predicted from L, but rather, exhibits a strong dependence on the choice of veriﬁer used to accept L. There are some interesting directions for future work. Can the intuitive claim that natural veriﬁers for NP-complete problems tend to have coNP-complete inverse problems, be formalized and proved? (Obviously, a formalized notion of “natural veriﬁer” having the property that all natural veriﬁers have coNPcomplete inverse problems would have to exclude the veriﬁer for circuit-sat, by Observation 6.) Along these lines, it would be of interest to classify the inverse complexity of the natural veriﬁers for graph 3-colorability and hamiltonian circuit. In addition, one could study inverse problems for other complexity classes deﬁnable by nondeterministic machines, such as NL and Σ2p . 4

For some (but not all) of the listed problems, we are aware of elementary proofs of coNP-completeness where the reduction is from the complement of the original language. For instance, co-k-clique reduces to exhaustive k-clique by mapping G, k to G ∪ C, k, V (C), where C is a clique of size k with vertex set disjoint from that of G.

Inverse NP Problems

347

Acknowledgements The author would like to thank Andrew Blumberg, Carla Gomes, Dexter Kozen, Riccardo Pucella, Tim Roughgarden, and Bart Selman for useful discussions and comments. The author was supported by a NSF Graduate Research Fellowship.

References 1. M. Agrawal and S. Biswas. Universal relations. In Proc. 7th Structure in Complexity Theory Conference, pages 207–220, 1992. 2. J. L. Balc´ azar, J. D´ıaz, and J. Gabarr´ o. Structural Complexity I. Texts in Theoretical Computer Science – An EATCS series. Springer-Verlag, Berlin, 2nd edition, 1995. 3. R. Dechter and J. Pearl. Structure identiﬁcation in relational data. Artiﬁcial Intelligence, 58:237–270, 1992. 4. Thomas Eiter and Kazuhisa Makino. On computing all abductive explanations. In AAAI/IAAI, pages 62–67, 2002. 5. S. Fischer, L. Hemaspaandra, and L. Torenvliet. Witness-isomorphic reductions and local search. Complexity, Logic and Recursion Theory, Lecture Notes in Pure and Applied Mathematics, A. Sorbi (ed.), pages 207–223, 1997. 6. D. S. Johnson, C. H. Papadimitriou, and M. Yannakakis. On generating all maximal independent sets. Information Processing Letters, 27:119–123, 1988. 7. Dimitris J. Kavvadias, Martha Sideri, and Elias C. Stavropoulos. Generating all maximal models of a Boolean expression. Information Processing Letters, 74(3– 4):157–162, 2000. 8. D. Kavvadias and M. Sideri. The inverse satisﬁability problem. SIAM Journal on Computing, 28(1):152–163, 1998. 9. A. Kwan, E. P. K. Tsang, and J. E. Borrett. Phase transition in ﬁnding multiple solutions in constraint satisfaction problems. In Workshop on Studying and Solving Really Hard Problems, First International Conference on Principles and Practice of Constraint Programming, pages 119–126, 1995. 10. N. Lynch and R. Lipton. On structure preserving reductions. SIAM J. Comp., 7(2):119–125, 1978. 11. C. Papadimitriou. NP-completeness: A retrospective. In Proceedings of the 24th International Colloquium on Automata, Languages and Programming, volume 1256 of Lecture Notes in Computer Science, pages 2–6. Springer, 1997. 12. T. J. Schaefer. The complexity of satisﬁability problems. In Proc. 10th Annual ACM Symposium on Theory of Computing, pages 216–226, 1978. 13. J. Simon. On some central problems in computational complexity. Technical report, Cornell University, 1975. 14. L. G. Valiant. The complexity of computing the permanent. Theoretical Computer Science, 8:189–201, 1979.

A Linear-Time Algorithm for 7-Coloring 1-Planar Graphs (Extended Abstract) Zhi-Zhong Chen and Mitsuharu Kouno Dept. of Math. Sci., Tokyo Denki Univ., Hatoyama, Saitama 350-0394, Japan [email protected]

Abstract. A graph G is 1-planar if it can be embedded in the plane in such a way that each edge crosses at most one other edge. Borodin showed that 1-planar graphs are 6-colorable, but his proof only leads to a complicated polynomial (but nonlinear) time algorithm. This paper presents a linear-time algorithm for 7-coloring 1-planar graphs (that are already embedded in the plane). The main diﬃculty in the design of our algorithm comes from the fact that the class of 1-planar graphs is not closed under the operation of edge contraction. This diﬃculty is overcome by a structure lemma that may ﬁnd useful in other problems on 1-planar graphs. This paper also shows that it is NP-complete to decide whether a given 1-planar graph is 4-colorable. The complexity of the problem of deciding whether a given 1-planar graph is 5-colorable is still unknown.

1

Introduction

The problem of coloring the vertices of a graph using few colors has been a central problem in graph theory. It has also been intensively studied in algorithm theory due to its applications in many practical ﬁelds such as scheduling, resource allocation, and VLSI design. Of special interest is the case where the graph is planar. Appel and Haken [1,2] showed that every planar graph is 4-colorable, but their proof only leads to a complicated polynomial (but nonlinear) time algorithm. Since then, a number of linear-time algorithms for 5-coloring planar graphs have appeared [9,12,11,15,10]. An interesting generalization of planar graphs is the class of 1-planar graphs. The problem of coloring the vertices of a 1-planar graph using few colors has also attracted very much attention [14,13,3,4,5]. Indeed, the problem has been formulated in another diﬀerent way: It is equivalent to the problem of coloring the vertices of a plane graph so that the boundary vertices of every face of size at most 4 receive diﬀerent colors [13]. Ringel [14] proved that every 1-planar graph is 7-colorable and conjectured that every 1-planar graph is 6-colorable. Ringel [14] and Archdeacon [3] conﬁrmed the conjecture for two special cases.

The full version can be found at http://rnc.r.dendai.ac.jp/˜chen/papers/1planar.pdf Supported in part by the Grant-in-Aid for Scientiﬁc Research of the Ministry of Education, Science, Sports and Culture of Japan, under Grant No. 14580390.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 348–357, 2003. c Springer-Verlag Berlin Heidelberg 2003

A Linear-Time Algorithm for 7-Coloring 1-Planar Graphs

349

Borodin [4] settled the conjecture in the aﬃrmative with a lengthy proof. He [5] later came up with a relatively shorter proof. However, his proof only leads to a complicated polynomial (but nonlinear) time algorithm for 6-coloring 1-planar graphs. Chen, Grigni, and Papadimitriou [7] studied a modiﬁed notion of planarity, in which two nations of a (political) map are considered adjacent when they share any point of their boundaries (not necessarily an edge, as planarity requires). Such adjacencies deﬁne a map graph (see [8] for a comprehensive survey of known results on map graphs). The map graph is called a k-map graph if no more than k nations on the map meet at a point. As observed in [7], the adjacency graph of the United States is nonplanar but is a 4-map graph. Obviously, every 4-map graph is 1-planar. In Section 3, we will observe that every 1-planar graph can be modiﬁed to a 4-map graph by adding some edges (see Fact 1. below). By these facts, the problem of k-coloring 1-planar graphs is essentially equivalent to the problem of k-coloring 4-map graphs, for every integer k ≥ 4. Recall that in the case of planar graphs, a linear-time 4-coloring algorithm seems to be diﬃcult to design and hence it is of interest to look for a linear-time 5-coloring algorithm. Similarly, in the case of 1-planar graphs, a linear-time 6coloring algorithm seems to be diﬃcult to design and hence it is of interest to look for a linear-time 7-coloring algorithm. In this paper, we present the ﬁrst lineartime algorithm for 7-coloring 1-planar graphs (that are already embedded in the plane). Our algorithm is much more complicated than all 5-coloring algorithms for planar graphs. The main reason is that unlike planar graphs, the class of 1-planar graphs is not closed under the operation of edge contraction (recall that contracting an edge {u, v} in a graph G is made by replacing u and v by a single new vertex z and adding an edge between z and each original neighbor of u and/or v). It is worth noting that many coloring algorithms (e.g., those for planar graphs) are crucially based on the property that the class of their input graphs is closed under the operation of edge contraction. In the case of 1-planar graphs, this property is not available and it becomes diﬃcult to ﬁnd suitable vertices to merge so that the resulting graph is still 1-planar. We overcome this diﬃculty with a structure lemma which essentially says that every 1-planar graph either has a constant fraction of vertices of degree at most 7, or has a constant fraction of vertices each of which is of degree 8 and has at least 5 neighbors of degree at most 8. We believe that this lemma will ﬁnd useful in the design of algorithms for other problems on 1-planar graphs. Our algorithm works only when the input 1-planar graph is given together with its embedding in the plane. Since it is still unknown whether 1-planar graphs can be recognized in polynomial time, it is an interesting open question to ask whether 1-planar graphs can be 7-colored in linear time when they are given without an embedding in the plane. Since planar graphs are special 1-planar graphs and it is NP-complete to decide whether a given planar graph is 3-colorable, it is also NP-complete to decide whether a given 1-planar graph is 3-colorable. This paper shows that it is NP-complete to decide whether a given 1-planar graph is 4-colorable. The problem of deciding whether a given 1-planar graph is 5-colorable remains open.

350

2

Zhi-Zhong Chen and Mitsuharu Kouno

Preliminaries

Throughout this paper, a graph is always simple (i.e., has neither multiple edges nor self-loops) unless stated explicitly otherwise. Let G = (V, E) be a graph. The neighborhood of a vertex v in G, denoted NG (v), is the set of vertices in G adjacent to v; dG (v) = |NG (v)| is the degree of v in G. For U ⊆ V , let NG (U ) = ∪u∈U NG (u). For U ⊆ V , the subgraph of G induced by U is the graph (U, F ) with F = {{u, v} ∈ E : u, v ∈ U } and is denoted by G[U ]. For U ⊆ V , we denote by G − U the subgraph induced by V − U . If u ∈ V , we write G − u instead of G − {u}. An independent set in G is a set of pairwise nonadjacent vertices in G. A maximal independent set in G is an independent set in G that is not a proper subset of another independent set in G. A 1-plane embedding of G is an embedding of G in the plane in such a way that each edge crosses at most one other edge. G has a 1-plane embedding only when G is a 1-planar graph. For a sequence u1 , . . . , uk of two or more distinct pairwise nonadjacent vertices in G, merging u1 , . . . , uk is the operation of modifying G by adding an edge between uk and every vertex in ∪1≤i≤k−1 NG (ui ) − NG (uk ) and further removing vertices u1 , . . . , uk−1 . Note that the sequence is ordered and uk is the last vertex in the sequence. Let k be a natural number. A k-coloring of G is a coloring of the vertices of G with at most k colors such that no two adjacent vertices get the same color. The color classes of a coloring C of the vertices of G are the sets V1 , V2 , ..., Vk , where k is the number of colors used by C and Vi , 1 ≤ i ≤ k, is the set of all vertices with the ith color.

3

The Algorithm

Since we can reduce the problem of 7-coloring 1-planar graphs to its special case where the input 1-planar graph is 3-connected, we can restrict our attention only to 3-connected 1-planar graphs. We assume that each input 1-planar graph G is given by its adjacency list and a list L of disjoint (unordered) pairs of edges of G such that G has a 1-plane embedding in which the two edges in each pair in L cross while no two other edges of G cross. Given G in such a way, we ﬁrst construct a graph H as follows. H contains all vertices of G and all those edges of G that are contained in no pair in L. Moreover, for each (unordered) pair {e, e } ∈ L, H contains a new vertex ve,e , and contains an edge between ve,e and each endpoint of e and/or e . H does not contain other vertices or edges. Note that H is a planar graph by our assumption on L. Also note that some vertices ve,e may have only three neighbors in H because e and e may share one endpoint. We then compute a plane embedding of H in linear time. For convenience, we identify H with its plane embedding. Hereafter, for a vertex v of H, we say that two neighbors u and w of v in H are consecutive if u and w appear around ve,e consecutively (clockwise or counterclockwise) in H. Fact 1. G has a supergraph that is a 4-map graph. Consequently, every 1-planar graph has a supergraph that is a 4-map graph.

A Linear-Time Algorithm for 7-Coloring 1-Planar Graphs

351

Fact 2. Let ve,e be a vertex in H but not in G. Suppose that x and y are two consecutive neighbors of ve,e in H such that {x, y} is an edge in H. Then, the cycle C formed by the three edges {ve,e , x}, {x, y}, and {y, ve,e } together is the boundary of some face of H. Lemma 1. We can modify H in linear time so that H satisfies the following conditions: 1. H has the same vertices as G, and contains all edges of G. 2. For each vertex v of H and for every two consecutive neighbors u and w of v in H, {u, w} is an edge in H and crosses no edge in H. 3. For each pair of edges of H that cross in H, the endpoints of the two edges induce a clique of size 4 in H. Hereafter, without loss of generality (by Lemma 1), we assume the following: Assumption 1. H satisfies the conditions in Lemma 1. Now, by Condition 1 in Lemma 1, it suﬃces to 7-color H in order to 7-color G. Hereafter, we will work on H instead of G. Corollary 1. The following two statements hold: 1. Let v be a vertex in H. Suppose that u is a neighbor of v in H such that edge {v, u} crosses another edge {x, y} in H. Then, {x, y} ⊆ NH (v), and x, u, y appear around v consecutively in H in this order (clockwise or counterclockwise). 2. Let u and w be two consecutive neighbors of v in H. Then, at least one of edges {v, u} and {v, w} crosses no edge in H. 3.1

A Structure Lemma

Fix two constants α and K with 1 < α < 2 and K > 7 + 9/(α − 1). Let v be a vertex of H. If |dH (v)| ≤ K, we say that v is small; otherwise, we say that v is large. We say that v is reducible if one of the following holds: 1. dH (v) ≤ 6. 2. dH (v) = 7 and NH (v) contains at most one large vertex. 3. dH (v) = 8, NH (v) contains no large vertex, and one of the following holds: (a) There are at most two vertices u ∈ NH (v) with dH (u) ≥ 9. (b) There are exactly three vertices u ∈ NH (v) with dH (u) ≥ 9 and there are distinct vertices u1 , u2 , u3 in NH (v) such that dH (u1 ) ≥ 9, dH (u2 ) ≥ 9, dH (u3 ) ≤ 8, and {v, u2 } and {u1 , u3 } are edges of H and cross in H. Lemma 2. Let R be the set of reducible vertices in H. Then, R contains a constant fraction of vertices of H. Corollary 2. We can compute a set I of reducible vertices of H in linear time such that the following conditions are satisfied: 1. I contains a constant fraction of vertices of H. 2. For every two vertices u and v in I, there is no path P between u and v in H such that P has at most three edges and has no large vertex.

352

3.2

Zhi-Zhong Chen and Mitsuharu Kouno

Outline of the Algorithm

We ﬁrst give an outline of the algorithm. It ﬁrst computes a set I of reducible vertices of H satisfying the conditions in Corollary 2. It then uses I and H to construct a new 1-planar graph G in linear time such that the number of vertices in G is a constant fraction of the number of vertices in H and a 7-coloring of H can be constructed in linear time from an arbitrarily given 7-coloring of G . It further recurses on G to obtain a 7-coloring of G which is then used to obtain a 7-coloring of H in linear time. Since each recursion takes linear time and reduces the size of the graph by a constant fraction, the overall time is linear. The core of the algorithm is in the construction of G . 3.3

Constructing Graph G for Recursion

To construct G , we may simply remove all v ∈ I with dH (v) ≤ 6 from H because each 7-coloring of H − v extends to a 7-coloring of H. Similarly, for each v ∈ I such that NH (v) contains a vertex u with dH (u) ≤ 6, we may remove u from H. However, these are not enough because I may contain very few such vertices v. So, we need to do something about those vertices v ∈ I such that 7 ≤ dH (v) ≤ 8 and NH (v) contains no vertex u with dH (u) ≤ 6. We call such vertices v critical vertices. The idea is to explore the neighborhood structure of critical vertices. First, we need the following deﬁnitions: Definition 1. A vertex x in H is dangerous for a critical vertex v if one of the following holds: – dH (v) = 7 and x is a large neighbor of v in H. – dH (v) = 8, x ∈ NH (v) ∪ {v}, and x is adjacent to some vertex u ∈ NH (v) in H. Note that a vertex x may be dangerous for more than one critical vertex. Definition 2. Let v be a critical vertex with dH (v) = 7. A mergable pair for v is a pair (u, w) of two nonadjacent neighbors of v in H such that u is small and the graph G1 obtained from H − v by merging u, w is a 1-planar graph. In Deﬁnition 2, w may be dangerous for v. Moreover, no matter whether w is dangerous for v, w remains in G1 . The intuition behind Deﬁnition 2 is that we can extend a given 7-coloring of G1 to a 7-coloring of H as follows: Let u have the color of w, and then let v have a color that is assigned to no vertex in NH (v). Definition 3. Let v be a critical vertex with dH (v) = 8. 1. A mergable triple for v is a set {u1 , u2 , u3 } of three pairwise nonadjacent neighbors of v in H such that the graph G2 obtained from H − v by merging u1 , u2 , u3 is a 1-planar graph.

A Linear-Time Algorithm for 7-Coloring 1-Planar Graphs

353

2. Two simultaneously mergable pairs for v are two pairs (u1 , u2 ) and (w1 , w2 ) such that u1 , u2 , w1 , and w2 are distinct neighbors of v in H, neither {u1 , u2 } nor {w1 , w2 } is an edge of H, and the graph G3 obtained from H − v by merging u1 , u2 and merging w1 , w2 is a 1-planar graph. 3. A desired quadruple for v is an ordered list (u, w1 , w2 , w3 ) of four distinct neighbors in H such that – dH (u) ≤ 8, – {w1 , w2 } ⊆ NH (u), and – {w1 , w2 , w3 } is an independent set in H, and the graph G4 obtained from H − {v, u} by merging w1 , w2 , w3 is a 1-planar graph. 4. A favorite quintuple for v is an ordered list (u, w1 , w2 , w3 , w4 ) of five distinct neighbors of v in H such that – dH (u) ≤ 8, – {w1 , w2 } ⊆ NH (u), and – neither {w1 , w2 } nor {w3 , w4 } is an edge in H, and the graph G5 obtained from H − {v, u} by merging w1 , w2 and merging w3 , w4 is a 1-planar graph. 5. A desired quintuple for v is an ordered list (u1 , u2 , w1 , w2 , w3 ) of five distinct neighbors of v in H such that – dH (u1 ) ≤ 8 and dH (u2 ) ≤ 8, – {w1 , w2 } ⊆ NH (u1 ) and {w2 , w3 } ⊆ NH (u2 ), and – {w1 , w2 , w3 } is an independent set in H, and the graph G6 obtained from H − {v, u1 , u2 } by merging w1 , w2 , w3 is a 1-planar graph. 6. A desired sextuple for v is an ordered list (u1 , u2 , w1 , w2 , w3 , w4 ) of six distinct neighbors of v in H such that – dH (u1 ) ≤ 8 and dH (u2 ) ≤ 8, – {w1 , w2 } ⊆ NH (u1 ) and {w3 , w4 } ⊆ NH (u2 ), and – neither {w1 , w2 } nor {w3 , w4 } is an edge in H, and the graph G7 obtained from H − {v, u1 , u2 } by merging w1 , w2 and merging w3 , w4 is a 1planar graph. 7. A useful sextuple for v is an ordered list (u1 , u2 , w1 , x1 , w2 , x2 ) of six distinct vertices in H such that – dH (u1 ) ≤ 8, dH (u2 ) ≤ 8, and {u1 , u2 } ⊆ {v} ∪ NH (v), – w1 ∈ {v} ∪ NH (v) and w2 ∈ {v} ∪ NH (v), – {w1 , x1 , w2 , x2 } ⊆ NH (u1 ) and {u1 , w1 , x1 } ⊆ NH (u2 ), and – neither {w1 , x1 } nor {w2 , x2 } is an edge in H, and the graph G8 obtained from H −{u1 , u2 } by merging w1 , x1 and merging w2 , x2 is a 1-planar graph. 8. A useful triple for v is an ordered list (u, w, x) of three distinct vertices in H such that – dH (u) ≤ 7 and u ∈ NH (v), – w ∈ NH (v) and {w, x} ⊆ NH (u), and – {w, x} is not an edge in H and the graph G9 obtained from H − {u} by merging w, x is a 1-planar graph.

354

Zhi-Zhong Chen and Mitsuharu Kouno

In Deﬁnition 3(7), x1 and x2 may be dangerous for v. Also, in Deﬁnition 3(8), x may be dangerous for v. The intuitions behind Deﬁnitions 3(1) and 3(2) are similar to that of Definition 2. The intuition behind Deﬁnition 3(3) is that we can extend a given 7-coloring of the graph G4 to a 7-coloring of H as follows: let w1 and w2 have the color of w3 , let u have a color assigned to no vertex in NH (u) − {v}, and further let v have a color assigned to no vertex in NH (v) before. The intuition behind Deﬁnition 3(4) is that we can extend a given 7-coloring of the graph G5 to a 7-coloring of H as follows: let w1 have the color of w2 , let w3 have the color of w4 , let u have a color assigned to no vertex in NH (u) − {v}, and further let v have a color assigned to no vertex in NH (v) before. The intuition behind Deﬁnition 3(5) is that we can extend a given 7-coloring of the graph G6 to a 7-coloring of H as follows: let w1 and w2 have the color of w3 , let u2 have a color assigned to no vertex in NH (u2 ) − {v}, let u1 have a color assigned to no vertex in NH (u1 ) − {v}, and further let v have a color assigned to no vertex in NH (v) before. The intuition behind Deﬁnition 3(6) is that we can extend a given 7-coloring of the graph G7 to a 7-coloring of H as follows: let w1 have the color of w2 , let w3 have the color of w4 , let u2 have a color assigned to no vertex in NH (u2 ) − {v}, let u1 have a color assigned to no vertex in NH (u1 ) − {v}, and further let v have a color assigned to no vertex in NH (v) before. The intuition behind Deﬁnition 3(7) is that we can extend a given 7-coloring of the graph G8 to a 7-coloring of H as follows: let w1 have the color of x1 , let w2 have the color of x2 , let u2 have a color assigned to no vertex in NH (u2 ) − {u1 } before, and further let u1 have a color assigned to no vertex in NH (u1 ) before. Finally, the intuition behind Deﬁnition 3(8) is that we can extend a given 7-coloring of the graph G9 to a 7-coloring of H as follows: let w have the color of x, and then let u have a color assigned to no vertex in NH (u) before. The following theorem can be proved by a case-analysis. The proof is very tedious (15 pages long). Theorem 1. For a critical vertex v, call an edge e in H a basic critical edge for v if at least one endpoint of e is v or a small neighbor of v in H, Moreover, for a critical vertex v, call an edge e in H a critical edge for v if e is a basic critical edge for v or e crosses a basic critical edge for v in H. Then, for every critical vertex v, the following hold: 1. If dH (v) = 7, then we can use the sub-embedding of H induced by the set of critical edges for v to find a mergable pair for v in O(1) time such that the graph G1 defined in Definition 2 has a 1-plane embedding H satisfying the following three conditions: (C1) For every pair of edges e1 and e2 in H , e1 and e2 cross each other in embedding H if and only if they cross each other in embedding H. (C2) For every vertex x in H that is neither v nor a small neighbor of v in H, and for every sequence e1 , . . . , ek of edges in H that are incident to x but incident to neither v nor a small neighbor of v, if edges e1 , . . . , ek appear around x consecutively in this order in embedding

A Linear-Time Algorithm for 7-Coloring 1-Planar Graphs

355

H, then edges e1 , . . . , ek appear around x consecutively in this order in embedding H . (C3) Same as (C2) but with both occurrences of the word “consecutively” deleted. 2. If dH (v) = 8, then we can use the sub-embedding of H induced by the set of critical edges for v to compute one of the following for v in O(1) time: – A mergable triple such that the graph G2 defined in Definition 3(1) has a 1-plane embedding H satisfying the above conditions (C1) through (C3). – Two simultaneously mergable pairs such that the graph G3 defined in Definition 3(2) has a 1-plane embedding H satisfying the above conditions (C1) through (C3). – A desired quadruple such that the graph G4 defined in Definition 3(3) has a 1-plane embedding H satisfying the above conditions (C1) through (C3). – A favorite quintuple such that the graph G5 defined in Definition 3(4) has a 1-plane embedding H satisfying the above conditions (C1) through (C3). – A desired quintuple such that the graph G6 defined in Definition 3(5) has a 1-plane embedding H satisfying the above conditions (C1) through (C3). – A desired sextuple such that the graph G7 defined in Definition 3(6) has a 1-plane embedding H satisfying the above conditions (C1) through (C3). – A useful sextuple such that the graph G8 defined in Definition 3(7) has a 1-plane embedding H satisfying the above conditions (C1) through (C3). – A useful triple such that the graph G9 defined in Definition 3(8) has a 1-plane embedding H satisfying the above conditions (C1) through (C3). In the constructions of graphs G1 through G9 (cf. Deﬁnitions 2 and 3), we may merge a dangerous vertex for v only with a small vertex in {v}∪NH (v) (and the dangerous vertex remains after the merging operation), may delete only v and/or some small vertices in NH (v), and may touch only some critical edges for v. Note that the set of critical edges for a critical vertex is disjoint from the set of critical edges for another critical vertex, because of the second condition in Corollary 2. Thus, Conditions (C1) through (C3) together guarantee that for each critical vertex v, we can ﬁnd and use a mergable pair, a mergable triple, two simultaneously mergable pairs, a desired quadruple, a favorite quintuple, a desired quintuple, a desired sextuple, a useful sextuple, or a useful triple for v to modify H in such a way that after the modiﬁcation, we can still ﬁnd a mergable pair, a mergable triple, two simultaneously mergable pairs, a desired quadruple, a favorite quintuple, a desired quintuple, a desired sextuple, a useful sextuple, or a useful triple for each other critical vertex. Now, we are ready to explain how to construct G . The construction of G from H is done as follows. 1. For each critical vertex v with dH (v) = 7, ﬁnd a mergable pair for v as guaranteed in Theorem 1. 2. For each critical vertex v with dH (v) = 8, ﬁnd a mergable triple, two simultaneously mergable pairs, a desired quadruple, a favorite quintuple, a desired quintuple, a desired sextuple, a useful sextuple, or a useful triple for v as guaranteed in Theorem 1.

356

Zhi-Zhong Chen and Mitsuharu Kouno

3. For each critical vertex v with dH (v) = 7 and the mergable pair (u, w) found for v in Step 1, remove v from H and further merge u, w. 4. For each critical vertex v with dH (v) = 8, perform the following: (a) If a mergable triple {u1 , u2 , u3 } was found for v in Step 2, then remove v from H and further merge u1 , u2 , u3 . (b) If two simultaneously mergable pairs (u1 , u2 ) and (w1 , w2 ) were found for v in Step 2, then remove v from H, merge u1 , u2 , and further merge w1 , w2 . (c) If a desired quadruple (u, w1 , w2 , w3 ) was found for v in Step 2, then remove v and u from H, and further merge w1 , w2 , w3 . (d) If a favorite quintuple (u, w1 , w2 , w3 , w4 ) was found for v in Step 2, then remove v and u from H, merge w1 , w2 , and further merge w3 , w4 . (e) If a desired quintuple (u1 , u2 , w1 , w2 , w3 ) was found for v in Step 2, then remove v, u1 , and u2 from H, and further merge w1 , w2 , w3 . (f) If a desired sextuple (u1 , u2 , w1 , w2 , w3 , w4 ) was found for v in Step 2, then remove v, u1 , and u2 from H, merge w1 , w2 , and further merge w3 , w4 . (g) If a useful sextuple (u1 , u2 , w1 , x1 , w2 , x2 ) was found for v in Step 2, then remove u1 and u2 from H, merge w1 , x1 , and further merge w2 , x2 . (h) If a useful triple (u, w, x) was found for v in Step 2, then remove u from H and further merge w, x. 5. Remove all v ∈ I with dH (v) ≤ 6 from H. 6. For each v ∈ I such that NH (v) contains a vertex u with dH (u) ≤ 6, remove all such vertices u from H. By the discussion in the paragraph succeeding Theorem 1, the merging and removal operations in Steps 3 through 6 do not interfere with each other. It is also easy to see that the construction of G takes O(|I|) time (and hence linear time). Recall that a vertex x may be dangerous for more than one critical vertex. So, during the construction of G , it is possible that after a dangerous vertex x for some critical vertex v is merged with a small vertex in {v} ∪ NH (v), x is merged with a small vertex in {v }∪NH (v ) later, where v = v is a critical vertex for which x is dangerous too. Fortunately, the second condition in Corollary 2 guarantees that during the construction of G , adjacent vertices are never merged together. By Condition (C1), G has a 1-plane embedding H such that for every pair of edges e1 and e2 in G , e1 and e2 cross each other in embedding H if and only if they cross each other in embedding H. Thus, we can compute a list L of disjoint (unordered) pairs of edges of G in linear time such that G has a 1-plane embedding in which the two edges in each pair in L cross while no two other edges of G cross.

A Linear-Time Algorithm for 7-Coloring 1-Planar Graphs

4

357

NP-Completeness of 4-Colorability

Since planar graphs are special 1-planar graphs and it is NP-complete to decide whether a given planar graph is 3-colorable, it is also NP-complete to decide whether a given 1-planar graph is 3-colorable. We can show the following: Theorem 2. It is NP-complete to decide whether a given 1-planar graph is 4colorable. It is natural to consider the problem of deciding whether a given 1-planar graph is 5-colorable. Unfortunately, we still do not know whether this problem is NP-complete. This is an open question.

References 1. K. Appel and W. Haken. Every planar map is four colorable, Part I: Discharging. Illinois J. Math., 21:429–490, 1977. 2. K. Appel, W. Haken, and J. Koch. Every planar map is four colorable, Part II: Reducibility. Illinois J. Math., 21:491–567, 1977. 3. D. Archdeacon. Coupled colorings of planar graphs. Congres. Numer., 39:89–94, 1983. 4. O.V. Borodin. Solution of Ringel’s problems on vertex-face coloring of planar graphs and coloring of 1-planar graphs (in Russian). Met. Discret. anal., Novosibirsk, 41:12–26, 1984. 5. O.V. Borodin. A new proof of the 6 color theorem. J. Graph Theory, 19:507–521, 1995. 6. Z.-Z. Chen. Approximation algorithms for independent sets in map graphs. J. Algorithms, 41:20–40, 2001. 7. Z.-Z. Chen, M. Grigni, and C.H. Papadimitriou. Planar map graphs. In Proc. ACM STOC’98, pages 514–523, 1998. 8. Z.-Z. Chen, M. Grigni, and C.H. Papadimitriou. Map graphs. J. ACM, 49:127–138, 2002. 9. N. Chiba, T. Nishizeki, and N. Saito. A linear 5-coloring algorithm of planar graphs. J. Algorithms., 8:470–479, 1981. 10. M. Chrobak and K. Diks. Two algorithms for coloring planar graphs with 5 colors. Tech. Report, Columbia University, January, 1987. 11. G.N. Frederickson. On linear-time algorithms for ﬁve-coloring planar graphs. Inform. Process. Lett., 19:219–224, 1984. 12. D.W. Matula, Y. Shiloach, and R.E. Tarjan. Two linear-time algorithms for ﬁvecoloring a planar graph. Tech. Report STAN-CS-80-830, Stanford University, November, 1980. 13. O. Ore and M.D. Plummer. Cyclic coloration of planar graphs. Recent Progress in Combinatorics (Proc. 3rd Waterloo Conf. on Combinatorics, 1968), Academic Press, New York-London, pages 287–293, 1969. 14. G. Ringel. Ein Sechsfarbenproblem auf der Kugel. Abh. Math. Sem. Univ. Hamburg, 29:107-117, 1965. 15. M.H. Williams. A linear algorithm for colouring planar graphs with ﬁve colours. Comput. J., 28:78–81, 1985.

Generalized Satisﬁability with Limited Occurrences per Variable: A Study through Delta-Matroid Parity Victor Dalmau1, and Daniel K. Ford2, 1

Universitat Pompeu Fabra [email protected] 2 UC Santa Cruz [email protected]

Abstract. In this paper we examine generalized satisﬁability problems with limited variable occurrences. First, we show that 3 occurrences per variable sufﬁce to make these problems as hard as their unrestricted version. Then we focus on generalized satisﬁability problems with at most 2 occurrences per variable. It is known that some N P -complete generalized satisﬁability problems become polynomially solvable when only 2 occurrences per variable are allowed. We identify two new families of generalized satisﬁability problems, called local and binary, that are polynomially solvable when only 2 occurrences per variable are allowed. We achieve this result by means of a reduction to the -matroid parity problem, which is another important theme of this work.

1

Introduction and Summary of Results

Satisﬁability problems are studied intensively in theoretical computer science. One of the main lines of research consists in identifying the restrictions of the general satisﬁability problem that give rise to tractable problems. In an attempt to deﬁne a common framework to study different classes of satisﬁability problems Schaefer deﬁned the class of generalized satisﬁability problems [16]. He observed that many of the variants of the satisﬁability problem considered in the literature, for example Horn and 2-CNF formulas, can be obtained by restricting the type of clauses that are allowed in a formula. In the generalized satisﬁability framework such restrictions are implemented by means of a ﬁxed basis, or set of Boolean relations, that deﬁne the basic building blocks from which formulas are created. Many families of satisﬁability problems can be expressed using this formalism. The following example illustrates how 3-SAT can be expressed as a generalized satisﬁability problem. The 3-clause, x ∨ y ∨ ¬z, can be expressed as the atomic formula, R(x, y, z), where R is a Boolean relation consisting of every assignment to x, y, z that satisﬁes x ∨ y ∨ ¬z, that is, {0, 1}3 \ {(0, 0, 1)}. Thus, if we let the basis Γ be the set consisting of the four relations, (1) {0, 1}3 \ {(0, 0, 0)}, (2) {0, 1}3 \ {(0, 0, 1)}, (3) {0, 1}3 \ {(0, 1, 1)}, and (4) {0, 1}3 \ {(1, 1, 1)}, then SAT (Γ ) is 3-SAT . Every 3-CN F formula can be generated, clause by clause, by picking the appropriate relation from Γ and assigning to

Research partially conducted when visiting UC Santa Cruz. Supported by NSF grant No. CCR9610257 and spanish MCyT under projects TIC2002-04019-C03 and TIC2002-04470-C03. Ford was partially supported by NSF Grant No. ISS-9907419.

B. Rovan and P. Vojt´asˇ (Eds.): MFCS 2003, LNCS 2747, pp. 358–367, 2003. c Springer-Verlag Berlin Heidelberg 2003

Generalized Satisﬁability with Limited Occurrences per Variable

359

it the appropriate variables. Every ﬁnite collection of relations, Γ , deﬁnes the generalized satisﬁability problem, SAT (Γ ). Schaefer’s Dichotomy Theorem showed that generalized satisﬁability problems are either in P or they are N P -complete. This is of particular interest because of Ladner’s Theorem that states, if P = N P there exists problems in N P that are neither N P complete nor in P [12]. Schaefer’s Dichotomy Theorem additionally gives a complete classiﬁcation of the computational complexity of generalized satisﬁability problems: SAT (Γ ) is in the class P if Γ is a subset of one of the following six families (1) Horn relations, (2) dual-Horn relations, (3) bijunctive relations, (4) afﬁne relations, (5) 0-valid relations, and (6) 1-valid relations and N P -complete otherwise [16]. The main motivation of this paper is to bring into the picture the number of occurrences per variable. It is well known that some NP-complete satisﬁability problems become tractable when the maximum number of occurrences per variable is bounded. For example, Tovey showed that (3, 4)-SAT, that is, 4-SAT with at most three variable occurrences per formula and at most one occurrence per clause, is N P -complete, and that, by a direct application of P. Hall’s Marriage Theorem, every (k, k)-SAT formula is satisﬁable [17]. He also conjectured that there exists an exponential function l(k) such that every (k, l(k))-SAT formula is satisﬁable. Kratochv´il et al. conﬁrmed this conjecture and showed that (k, l(k) + 1)-SAT is N P -complete [11]. The study of generalized satisﬁability problems with a bounded number of occurrences per variable was introduced independently in [7] and [10]. It was motivated by the fact that (1) some N P complete satisﬁability problems become solvable in polynomial time when the number of occurrences per variable is bounded and (2) the generalized satisﬁability framework is a well established setting for the uniform analysis of satisﬁability problems. The ultimate goal here is to ﬁnd a reﬁnement of Schaefer’s Dichotomy Theorem in which the maximum number of occurrences per variable is taken into account. More precisely, we let SAT (k, Γ ) contain those formulas (instances) of SAT (Γ ) in which every variable occurs at most k times, and we are interested in determining, for every positive integer k and every basis Γ , the computational complexity of SAT (k, Γ ). Matching on graphs with bounded degree is a typical example of a problem that can be expressed as a generalized satisﬁability problem with a bounded number occurrences per variable. Edges of the graph are represented by variables. Each vertex of degree j is represented by an ordered set of j variables, one per incident edge, and the j-ary Boolean relation 1-in-j: the j-element set of all j-dimensional unit vectors, e.g. the relation 1in-3 is {(1, 0, 0), (0, 1, 0), (0, 0, 1)}. A 1 in the relation corresponds to an edge that is assigned to the ﬁnal matching and a 0 corresponds to an edge that is not. The relation 1-in-j constrains variable assignments to those that assign exactly one incident edge, of the corresponding vertex, to the ﬁnal matching. Since every edge is incident to two vertices every variable must occur exactly twice. If we let Γm be the set consisting of the m relations 1-in-j for 1 ≤ j ≤ m, then SAT (2, Γm ) can express every instance of the matching problem on graphs with degree ≤ m. This is a subclass of the graph matching problem which Edmonds’ Matching Algorithm solves in polynomial time [6]. However, Schaefer showed that the relation 1-in-3 is neither Horn, dual-Horn, bijunctive, afﬁne, 0-valid, nor 1-valid, thereby proving that SAT (Γm ) is N P -complete whenever m ≥ 3 [16]. Schaefer’s Theorem fails to show that the graph matching problem is tractable

360

Victor Dalmau and Daniel K. Ford

because it doesn’t address the case where each variable occurs exactly twice. Istrate was the ﬁrst to notice that if we limit the formulas of SAT (Γm ) to those where variables occur at most twice, then the problem, SAT (2, Γm ), reduces in polynomial time to the matching problem on graphs of degree at most m [10]. Hence, generalized satisﬁability explains the polynomial solvability of graph matching only when we limit the number of variable occurrences. Graph matching is not an isolated example of a generalized satisﬁability problem that is tractable when variables occur at most twice and N P -complete otherwise. In particular, it is known that if every relation of a basis, Γ , is compact then SAT (2, Γ ) is solvable in polynomial-time whereas SAT (Γ ) is, in general, NP-complete [10]. Another example of a broad family of generalized satisﬁability problems that becomes tractable when the number of occurrences is bounded by 2 is the class of bases consisting of co-independent relations [7]. Generalized satisﬁability with at most two occurrences per variable is a rich class of problems intimately related to algorithms, concepts and methods of matching theory. Moreover, the computational complexity of these problems remains largely unknown. In order to simplify matters we only consider in this paper a variant of the generalized satisﬁability problems, denoted SATC (Γ ), already introduced by Schaeffer, in which constants 0 and 1 can be used in the formulas. This assumption, widely used in the literature (see [5]) allows us to simplify considerably the picture. In particular, we are able to show that if k ≥ 3 then SATC (k, Γ ), is polynomially equivalent to SATC (Γ ) for every basis Γ . This result shows that three occurrences sufﬁce to make a problem as hard as the unrestricted problem. As an immediate corollary we get that many common satisﬁability problems such as 1-in-3 SAT and not all equal 3-SAT (see [8] for deﬁnitions), are N P -complete with at most 3 occurrences per variable (these problems are known to be tractable with at most 2 occurrences). Generalized satisﬁability with exactly one occurrence per variable, SATC (1, Γ ), is trivially tractable: every instance is satisﬁable. Hence, we only need to determine the complexity of , SATC (2, Γ ). It was recently shown that SATC (2, Γ ) is polynomially equivalent to SATC (Γ ) if Γ contains a relation that is not a -matroid [7]. These two results combine to identify SATC (2, Γ ) where every relation in Γ is a -matroid as the only family that may contain additional tractable families that were not previously identiﬁed by Schaefer. Thus, the only remaining task is to identify all the bases, Γ , consisting of -matroids such that SATC (2, Γ ) is tractable. As we mentioned above, only two families of such bases have been identiﬁed so far (prior to the results in this paper): the class of bases consisting of compact relations and the class of bases consisting of co-independent relations [10,7]. In this paper we add two new families of bases to the list: the class of bases consisting of local -matroids and the class of bases consisting of binary -matroids. It is possible to show, using the technique of forbidden minors, that these four classes of -matroids, compact, co-independent, local, and binary, are pairwise incomparable. The two new classes, local and binary, are shown to be tractable via a reduction to the -matroid parity problem (see [9] for deﬁnitions). In addition we give speciﬁc conditions under which tractable cases of the -matroid parity problem yield tractable cases of SATC (2, Γ ). The only previously known tractable cases of the -matroid parity problem are binary and linear -matroids [13,9]. However, only the

Generalized Satisﬁability with Limited Occurrences per Variable

361

family of binary -matroids, yields a tractable case of SATC (2, Γ ). The reason for this is that all the known polynomial algorithms for these classes require that a representation, in terms of matrices, of the -matroid be given. Whereas for binary -matroids such a matrix representation can be constructed, it is still open whether this is possible for linear -matroids. On the other hand, the class of local -matroids constitutes a brand new family of tractable cases of the -matroid parity problem. The proof of this result relies on a generalization of the concept of augmenting paths originally introduced by Berge in graph matching [1]. Figure 1 illustrates the current state of knowledge about the computational complexity of SATC (2, Γ ).

SAT NP−C

P

non−Schaefer

Schaefer

NP−C 3 occurences this paper

2 occurences

dual− bijunctive Horn (2−SAT) affine

Horn

NP−C non −matroid Feder

? ?

−matroid

P compact Istrate

P co−independent Feder

P

P binary this paper

local this paper

Fig. 1. Computational complexity of generalized satisﬁability with ≤ k occurrences per variable

Because of space restrictions proofs are only available in the full version of this paper available at http://www.soe.ucsc.edu/ ford/gslvo2003.ps.gz.

2

Generalized Satisﬁability with ≤ k Occurrences per Variable

We introduced generalized satisﬁability problems, SAT (Γ ), via the example of 3-SAT where clauses were replaced by an ordered set of variables and a relation. We now give a formal deﬁnition of SAT (Γ ). An r-ary relation, R, is any nonempty subset of {0, 1}r , R is a predicate symbol with the same arity as R, and a basis or constraint set, Γ , is a ﬁnite collection {R1 , . . . , Rm } of relations. A CN F (Γ )-formula is a ﬁnite conjunction of clauses C1 ∧ · · · ∧ Cn such that each clause, Ci , is an atomic formula of the form

362

Victor Dalmau and Daniel K. Ford

R (v1 , . . . , vr ) where v1 , . . . , vr are Boolean variables in an inﬁnite set V , and R is an r-ary relation in Γ . An atomic formula R (v1 , ..., vr ) is satisﬁed by a variable assignment f : V → {0, 1} if and only if (f (v1 ), ..., f (vr )) ∈ R, and a CN F (Γ )-formula is satisﬁable if and only if there exists an assignment satisfying all its clauses. It is sometimes customary to assume that we can replace some variables in a CN F (Γ )-formula by the constant symbols 0 and 1 to be interpreted as 0 and 1 respectively. We call any formula obtained this way a CN FC (Γ )-formula. Each basis Γ gives rise to the generalized satisﬁability problem SAT (Γ ): given a CN F (Γ )-formula, is it satisﬁable? The problem generalized satisﬁability problem with constants, SATC (Γ ), is deﬁned similarly. As we mentioned in the introduction, Schaefer completely classiﬁed the computational complexity of generalized satisﬁability problems [16]. In order to state Schaefer’s Dichotomy Theorem (Theorem 1 below) we introduce the following deﬁnitions, where ∨, ∧, and ⊕ act on tuples component-wise. A relation, R, is Horn if x, y ∈ R ⇒ x ∧ y ∈ R, dual-Horn if x, y ∈ R ⇒ x ∨ y ∈ R, bijunctive if x, y, z ∈ R ⇒ (x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z) ∈ R, afﬁne if x, y, z ∈ R ⇒ x ⊕ y ⊕ z ∈ R, 1-valid if it contains the tuple (1, 1, . . . , 1), and 0-valid if it contains the tuple (0, 0, . . . , 0). A basis, Γ , is Horn (respectively, dual-Horn, bijunctive, afﬁne, 1-valid, or 0-valid) if every relation in Γ is Horn (respectively, dualHorn, bijunctive, afﬁne, 1-valid, or 0-valid). We say that Γ is Schaefer if it is Horn, dual-Horn, bijunctive or afﬁne. Theorem 1. [16] If Γ is Schaefer, 1-valid, or 0-valid then SAT (Γ ) is in P; otherwise, it is NP-complete. If Γ is Schaefer, then SATC (Γ ) is in P; otherwise, it is NP-complete. The ultimate goal of this research is to classify the complexity of generalized satisﬁability problems (with and without constants) when the number of occurrences per variable is bounded. Although we did not completely achieve this objective we have been able to identify some subclasses of the N P -complete problems identiﬁed by Schaefer that become tractable when the number of occurrences per variables is bounded. In what follows we will only deal with generalized satisﬁability problems with constants, SATC (Γ ). The ﬁrst question we asked is how many occurrences per variable sufﬁce to make the restricted problem as hard as the general problem. Let k ≥ 1 be a positive integer. We deﬁne CN FC (k, Γ )-formulas to be the subset of CN FC (Γ )-formulas restricted to those formulas where each variable occurs at most k times. SATC (k, Γ ) is deﬁned to be the following decision problem: given a CN FC (k, Γ )-formula φ, is it satisﬁable? SAT (k, Γ ) is deﬁned similarly. The following theorem could be derived from Theorem 3 in [7]. Theorem 2. If k ≥ 3 and Γ is a basis then SATC (k, Γ ) is polynomially equivalent to SATC (Γ ). 2.1 At Most 2 Occurrences per Variable. -Matroids For any basis Γ , any generalized satisﬁability problem with exactly one occurrence per variable, SATC (1, Γ ), is trivially solvable in polynomial time, that is, every instance is satisﬁable. Therefore, the only case of unknown complexity is when variables occur

Generalized Satisﬁability with Limited Occurrences per Variable

363

at most twice. The computational complexity of SATC (2, Γ ) appears to be quite interesting. Let {1 − in − 3} be the basis containing the single relation 1-in-3 given by {(0, 0, 1), (0, 1, 0), (1, 0, 0)}. Istrate ﬁrst observed that SATC (2, {1 − in − 3}) is solvable in polynomial-time, by means of a reduction to the matching problem in graphs, whereas SATC ({1 − in − 3}) is NP-complete [16,10]. The guiding purpose of this work is to identify those bases, Γ , like {1 − in − 3}, such that SATC (2, Γ ) is polynomially solvable and SATC (Γ ) is NP-complete. That is, we ask, assuming P = N P , when is tractability contingent on having at most two occurrences per variable? When variables occur at most twice there are two previously known tractable classes in addition to those identiﬁed by Schaefer. SATC (2, Γ ) is polynomially solvable when every relation in Γ is compact or when every relation in Γ is co-independent [10,7]. An n-ary relation, R, is co-independent if d(x, y) = 1 for all x, y ∈ {0, 1}n \ R where d is the hamming distance. We do not provide a deﬁnition of compact relations since it is not needed to obtain our results and it is rather involved. On the other hand, it is also known that if a basis, Γ , contains a relation which is not a -matroid relation, then SATC (2, Γ ) is not any easier that SATC (Γ ) [7]. Thus, any new tractable basis, that is, any non-Schaefer basis, Γ , such that SATC (2, Γ ) is solvable in polynomial time, must consists entirely of -matroids. We now give two equivalent deﬁnitions of -matroids which will play a central role in our study. Deﬁnition 1. [2] Let E be a ﬁnite set and F ⊆ P(E) a collection of subsets of E. The pair (E, F) is called a set system. The set E is called the universe and the subsets in F are called the feasible sets of (E, F). Let be the symmetric difference operator, that is, for two sets A and B, AB = (A ∪ B) \ (A ∩ B). Then (E, F) is a -matroid if F satisﬁes the following symmetric exchange axiom: ∀A, B ∈ F and ∀x ∈ AB, ∃y ∈ AB such that A{x, y} ∈ F. Notice that y is not necessarily different from x. Deﬁnition 2. [14] Let R ⊆ {0, 1}r be a relation. Let x, y, x ∈ {0, 1}r , then x is a step from x to y if d(x, x ) = 1 and d(x, x ) + d(x , y) = d(x, y). R is a -matroid (relation) if it satisﬁes the following two-step axiom: ∀x, y ∈ R and ∀x a step from x to y, either x ∈ R or ∃x ∈ R which is a step from x to y. A -matroid relation can be obtained as the set of indicator tuples of the feasible sets of a -matroid. More formally, let (E, F) be a -matroid where E = {u1 , . . . , un }. Thus, the n-ary relation R containing for every F ∈ F the tuple tF given by tF [i] = 1 if ui ∈ F and 0 otherwise is a -matroid relation. We say that R is obtained from (E, F) (via the ordering u1 , . . . , un ). Conversely, given a n-ary relation R we can construct a matroid (E, F) in the following way: E = {1, . . . , n} and F contains a feasible set F if and only if there exists a tuple t ∈ R such that F = {1 ≤ i ≤ n| t[i] = 1}. Consequently, -matroids and -matroid relations are essentially different mathematical embodiments of the same concept. We will change freely between the set system and the relation formalism of -matroids as convenient. In the process we will often abuse notation and let R denote both the -matroid relation and the -matroid from which it is obtained. Now we have introduced all the necessary conceptual machinery to formally state Feder’s result. Theorem 3. [7] Let Γ be a basis that contains a relation that is not a -matroid. Then SATC (2, Γ ) is polynomially equivalent to SATC (Γ ).

364

Victor Dalmau and Daniel K. Ford

3 The -Matroid Parity Problem When Γ is a -matroid basis the problem SATC (2, Γ ) is intimately related with a problem in combinatorics called the -matroid parity problem. The -matroid parity problem was shown to be equivalent to the -covering problem and is a generalization of the well known matroid parity problem [9,4,13]. Let (E, F) be a -matroid and L a partition of E into pairs. For every u ∈ E, its mate will be denoted by u, that is, u is the only element in E such that {u, u} ∈ L. Let F ∈ F be a feasible set. We will let LF denote the subset of L containing those pairs {u, u} ∈ L such that either both u and u are in F or neither u nor u are in F . An instance of the -matroid parity problem consist of a -matroid, (E, F), and a partition, L, of E into pairs. The goal is to ﬁnd a feasible set F ∈ F such that LF is maximum, that is, at least as large as LG for any other G ∈ F. For computational complexity purposes, the -matroid parity problem, as deﬁned above, is not adequately described: it is not clear what the input of the problem is, that is, how is the -matroid given? If the -matroid is speciﬁed in its entirety then the problem is trivially tractable by exhaustive search. At the other end of the spectrum a -matroid, (E, F), can be speciﬁed via a feasible oracle. That is, given F ⊆ E the oracle tells whether F is in F or not. Unless explicitly stated we will assume that the -matroid is speciﬁed by means of a feasible oracle. We will also assume that one feasible set is always available. Using this representation, we say that the -matroid parity problem is polynomially solvable if it can be solved in time polynomial in the size |E| of the universe of the -matroid. Much of the work on the -matroid parity problem (and on the matroid parity problem) uses this representation (see for example [9]). Lov´asz showed that the -matroid parity problem, with feasible oracle representation, requires time exponential in the size of the universe even if the -matroid of the instance is restricted to be a matroid [15]. However, the problem is polynomial-time solvable if the -matroid in an instance is either linear or binary and is speciﬁed by its matrix representation [9]. In order to make precise the intimate relationship between generalized satisﬁability problems with at most two occurrences per variables and the -matroid parity problem we introduce the following deﬁnitions. Let S be a collection of -matroids. The S-parity problem contains all the instances ((E, F), L) of the -matroid parity problem such that (E, F) belongs to S. Let (E1 , F1 ) and (E2 , F2 ) be two -matroids such that E1 ∩ E2 = ∅. Let E = E1 ∪ E2 and F = {F1 ∪ F2 : F1 ∈ F1 , F2 ∈ F2 }. Then (E, F) is the direct sum of (E1 , F1 ) and (E2 , F2 ). It is easy to see that the direct sum of two -matroids is a -matroid. A set S of -matroids is closed under direct sum if the direct sum of any two elements in S is an element in S. Two -matroids (E1 , F1 ), (E2 , F2 ) are called isomorphic if there exists some bijection h : E1 → E2 such that for every u1 , . . . , un ∈ E1 , {u1 , . . . , un } ∈ F1 iff {h(u1 ), . . . , h(un )} ∈ F2 . A set S of -matroids is closed under isomorphism if every -matroid (E2 , F2 ) isomorphic to an element (E1 , F1 ) of S belongs also to S. Lemma 1. Let S be any collection of -matroids closed under direct sum and under isomorphism such that the S-parity problem is polynomially solvable, and let Γ be a basis consisting of relations from S. Then SATC (2, Γ ) is in P .

Generalized Satisﬁability with Limited Occurrences per Variable

365

The two new tractable cases of SATC (2, Γ ) introduced in this paper, local and binary, are obtained by identifying some polynomial-time solvable cases of the S-parity problem and transforming them, via Lemma 1, into tractable cases of SATC (2, Γ ). 3.1 The Method of Augmenting Paths Beginning with Edmonds’ original matching algorithm (see [6]), the notion of an augmenting path has been central to matching theory. In this section we generalize the notion of an augmenting path to make it suitable for the -matroid parity problem. Let ((E, F), L) be an instance of the -matroid parity problem. A path in (E, F) (or simply a path when the -matroid is implicit) is an ordered collection, u1 , . . . , un , of different elements in E. Let L ⊆ L, a path, u1 , . . . , un , is called L-alternating if: (1) {u1 , u1 }, {u1 , un } ∈ L, and (2) for every 1 ≤ 2j < n, {u2j , u2j+1 } ∈ L. Let F ∈ F be any feasible set. We say that a path, u1 , . . . , un , is an F path if: (1) F {u1 , . . . , un } ∈ F, and (2) for every 1 < 2j ≤ n, F {u1 , . . . , u2j } ∈ F. An LF -alternating F path, u1 , . . . , un , is called LF -augmenting (or simply augmenting when L and F are implicit) if either n is odd or {un , un } ∈ LF . The basic intuition behind this deﬁnition is that if F is a feasible set such that |LF | is not maximum then, by Theorem 4, there exists some LF -augmenting path, u1 , . . . , un . This path can be used to obtain a new feasible set G = F {u1 , . . . , un } which increases the objective function that we intend to maximize, (|LG | > |LF |) Theorem 4. Let ((E, F), L) be an instance of the -matroid parity problem and F ∈ F such that |LF | is not maximum. Then there exists an LF -augmenting path. Furthermore, we can compute it in time polynomial in |E| given a G ∈ F with |LG | > |LF |. 3.2 A New Tractable Case of -Matroid Parity: Local -Matroids From Theorem 4, the S-parity problem—and hence the SATC (2, Γ ) problem when Γ ⊆ S—reduces in polynomial time to the problem of either ﬁnding an augmenting path or determining that none exists. The augmenting path approach can be viewed as searching for an augmenting path by extending alternating paths. In graph matching two odd length alternating paths to the same node can be extended in the same way. This is not generally the case for parity problems. Intuitively, local -matroids is a family of -matroids for which this property still holds, that is, extending an alternating F path of a local -matroid is independent of the path. This motivates the term, local, and allows the local -matroid parity problem to be solved via augmenting paths without the need for transforms (see [9] for deﬁnitions). In particular we show that if a -matroid is local, then the existence of an augmenting path is equivalent to the existence of a speciﬁc type of path, called a local path. Furthermore, we show that it takes only polynomial time to (1) ﬁnd an alternating local path or determining that none exists and (2) transform an alternating local path into an augmenting path. Consequently, we conclude that the S-parity problem is tractable when S consists of local -matroids. We now give a formal deﬁnition of local -matroids in terms of concepts related to augmenting paths. In the full version of the paper it is possible to ﬁnd an equivalent characterization of local -matroids in terms of forbidden minors.

366

Victor Dalmau and Daniel K. Ford

Let (E, F) be a -matroid and let F ∈ F be a feasible set.A path u1 , . . . , un is called an F -local path (or simply a local path when F is implicit) if: (1) for every 1 < 2j ≤ n, F {u2j−1 , u2j } ∈ F and (2) F {un } ∈ F if n is odd. A cut of u1 , . . . , un is any path u1 , . . . , uj where 1 ≤ j ≤ n, and j is odd. A shortcut of u1 , . . . , un is any path u1 , . . . , uj , uk , uk+1 , . . . , un where 1 ≤ j < k ≤ n, j is odd, and k is even. A subpath of u1 , . . . , un is any path obtained from u1 , . . . , un by a (possibly empty) sequence of cuts and shortcuts. A path is F -local-minimal if it is F -local and it does not contain any proper F -local subpath. Let (E, F) be a -matroid relation, and M ≥ 0. (E, F) is called M -local if for every F ∈ F, every F -local-minimal path of length at most 2M is also F . (E, F) is called local if it is M -local for every M ≥ 0. Theorem 5. If SL is a set of local -matroids then the SL-parity problem is polynomially solvable.

4 Two New Tractable Cases of Generalized Satisﬁability with at Most 2 Occurrences per Variable: Local and Binary -Matroids The general procedure adopted here is to use Lemma 1 to transform tractable families of the -matroid parity problem into corresponding additional tractable families of SATC (2, Γ ). This technique is immediately applicable to local -matroids. Corollary 1. If Γ is a basis consisting of local relations then SATC (2, Γ ) is in P . As previously mentioned in Sec. 3 linear (binary) -matroids are also polynomially solvable cases of the -matroid parity problem. As we did with local -matroids, we would like to use Lemma 1 to transform the tractability of the linear (binary) matroid parity problem into corresponding additional tractable families of SATC (2, Γ ). Unfortunately, Lemma 1 doesn’t apply in these cases since the linear (binary) -matroid parity problems are only known to be tractable when the instances include a matrix representation of the -matroid. However, in the binary case, we can work around this problem to get the additional tractable family of SATC (2, Γ ) where Γ consists of binary -matroids. We continue this section with the deﬁnition of the linear (binary) -matroid parity problem, and conclude with Theorem 2, which provides the desired additional tractable case of SATC (2, Γ ). Let R ⊆ {0, 1}n be a -matroid and x ∈ {0, 1}n . Then R ⊕ x := {x ⊕ y : y ∈ R} is the twisting of R by x. It is easy to see that R ⊕ x is a -matroid. Two -matroids are said to be equivalent if one is a twisting of the other. A matrix is said to be skew-symmetric if all its diagonal entries are 0 and A = −AT . A matrix is said to be symmetric if A = AT . Hence, skew-symmetric matrices over GF (2) are a proper subset of symmetric matrices over GF (2). Let A be a (skew-)symmetric n by n matrix. For x ∈ {0, 1}n , deﬁne A[x] to be the principal sub-matrix of A indexed by the non-zero elements of x. Deﬁne R(A) := {x ∈ {0, 1}n : rank A[x] = x1 }. Then R(A) is an n-ary -matroid relation [3]. A -matroid R is called linear if it is equivalent to R(A) for some skewsymmetric matrix A, and binary if it is equivalent to R(A) for some symmetric matrix A over GF (2) [9]. (A, x) will be called the matrix representation of the linear or binary

Generalized Satisﬁability with Limited Occurrences per Variable

367

-matroid R(A)⊕x (if (A, x) is the matrix representation of R then x ∈ R.) The linear (binary) -matroid parity problem is: given a matrix representation for a linear (binary) n-ary -matroid, R, and, L, a partition of {1, . . . , n} into pairs, ﬁnd the maximum matching in R with respect to L. Theorem 6. [9] The linear and binary -matroid parity problems with matrix representations are in P . Corollary 2. If Γ is a basis consisting of binary relations then SATC (2, Γ ) is in P .

5

Concluding Remarks

Generalized satisﬁability with at most two variable occurrences is a rich class of problems intimately related to algorithms, concepts and methods of matching theory. Moreover, the computational complexity of these problems remains largely unknown.

Acknowledgements We would like to thank Phokion Kolaitis for introducing us to this problem and for his many helpful suggestions.

References 1. C. Berge. Two theorems in graph theory. Proc. Nat. Acad. Sci. U.S.A., 43:842–844, 1957. 2. A. Bouchet and W. H. Cunningham. Delta-matroids, jump systems, and bisubmodular polyhedra. SIAM J. Discrete Math., 8(1):17–32, 1995. 3. A. Bouchet. Representability of -matroids. In Combinatorics, pages 167–182. 1988. 4. A. Bouchet. Coverings and delta-coverings. In Integer programming and combinatorial optimization (Copenhagen, 1995), pages 228–243. 5. N. Creignou, S. Khanna, and M. Sudan. Complexity classiﬁcations of Boolean constraint satisfaction problems. SIAM 2001. 6. J. Edmonds. Paths, trees, and ﬂowers. Canad. J. Math., 17:449–467, 1965. 7. T. Feder. Fanout limitations on constraint systems. Th. Comp. Sci., 255(1-2):281–293, 2001. 8. M. R. Garey and D. S. Johnson. Computers and intractability. 1979. 9. J. Geelen, S. Iwata, and K. Murota. The linear delta-matroid parity problem. Technical Report RIMS Preprint 1149, Kyoto University, 1997. 10. G. I. Istrate. Looking for a version of schaeer’s dichotomy theorem when each variable occurs at most twice. Technical Report TR652, The University of Rochester, March 1997. 11. J. Kratochv´ıl, P. Savick´y, and Z. Tuza. One more occurrence of variables makes satisﬁability jump from trivial to NP-complete. SIAM J. Comput., 22(1):203–210, 1993. 12. R. E. Ladner. On the structure of polynomial time reducibility. J. Assoc. Comput. Mach., 22:155–171, 1975. 13. L. Lov´asz and M. D. Plummer. Matching Theory. 1986. 14. L. Lov´asz. The membership problem in jump systems. J. Comb. Th. (B), 70(1):45–66, 1997. 15. L. Lov´asz. The matroid matching problem. In Algebraic methods in graph theory, Vol. I, II (Szeged, 1978), pages 495–517. 16. T. J. Schaefer. The complexity of satisﬁability problems. In Theory of Computing. 1978. 17. Craig A. Tovey. A simpliﬁed NP-complete satisﬁability problem. Discrete Appl. Math., 8(1):85–89, 1984.

Randomized Algorithms for Determining the Majority on Graphs Gianluca De Marco1 and Andrzej Pelc2 1

Istituto di Informatica e Telematica, Consiglio Nazionale delle Ricerche via Moruzzi 1, 56124 Pisa, Italy [email protected] 2 D´epartement d’informatique, Universit´e du Qu´ebec en Outaouais Hull, Qu´ebec J8X 3X7, Canada [email protected]

Abstract. Every node of an undirected connected graph is colored white or black. Adjacent nodes can be compared and the outcome of each comparison is either 0 (same color) or 1 (diﬀerent colors). The aim is to discover a node of the majority color, or to conclude that there is the same number of black and white nodes. We consider randomized algorithms for this task and establish upper and lower bounds on their expected running time. Our main contribution are lower bounds showing that some simple and natural algorithms for this problem cannot be improved in general.

1

Introduction

Given an undirected connected n-node graph G = (V, E), any assignment of colors white or black to the nodes of G such that there are at most n/2 black nodes, is referred to as a coloring of G. Given such a coloring, adjacent nodes can be compared and the outcome of each comparison is either 0 (same color) or 1 (diﬀerent colors). The aim is to discover a white node, in the case when white nodes form a strict majority, or else to conclude that there is the same number of black and white nodes, using as few comparisons as possible. This problem has been investigated in [2,3,10] for the complete graph, i.e., in the case when any pair of nodes can be compared. It has been proved in [10,2] that the minimum worst-case number of comparisons to deterministically solve this problem is n − ν(n), where ν(n) is the number of 1-bits in the binary representation of n. In [3] the minimum average-case number of comparisons was investigated for this problem. The above problem has a similar ﬂavor to that of diagnosis of multiprocessor systems, introduced in [9]. Such a system is represented by an undirected

This work was done during the ﬁrst author’s visit at the Research Chair in Distributed Computing of the Universit´e du Qu´ebec en Outaouais. The work of the second author was supported in part by NSERC grant OGP 0008136 and by the Research Chair in Distributed Computing of the Universit´e du Qu´ebec en Outaouais.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 368–377, 2003. c Springer-Verlag Berlin Heidelberg 2003

Randomized Algorithms for Determining the Majority on Graphs

369

connected graph whose nodes are processors. Some processors are faulty, there are less than one-half of such processors, and the aim of diagnosis is to locate all faulty processors, by performing tests on some processors by adjacent ones. Among many fault models considered in the literature ([9] introduced the ﬁrst of them, called the PMC model), two are particularly relevant in our context: the symmetric comparison model of Chwa and Hakimi [4], and the asymmetric comparison model of Malek [8]. In both models, comparison tests can be conducted between adjacent processors. A comparison test between two fault-free processors gets answer 0 (no diﬀerence) and the comparison test between a faultfree and a faulty processor gets answer 1 (diﬀerence). This is also identical to our assumptions (black nodes representing faulty processors). The two models diﬀer between them and also diﬀer from our setting when two faulty processors are compared. In the symmetric comparison model, the answer can be then arbitrary (0 or 1) and in the asymmetric comparison model the answer is 1. The justiﬁcation usually given for the two above diagnosis models is the following. Comparison tests often consist in choosing a fairly complex computational task and submitting it to both compared processors. If the results of computations are identical for both processors, the answer to the test is 0, if not, the answer is 1. Two fault-free processors will clearly give the same result of the computation, and a faulty processor will likely make an error somewhere in the computation, thus producing a diﬀerent result than a good one. The situation is less clear when two faulty processors are compared. They may either err in the same way, thus causing answer 0 to the comparison test, or make diﬀerent mistakes, the test answer being 1 in this case. This argument justiﬁes the symmetric model. However, one may say that, for a complex computational task, identical errors for two faulty processors are very unlikely, thus the asymmetric comparison model could be more realistic. Our testing model, in which comparison test results faithfully describe the same or diﬀerent fault status of tested processors (same or diﬀerent node colors) can be justiﬁed by another scenario. Suppose that all processors of the system have a boolean variable identically initialized in the beginning. Then some processors (less than half of them) fail, and the fault consists in corrupting precisely this bit. We want to discover the original bit (which corresponds to the majority color, since only less than half of the processors changed it) by comparing the value of this boolean variable between adjacent processors. This situation is similar to the persistent bit problem described in [7], although the focus in [7] was to distributedly restore the common bit in all nodes, using only local probes of the network by each processor. In the above context of fault-tolerance, it is natural to assume that faulty processors (black nodes) are signiﬁcantly less numerous than the good ones (white nodes), since in realistic systems the number of faults rarely approaches 50%. Therefore, it is reasonable to suppose that the number of black nodes is at most αn, for some positive constant α < 1/2. A coloring satisfying this assumption will be called an α-coloring. Thus an α-coloring of an n-node graph G = (V, E)

370

Gianluca De Marco and Andrzej Pelc

is a function I : V → {b, w}, (where b and w stand for black and white, respectively), with |I −1 ({b}| ≤ αn. We now formulate two variations of the problem considered in this paper. The Simple-Majority Problem on Graphs (MPG). Let G = (V, E) be an undirected connected graph with an input coloring deﬁned on it. If white nodes strictly outnumber black nodes, we must discover a white node v ∈ V , and report equality otherwise, by making comparisons beetwen adjacent nodes of G. The outcome of every comparison is either 0 (equal colors) or 1 (nonequal colors). The goal is to use as few comparisons as possible. The α-Majority Problem on Graphs (α-MPG). Let G = (V, E) be an undirected connected graph with an input α-coloring deﬁned on it, for some α < 1/2. We must discover a white node v ∈ V by making comparisons beetwen adjacent nodes of G. The outcome of every comparison is either 0 (equal colors) or 1 (nonequal colors). The goal is to use as few comparisons as possible. For both these problems, we refer to the number of comparisons used by an algorithm on a given input coloring (resp. α-coloring), as the running time of the algorithm on this input. Hence we assume that all operations other than tests take negligible time. It should be noted that in the α-MPG, the parameter α < 1/2 is known to the algorithm selecting comparisons. Aigner [1] considered this variation of the majority problem in the deterministic setting for the complete graph, and proved that the minimum number of comparisons is 2αn − ν(αn). Obviously, for any connected graph, 2αn comparisons are suﬃcient, by performing tests along edges of a spanning tree of any connected subgraph with 2αn+1 nodes. Aigner [1] also pointed out interesting relations between the α-MPG and the diagnosis problem in the PMC model from [9], in the case of the complete graph. Hence (asymptotically) optimal solutions to both MPG and α-MPG are known in the deterministic setting. Since the running time of optimal algorithms is linear in the number of nodes in both cases, it is natural to seek randomized algorithms for both problems, hoping to improve eﬃciency. Thus in the present paper we concentrate on randomized algorithms for both these problems. It turns out that while randomization does not help signiﬁcantly in the case of the MPG, it sometimes drastically improves algorithm complexity for the α-MPG. Our main contribution are lower bounds showing that the complexity of some simple and natural algorithms for these problems cannot be improved in general. 1.1

Our Results

We ﬁrst show that the simple-majority problem does not allow eﬃcient randomized algorithms on any graphs. Indeed, we prove that if the diﬀerence between the number of white and black nodes is bounded by a constant then every randomized algorithm for determining a white node in an n-node graph (with suﬃciently

Randomized Algorithms for Determining the Majority on Graphs

371

small constant error probability) uses expected running time Ω(n) on some input, even for the complete graph. (As mentioned above, O(n)-time algorithms – even deterministic – exist for any connected graph). Hence, in the rest of the paper we investigate randomized algorithms for the α-majority problem on nnode connected graphs, with parameter α < 1/2. We study expected running time of randomized Monte Carlo algorithms for the α-MPG, whose probability of error is at most > 0. For any connected graph, we show an algorithm whose running time on every input is O(D log(1/)), where D is the diameter of the graph. If the maximum degree of a graph is βn, where β > 2α, then there is an algorithm with running time O(log(1/)). We show that these bounds cannot be improved in general. Every algorithm, running on an arbitrary n-node graph, must use expected time Ω(min(n, log(1/))) on some input. We also show that the large constant β in the requirement of maximum degree of βn is essential to get a fast algorithm: we show graphs of maximum degree Θ(n), for which every algorithm must use expected time Ω(n) on some input. On the other hand, for suﬃciently small constant , the bound O(D) cannot be improved for a large class of graphs: we show that, for d-dimensional grids and tori, every algorithm must use expected time Ω(D) on some input.

2

Terminology and Preliminaries

Throughout the paper, we restrict attention to connected graphs. Given a graph G = (V, E) and an input coloring I on it, we deﬁne the comparison function LI of G on input I as the function LI : E → {0, 1}, such that for each edge e = {u, v} ∈ E, 0 if I(u) = I(v); LI (e) = 1 otherwise. We will use deterministic algorithms as a tool to prove lower bounds on the performance of randomized algorithms. Fix a deterministic algorithm A for the MPG (resp. α-MPG). Given a graph G = (V, E) and an input coloring (resp. α-coloring) I on it, algorithm A works in consecutive steps deﬁned as follows. At any step, the algorithm selects an edge e ∈ E and receives as an answer LI (e). After the last step, it shows a node of the graph as the solution (a discovered white node). For any input I, the set EI ⊆ E denotes the set containing all the edges selected during the execution of A. The running time of algorithm A on input I is r(I) = |EI |. For any input I and integer γ, we deﬁne the set EI (γ) as follows {e∈EI | e is selected at step s≤γ } if γ < |EI | EI (γ) = EI otherwise. Given an input I, we deﬁne the execution E(A, I) of algorithm A on input I, as the pair E(A, I) = (GI , uI ), where GI = (VI , EI ) is the subgraph of G induced by the set of edges EI , and uI is the node representing the solution given by A. We will use the following version of Chernoﬀ bound.

372

Gianluca De Marco and Andrzej Pelc

Lemma 1 (Chernoﬀ bound [6]). Let X be the number of successes in a series of Bernoulli trials of length m with success probability q. Let q < q. Then P rob(X ≤ q m) ≤ e−am , where a is a positive constant depending on q and q but not on m. The main tool for proving lower bounds on the performance of randomized algorithms will be the following well known result. Lemma 2 (Yao’s minimax principle [11]). Let 0 < < 1/2. Let P be a probability distribution over the set of inputs. Let A denote the set of all deterministic algorithms that err with probability at most 2 over P. For A ∈ A, let C(A, P) denote the expected running time of A over P. Let R be the set of randomized algorithms that err with probability at most for any input, and let E(R, I) denote the expected running time of R ∈ R on input I. Then, for all P and all R ∈ R, minA∈A C(A, P) ≤ 2 maxI E(R, I). The standard application of the above lemma to lower bound proofs is the following. We construct a probability distribution over the set of inputs for a given graph, for which any deterministic algorithm that errs with probability at most 2 has a large expected running time over this probability distribution. (Note that there is a big ﬂexibility in the choice of the distribution, in fact any set of inputs can be selected, by putting probability zero on other inputs). In view of the lemma, this implies that every randomized (Monte Carlo) algorithm that errs with probability at most for any input, must have large expected running time on some input.

3

The Simple-Majority Problem on Graphs

In this section we show that the simple-majority problem on graphs does not allow eﬃcient randomized algorithms. Indeed, we prove that if the diﬀerence between the number of white and black nodes is bounded by a constant then every randomized algorithm for determining a white node in an n-node graph (with suﬃciently small constant error probability) uses expected running time Ω(n) on some input, even for the complete graph. Note that this lower bound holds even if the exact (constant) diﬀerence between the number of black and white nodes is known to the algorithm selecting comparisons. This lower bound on complexity is tight in view of the obvious algorithm using n − 1 comparisons on any connected graph: performing all tests along edges of a spanning tree, and determining the majority color. Theorem 1. Consider an arbitrary graph G = (V, E) and an arbitrary positive integer constant d. Suppose that the number of white nodes exceeds that of black nodes by d. Any randomized algorithm for determining a white node in G, which errs with probability at most , for suﬃciently small constant , must have expected running time Ω(n) on some input I.

Randomized Algorithms for Determining the Majority on Graphs

373

Proof. Fix a positive integer constant d. Consider the set I of input colorings on G, for which the number of white nodes exceeds that of black nodes by d. We prove that any deterministic algorithm for the MPG on G, that errs with probability at most 2 on the uniform distribution on I, uses an expected number of Ω(n) comparisons. By Yao’s minimax principle, this implies our theorem. Let g be the size of the set I. Let c = 10d · d! and ﬁx < 1/(4c + 4). Fix a deterministic algorithm A for the MPG on G which errs with probability at most 2 on the uniform distribution on I. Two cases are possible: either r(I) ≤ n/10 for at least g/2 inputs I, or r(I) > n/10 for at least g/2 inputs I. Suppose that the ﬁrst case holds and let J be the set of inputs for which r(I) ≤ n/10. Denote by J + the set of inputs in J on which algorithm A is correct, and by J − the set of inputs in J on which algorithm A is incorrect. Let VI be the set of nodes involved in comparisons made by A on input I and let uI be the node presented by A as a solution on input I. Denote UI = VI ∪ {u}. We have |UI | ≤ 2r(I) + 1 ≤ 3n/10. Denote by WI the set of nodes outside of UI which are colored white on input I. We have |WI | ≥ n/5, for any I ∈ J . For any I ∈ J + and any set D ⊆ WI of size d, let f (I, D) denote the coloring resulting from I by interchanging colors white and black at each node outside of D. Notice that f (I, D) ∈ I because the number of white nodes in f (I, D) exceeds that of black nodes by exactly d (the set D is a set of those extra d white nodes). For any I ∈ J + and any D ⊆ WI , we have E(A, I) = E(A, f (I, D)). In particular, f (I, D) ∈ J . Moreover, for any I ∈ J + , algorithm A is incorrect on f (I, D), and hence f (I, D) ∈ J − . For a given I ∈ J + and diﬀerent sets D1 , D2 , the colorings f (I, D1 ) and f (I, D2 ) are diﬀerent. On the other hand, for a ﬁxed set D and ﬁxed coloring J, there is only one coloring I such that J = f (I, D). Consider the bipartite graph on the set J of colorings with bipartition J + , J − and edges between those I ∈ J + and J ∈ J − , for which J = f (I, D), for some set D ⊆ WI of size d. Every I ∈ J + has degree at least n/5 (n/10)d nd ≥ = . d d! c Every J ∈ J − has degree at most nd ≤ nd . Hence |J − | ≥

nd /c · |J + | = |J + |/c. nd

This implies |J + | ≤ c|J − |, hence g/2 ≤ |J | ≤ (c + 1)|J − |, and consequently |J − | ≥ g/(2c + 2). Thus the probability that the algorithm is incorrect is at least 1/(2c + 2) > 2, which is a contradiction. Hence the second case must hold, i.e., r(I) > n/10 for at least g/2 inputs I. This implies that the expected running time of algorithm A is Ω(n), and the theorem follows.

374

4

Gianluca De Marco and Andrzej Pelc

Upper Bounds for the α-Majority Problem on Graphs

In this section we establish upper bounds on the expected running time of randomized algorithms for the α-MPG, and show eﬃcient randomized algorithms for this problem on a large class of graphs. We begin with the following lemma. Lemma 3. Fix α < 1/2. Let β > 2α and let H be a connected subgraph of an nnode graph G, with k ≥ βn nodes. Let 0 < < 1 be the bound on error probability. Then there exists a randomized algorithm for the α-MPG on the graph G which errs with probability at most and has running time O(D log(1/)) on every input, where D is the diameter of the graph H. Proof. Consider a series of m = c log(1/) Bernoulli trials with success probability q = (β − α)/β. Let E be the event that in this series there are at most m/2 successes. By Lemma 1, the probability of event E is at most , for some constant c, in view of q > 1/2. Fix such a constant c. The algorithm works as follows. 1. Make m random independent selections of nodes in H with uniform probability 1/k. Note that selections are with return, so it is possible to select the same node many times. 2. Let S = {s1 , ..., sr } be the set of selected nodes. Construct a spanning subtree T of this set in the graph H in the following greedy manner: let Ti−1 be the part of T constructed till step i − 1. (T0 consists of node s1 ). At step i, join node si+1 to the closest (in H) among nodes in Ti−1 by a shortest path in H. 3. Perform all tests along edges of this tree. Answers to these tests induce an assignment of colors 1 and 2 to all nodes of the tree. Consider colors assigned to nodes of S and count them with multiplicities, i.e., add x to the count of a given color if a node of this color was chosen x times in the selection. 4. Select a node v of that of colors 1 or 2 which got count at least as large as the other color. Choose this node v as the solution (white) node. In order to analyze this algorithm, observe that if it is incorrect then the majority among the m random selections must be black. This means that in the corresponding Bernoulli series with success probability q (success means selecting a white node) there are at most m/2 successes. By the choice of the constant c, this probability is at most , hence the algorithm errs with probability at most , as required. The total number of tests is at most rD ≤ mD ∈ O(D log(1/)), for any input. Notice that the bound on running time holds for every execution of the algorithm (not only for the expected value of the running time), for every input. Putting H = G in the above lemma we obtain the following result. Theorem 2. Fix α < 1/2. For any 0 < < 1 and any connected graph G there exists a randomized algorithm for the α-MPG, which errs with probability at most and has running time O(D log(1/)) on every input, where D is the diameter of G.

Randomized Algorithms for Determining the Majority on Graphs

375

The next result implies that the α-MPG can be randomly solved with any constant error bound in constant time, for graphs of large maximum degree. It follows from Lemma 3. Theorem 3. Fix α < 1/2. For any < 1 and any graph G of maximum degree at least βn, for β > 2α, there exists a randomized algorithm for the α-MPG, which errs with probability at most and has running time O(log(1/)) on every input.

5

Lower Bounds for the α-Majority Problem on Graphs

In this section we establish lower bounds on the performance of randomized algorithms for the α-MPG. The ﬁrst lower bound shows that the complexity obtained in Theorem 3 for graphs of large maximum degree, cannot be improved, regardless of this degree. Of course, we cannot get the lower bound Ω(log(1/)) for any error probability , when is a very fast decreasing function of the number n of nodes of the graph, since, for connected graphs, n − 1 comparisons performed along edges of a spanning tree are always suﬃcient. Hence the best lower bound we can hope for in general is Ω( min (n, log(1/))). We show that it actually holds. Theorem 4. Fix α < 1/2. Let G = (V, E) be an n-node graph. Every randomized algorithm for the α-MPG on G, that errs with probability at most , has expected running time Ω(min(n, log(1/))), for some input. Proof. We deﬁne a set I of input α-colorings for the α-MPG on G, and a (nonuniform) probability distribution on I, such that any deterministic algorithm for the α-MPG on G, that errs with probability at most 2 on this distribution, uses an expected number of Ω(min(n, log(1/))) comparisons. By Yao’s minimax principle, this implies our theorem. Let p = α/2. Consider the set J of all assignments of colors white or black to nodes in V . For a given assignment J ∈ J , let x(J) denote the number of black nodes in J. Deﬁne a probability distribution on J by the formula P (J) = px(J) (1−p)n−x(J) . Distribution P corresponds to random and independent coloring of every node: black with probability p and white with probability 1 − p. Let I = {J ∈ J |x(J) ≤ αn} and let q = J∈I P (J). Deﬁne the probability distribution P on I by the formula P(J) = P (J)/q. Set I consists of all α-colorings of G and probability distribution P is yielded by probability distribution P by restricting it to I and normalizing. As usual, we extend P to subsets of I by taking the sum of distribution values over all elements of a subset. By Lemma 1, there exists a positive constant b such that 1−q ≤ e−bn . Let d be a positive constant for which e−bn ≤ pdn /2. Let c be a positive constant satisfying 2c log(1/)+1 > . (There exists an 0 depending only on p, such the inequality p 4q that, for < 0 , such a constant exists. If ≥ 0 then the theorem is true because Ω(min(n, log(1/))) = Ω(1) in this case.) Let γ = min((dn − 1)/2, c log(1/)). Fix a deterministic algorithm A for the α-MPG on G, that errs with probability

376

Gianluca De Marco and Andrzej Pelc

at most 2 on distribution P. Two cases are possible: either for some input I ∈ I we have r(I) ≤ γ, or r(I) > γ for all I ∈ I. Suppose that the ﬁrst case holds and ﬁx I ∈ I for which r(I) ≤ γ. Let U = VI be the set of nodes involved in comparisons made by A on input I and let u = uI be the node presented by A as a solution on input I. Let |U ∪{u}| = δ. By deﬁnition, δ ≤ 2γ + 1. Consider the following assignment C of colors white and black to nodes of the set U ∪ {u}: if A is incorrect on I, then C is as in I; otherwise, C is reverse with respect to I (colors white and black are interchanged on U ∪ {u}). Let K be the set of all inputs in I which have the assignment C of colors on the set U ∪ {u}. We have P (K) ≥ pδ − (1 − q) ≥ pδ − pdn /2 ≥ pδ /2. Hence p2c log(1/)+1 pδ ≥ > 2. P(K) ≥ 2q 2q For any J ∈ K, we have E(A, J) = E(A, I), hence A is incorrect on every input J ∈ K. This implies that A errs with probability larger than 2, which is a contradiction. Hence we must have r(I) > γ, for all I ∈ I. Our next result shows that the upper bound given by Theorem 2 cannot be improved in general. For constant error probability we got running time linear in the diameter of the graph. We now show that for a large class of graphs, including d-dimensional grids and tori, this running time is optimal. Fix a positive integer d. An m-node line is the graph with the set of nodes V = {v1 , ..., vm } and set of edges E = {{vi , vi+1 }|i = 1, ..., m − 1}. An m-node cycle has the same set of nodes and one edge more: {v1 , vm }. A d-dimensional grid of size m is the graph Gd,m which is the graph product of d copies of the mnode line. A d-dimensional torus of size m, denoted by Td,m , is a d-dimensional grid of size m, supplemented with wrap around edges. The proof of the following theorem will appear in the full version of this paper. Theorem 5. Fix α < 1/2 and < α/8. For any randomized algorithm R for the α-MPG on the grid Gd,m or on the torus Td,m , having error probability at most , there exists an input α-coloring I, such that the expected number of comparisons used by R on I is Ω(m). We conclude with the observation that the assumption of Theorem 3 cannot be weakened to only require maximum degree linear in the number of nodes: the large constant β, for maximum degree βn, turns out to be crucial to get a fast randomized algorithm. The proof of the following result will appear in the full version of this paper. Proposition 1. Fix α < 1/2. There exist n-node connected graphs Gn of maximum degree Θ(n), such that every randomized algorithm for the α-MPG on Gn , with suﬃciently small constant error probability, has expected running time Ω(n), on some input.

Randomized Algorithms for Determining the Majority on Graphs

6

377

Conclusion and Open Problems

We showed that randomization does not help signiﬁcantly to solve the simplemajority problem on graphs: expected time linear in the number of nodes is still necessary if white nodes outnumber black nodes only by a constant. What happens in the more general case, if the diﬀerence is o(n) for n-node graphs? Can the MPG be solved with constant error probability √ in sublinear expected time, e.g., if the algorithm knows that the diﬀerence is n? On the other hand, randomization drastically improves algorithm complexity for the α-majority problem on graphs, α < 1/2, in some cases. For constant error probability, it is possible to solve the α-MPG in time linear in the diameter of a suﬃciently large subgraph of the given graph. Thus, for graphs of large maximum degree, the α-MPG can be solved in constant time with constant error probability. However, for graphs of bounded maximum degree, constant diameter subgraphs are only of constant size, so our upper bound argument does not apply. Nevertheless, for some graphs of bounded maximum degree, the α-MPG can be solved in constant time with constant error probability. This is the case, e.g., for regular expander graphs, in view of the results from [5]. On the other hand, our lower bound for d-dimensional grids and tori shows that for some bounded degree graphs such a constant time algorithm does not exist. The following question remains open: for which graphs the α-MPG can be solved in constant time with constant error probability?

References 1. M. Aigner, Variants of the majority problem, Discrete Applied Mathematics, to appear. 2. L. Alonso, E. M. Reingold, and R. Schott, Determining the majority, Information Processing Letters 47 (1993), 253-255. 3. L. Alonso, E. M. Reingold, and R. Schott, The average-case complexity of determining the majority, SIAM Journal on Computing 26 (1997), 1-14. 4. K. Y. Chwa and S. L. Hakimi, Schemes for fault-tolerant computing: A comparison of modularly redundant and t-diagnosable systems, Information and Control 49 (1981), 212-238. 5. D. Gillman, A Chernoﬀ bound for random walks on expander graphs, SIAM Journal on Computing 27 (1998), 1203-1220. 6. T. Hagerup and C. Rub, A guided tour of Chernoﬀ bounds, Information Processing Letters 33 (1989/90), 305-308. 7. S. Kutten and D. Peleg, Fault-local distributed mending, Proc. 14th ACM Symposium on Principles of Distributed Computing (1995), 20-27. 8. M. Malek, A comparison connection assignment for diagnosis of multiprocessor systems, Proc. 7th Symp. Comput. Architecture (1980), 31-35. 9. F. P. Preparata, G. Metze, and R. T. Chien, On the connection assignment problem of diagnosable systems, IEEE Trans. on Electr. Computers 16 (1967), 848-854. 10. M. E. Saks and M. Werman, On computing majority by comparisons, Combinatorica 11 (1991), 383-387. 11. A. C-C. Yao, Probabilistic computations: Towards a uniﬁed measure of complexity, Proc. 18th Ann. IEEE Symp. on Foundations of Computer Science (1977), 222-227.

Using Transitive–Closure Logic for Deciding Linear Properties of Monoids Christian Delhomm´e1 , Teodor Knapik1 , and D. Gnanaraj Thomas2 1

ERMIT, Universit´e de la R´eunion, BP 7151 97715 Saint Denis Messageries Cedex 9, R´eunion {delhomme,knapik}@univ-reunion.fr 2 Dept. of Mathematics Madras Christian College Tambaram, Madras 600 059, India

Abstract. We use ﬁrst–order logic with transitive closure operator FO(TC1 ) for deciding ﬁrst–order linear monoid properties. These are written in the style of linear sentences of Ron V. Book, but with a less restrictive language. The decidability of such properties concerns monoids presented by recognizable convergent suﬃx semi–Thue systems. Keywords: string rewriting, monoid presentations, transitive closure logic

Introduction String–rewriting systems, also known as semi–Thue systems, appear as perhaps the oldest model of computation although their introduction by Axel Thue [18] early in the 20th century was related to (semi–)group theory. Later, semi–Thue systems have been conﬁrmed as useful in various areas of computing including computer algebra, compiling [1], cryptographic protocols [6, 7] and public–key cryptography [17]. In the present paper, semi–Thue systems are studied in their primary context of combinatorial (semi–)group theory and are regarded as monoid presentations. Our leitmotiv is the question: which classes of properties are decidable for which classes of monoids. Of course, without restrictions on the general form of semi– Thue systems involved in the presentations, only undecidability results may be expected. Several restrictions have therefore been identiﬁed as being crucial in discussing decidability during the second half of the 20th century, in particular, termination, conﬂuence or the property of being monadic. These restrictions lead to several classes of monoids. Concerning the classes of properties, a general approach consists in asking about the logical language necessary for their expression. A class of ﬁrst–order monoid properties has been proposed by Ron V. Book in [5]. The properties of this class are deﬁnable by means of linear sentences, viz., sentences with no multiple occurrences of any one variable. More precisely Ron V. Book considered a Σ02 ∪ Π02 fragment of ﬁrst–order logic with equality, restricted to connectives ∧ and ∨, where the sentences are linear and B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 378–387, 2003. c Springer-Verlag Berlin Heidelberg 2003

Using Transitive–Closure Logic for Deciding Linear Properties of Monoids

379

such that every term may contain only universally or only existentially bound variables. This may seem to be a rather weak logic, but, as pointed out in [5], it is still powerful enough for expressing a number of interesting monoid properties. Moreover, the main result of [5] attests the decidability of those sentences for every monoid presented by a ﬁnite monadic and Church–Rosser semi–Thue system and this problem is shown to be PSPACE–complete. In [16], the decidability of Book’s linear sentences is established for context–free groups. We extend the aforementioned result of Ron V. Book by considering more general classes of both semi–Thue systems and sentences. First, we drop the restriction to sole connectives ∨ and ∧, we let terms contain both existentially and universally bound variables and we allow a more liberal alternation of quantiﬁers. More precisely, we use the full ﬁrst–order logic with equality, still however subject to linearity restriction and of the following condition: we require the order of quantiﬁers to be in accordance with the order of variables in terms. We show that this logic is decidable in the case of monoids presented by recognizable (possibly inﬁnite) convergent and suﬃx semi–Thue systems. This is accomplished by reducing the truth of linear sentences to the model checking of, so called, Cayley–type graph, with respect to the transitive closure logic FO(TC1 ).

1

Background

The powerset of a set E is written ℘(E). The set {1, . . . , n} is abbreviated as [n] with [0] = ∅. The domain (resp. range) of a binary relation R is written Dom(R) (resp. Ran(R)). We assume that the reader is familiar with the notions of monoid, word, language, rational and recognizable subsets of a monoid and regular expression. The family of recognizable (resp. rational) subsets of a monoid M is written Rec(M) (resp. Rat(M)). The free monoid over Σ is written Σ ∗ , ε stands for the empty word and the length of a word w ∈ Σ ∗ is denoted by |w|. The i–th letter of w, for i ∈ [|w|], is written w(i). The reversal w(|w|)w(|w| − 1) . . . w(1) of w is = {w written w and this notation is extended to sets in the usual way: W |w∈ W }. When w = uv then u (resp. v) is called a preﬁx (resp. suﬃx ) of w. The set of suﬃxes of w is written suﬀ(w). Semi–Thue Systems and Monoid Presentations A semi–Thue system (an sts for short) S is a subset of Σ ∗ ×Σ ∗ . An element (l, r) of S is called a rule and is written l → r; the word l (resp.r) is its left hand (resp. right hand) side. The reversal of S is the following sts: S := { l → r | l → r ∈ S}. The single–step reduction relation induced by S on Σ ∗ , is the binary relation - = {(xly, xry) | x, y ∈ Σ ∗ , l → r ∈ S} .

S

∗ v, A word u reduces into a word v (alternatively v is a descendant of u), if u S ∗ where - is the reﬂexive–transitive closure of - called the reduction reS

S

380

Christian Delhomm´e, Teodor Knapik, and D. Gnanaraj Thomas

lation induced by S on Σ ∗ . A word v is irreducible with respect to (w.r.t. for short) S when v does not belong to Dom( - ). It is easy to see that the set S

of all irreducible words w.r.t. S, written Irr(S), is rational whenever Dom(S) is so, since Dom( - ) = Σ ∗ Dom(S) Σ ∗ . The set of irreducible descendants of S

a word u w.r.t. S, is written u↓S . The Thue congruence induced by S on Σ ∗ is the reﬂexive–symmetric–transitive closure of - , written ∗- . We denote by MS , the quotient of the S S free monoid Σ ∗ by this congruence. We say that a monoid M is presented by Σ, S, if S ⊆ Σ ∗ × Σ ∗ and M is isomorphic to MS . A semi–Thue system S over Σ is said to be (1) monadic if Ran(S) ⊆ Σ ∪ {ε} and |l| > |r| for each l → r ∈ S, (2) terminating or noetherian if there is no inﬁnite chain u0 - u1 - · · · , S

S

∗ ∗ (3) conﬂuent if for all words u, v, v ∈ Σ such that u v and u v , there

∗

S

S

∗ ∗ w and v w, exists a word w ∈ Σ such that v ∗

S

S

(4) convergent if it is both conﬂuent and terminating, n (5) recognizable if it is a ﬁnite union S = i=1 Li × Ri with Li , Ri ∈ Rat(Σ ∗ ) for i ∈ [n]. We recall that, in a convergent sts, u↓S is a singleton for every word u. Graphs and Transitive Closure Logic Given an alphabet Σ, a simple directed edge–labeled graph G over Σ is a set of edges, viz a subset of D × Σ × D where D is an arbitrary set. Given d, d ∈ D, a d . A (ﬁnite) path in an edge from d to d labeled by a ∈ Σ is written d G

G from some d ∈ D to some d ∈ D is a sequence of edges of the following a1 an form: d0 d1 , . . . , dn−1 dn , such that d0 = d and dn = d . The word G G w - d to mean that w = a1 . . . an is then the label of the path and we write d ........ L - d , if there is a path from d to d labelled by w. For L ⊆ Σ ∗ , we write d ........ G

w - d , for some w ∈ L. By convention, there is a path labelled by ε from d ........ G

every vertex to itself. For the purpose of this paper, the interest lies only in graphs, the vertices of which are all accessible from some distinguished vertex. Thus, a graph G ⊆ D × Σ × D is said to be rooted on a vertex e ∈ D if there exists a path from e to each vertex of G. The following assumption is made for the sequel. Whenever in a deﬁnition of a graph a vertex e is distinguished as root, then the maximal subgraph rooted on e is understood. We skip a general deﬁnition of a unary transitive–closure logic FO(TC1 ) of a relational structure and we focus on the particular case of simple, directed,

Using Transitive–Closure Logic for Deciding Linear Properties of Monoids

381

edge–labelled graph G ⊆ D × Σ × D with a distinguished root e ∈ D. Such a graph G may be seen as the model theoretic structure D, ( a- )a∈Σ , e and used for interpreting FO(TC1 ). The formulae are constructed from vertex variables written x, y, x , x1 , etc., the binary predicate symbols (sa )a∈Σ and “=”, the constant symbol r, the classical connectives and quantiﬁers, as well as the transitive closure operator TC deﬁned as follows. Let ϕ(x, y) be a formula with at least 2 free variables x and y and let s, t be two terms. The formula [TCx,y ϕ](s, t) says that the ordered pair (s, t) belongs to the reﬂexive–transitive closure of the binary relation that is deﬁned by ϕ(x, y). The predicate symbols sa and “=” are a and equality. The constant symbol r is interpreted interpreted on G resp. as G

by e. The satisfaction of a formula under an assignment is deﬁned in the usual way (see e.g. [11]). We denote by Fvar(ϕ) the set of variables which have free occurrences in ϕ. In the following, we give an example of a property that can be deﬁned in FO(TC1 ). Example 1.1. For every rational language L, one may construct an FO(TC1 ) formula pathL (x, y) which is true under every assignment ν into a graph G such

L - ν(y). Assuming that L is presented by a rational expression, that ν(x) ......... G

pathL (x, y) is deﬁned inductively path∅ (x, y) path{a} (x, y) pathR∪R (x, y) pathRR (x, y) pathR∗ (x, y)

:⇐⇒ :⇐⇒ :⇐⇒ :⇐⇒ :⇐⇒

sa (x, y) ∧ ¬sa (x, y) sa (x, y) pathR (x, y) ∨ pathR (x, y) ∃z (pathR (x, z) ∧ pathR (z, y)) [TCx,y pathR (x, y)](x, y)

The Kleene star is a typical example of a situation where the transitive closure is needed. As shown by Fagin [12], the latter cannot be deﬁned in ﬁrst–order logic. A sentence is a formula with no free variable1 . Given a logic L, the L model checking problem for a class of structures S , written MCP(L, S ), is the problem of deciding, for every structure S ∈ S , the membership to the set of L sentences which are true for S.

2

Cayley–Type Graphs of Semi–Thue Systems

As deﬁned in [10], a Cayley–type graph CG(S) of an sts S is a simple, directed, edge–labeled graph: a ∗ v | u, v ∈ Irr(S), a ∈ Σ, au v} . CG(S) := {u S

1

Note that x and y are bound in [TCx,y ϕ](x , y ) whenever {x, y} ∩ {x , y } = ∅.

382

Christian Delhomm´e, Teodor Knapik, and D. Gnanaraj Thomas

We use here its symmetric deﬁnition which has been considered in [14]: a ∗ v | u, v ∈ Irr(S), a ∈ Σ, ua v} . CG(S) := {u S

S) are isomorphic. Obviously CG(S) and CG( From the above deﬁnition we obtain immediately the following facts: (1) If S is noetherian, then every vertex of CG(S) has an outgoing edge labelled by a, for each a ∈ Σ. (2) If S is conﬂuent, then, for each a ∈ Σ, every vertex of CG(S) has at most one outgoing edge labelled by a. (3) CG(S) may have an inﬁnite number of connected components. We shall focus on the subgraph CG(S, ε) of CG(S) rooted at the vertex ε: ∗

Σ a - u} . CG(S, ε) := {u v ∈ CG(S) | ε ........... CG(S)

Note that CG(S) = ∅ when ε ∈ Dom(S), because Irr(S) = ∅, and so is CG(S, ε). Just as properties of the Cayley graph of a group are meaningful for the group itself, some properties of CG(S, ε) and MS are related, especially when S u - w, and we have also is convergent. In particular, w ∈ u↓S , whenever ε .............. CG(S,ε)

the converse provided that S is conﬂuent. This observation leads to the following lemma: Lemma 2.1. For all u, v ∈ Σ ∗ and every convergent S, the following conditions are equivalent: (1) u ∗- v, S

u v - w and ε .............. - w. (2) there exists a vertex w such that ε .............. CG(S,ε)

CG(S,ε)

We now turn our attention to a particular class of semi–Thue systems. Deﬁnition 2.2. A sts S is suﬃx on L ⊆ Irr(S), if where

∗(LΣ) S

is the image of the set LΣ under

L ⊆ Irr(S), if S is suﬃx on L.

∗(LΣ) Σ −1 S

∗. S

⊆ Irr(S),

A sts S is preﬁx on

In other words, in each step of a reduction starting from a word of LΣ (resp. ΣL), rules apply only on a suﬃx (resp. preﬁx). In the case when L ∈ Rat(Σ ∗ ) and S is recognizable, this property is decidable [10].

Using Transitive–Closure Logic for Deciding Linear Properties of Monoids

3

383

Linear Sentences of Ron V. Book and Their Translation into FO(TC1 )

Book’s linear sentences (BL sentences) are ﬁrst–order sentences with atomic formulae of the form u ≡ v where u, v ∈ (Σ ∪ X )∗ are words with variables in X and are called terms. Only two connectives are allowed, namely ∧ and ∨. The sentences considered are in the prenex normal form and may have at most one alternation of quantiﬁers2 . All variables within an occurrence of a term have to be bound by quantiﬁers of the same type (existential or universal). Each variable occurs at most once in a sentence; this is the linearity condition. We say that a BL sentence ϕ such that Var(ϕ) = x = (x1 , . . . , xn ), holds on a structure Σ, S, L1 , . . . , Ln , where S ⊆ Σ ∗ × Σ ∗ is an sts and L1 , . . . , Ln ⊆ Σ ∗ , when ϕ is true according to the usual ﬁrst–order interpretation such that each xi ranges over Li and “≡” is interpreted as ∗- . S

Many properties of monoids may be expressed by a BL sentence together with an appropriate structure. We mention only two of them. The reader may consult [8] for more examples. Example 3.1. (1) Extended Word Problem Instance: A semi–Thue system S and two languages L1 , L2 . Question: Do there exist u1 ∈ L1 and u2 ∈ L2 such that u1 ∗- u2 ? S The answer is yes if and only if the sentence ∃x1 ∃x2 x1 ≡ x2 holds on the structure Σ, S, L1 , L2 . (2) (Finite) Independent Set Problem Instance: A semi–Thue system S and a ﬁnite set of nonempty words F = {u1 , . . . , un }. Question: Is ui congruent to no word of (F {ui })∗ ? The answer is no if and only if the sentence ∃x1 . . . ∃xi ni=1 ui ≡ xi holds on the structure Σ, S, (F {u1 })∗ , . . . , (F {un })∗ .

According to [5], the truth of BL sentences is decidable (even PSPACE– complete) on every structure Σ, S, R1 , . . . , Rn , such that S is a ﬁnite, monadic and conﬂuent and R1 , . . . , Rn are rational languages. What are the properties of monoids that cannot be expressed using BL sentences ? Unfortunately, we are not able to answer this question at the present. Even for a single example of a property it may be diﬃcult to establish that it is not expressible. The general technique of Ehrenfeucht–Fra¨ıss´e games does not apply here due to syntactic restrictions of BL sentences. We therefore only conjecture that the following Left Zero Problem cannot be expressed using BL sentences: Instance: A semi–Thue system S over Σ. Question: Do there exist z ∈ Σ ∗ such that for every u ∈ Σ ∗ , zu ∗- z ? S

2

In other words, BL sentences are Σ02 or Π02 sentences.

384

Christian Delhomm´e, Teodor Knapik, and D. Gnanaraj Thomas

More General Linear Sentences We shall consider now the full ﬁrst order logic with equality restricted to the linear case. The sentences of this more general logic are called ﬁrst–order linear monoid (FOLM) sentences. Besides the use of all connectives and arbitrary alternation of quantiﬁers, the syntactical diﬀerence with BL sentences is that instead of using a structure Σ, S, R1 , . . . , Rn in order to constrain each xi to range over Ri , we use bounded quantiﬁcation: ∃Ri xi and ∀Ri xi . An atomic FOLM formula is of the form u ≡ v, with u, v ∈ (Σ ∪ X )∗ . FOLM formulae use connectives ¬, ∨, ∧, ⇒, bounded quantiﬁers ∃R , ∀R , where R is a rational language, and obey the following restrictions:

• each variable occurs at most once in a formula, • for every term t1 xt2 yt3 occurring in a formula, with x, y ∈ X and t1 , t2 , t3 ∈ (Σ ∪ X )∗ , the quantiﬁer bounding x cannot occur in the scope of the quantiﬁer bounding y. According to the latter restriction, the order of quantiﬁers has to be in accordance with the order of appearance of variables in a term. We say that a FOLM sentence holds in MS , if the sentence is true according to the usual ﬁrst order interpretation where “≡” is interpreted by ∗- . The next lemma follows from S deﬁnitions. Lemma 3.2. For every BL sentence ϕ such that Var(ϕ) = {x1 , . . . , xn } and every structure Σ, S, R1 , . . . , Rn , such that R1 , . . . , Rn are rational languages, there exists a FOLM sentence θ such that ϕ holds in Σ, S, R1 , . . . , Rn , if and only if, θ holds in MS . It is interesting that, in spite of linearity restriction, Left Zero Problem may be expressed using the following FOLM sentence: ∃Σ∗z ∃Σ∗x ∀Σ∗y xy ≡ z. This works because y may equal ε. However, this sentence cannot be directly translated into a BL sentence. Indeed, the variables of the term xy are not bound by the same type of quantiﬁers. The following main result establishes a connection between FOLM on MS and FO(TC1 ) on CG(S, ε): Proposition 3.3. For every FOLM sentence ϕ, there exists eﬀectively an FO(TC1 ) sentence θ such that, for every recognizable, convergent sts S, ϕ holds ε). in MS , if and only if, θ holds in CG(S, Before proving Proposition 3.3, we need some preliminaries. We call basic FOLM formula, an FOLM formula of the form Q t ≡ t , where t, t ∈ (Σ∪X )∗ and Q is a block: Q = Q1 . . . Qn with each Qi either equal to a bounded existential quantiﬁcation ∃Ri xi or equal to a negation ¬. A formula which is a boolean combination of basic FOLM formulas is said to be in boolean normal form. Lemma 3.4. Every FOLM formula is equivalent to an FOLM formula in boolean normal form.

Using Transitive–Closure Logic for Deciding Linear Properties of Monoids

385

Proof. As for the usual quantiﬁcation, for the bounded one we have the following identities: ¬∃R x ϕ ⇔ ∀R x ¬ϕ,

¬∀R x ϕ ⇔ ∃R x ¬ϕ.

In addition, it is easy to establish the following identities when x ∈ / Fvar(ϕ): ∃R x (ϕ ∨ ϕ ) ⇔ ϕ ∨ ∃R x ϕ

∃R x (ϕ ∧ ϕ ) ⇔ ϕ ∧ ∃R x ϕ

Thanks to the linearity condition, we may apply the above identities from left to right in order to transform any FOLM formula into a boolean normal form. 2 We call basic FOLM formula any basic FOLM formula ϕ of the form Q zt ≡ z t with Fvar(ϕ) = {z, z } (in particular z and z are two distinct variables). To each such formula ϕ, we associate an FO(TC1 ) formula T(ϕ) according to the following inductive rules: T(zt ≡ z t ) := ∃x (patht (z, x) ∧ patht (z , x)), T(¬Q zt ≡ z t ) := ¬T(Q zt ≡ z t ), T(∃R x Q zuxt ≡ z t ) := ∃x (pathuR (z, x) ∧ T(Q xt ≡ z t )), T(∃R x Q zt ≡ z uxt ) := ∃x (pathuR (z , x) ∧ T(Q zt ≡ xt )). Notice that Fvar(T(ϕ)) = Fvar(ϕ). Lemma 3.5. For every basic FOLM formula Q zt ≡ z t , all r, r ∈ Σ ∗ and every recognizable, convergent sts S, MS |= Q rt ≡ r t

iﬀ

CG(S, ε) |= ∃z ∃z (pathr (r, z) ∧ pathr (r, z ) ∧ T(Q zt ≡ z t )) . The proof of the above lemma, by induction on the structure of the formula, is available in a longer version of this paper. Proof (Proposition 3.3). Thanks to Lemma 3.4, we only have to consider basic FOLM sentences. Now given a FOLM sentence Q t ≡ t , consider the FO(TC1 ) sentence ∃z ∃z (pathε (r, z)∧pathε (r, z )∧T(Q zt ≡ z t )) and invoke Lemma 3.5, 2 given that CG(S, ε) |= ∀z(pathε (r, z) ⇔ z = r). We note that our translation of FOLM into FO(TC1 ) based on Cayley–type graphs cannot be extended to the case where the order of quantiﬁers is not in accordance with the order of appearance of variables in a term. Indeed consider the following FOLM sentence: ∃x ∀y ∃z yxz = u, where u ∈ Σ ∗ . On a Cayley– type type graph G, the corresponding property may be expressed as follows: there exists a word, say v (i.e. ∃x) such that from every vertex of G (i.e. ∀y), there is a path to u↓ with a label in vΣ ∗ (i.e. Σ ∗ corresponds to ∃z). Here, the preﬁx v

386

Christian Delhomm´e, Teodor Knapik, and D. Gnanaraj Thomas

of a path in vΣ ∗ is independent from the starting point of this path. This means that, on the contrary to the proof of Lemma 3.5, in the present example, the ﬁrst existential quantiﬁcation on the label of a path (i.e. ∃x) cannot be replaced with a quantiﬁcation on vertices of G. Now, the quantiﬁcation on labels of a path cannot be expressed in FO(TC1 ) (even not in monadic second order logic). For the sequel, we need to deﬁne the following classes of structures: G0 := {CG(S) | S recognizable and preﬁx on Irr(S)}, G1 := {CG(S) | S recognizable, convergent and preﬁx on Irr(S)}, G2 := {CG(S, ε) | S recognizable, convergent and preﬁx on Irr(S)}, G2 := {CG(S, ε) | S recognizable, convergent and suﬃx on Irr(S)} and M2 := {MS | S recognizable, convergent and suﬃx on Irr(S)}, and let us denote by GPrefRec the class of preﬁx–recognizable graphs [9]. Using Theorem 4.6 of [10] we conclude that G0 ⊆ GPrefRec . Since G1 ⊆ G0 , we have also G1 ⊆ GPrefRec. Since the reachability is expressible in monadic second order (MSO) logic, each graph of G2 is deﬁnable within a graph of G1 . According to [3] (see also [4]), each graph of G2 is therefore in GPrefRec . Since G2 and G2 are equal up to graph isomorphism, we have the following lemma. Lemma 3.6. CG(S, ε) is preﬁx–recognizable for every recognizable sts which is suﬃx on Irr(S). Now, since MCP(MSO, GPrefRec) is decidable [9] and FO(TC1 ) embeds into MSO, MCP(FO(TC1 ), G2 ) is decidable too. Hence, from Proposition 3.3, we obtain the following: Corollary 3.7. FOLM theory and, in particular, BL theory of MS is decidable for every recognizable convergent sts S which is suﬃx on Irr(S). This properly extends the decidability result of [5] beyond the case of ﬁnite monadic and Church–Rosser sts’s. Indeed, every monadic sts S is terminating and is suﬃx on Irr(S) but not vice–versa [14]. Unfortunately, at the present, we are not able to obtain an analogous result about complexity. Nevertheless, the following discussion may be useful for further investigations. Let µ denote the modal µ–calculus [2]. As established in [15, 19], MCP(µ, GPrefRec) is EXPTIME–complete. Since on ﬁnite structures the complexity of model checking w.r.t. FO(TC1 ) is lower than w.r.t. µ (see e.g. [13]), we conjecture the following: Conjecture 3.8. MCP(FO(TC1 ), GPrefRec ) is in EXPTIME. Consequently we conjecture that MCP(FO(TC1 ), G2 ) is in EXPTIME and also that MCP(FOLM, M2 ) is so because the proof of Proposition 3.3 is based on a linear translation of FOLM sentences into FO(TC1 ) sentences.

Using Transitive–Closure Logic for Deciding Linear Properties of Monoids

387

References 1. A. V. Aho, R. Sethi, and J. D. Ullman. Code optimization and ﬁnite Church– Rosser systems. In R. Rustin, editor, Design and Optimization of Compilers, pages 89–106. Prentice–Hall, 1972. 2. A. Arnold and D. Niwi´ nski. Rudiments of µ–calculus. Number 146 in Studies in Logic and the Foundations of Mathematics. Elsevier, 2001. 3. K. Barthelmann. On equational simple graphs. Technical Report 9/97, Johannes Gutenberg Universit¨ at, Mainz, 1998. 4. A. Blumensath. Axiomatising tree-interpretable structures. In H. Alt and A. Ferreira, editors, STACS 2002, LNCS 2285, pages 596–607, Antibes – Juan les Pins, Mar. 2002. Springer. 5. R. V. Book. Decidable sentences of Church–Rosser congruences. Theoretical Comput. Sci., 24:301–312, 1983. 6. R. V. Book and F. Otto. On the security of name–stamp protocols. Theoretical Comput. Sci., 39:319–325, 1985. 7. R. V. Book and F. Otto. On the veriﬁability of two–party algebraic protocols. Theoretical Comput. Sci., 40:101–130, 1985. 8. R. V. Book and F. Otto. String–Rewriting Systems. Texts and Monographs in Computer Science. Springer–Verlag, 1993. 9. D. Caucal. On inﬁnite transition graphs having a decidable monadic second–order theory. Theoretical Comput. Sci., 290(1):79–115, 2003. 10. D. Caucal and T. Knapik. A Chomsky–like hierarchy of inﬁnite graphs. In K. Diks and W. Rytter, editors, MFCS 2002, LNCS 2420, pages 177–187, Warsaw, Aug. 2002. 11. H.-D. Ebbinghaus and J. Flum. Finite Model Theory. Springer–Verlag, 1999. Second edition. 12. R. Fagin. Monadic generalized spectra. Zeitschrift f¨ ur Mathematische Logik und Grundlagen der Mathematik, 21:89–96, 1975. 13. N. Immerman and M. Y. Vardi. Model checking and transitive-closure logic. In O. Grumberg, editor, Computer Aided Verification, 9th International Conference, CAV ’97, pages 291–302, Haifa, June 1997. 14. T. Knapik and H. Calbrix. Thue speciﬁcations and their monadic second–order properties. Fundamenta Informaticae, 39(3):305–325, 1999. 15. O. Kupferman and M. Y. Vardi. An automata–theoretic approach to reasoning about inﬁnite–state systems. In Computer Aided Verification, Proc. 12th Int. Conference, LNCS 1855, pages 36–52, Chicago, July 2000. Springer. 16. K. Madlener and F. Otto. Decidable sentences for context–free groups. In C. Choffrut and M. Jantzen, editors, Symposium on Theoretical Aspects of Computer Science, LNCS 480, pages 160–171, Hamburg, Feb. 1991. Springer. 17. V. A. Oleshchuk. On public–key cryptosystem based on Church–Rosser string– rewriting systems. In D.-Z. Du and M. Li, editors, First Annual International Computing and Combinatorics Conference, COCOON’95, LNCS 959, pages 264– 269, Xian, Aug. 1995. Springer. 18. A. Thue. Probleme u ¨ber Ver¨ anderungen von Zeichenreihen nach gegebenen Regeln. Skr. Vid. Kristiania, I Mat. Natuv. Klasse, 10:34 pp, 1914. 19. I. Walukiewicz. Monadic second–order logic on tree–like structures. Theoretical Comput. Sci., 275(1–2):311–346, 2002.

Linear-Time Computation of Local Periods Jean-Pierre Duval1 , Roman Kolpakov2, , Gregory Kucherov3 , Thierry Lecroq4 , and Arnaud Lefebvre4 1

2

LIFAR, Universit´e de Rouen, France [email protected] Department of Computer Science, University of Liverpool, UK [email protected] 3 INRIA/LORIA, Nancy, France [email protected] 4 ABISS, Universit´e de Rouen, France {Thierry.Lecroq,Arnaud.Lefebvre}@univ-rouen.fr

Abstract. We present a linear-time algorithm for computing all local periods of a given word. This subsumes (but is substantially more powerful than) the computation of the (global) period of the word and on the other hand, the computation of a critical factorization, implied by the Critical Factorization Theorem.

1

Introduction

Periodicities in words have been classically studied in word combinatorics and are at the core of many fundamental results [18,2,19]. Besides, notions and techniques related to periodic structures in words ﬁnd their applications in diﬀerent areas: data compression [24], molecular biology [12], as well as for designing more eﬃcient string search algorithms [11,3,5]. In this paper, we concentrate, from the algorithmic perspective, on the important notion of local periods, that characterize a local periodic structure at each location of the word [9,8]. In informal terms, the local period at a given position is the size of the smallest square centered at this position. An importance of local periods is evidenced by the fundamental Critical Factorization Theorem [18,2,19] that asserts that there exists a position in the word (and a corresponding factorization), for which the local period is equal to the global period of the word. Designing eﬃcient algorithms for computing diﬀerent periodic structures in words has been for a long time an active area of research. It is well-known that the (global) period of a word can be computed in linear time, using the KnuthMorris-Pratt string matching method [16,4].On the other hand, in [3] it has been shown that a critical factorization can be constructed in linear time, by computing the smallest and largest suﬃxes under the lexicographical ordering. In the same work, the factorization has then been used to design a new string matching algorithm.

On leave from the French-Russian Institute for Informatics and Applied Mathematics, Moscow University, Russia

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 388–397, 2003. c Springer-Verlag Berlin Heidelberg 2003

Linear-Time Computation of Local Periods

389

In this paper, we show how to compute all local periods in a word in time O(n) assuming an alphabet of constant size. This is substantially more powerful than linear-time computations of a critical factorization and of the global period: indeed, once all local periods have been computed, the global period is simply the maximum of all local periods, and each such maximal value corresponds to a distinct critical factorization. Note that a great deal of work has been done on ﬁnding periodicities occurring in a word (see [13] for a survey). However, none of them allows to compute all local periods in linear time. The reason is that most of those algorithms are intrinsically super-linear, which can be explained by the fact that they tend, explicitly or implicitly, to enumerate all squares in the word, the number of which can be super-linear. The closest result is the one of [17] which claims a linear-time algorithm for ﬁnding, for each position i of the string, the smallest square starting at i. The approach is based on a sophisticated analysis of the suﬃx tree. The absence of a complete proof prevents the comprehension of the algorithm in full details; however, to the best of our understanding, this approach cannot be applied to ﬁnding local periods. Here we design a linear-time algorithm for ﬁnding all local periods, based on several diﬀerent string matching techniques. Some of those techniques (sfactorization, Main-Lorentz extension functions) have already been successfully used for several repetition ﬁnding problems [7,21,20,13,14,15]. In particular, in [13], it has been shown that all maximal repetitions can be found in linear time, providing an exhaustive information about the periodic structure of the word. However, here again, a direct application of this approach to ﬁnding local periods leads to a super-linear algorithm. We then propose a non-trivial modiﬁcation of this approach, that allows to ﬁnd a subclass of local periods in linear time. Another tool we use is the simpliﬁed Boyer-Moore shift function, which allows us to complete the computation of local periods, staying within the linear time bound.

2

Local Periods: Preliminaries

Consider a word w = a1 ...an over a ﬁnite alphabet. |w| denotes the length of w, and wR stands for the reverse of w, that is an an−1 . . . a1 . w[i..j], for 1 ≤ i, j ≤ n, denotes the subword ai ...aj provided that i ≤ j, and the empty word otherwise. A position i in w is an integer number between 0 and n, associated with the factorization w = uv, where |u| = i. A square s is a word of the form tt (i.e. a word of even length with two equal halves). t is called the root of s, and |t| is called its period. Deﬁnition 1. Let w = uv, and |u| = i. We say that a non-empty square tt is centered at position i of w (or matches w at central position i) iﬀ the following conditions hold: (i) t is a suﬃx of u, or u is a suﬃx of t, (ii) t is a preﬁx of v, or v is a preﬁx of t.

390

Jean-Pierre Duval et al.

In the case when t is a suﬃx of u and t is a preﬁx of v, we have a square occurring inside w. We call it an internal square. If v is a proper preﬁx of t (respectively, u is a proper suﬃx of t), the square is called right external (respectively, left external). Deﬁnition 2. The smallest square centered at a position i of w is called the minimal local square (hereafter simply minimal, for shortness). The local period at position i of w, denoted LPw (i), is the period of the minimal square centered at this position. Note that for each position i of w, LPw (i) is well-deﬁned, and 1 ≤ LPw (i) ≤ |w|. Any word w has the (global) period p(w), which is the minimal integer p such that w[i] = w[i + p] whenever 1 ≤ i, i + p ≤ |w|. Equivalently, p(w) is the smallest positive integer p such that words w[1..n − p] and w[p + 1..n] are equal. The critical factorization theorem [18,2,19] is a fundamental result relating local and global periods: Theorem 1 (Critical Factorization Theorem). For each word w, there exists a position i (and the corresponding factorization w = uv, |u| = i) such that LPw (i) = p(w). Moreover, such a position exists among any p(w) consecutive positions of w. Apart from its combinatorial consequences, an interesting feature of the critical factorization is that it can be computed very eﬃciently, in a time linear in the word length [3]. This can be done, for example, using the suﬃx tree construction [4]. On the other hand, it is well-known that the (global) period of a word can be computed in linear time, using, for example, the Knuth-Morris-Pratt technique [4]. In this paper, we show how to compute all local periods in a word in linear time. This computation is much more powerful than that of a critical factorization or the global period : once all local periods are computed, the global period is equal to the maximum among them, and each such maximal local period corresponds to a critical factorization of the word. The method we propose consists of two parts. We ﬁrst show, in Section 3, how to compute all internal minimal squares. Then, in Section 4 we show how to compute left and right external minimal squares, in particular for those positions for which no internal square has been found. Both computations will be shown to be linear-time, and therefore computing all local periods can be done within linear time too.

3

Computing Internal Minimal Squares

Finding internal minimal squares amounts to computing, for each position of the word, the smallest square centered at this position and occurring entirely inside the word, provided that such a square exists. Thus, throughout this section we

Linear-Time Computation of Local Periods

391

will be considering only squares occurring inside the word and therefore, for the sake of brevity, omit the adjective “internal”. The problem of ﬁnding squares and, more generally, ﬁnding repetitions occurring in a given word has been studied for a long time in the string matching area, we refer to [13] for a survey. A natural idea is then to apply one of those methods in order to compute all squares and then select, for each central position, the smallest one. A direct application of this approach, however, cannot result in a linear-time algorithm, for the reason that the overall number of squares in a word can be as big as Θ(n log n) (see [6]). Therefore, manipulating the set of all squares explicitly is prohibitive for our purpose. In [13], maximal repetitions have been studied, which are maximally extended runs of consecutive squares. Importantly, the set of maximal repetitions encodes the whole set of squares, while being only of linear size. Our approach here is to use the technique of computing maximal repetitions in order to retrieve squares which are minimal for some position. To present the algorithm in full details, we ﬁrst need to describe the techniques used in [20,13] for computing maximal repetitions. 3.1

s-Factorization, Main-Lorentz Extension Functions, and Computing Repetitions

In this section we recall basic ideas, methods and tools underlying our approach. The s-factorization [7] is a special decomposition of the word. It is closely related to the Lempel-Ziv factorization (implicitly) deﬁned by the well-known Lempel-Ziv compression method. The idea of deﬁning the s-factorization is to proceed from left to right and to ﬁnd, at each step, the longest factor which has another copy on the left. Alternatively, the Lempel-Ziv factorization considers the shortest factor which does not appear to the left (i.e. extends by one letter the longest factor previously occurred). We refer to [12] for a discussion on these two variants of factorization. A salient property of both factorizations is that they can be computed in linear time [22] in the case of constant alphabet. In their original deﬁnition, both of these factorizations allow an overlap between a factor and its left copy. However, we can restrict this and require the copy to be non-overlapping with the factor. This yields a factorization without copy overlap (see [15]). Computing the s-factorization (or Lempel-Ziv factorization) without copy overlap can still be done in linear time. In this work we will use the s-factorization without copy overlap: Deﬁnition 3. The s-factorization of w without copy overlap is the factorization w = f1 f2 . . . fm , where fi ’s are deﬁned inductively as follows: (i) f1 = w[1], (ii) assume we have computed f1 f2 . . . fi−1 (i ≥ 2), and let w[bi ] be the letter immediately following f1 f2 . . . fi−1 (i.e. bi = |f1 f2 . . . fi−1 | + 1). If w[bi ] does not occur in f1 f2 . . . fi−1 , then fi = w[bi ], otherwise fi is the longest subword starting at position bi , which has another occurrence in f1 f2 . . . fi−1 .

392

Jean-Pierre Duval et al.

Note however that the choice of the factorization deﬁnition is guided by the simplicity of algorithm design and presentation clarity, and is not unique. Our second tool is Main-Lorentz extension functions [21]. In its basic form, the underlying problem is the following. Assume we are given two words w1 , w2 and we want to compute, for each position i of w2 , the longest preﬁx of w1 which occurs at position i in w2 . This computation can be done in time O(|w1 | + |w2 |) [21]. Note that w1 and w2 can be the same word, and that if we invert w1 and w2 , we come up with the symmetric computation of longest suﬃxes of w2 [1..i] which are suﬃxes of w1 . We now recall how Main-Lorentz extension functions are used for ﬁnding repetitions. The key idea is illustrated by the following problem. Assume we have two words w1 = w1 [1..m] and w2 = w2 [1..n] and consider their concatenation w = w1 w2 . Assume we want to ﬁnd all squares of w which cross the boundary between w1 and w2 , i.e. squares which start at some position ≤ m and end at some position > m in w (start and end positions of a square are the positions of respectively its ﬁrst and last letter). First, we divide all such squares into two categories – those centered at a position < m and those centered at a position ≥ m – and by symmetry, we concentrate on the squares centered at a position ≥ m only. We then compute the following extension functions : – pref (i), 2 ≤ i ≤ n + 1 deﬁned by pref (i) = max{j|w2 [1..j] = w2 [i..i + j − 1]} for 2 ≤ i ≤ n, and pref (n + 1) = 0, – suf (i), 1 ≤ i ≤ n deﬁned by suf (i) = max{j|w1 [m − j + 1..m] = w[m + i − j + 1..m + i]}. Then there exists a square with period p iﬀ suf (p) + pref (p + 1) ≥ p

(1)

[20]. This gives a key of the algorithm: we ﬁrst compute values pref (p) and suf (p) for all possible p, which takes time O(m + n). Then we simply check for each p inequality (1) – each time it is veriﬁed, we witness new squares of period p. More precisely, whenever the inequality is veriﬁed we have identiﬁed, in general, a series (run) of squares centered at each position from the interval [m − suf (p) + p..m + pref (p + 1)]. This run is a maximal repetition in w (see [13]). Formally, this maximal repetition may contain squares centered at positions < m (if suf (p) > p), and squares starting at positions > m (if pref (p + 1) > p − 1). Therefore, if we want only squares centered at positions ≥ m and starting at positions ≤ m (as it will be our case in the next Section), we have to restrict the interval of centers to [max{m − suf (p) + p, m}.. min{m + pref (p + 1), m + p}]. Clearly, verifying inequality (1) takes constant time and the whole computation can be done in O(n). To ﬁnd, in linear time, all squares in a word (and not only those which cross a given position), we have to combine the factorization and extension function techniques. In general terms, the idea is the following : we compute the s-factorization and process factors one-by-one from left to right. For each factor fr , we consider separately those squares which occur completely inside fr , and

Linear-Time Computation of Local Periods

393

those ending in fr and crossing the boundary with fr−1 . The squares of the ﬁrst type are computed using the fact that fr has a copy on the left – we can then retrieve those squares from this copy in time O(|fr |). The squares of the second type are computed using the extension function technique sketched above, together with an additional lemma asserting that those squares cannot extend to the left of fr by more than |fr | + 2|fr−1 | letters [20]. Therefore, ﬁnding all these squares, in form of runs, takes time O(|fr−1 | + |fr |). The whole word can then be processed in time O(n). The reader is referred to [20,13] for full details. This general approach, initiated in [7,20] has been applied successfully to various repetition ﬁnding problems [13,14,15]. In this work we show that it can be also applied to obtain a linear-time algorithm for computing internal local periods. This gives yet another illustration of the power of the approach. 3.2

Finding Internal Minimal Squares

We are ready now to present a linear-time algorithm of computing all internal minimal squares in a given word w. First, we compute, in linear time, the s-factorization of w without copy overlap and we keep, for each factor fr , a reference to its non-overlapping left copy. The algorithm processes all factors from left to right and computes, for each factor fr , all minimal squares ending in this factor. For each minimal square found, centered at position i, the corresponding value LPw (i) is set. After the whole word has been processed, positions i for which values LPw (i) have not been assigned are those positions for which no internal square centered at i exists. For those positions, minimal squares are external, and they will be computed at the second stage, presented in Section 4. Let fr = w[m + 1..m + l] be the current factor, and let w[p + 1..p + l] be its left copy (note that p + l ≤ m). If for some position m + i, 1 ≤ i < l, the minimal square centered at m + i occurs entirely inside the factor, that is LPw (m + i) ≤ min{i, l − i}, then LPw (m + i) = LPw (p + i). Note that LPw (p + i) has been computed before, as the minimal square centered at p + i ends before the beginning of fr . Based on this, we retrieve, in time O(|fr |), all values LPw (m + i) which correspond to squares occurring entirely inside fr . It remains to ﬁnd those values LPw (m + i) which correspond to minimal squares that end in fr and extend to the left beyond the border between fr and fr−1 . To do this, we use the technique of computing squares described in the previous section. The idea is to compute all candidate squares and test which of them are minimal. However, this should be done carefully: as mentioned earlier, this can break down the linear time bound, because of a possible super-linear number of all squares. The main trick is to keep squares in runs and to show that there is only a linear number of individual squares which need to be tested for minimality. As in [20], we divide all squares under consideration into those which are centered inside fr and those centered to the left of fr . Two cases are symmetrical and therefore we concentrate on those squares centered at positions m..m + l − 1. In addition, we are interested in squares starting at positions ≤ m and ending inside fr . We compute all such squares in the increasing order of pe-

394

Jean-Pierre Duval et al. q k−q m

j−q

q j

k−p

j+q k

p

k+p p

bound of factors

Fig. 1. Case where neither of inequations (2),(3) holds (subcase k > j)

riods. For each p = 1..l −1 we compute the run of all squares of period p centered at positions belonging to the interval [m..m + l − 1], starting at a position ≤ m, and ending inside fr , as explained in Section 3.1. Assume we have computed a run of such squares of period p, and assume that q < p is the maximal period for which squares have been previously found. If p ≥ 2q, then we check each square of the run whether it is minimal or not by checking the value LPw (i). If this square is not minimal, then its center i has been already assigned a value LPw (i). Indeed, if a smaller square centered at i exists, it has necessarily been already computed by the algorithm (recall that squares are computed in the increasing order of periods), and therefore a positive value LPw (i) has been set before. If no value LPw (i) has yet been assigned, then we have found the minimal square centered at i. Since there is ≤ p of considered squares of period p (their centers belong to the interval [m..m + p − 1]), checking all of them takes ≤ 2(p − q) individual checks (as q ≤ p/2 and p − q ≥ p/2). Now assume p < 2q. Consider a square sq = w[j − q + 1..j + q] of period q and center j, which has been previously found by the algorithm (square of period q in Figure 1). We now prove that we need to check for minimality only those squares sp of period p which have their center k verifying one of the following inequalities : |k − j| ≤ p − q, or k ≥j+q

(2) (3)

In words, k is located either within distance p − q from j, or beyond the end of square sq . Show that one of inequations (2),(3) must hold. By contradiction, assume that neither of them holds. Consider the case k > j, case k < j is symmetric. The situation with k > j is shown in Figure 1. Now observe that word w[j + 1..k] has a copy w[j − q + 1..k − q] (shown with empty strips in Figure 1) and that its length is (k − j). Furthermore, since k − j > p − q (as inequation (2) does not hold), this copy overlaps by p − q letters with the left root of sp . Consider this overlap w[k − p + 1..k − q] (shadowed strip in Figure 1). It has a copy w[k + 1..k + (p − q)] and another copy w[k − (p − q) + 1..k] (see Figure 1). We thus have a smaller square centered at k, which proves that square sp cannot be minimal. Therefore, we need to check for minimality only those squares sp which verify, with respect to sq , one of inequations (2),(3). Note that there are at most 2(p−q)

Linear-Time Computation of Local Periods

395

squares sp verifying (2), and at most p − q squares sp verifying (3), the latter because sp must start before the current factor, i.e. k ≤ m + p. We conclude that there are ≤ 3(p−q) squares of period p to check for minimality, among all squares found for period p. Summing up the number of all individual checks results in a telescoping sum, and we obtain that processing all squares centered in the current factor can be done in time O(|fr |). A similar argument applies to the squares centered on the left of fr . Note that after processing fr , all minimal squares ending in fr have been computed. To sum up, we need to check for minimality only O(|fr−1 | + |fr |) squares, among those crossing the border between fr and fr−1 , each check taking a constant time. We also need O(|fr |) time to compute minimal squares occurring inside fr . Processing fr takes then time O(|fr−1 | + |fr |) overall, and processing the whole word takes time O(n). Theorem 2. In a word of length n, all internal minimal squares can be computed in time O(n).

4

Computing External Minimal Squares

In this section, we show how to compute minimal external squares for those positions which don’t have internal squares centered at them. The algorithm is based on the simpliﬁed Boyer-Moore shift function, used in classical string matching algorithms [1,16]. Deﬁnition 4. For a word w of length n the simpliﬁed Boyer-Moore shift function is deﬁned as follows [1,16]: dw (i) = min{ | ≥ 1 and (for all j, i < j ≤ n, ≥ j or w[j] = w[j − ])} . In words, dw (i) is the smallest shift between suﬃx v and its copy in w. If v has no other occurrence in w, then we look for the longest suﬃx of v occurring in preﬁx of w. The function dw can be computed in O(n) time and space [1]. We will show that, given a word w of length n, all minimal external squares can be computed in time O(n). Consider a word w and assume that the function dw has been computed. Consider a factorization w = uv and assume that there is no internal square centered at position |u|. We ﬁrst consider the case when |u| ≥ |v|, and show how to compute the minimal right external square centered at |u|. Lemma 1. Let w = uv with |u| ≥ |v|. If there is no internal square centered at i = |u|, then the minimal right external square has period dw (i). Proof. First note that dw (i) > |v| must hold, as otherwise there is a copy of v overlapping (or touching) the suﬃx occurrence of v, which implies that there is an internal square of period dw (i) centered at i, which contradicts our assumption. We now consider two cases. If dw (i) ≤ i,then there is an occurrence of v inside u and therefore u = u0 vu1 for some u0 , u1 . It follows that there is a right external

396

Jean-Pierre Duval et al.

square centered at i with the root vu1 . This is the minimal such square, as the deﬁnition of dw guarantees that u1 is the shortest possible. If dw (i) > i,then v = v0 v1 and u = v1 u0 with |v1 u0 v0 | = dw (i). v1 u0 v0 forms the root of a right and left external square centered at i. Again, the existence of a smaller right external square would contradict the minimality requirement in the deﬁnition of dw . The case |u| < |v| is symmetric and can be treated similarly by considering the inverse of w. To conclude, all external minimal squares can be computed in time O(n), for those positions which don’t have internal squares centered in them. We then obtain an O(n) algorithm for computing all minimal squares: ﬁrst, using the algorithm of Section 3 we compute all internal minimal squares and then, using Lemma 1,we compute all external minimal squares for those positions for which no internal square has been found at the ﬁrst stage. This proves the main result. Theorem 3. In a word w of length n, all local periods LPw (i) can be computed in time O(n).

5

Conclusions

We presented an algorithm that computes all local periods in a word in a time linear in length of the word. This computation provides an exhaustive information about the local periodic structure of the word. According to the Critical Factorization Theorem, the (global) period of the word is simply the maximum among all local periods. Therefore, as a case application, our algorithm allows to ﬁnd all possible critical factorization of the word. The main diﬃculty to solve was to extract all shortest local squares without having to process all individual squares occurring in the word, which would break down the linear time bound. This made impossible an oﬀ-the-shelf use of existing repetition-ﬁnding algorithms, and necessitated a non-trivial modiﬁcation of existing methods. An interesting research direction would be to study the combinatorics of possible sets of local periods, in a similar way as it was done for the structure of all (global) periods [10,23]. The results presented in this paper might provide an initial insight for a such study.

Acknowledgments GK, TL and AL have been supported by the french Action Sp´eciﬁque “Algorithmes et S´equences” of CNRS. JPD, TL and AL have been supported by the NATO grant PST.CLG.977017. Part of this work has been done during the stay of RK at LORIA in summer 2002, supported by INRIA.

References 1. R. S. Boyer and J. S. Moore. A fast string searching algorithm. Communications of the ACM, 20:762–772, 1977.

Linear-Time Computation of Local Periods

397

2. Ch. Choﬀrut and J. Karhum¨ aki. Combinatorics of words. In G. Rozenberg and A. Salomaa, editors, Handbook on Formal Languages, volume I, 329–438, Springer Verlag, 1997. 3. M. Crochemore and D. Perrin. Two-way string matching. J. ACM, 38:651–675, 1991. 4. M. Crochemore and W. Rytter. Text algorithms. Oxford University Press, 1994. 5. M. Crochemore and W. Rytter. Squares, cubes, and time-space eﬃcient string searching. Algorithmica, 13:405–425, 1995. 6. M. Crochemore. An optimal algorithm for computing the repetitions in a word. Information Processing Letters, 12:244–250, 1981. 7. M. Crochemore. Recherche lin´eaire d’un carr´e dans un mot. Comptes Rendus Acad. Sci. Paris S´er. I Math., 296:781–784, 1983. 8. J.-P. Duval, F. Mignosi, and A. Restivo. Recurrence and periodicity in inﬁnite words from local periods. Theoretical Computer Science, 262(1):269–284, 2001. 9. J.-P. Duval. P´eriodes locales et propagation de p´eriodes dans un mot. Theoretical Computer Science, 204(1-2):87–98, 1998. 10. Leo J. Guibas and Andrew M. Odlyzko. Periods in strings. Journal of Combinatorial Theory, Series A, 30:19–42, 1981. 11. Z. Galil and J. Seiferas. Time-space optimal string matching. Journal of Computer and System Sciences, 26(3):280–294, 1983. 12. D. Gusﬁeld. Algorithms on Strings, Trees, and Sequences. Computer Science and Computational Biology. Cambridge University Press, 1997. 13. R. Kolpakov and G. Kucherov. Finding maximal repetitions in a word in linear time. In Proc. of FOCS’99, New York (USA), 596–604, IEEE Comp. Soc., 1999. 14. R. Kolpakov and G. Kucherov. Finding repeats with ﬁxed gap. In Proc. of the 7th SPIRE, La Coru˜ na, Spain , 162–168, IEEE, 2000. 15. R. Kolpakov and G. Kucherov. Finding Approximate Repetitions under Hamming Distance. In F.Meyer auf der Heide, editor, Proc. of the 9th ESA, Aarhus, Denmark, LNCS 2161, 170–181, 2001. 16. D. E. Knuth, J. H. Morris, and V. R. Pratt. Fast pattern matching in strings. SIAM Journal of Computing, 6:323–350, 1977. 17. S. R. Kosaraju. Computation of squares in string. In M. Crochemore and D. Gusﬁeld, editors, Proc. of the 5th CPM, LNCS 807, 146–150, Springer Verlag, 1994. 18. M. Lothaire. Combinatorics on Words, volume 17 of Encyclopedia of Mathematics and Its Applications. Addison Wesley, 1983. 19. M. Lothaire. Algebraic Combinatorics on Words. Cambridge University Press, 2002. 20. M. G. Main. Detecting leftmost maximal periodicities. Discrete Applied Mathematics, 25:145–153, 1989. 21. M.G. Main and R.J. Lorentz. An O(n log n) algorithm for ﬁnding all repetitions in a string. Journal of Algorithms, 5(3):422–432, 1984. 22. M. Rodeh, V.R. Pratt, and S. Even. Linear algorithm for data compression via string matching. Journal of the ACM, 28(1):16–24, 1981. 23. E. Rivals and S. Rahmann. Combinatorics of periods in strings. In J. van Leuween P. Orejas, P. G. Spirakis, editors, Proc. of the 28th ICALP, LNCS 2076, 615–626, Springer Verlag, 2001. 24. J.A. Storer. Data Compression: Methods and Theory. Computer Science Press, Rockville, MD, 1988.

Two Dimensional Packing: The Power of Rotation Leah Epstein School of Computer Science, The Interdisciplinary Center, Herzliya, Israel [email protected]

Abstract. Recently there is a rise in the study of two-dimensional packing problems. In such problems the input items are rectangles which need to be assigned into unit squares. However, most of the previous work concentrated on ﬁxed items. Fixed items have a ﬁxed direction and must be assigned so that their bottom is parallel to the bottom of the bin. In this paper we study two-dimensional bin packing of rotatable items. Those are rectangles which can be rotated by ninety degrees. We give almost tight bounds for bounded space bin packing of rotatable items, and introduce a new unbounded space algorithm. This improves the results of Fujita and Hada.

1

Introduction

Consider a situation where large sheets of paper need to be cut into smaller pages. A smaller page is cut oﬀ the large sheet so that its side is parallel to a side of the large paper. However, we are not restricted in the direction. We can rotate some of the requests by 90◦ if this makes the assignment of small pages into the large sheet easier. The diﬀerence between a page of height h and width w and a page of height w and width h is insigniﬁcant. Modeling this situation we get the familiar two-dimensional packing problem of packing rectangles into unit squares. However our problem is slightly diﬀerent as the requests may be rotated. This problem is called “rotatable items packing”, introduced and studied by Fujita and Hada [6]. The same problem is also known as “packing of nonoriented items” [4]. We use bins which are unit squares. The items are rectangles of sides bounded by 1. The items arrive one by one, each rectangle must be assigned to a bin before the next rectangle is introduced. The algorithm has to decide on a position for the rectangle in a previously opened bin, or in a new bin. This decision involves also the decision whether an item is rotated or not. The cost of the algorithm is the number of bins that were used. We study on-line algorithms which are measured by the competitive ratio. This is the asymptotic worst case ratio between the cost of the on-line algorithm and the cost of an optimal oﬀ-line algorithm which sees the input stream as a set of items given in advance. Bounded space algorithms are algorithms which may have only a constant number of active bins. Active bins are bins that can be used to pack

Research supported in part by the Israel Science Foundation (grant no. 250/01).

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 398–407, 2003. c Springer-Verlag Berlin Heidelberg 2003

Two Dimensional Packing: The Power of Rotation

399

new items. Other previously used bins are “closed”, which means that they cannot accommodate arriving items. Our Results. We design a bounded space algorithm of competitive ratio at most 2935/1152 + δ ≈ 2.54775. We show that this algorithm is very close to have the optimal competitive ratio of bounded space algorithms by showing a lower bound of 120754/47628 ≈ 2.53537 on the competitive ratio of every such algorithm. We also design an algorithm of competitive ratio slightly below 2.45 which uses unbounded space. The ﬁrst algorithm uses many ideas from the paper [6]. To improve the results we use a more advanced analysis, a better partition to types, and a technique which allows us to have a constant number of open bins. This technique is a special case of the technique developed in [5], where it allows to develop algorithms of optimal competitive ratio for non-rotatable rectangles, boxes and hyperboxes. The second algorithm tries to combine together items which, roughly speaking, occupy a lot of space in any bounded space algorithm. Note that in our model, not only the assignment to a bin, but also the position of a rectangle inside the bin is decided upon arrival. Our lower bound holds also for the model where only the assignment to the bin has to be done right away, and the exact packing of each bin can be postponed till later. Previous Work. For many years now, there has been extensive work on onedimensional bin packing [9,10,11,15,13] and on two-dimensional bin packing of rectangles into unit squares [1,3,2,7,14]. The best on-line algorithm currently known for packing of rectangles into squares achieves the competitive ratio 2.66013 and was given by [14]. Fujita and Hada [6] introduced the rotatable problem and designed two competitive algorithms. None of the algorithms was deﬁned to be bounded space in their work. The ﬁrst algorithm does not combine very diﬀerent items in one bin, and using our techniques can be converted into a bounded space algorithm without losing its properties and analysis. Unfortunately there are two errors in the analysis, one of them can be ﬁxed, but the other is more crucial. Given the current deﬁnition of the algorithm and the bounds used to analyze it, the competitive ratio should be slightly higher, namely, instead of at most 47/18 ≈ 2.61112 claimed in the paper it is at most 95/36 ≈ 2.63889. It might be possible to get a better bound on the same algorithm using a more complicated analysis. The second algorithm cannot be converted into a bounded space algorithm. Unfortunately it is not well deﬁned in the paper and its analysis builds on the wrong analysis of the ﬁrst algorithm. Its competitive ratio is claimed to be at most 100/39 ≈ 2.56411. As for lower bounds, Seiden and Van Stee [14] give bounds on packing of squares, cubes and hypercubes, which clearly imply lower bounds on the packing of rotatable items. We focus on their results for d = 2. They show a lower bound of 2.28229 for bounded space algorithms and 1.62176 for unbounded space algorithms. The lower bound for bounded space square was recently improved [5] to 2.3552. Throughout the paper, we denote a rectangle by (x, y). This means that it has height x and width y. We always assume that the input is rotated in a way such that x ≥ y. The algorithms may assign a rectangle to a bin in this position

400

Leah Epstein

or rotated to the other position. In some cases we assume that we can split the bin into parts and rotate some of them. In practice this means that the rectangles are rotated to be placed into this part of the bin. The paper is organized as follows. We start with a proof of the lower bound. In that section we also prove some useful properties. Then we move to the bounded space algorithm, and in the following section we adapt it to combine diﬀerent types of large items and show how this aﬀects the competitive ratio.

2

A Lower Bound for Bounded Space Algorithms

In order to construct a sequence which allows us to prove the lower bound, we deﬁne seven types of relatively large rectangles. For each type we check how many instances of this rectangle can simultaneously ﬁt into one bin (with no rectangles of other types). We will use the three following geometrical lemmas. Lemma 1. Let γ > 0 be a small constant. Given a packing of squares of width and height 1/k +γ, where k ≥ 2 is an integer, each bin may have at most (k −1)2 squares packed in it. Proof. Any vertical or horizontal line through the bin can meet at most k − 1 squares. Take the projection of the squares and the bin on one axis. We get short intervals of length 1/k + γ (projections of squares) on an main interval of length 1 (the projection of the bin). Each point of the main interval can have the projection of at most k − 1 rectangles. Consider the short intervals as an interval graph. The size of the largest clique is at most k−1. Therefore, as interval graphs are perfect [8], we can colour the short intervals using k − 1 colours. Note that the number of intervals of each independent set is at most k − 1 (due to length), and so the total number of intervals is at most (k − 1)2 . We omit the proofs of the two following lemmas. Lemma 2. Consider a packing of identical rectangles of width 1/2 < w < 2/3 and height 1/3 < h < 1/2, such that w + h > 1. Each bin may have one or two rectangles. Lemma 3. Consider a packing of identical rectangles of width 1/2 < w < 4/7 and height 1/7 < h < 1/6, such that w + 3h > 1. Each bin may have at most eight rectangles. We turn to the speciﬁc deﬁnition of items. Let δ > 0 be a small constant. Type A: Rectangles of width and height 1/2 + δ. By Lemma 1, a bin can only contain a single such rectangle. Type B: Rectangles of width and height 1/3 + δ. By Lemma 1, a bin can only contain at most four such rectangles. Type C1 : Rectangles of width 2/3 − δ and height of 1/3 + 2δ. By Lemma 2, a bin can only contain at most two such rectangles. Type C2 : Rectangles of width 2/3 − 2δ and height of 1/3 + 3δ. By Lemma 2, a bin can only contain at most two such rectangles.

Two Dimensional Packing: The Power of Rotation

401

Type D1 : Rectangles of width 11/21−4δ and height of 10/63+2δ. By Lemma 3, a bin can only contain at most eight such rectangles. Type D2 : Rectangles of width 32/63−4δ and height of 31/189+2δ. By Lemma 3, a bin can only contain at most eight such rectangles. Type E: Rectangles of width and height 1/7 + δ. By Lemma 1, a bin can only contain at most 36 such rectangles. We also use tiny squares of very small height and width. The choice of large rectangles was in a way that one bin (of the optimal oﬀ-line algorithm) can contain exactly one of each type. Moreover, there are small gaps left in such a bin. The size of tiny rectangles is picked so that they can ﬁll up all the gaps in a bin containing one of each type of large rectangles. We show how the large rectangles can ﬁt in a bin containing one of each type. We cut oﬀ one horizontal strip from the top of the bin. The height of this strip is 1/3 + 2δ. In this strip we assign one rectangle of type B and one C1 rectangle. From the part which is left, we cut oﬀ the rightmost part, which is a strip of height 2/3 − 2δ and width 1/3 + 3δ, and assign a rotated C2 item there. We are left with a bin of height 2/3 − 2δ and width 2/3 − 3δ. From this we cut oﬀ a horizontal strip of height 10/63 + 2δ (from the top). We use it to pack one E item and one D1 item. In the remainder of the bin we pack the A item and a rotated D2 item. Note that 31/189 + 1/2 < 2/3. The sizes of D1 and D2 were optimized so that they can ﬁt together and with the E item, and under these conditions they need to have total minimum area. This leaves an area of V = 361/47628 − Θ(δ) ≈ 0.007579575 − Θ(δ). A sequence consists of n rectangles of each type, followed by the tiny squares, i.e. n rectangles of type A followed by n rectangles of type B, then C1 , C2 , D1 , D2 , and E (n items each), and ﬁnally tiny squares of total area V n . We can now compute the value of the lower bound. Theorem 1. The competitive ratio of any on-line bounded space algorithm for packing of rotatable items is at least 120754/47628 ≈ 2.535357. Proof. A bounded space algorithm can keep at most some constant number of bins open. Therefore each type of items is packed separately (except for a small number of items). A direct calculation gives a lower bound on the number of bins used by the on-line algorithm. We omit the details.

3

A Bounded Space Algorithm

The algorithm classiﬁes items by their size and packs each class separately. To do the classiﬁcation, we use two parameters ε and M . M ≥ 10 is a positive integer, and ε > 0 is a small constant. The value of ε is chosen so that 1/(εi(i + 1)) is integer for all i < M . The role of ε is to control the additive constant of the algorithm. As ε becomes smaller, there are more open bins. On the other hand, we bound the amount of occupied space in each bin. In those calculations, an exact computation would derive values which depend on ε. For simplicity, we assume that ε is small enough and neglect it, this can change the competitive

402

Leah Epstein

ratio by a very small constant (tending to zero as ε becomes smaller). The value of M also inﬂuences the number of classes. A safe choice for M is M = 20, and we use this value. We use the following terms in order to analyze the algorithm. Occupation Ratio. For a class (or a subclass) of items, this is the minimum total area of items in a closed bin used to pack items of this class. Weight. We give each item (assigned to a bin) a weight which is the fraction of a bin which it occupies. This is not the area of the item, but the fraction of the bin that it actually uses (which can be larger than its area). E.g. an item that is a single item in a bin gets weight one. When the algorithm terminates, some bins are active and did not receive a full amount of items as did other bins with similar contents. We ignore these bins, there is only a constant number of them. Some bins will contain a ﬁxed number q of items (that are deﬁned to have similar properties, the number q depends on the items). In this case the weight of each item is 1/q. For a bin B that contains a variety of items, if the occupation ratio is ORB , we simply give an item of area r the weight r/ORB . Expansion. The expansion of an item is the ratio between its weight and its area. As a ﬁrst step, we would like to identify all cases where the occupation ratio is low, i.e. items with high expansion. This type of algorithm can be analyzed using the weighting method introduced already in [10] and further developed in [13]. We use the following theorem. Theorem 2. Given a bin packing algorithm, deﬁne a weight function on items (rectangles with 0 < h ≤ w ≤ 1) such that for all bins packed by the algorithm (except a constant number of bins), the sum of weights of items in the bin is at least 1. Consider the (inﬁnite) set of all possible (ﬁnite) sets of items such that there exists a feasible packing of them into a single bin. For each such set, deﬁne its total weight by the sum of weights of all items. Then the competitive ratio of an algorithm is bounded from above by the supremum of the weights of those sets. Next we describe the classiﬁcation. Each rectangle is classiﬁed according to its height and width. For a rectangle (x, y), let i be an integer such that 1/(i + 1) < x ≤ 1/i and let j be an integer such that 1/(j + 1) < y ≤ 1/j, clearly i ≤ j (as x ≥ y). If j < M , the class of rectangles with given values i and j is called (large) class (i, j). If j ≥ M but i < M , the class is the medium class i and otherwise we ﬁnd an integer p ≥ 0 such that 1/(2M ) < x2p ≤ 1/M . Let i be an integer M ≤ i ≤ 2M − 1 so that 1/(i + 1) < x2p ≤ 1/i . Then (x, y) belongs to small class i . We deﬁne the packing of each class of each size. For “medium” and “large” classes, we use either “simple packing”, or “advanced packing”. The speciﬁc decision depends on the properties of each class. As a rule, in order to get a low competitive ratio we need to be more careful with packing relatively large items, therefore some classes of large items will be split into sub-classes. The exact deﬁnition will be given later. We would like to make sure that the expansion ratio of all small and medium items is at most 1.5. For large items this cannot be true and therefore we need to use a deeper analysis for those items.

Two Dimensional Packing: The Power of Rotation

403

We start with deﬁning the simple packing methods for large and medium items. As simple packing of large class items is easiest, and advanced packing of such items is the most complicated, we start and end with large items. The simple packing of a large class bin is natural. There is always at most one open bin used to pack this class. The bin is partitioned into exactly ij identical rectangles (of height 1/i and width 1/j). The partition is done when the rectangle is opened, by cutting it into i identical rows and j identical columns. Each slot can accommodate exactly one item. After assigning ij items, the bin is closed and a new bin is opened for this class. For a class (i, j), a closed bin has ij items of area at least 1/((i + 1)(j + 1)). Therefore the weight of each item is 1/(ij), the occupation ratio is ij/((i + 1)(j + 1)), and the expansion of items is (i + 1)(j + 1)/(ij). The simple packing of a medium class is done as follows, we use it for i ≥ 3. For each 3 ≤ i < M , we keep one open bin. This bin is initialized by cutting it into i horizontal strips of height 1/i and width 1. The items are packed into the strips in an any-ﬁt fashion. Each strip is seen as a bin of one dimension (the width). Each item is packed immediately to the right of the previous item in the strip. All items in a medium class i are assigned to such a bin and packed into one of the strips (using any-ﬁt) ignoring their heights. When an item does not ﬁt into any of the strips, the bin is closed and a new bin for medium class i is initialized. In a closed bin, each strip is full by at least i/(i+1) to the height, and by at least 1 − 1/M to the width (an item which did not ﬁt, has width of at most 1/M ). This gives an occupation ratio of at least (3/4) · (M − 1)/M ≥ 57/80 ≥ 2/3. Advanced packing of medium classes is done for i = 1 and i = 2. We ﬁrst consider medium class 1. In this case the height of a rectangle is classiﬁed further into Θ(1/ε) subclasses. Let α be an integer such that 1 ≤ α ≤ 1/(2ε). The subclass α consists of rectangles (x, y) such that x ∈ (1/2 + (α − 1)ε, 1/2 + αε]. The bin is split into one part of width 1 and height 1/2 + αε, and another part of height 1/2 − αε and width 1/2 + αε. (Some space remains unused). After rotation we have two strips (one wide and one narrow) of height 1/2 + αε. We use them one-dimensionally applying any-ﬁt. When the bin is closed, each strip is full by at least its width minus 1/M . This gives total occupied area of at least (1/2+(α−1)ε)(1−1/M +1/2−αε−1/M ). The function x(1−1/M +1−x−1/M ) is at least 7/10 > 2/3 (achieved for x = 0.5) for M = 20 and 1/2 < x ≤ 1. For i = 2, the height is also classiﬁed further into Θ(1/ε) subclasses. Let α be an integer such that 1 ≤ α ≤ 1/(6ε). The subclass α consists of rectangles (x, y) such that x ∈ (1/3 + (α − 1)ε, 1/3 + αε]. Given a subclass α, we partition the bin into two parts of width 1 and height 1/3 + αε, and two parts of height 1/3 − 2αε and width 1/3 + αε. Again some space remains unused. After rotation we have two wide and two narrow strips of height 1/3 + αε. Those strips are used in an any-ﬁt fashion, in one dimension, to pack items of medium class 2. Similarly to the previous case, we can get the function x(2 + 2(1 − 2x) − 4/M ) for 1/3 < x ≤ 1/2. This gives at least occupied area of 37/45 > 2/3 (the minimum is obtained for x = 1/3). The packing of small items is done as follows. For a small class i the initial partition is the same as for medium classes for i ≥ 3, but the strips get further

404

Leah Epstein

partitioned. Packing into strips is again done in one dimension. For each i such that M ≤ i ≤ 2M − 1, there is a single open bin dedicated to it. When such a bin is initialized, it is split into i horizontal strips of height 1/i and width 1. The items which are assigned to this bin are in class i, i.e. for an item (x, y) there exists an integer p ≥ 0 such that 1/(i + 1) < x2p ≤ 1/i. The strips will have heights which are of the form 1/(i2k ). On arrival of an item there are several cases. Given p, if there is an open strip of height 1/(i2p ), and the item ﬁts there, it is simply packed in it. Otherwise (no such strip, or the item does not ﬁt), an empty strip of smallest height 1/(i2p ) that is still larger than 1/(i2p ) (i.e. largest p < p) is picked, and partitioned into horizontal strips of heights 1/(i2p +1 ), . . . , 1/(i2p ). An additional strip of height 1/(i2p ) is created in this process, and it is used to pack the new item. Finally if this is impossible (no p exists), the bin is closed and a new one is initialized. For a bin for small classes, used for a small class i, M ≤ i ≤ 2M − 1, we can analyze the situation when the bin is closed. Note that an empty strip of some height 1/(i2k ), k ≥ 1, is created if no such empty strip exists. When a strip of height 1/(i2p ) is partitioned in order to get a strip of height 1/(i2p ), there are no empty strips of heights 1/(i2p +1 ), . . . , 1/(i2p ). The only height of which two identical strips are created is 1/(i2p ), but one of them is immediately used. Hence only one such empty strip exists at a time. A direct calculation shows that all closed bins for small classes have an occupation ratio of at least 2/3. We omit the details. As mentioned before, for some of the large classes we do not use the simple packing method, but advanced packing. We deﬁne and analyze it now. The choice of those classes is done in a way that the occupation ratio becomes large enough for most items. Speciﬁcally, we use advanced packing for the following large classes (we picked M = 20 so all the classes listed below are large). 1. i = 1 and 2 ≤ j ≤ M − 1 = 19 2. i = 2 and 3 ≤ j ≤ M − 1 = 19 3. i = 3 and 4 ≤ j ≤ 7. For items in classes (i, i) (i < M ), rotation or further classiﬁcation does not help, so we pack them using simple packing. Other classes that are not discussed here already have occupation ratio of at least 2/3. Each large class (i, j) (i ≤ j) that is packed by advanced packing is further partitioned into Θ(1/(ε2 )) subclasses. Let α, β be integers such that 1 ≤ α ≤ 1/(εi(i + 1)) and 1 ≤ β ≤ 1/(εj(j + 1)). The subclass (i, j, α, β) consists of rectangles (x, y) such that x ∈ (1/(i + 1) + (α − 1)ε, 1/(i + 1) + αε] and y ∈ (1/(j + 1) + (β − 1)ε, 1/(j + 1) + βε]. We describe two advanced methods: “side by side” packing and “round packing”. The round packing is used only for (i = 1, j ≤ 10), (i = 2, j = 3, 4) and (i = 3, j = 4). We start with the side by side packing. Consider a subclass (i, j, α, β), we describe a partition of a bin into parts which can contain one item each. The size of each part is at least (xα , yβ ) where xα = 1/(i + 1)+αε and yβ = 1/(j + 1)+βε. We ﬁrst cut the bin into two horizontal strips of heights ixα and 1 − ixα and width 1. The ﬁrst one is further cut vertically into j identical parts, and

Two Dimensional Packing: The Power of Rotation

405

horizontally into i identical parts. In result we get ij identical parts of width 1/j and height xα . The other part is cut into 1 − ixα /yβ horizontal strips, each of height at least yβ , and further cut into parts of width 1/i. This adds i1 − ixα /yβ parts which can keep one item each. To compute the occupation ratio we see that the total area packed in a closed bin is at least i(xα −ε)(yβ −ε)(j+(1 − ixα )/yβ −1) which is close to ixα yβ (j−1)+ ixα (1−ixα ) for small ε. Given that yβ > 1/(j+1) we get at least ixα (2j/(j + 1)− ixα ). This is at least min{(j − 1)/(j + 1), (i2 j + 2ij − i2 )/((i + 1)2 (j + 1))}. We get at least (j − 1)/(j + 1) for j ≤ 2i + 1. Hence for (i = 2, j = 5) and (i = 3, j = 5, 6, 7) we get occupation ratios of at least 2/3. Otherwise we get at least (i2 j + 2ij − i2 )/((i + 1)2 (j + 1)) and so for (i = 1, j ≥ 11) and (i = 2, j ≥ 6) we get occupation ratios of at least 2/3. Next we describe the “round packing” method. Given sizes s, t such that s+t ≤ 1, we ﬁt two rectangles of width s and height t in the upper left and lower right corners, and two rectangles of width t and height s in the other corners. Since rotation is allowed, we can assume that we thus have four identical areas. Given i, j, α, β we can ﬁnd positive integers k and r such that kxα + ryβ ≤ 1, we deﬁne s = kxα , t = ryβ and get four areas such that each area can contain rk items. In total we can pack 4rk items of subclass (i, j, α, β). Note that using side by side packing we can pack i(j + r) items when r is the largest integer satisfying ixα + ryβ ≤ 1. For i = 3, j = 4 we can pack 12 or 15 items using side by side packing and 16 items using round packing. The case 2xα + 2yβ > 1 implies 3xα + yβ > 1 (as xα > yβ ) so we need to consider only two options. We get 12 items if 2xα + 2yβ > 1, and we get 16 items if 2xα + 2yβ ≤ 1. In the ﬁrst case the minimum for 12xα yβ (which is the occupation ratio) is 18/25 > 2/3. If 16 items ﬁt, the occupation ratio is at least 16 · (1/5) · (1/4) = 4/5. For i = 2, j = 3, six items are assigned by simple packing. If xα + 2yβ ≤ 1, round packing allows to pack eight items. The ﬁrst option gives occupation ratio of at least 6/9 = 2/3, the second gives at least 1/3 · (1/4) · 8 = 2/3 as well. Simple packing may be used if the inequality does not hold, and round packing if it does. For i = 2, j = 4, if xα + 2yβ > 1, simple packing assigns eight items. If 2xα + yβ ≤ 1 and xα + 3yβ > 1, we can assign ten items using side by side packing. If xα + 3yβ ≤ 1 we can pack 12 items using round packing. We get occupation ratios of 16/25, 20/27 and 4/5. For i = 1 we use either a simple packing of j items (if xα + yβ > 1) or if xα + ryβ ≤ 1 but xα + (r + 1)yβ > 1, we can choose between packing j + r items side by side or 4r items using round packing. Therefore, if j ≥ 3r we prefer side by side packing and otherwise round packing. For all simple packing cases we get xα +yβ > 1 and j items packed. This gives an occupation ratio of at least j 2 /((j + 1)2 ). If xα + yβ ≤ 1 (but xα + 2yβ > 1) we use round packing for j = 2 only, and for this case get occupation ratio of at least 2/3. For j = 3 we pack four items using round or side by side packing and get occupation ratio of at least 1/2. In all other cases where xα + yβ ≤ 1 and xα + 2yβ > 1 we get an occupation ratio of at least (j − 1)/(j + 1).

406

Leah Epstein

The case xα + 2yβ ≤ 1 and xα + 3yβ > 1 is relevant for j ≥ 4. If 4 ≤ j ≤ 5 we can use round packing and pack eight items. This gives occupation ratio of at least 2/3 (xα > 1/2 and yβ > 1/6). We get occupation ratio of at least (j 2 − 4)/((j + 1)2 ) for j ≥ 6. For j = 6 this gives 32/49 and for j ≥ 7, (j 2 − 4)/((j + 1)2 ) ≥ 2/3 The case xα + 3yβ ≤ 1 and xα + 4yβ > 1 is relevant for j ≥ 6. For 6 ≤ j ≤ 8 we pack 12 items using round packing. The occupation ratio is at least 2/3. For 9 ≤ j ≤ 10 we pack j + 3 items using side by side packing, and get occupation ratio of at least (j 2 − 9)/((j + 1)2 ) > 2/3 for both cases (j = 9 and j = 10). The last option xα + 4yβ ≤ 1 is useful only in some cases where 8 ≤ j ≤ 10. In this case, 16 items are packed using round packing. We get an occupation ratio of at least 8/11 ≥ 2/3. We analyze the competitive ratio using Theorem 2. Due to space constraints we omit the analysis which leads to the following theorem. Theorem 3. The competitive ratio of the above algorithm is at most

2935 1152

+ δ.

It is unclear whether an algorithm of this type can have the best performance among bounded space algorithms. It seems that a diﬀerent algorithm could pack some items better. However, as we saw in the lower bound section, the performance cannot be improved by very much. Note that it is diﬃcult to analyze algorithms for rotatable items using a computer program. For the one-dimensional case, and also for the standard twodimensional case, there is a ﬁxed set of sizes which is critical. Those are sizes of the structure 1/k + δ (for small δ > 0). However, as we saw in the lower bound section, there are no such rules for rotatable items.

4

An Unbounded Space Algorithm

In the previous sections we saw that the drawback of bounded space algorithms is having to pack diﬀerent types of items separately. As we saw earlier, the items with largest expansion are the relatively small items which still belong to the (1,1) class. We deﬁne an algorithm which combines large but diﬀerent items together. The idea is to combine not all, but a certain percentage of relatively large items in the same bins. This is similar to the algorithms “Reﬁned Harmonic” [11] and “Modiﬁed Harmonic” [12]. Our algorithm packs most items as in the previous algorithm. Some items in the largest three classes: (1,1), (1,2) and (2,2) are packed diﬀerently and combined. The analysis uses a generalized form of Theorem 2 and gives a competitive ratio slightly below 2.45. The details are omitted.

5

Conclusion

We showed an almost optimal algorithm (in terms of competitive ratio) for bounded space packing of rotatable items. It is interesting to ﬁnd an algorithm which can be shown to be optimal. It could be the case that our algorithm, or a simple adaptation, could be used for this purpose. Another interesting issue is

Two Dimensional Packing: The Power of Rotation

407

simplifying the analysis of unbounded space algorithms (i.e. reducing the problem to a given set of possible items as in [13]) in a way that a computer program will be able to perform the analysis. This will allow a deeper analysis which can take smaller items into account, and should result in smaller competitive ratios. A natural extension of the problem is packing rotatable boxes in three dimensional cubes, and packing rotatable d-dimensional hyper-boxes in d-dimensional hyper-cubes. Most methods used in this paper can be applied for more dimensions as well, however there may be many cases to check and it seems that a good analysis can be done once a method to computerize such calculations is developed.

Acknowledgement The author would like to thank the MFCS referee who gave many helpful comments.

References 1. D. Coppersmith and P. Raghavan. Multidimensional online bin packing: Algorithms and worst case analysis. Operations Research Letters, 8:17–20, 1989. 2. J. Csirik and A. van Vliet. An on-line algorithm for multidimensional bin packing. Operations Research Letters, 13(3):149–158, Apr 1993. 3. J. Csirik, J. B. G. Frenk, and M. Labbe. Two dimensional rectangle packing: on line methods and results. Discrete Applied Mathematics, 45:197–204, 1993. 4. M. Dell’Amico, S. Martello, and D. Vigo. A lower bound for the non-oriented two-dimensional bin packing problem. Discrete Applied Mathematics, 118:13–24, 2002. 5. L. Epstein and R. van Stee. Optimal online bounded space multidimensional packing. Technical Report SEN-E0303, CWI, Amsterdam, 2003. 6. S. Fujita and T. Hada. Two-dimensional on-line bin packing problem with rotatable items. Theoretical Computer Science, 289(2):939–952, 2002. 7. G. Galambos and A. van Vliet. Lower bounds for 1-, 2-, and 3-dimensional online bin packing algorithms. Computing, 52:281–297, 1994. 8. T. R. Jensen and B. Toft. Graph coloring problems. Wiley, 1995. 9. D. S. Johnson. Near-optimal bin packing algorithms. PhD thesis, MIT, Cambridge, MA, 1973. 10. D. S. Johnson, A. Demers, J. D. Ullman, Michael R. Garey, and Ronald L. Graham. Worst-case performance bounds for simple one-dimensional packing algorithms. SIAM Journal on Computing, 3:256–278, 1974. 11. C. C. Lee and D. T. Lee. A simple online bin packing algorithm. Journal of the ACM, 32:562–572, 1985. 12. P. Ramanan, D. J. Brown, C. C. Lee, and D. T. Lee. Online bin packing in linear time. Journal of Algorithms, 10:305–326, 1989. 13. S. S. Seiden. On the online bin packing problem. Journal of the ACM, 49(5):640– 671, 2002. 14. S. S. Seiden and R. van Stee. New bounds for multi-dimensional packing. In Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’2002), pages 486–495, 2002. 15. A. van Vliet. An improved lower bound for online bin packing algorithms. Information Processing Letters, 43:277–284, 1992.

Approximation Schemes for the Min-Max Starting Time Problem Leah Epstein1 and Tamir Tassa2 1

2

School of Computer Science, The Interdisciplinary Center P.O.B 167, 46150 Herzliya, Israel [email protected] Department of Applied Mathematics, Tel-Aviv University Ramat Aviv, Tel Aviv, Israel [email protected]

Abstract. We consider the oﬀ-line scheduling problem of minimizing the maximal starting time. The input to this problem is a sequence of n jobs and m identical machines. The goal is to assign the jobs to the machines so that the ﬁrst time in which all jobs have already started their processing is minimized, under the restriction that the processing of the jobs on any given machine must respect their original order. Our main result is a polynomial time approximation scheme for this problem in the case where m is considered as part of the input. As the input to this problem is a sequence of jobs, rather than a set of jobs where the order is insigniﬁcant, we present techniques that are designed to handle ordering constraints. Those techniques are combined with common techniques of assignment problems in order to yield a polynomial time approximation scheme.

1

Introduction

Consider the following scenario: a computer operator needs to run an ordered sequence of n jobs having known processing times. The operator may assign the jobs to one of m identical and parallel machines, where each job must be executed continuously and completely on one machine. After the jobs have been assigned to the m machines, they must run on each machine according to their original order. The operator needs to verify that each job has started running before he may go home. The goal of the operator is to minimize the time when he could go home. Hence, he aims at ﬁnding an assignment that minimizes the maximal starting time, namely, the maximum over all machines of the time in which the last job assigned to that machine starts running. The above scenario may be generalized to any setting where n clients are registered in a service center having m servers; e.g., patients in a clinic where there are m doctors or drivers that bring their car to a garage where there are m service stations. Each client has a priority that could be determined, for example, by the time the client has called in to make an appointment. The clients need

Research supported in part by the Israel Science Foundation (grant no. 250/01).

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 408–418, 2003. c Springer-Verlag Berlin Heidelberg 2003

Approximation Schemes for the Min-Max Starting Time Problem

409

to be assigned to servers according to their estimated service time so that the time in which the last client starts being served is minimized. That could be the relevant cost function if the waiting clients need to be attended (for example, if the receptionist in the clinic needs to stay and watch the patients that are still waiting, he would aim at assigning the clients to doctors so that he could leave as early as possible). An assignment constraint is that a client with a lower priority cannot be served before a client of a higher priority by the same server. We note that a similar oﬀ-line problem with ordering constraint was presented in [10]. There, the n jobs were to be assigned to m = 2 machines and be processed on each machine according to their original order so that the sum of all completion times is minimized. Many scheduling problems have been studied both in oﬀ-line [3,4,6,12,7] and on-line environments. In on-line environments, the jobs usually arrive in a sequence. In oﬀ-line environments, however, the input is usually a set of jobs, where the order of jobs is insigniﬁcant. The min-max starting time problem was ﬁrst introduced as an on-line problem in [2]. Here, we study the oﬀ-line version of the problem where the input is still viewed as a sequence of jobs and order does matter. Note that if we disregard order in this problem, namely, if we view the sequence of jobs as merely a set of jobs, then it becomes equivalent to the standard makespan problem. Indeed, the non-ordered min-max starting time problem may be reduced to the makespan problem if we remove the m largest jobs and then solve a makespan minimization problem for the remaining n − m jobs. On the other hand, the makespan problem may be reduced to the non-ordered min-max starting time problem by adding m additional jobs of very large size to the given set of jobs. In view of the above, the non-ordered version of our problem is strongly NP-hard, and, consequently, so is the ordered version which we study herein. Due to the strong NP-hardness of the problem, Polynomial Time Approximation Schemes (PTAS) are sought. Such schemes aim at ﬁnding a solution whose target function value is larger than the optimal value by a factor of no more than (1 + ε), where ε > 0 is an arbitrarily small parameter. The run-time of such schemes depends polynomially on n and m, but it depends exponentially on 1/ε. In case m is viewed as a constant, one usually aims at ﬁnding a Fully Polynomial Time Approximation Scheme (FPTAS), the run time of which depends polynomially on n and 1/ε, but is exponential in the constant m. For example, PTAS for the classical makespan problem were designed by Hochbaum and Shmoys [6,5], while an FPTAS to that problem was presented by Graham in [4] and later by Sahni in [12]. Regarding the on-line version of our problem, an algorithm of competitive approximation ratio 12 is presented in [2]. It is also shown there that a greedy algorithm that performs list scheduling on the sequence [3] has a competitive ratio of Θ(log m). In this paper we design two approximation schemes: a PTAS for the case where m is part of the input and an FPTAS for the case where m is constant (the FPTAS is omitted due to space limitations). In doing so, we employ several techniques that appeared in previous studies: rounding the job sizes [6], distin-

410

Leah Epstein and Tamir Tassa

guishing between small and large jobs and preprocessing the small jobs [5,1], and enumeration. However, the handling of sequences of jobs, rather than sets thereof, is signiﬁcantly more delicate: the small jobs require a diﬀerent handling when order matters and one must keep track of the index of the ﬁnal job that is scheduled to run on each machine in order to guarantee the legality of the assignment. The techniques presented here address those issues. Note that there exists some work on oﬀ-line scheduling of sequences. One such example is the precedence constraints problem. The jobs in that problem are given as the vertices of a directed acyclic graph. An edge (a, b) means that job a must be completed before job b is started. The goal is to minimize the makespan. It was proved in [9] that it is hard to achieve for this problem an approximation factor smaller than 4/3, unless P = N P . Schuurman and Woeginger [13] mention this problem as the ﬁrst among ten main open problems in approximation of scheduling problems. Speciﬁcally they ask whether the problem can be approximated up to a factor smaller than 2 − 1/m, an approximation factor that is achieved by the on-line List Scheduling algorithm [3]. We proceed by presenting a formal deﬁnition of the problem. In the Minmax Starting Time Problem one is given a sequence of n jobs with processing times pi , 1 ≤ i ≤ n, and m < n identical machines, Mk , 1 ≤ k ≤ m. An assignment of the jobs to the machines is a function A : {1, . . . , n} → {1, . . . , m}. The subset of jobs that are scheduled to run on machine Mk is {pi : i ∈ A−1 (k)} where A−1 (k) = {i : A(i) = k}. The jobs are processed on each machine in the order that corresponds to their index. Hence, the index of the last job to run on machine Mk is given by fk = max{A−1 (k)}. Such jobs are referred to as the ﬁnal jobs for assignment A. The time in which the ﬁnal job on machine Mk will start running is given by Fk = pi . The goal is to ﬁnd an assignment i∈A−1 (k)\{fk }

A such that the time in which the last ﬁnal job starts running is minimized. Namely, we look for an assignment for which T (A) := max Fk is minimized. 1≤k≤m

The discussion of this problem becomes easier if we assume that the given problem instance is collision-free in the sense that all processing times are different, i.e., pi = pj for i = j. Such an assumption also allows us to identify a job with its processing time. In order to rely upon that assumption, we show how to translate a problem instance having collisions into another one that is collision-free and has a close target function value. To that end, deﬁne ∆ = min1≤i,j≤n {pi , |pi −pj |}. If C = {pi }1≤≤c is a cluster of colliding jobs, pi1 = pi }1≤≤c where pˇi = . . . = pic , we replace that cluster of jobs with Cˇ = {ˇ pi + ( − 1) · ε∆ , where 0 < ε ≤ 1. By the deﬁnition of ∆, it is clear that after ap2 n plying this procedure to all clusters among {p1 , . . . , pn }, we get a new sequence of perturbed jobs {ˇ p1 , . . . , pˇn } that is collision-free. Moreover, as 0 ≤ pˇi − pi < ε∆ n , we conclude that the value of the target function may increase in wake of such a perturbation by no more than ε∆. Since it is clear that all assignments satisfy T (A) ≥ ∆ (recall that n > m), we may state the following: Proposition 1. Let A be an assignment of {p1 , . . . , pn }, the original sequence of jobs, and let Aˇ be the corresponding assignment of the perturbed sequence of ˇ ≤ (1 + ε) · T (A) , 0 < ε ≤ 1. jobs, {ˇ p1 , . . . , pˇn }. Then T (A) ≤ T (A)

Approximation Schemes for the Min-Max Starting Time Problem

411

In view of the above, we assume henceforth that all processing times are diﬀerent and we maintain the original notation (i.e., pi and not pˇi ). In the subsequent section we describe a PTAS for the case where m is part of the input. The FPTAS the case where m is constant appears in the full version of this paper.

2

A Polynomial Time Approximation Scheme

To facilitate the presentation of our PTAS, we introduce the following notations: 1. Given an assignment A, FA denotes the subset of indices of ﬁnal jobs, FA = {fk : 1 ≤ k ≤ m}. 2. FAc denotes the complement subset of indices of non-ﬁnal jobs. 3. plnf denotes the size of the largest non-ﬁnal job, plnf = max{pi : i ∈ FAc }. 4. J m+1 = {pj1 , . . . , pjm+1 } denotes the subset of the m + 1 largest jobs. The above deﬁnitions imply that one of the jobs in J m+1 has processing time plnf . The main loop in the algorithm goes over all jobs p ∈ J m+1 and considers assignments in which plnf = p. Obviously, by doing so, we cover all possible assignments. For a given value of plnf , we may conclude that all corresponding assignments A satisfy T (A) ≥ plnf . In view of that, we decompose the set of jobs {p1 , . . . , pn } to small and large jobs as follows: pi : pi ≤ εplnf

S=

,

L=

pi : pi > εplnf

.

(1)

Discretizing the Large Jobs. Given a lower bound plnf , we discretize the processing times of all large jobs that are smaller than plnf . Namely, we treat all jobs pi for which εplnf < pi < plnf . To this end, we deﬁne a geometric mesh on the interval [ε, 1], ξ0 = ε ;

ξi = (1 + ε)ξi−1 ,

and then, for all p ∈ L,

p =

1≤i≤q ;

plnf · H p/plnf p

q :=

if p < plnf otherwise

− lg ε lg(1 + ε)

,

,

(2)

(3)

where H replaces its argument by the left end point of the interval [ξi−1 , ξi ) where it lies. Note that if p < plnf then p belongs to a ﬁnite set of size q, Ω = {ξ0 · plnf , . . . , ξq−1 · plnf }. With this deﬁnition, we state the following straightforward proposition. Proposition 2. For a given 0 < ε ≤ 1 and plnf ∈ J m+1 , let S and L be as in (1), and L = {p : p ∈ L} . (4)

412

Leah Epstein and Tamir Tassa

Let A be an assignment of the original jobs, S ∪L, and let A be the corresponding assignment of the modiﬁed jobs S ∪ L . Then T (A ) ≤ T (A) ≤ (1 + ε) · T (A ). Preprocessing the Small Jobs. Denote the subsequence of indices of small jobs by i1 < i2 < . . . < ib , where b = |S|. We describe below how to modify the small jobs into another set of jobs, the size of which is either εplnf or 0. To that end, let σr denote the sum of the ﬁrst r small jobs, r σr = pik 0≤r≤b. (5) k=1

The modiﬁed small jobs are deﬁned as follows:    εplnf if σr /εplnf > σr−1 /εplnf . (6) Sˆ = {ˆ pi1 , . . . , pˆib } where pˆir =   0 otherwise Proposition 3. Let A be an assignment of the original jobs, S ∪ L, with target ˆ of the value T (A). Then for every 0 < ε ≤ 1 there exists a legal assignment, A, ˆ modiﬁed jobs, S ∪ L, such that ˆ ≤ T (A) + 2εplnf . T (A)

(7)

Notation Agreement. Each assignment A : {1, . . . , n} → {1, . . . , m} induces a unique function from the set of processing times {p1 , . . . , pn } to the set of machines {M1 , . . . , Mm }. In view of our assumption of distinct processing times, the converse holds as well. In order to avoid cumbersome notations, we identify between those two equivalent functions and use the same letter to denote them both. For example, notations such as A(i) or A−1 (k) correspond to the indexindex interpretation, while A : S ∪ L → {M1 , . . . , Mm } corresponds to the jobmachine interpretation. Proof. The order of the machines. Consider the subset of indices of ﬁnal jobs, FA . Without loss of generality, we assume that they are monotonically increasing, i.e., f1 < f2 < . . . < fm . ˆ Deﬁne the preﬁx subsets and sums of A, Description of A. Ak = {pi : pi ∈ S and A(i) ≤ k} ,

τk =

pi

0≤k≤m.

(8)

pi ∈Ak

Namely, Ak denotes the preﬁx subset of small jobs that are assigned by A to one of the ﬁrst k machines, while τk denotes the corresponding preﬁx sum. Next, we deﬁne σr τk = 0 ≤ k < m and r(m) = b , (9) r(k) = min r : εplnf εplnf where σr is given in (5) and b is the number of small jobs. Obviously,

Approximation Schemes for the Min-Max Starting Time Problem

0 = r(0) ≤ r(1) ≤ . . . ≤ r(m) = b . Next, we deﬁne the assignment Aˆ : Sˆ ∪ L → {M1 , . . . , Mm } as follows: ˆ A(p) = A(p)

∀p ∈ L ,

413

(10)

(11)

namely, it coincides with A for the large jobs, while for the modiﬁed small jobs ˆ pi ) = k A(ˆ r

for all 1 ≤ r ≤ b such that r(k − 1) + 1 ≤ r ≤ r(k).

(12)

ˆ Finally, for each of Note that (10) implies that (12) deﬁnes Aˆ for all jobs in S. the machines, we rearrange the jobs that were assigned to it in (11) and (12) in an increasing order according to their index. ˆ Similarly to (8), we deﬁne the preﬁx subsets and The preﬁx sets and sums of A. ˆ sums for the modiﬁed small jobs Sˆ and assignment A: ˆ ≤ k} , τˆk = pi : pˆi ∈ Sˆ and A(i) pˆi 0 ≤ k ≤ m . (13) Aˆk = {ˆ ˆk pˆi ∈A

Denote the largest index of a job in Ak by it(k) and the largest index of a job in Aˆk by itˆ(k) . As (12) implies that tˆ(k) = r(k) while, by (9), r(k) ≤ t(k), we conclude that tˆ(k) ≤ t(k). ˆ Given an assignment A : S ∪ L → {M1 , . . . , Mm } The Min-Max start time of A. ˆ we deﬁned an assignment A : Sˆ ∪ L → {M1 , . . . , Mm }. Let Mk be an arbitrary machine, 1 ≤ k ≤ m. The ﬁnal job in that machine, corresponding to the ﬁrst assignment, is pfk , and its start time is θk = k + τk − τk−1 − pfk , where k is the sum of large jobs assigned to Mk . Similarly, letting pˆfˆk denote the ﬁnal job in that machine as dictated by the second assignment, its start time is θˆk = k + τˆk − τˆk−1 − pˆfˆk . In order to prove (7) we show that θˆk ≤ θk + 2εplnf .

(14)

First, we observe that in view of the deﬁnition of the modiﬁed small jobs, (6), and r(k), (9), τˆk = εplnf · τk /εplnf . Consequently, τˆk − εplnf < τk ≤ τˆk .

(15)

Therefore, it remains to show only that pfk − pˆfˆk ≤ εplnf

(16)

in order to establish (14). If pfk ∈ S then (16) is immediate since then pfk − pˆfˆk ≤ pfk ≤ εplnf . Hence, we concentrate on the more interesting case where pfk ∈ L. In this case ˆ We claim that it is in fact also the ﬁnal the job pfk is assigned to Mk also by A. job in that latter assignment, namely, pˆfˆk = pfk . This may be seen as follows:

414

Leah Epstein and Tamir Tassa

◦ The indices of the modiﬁed small jobs in Mk are bounded by itˆ(k) . ◦ As shown earlier, itˆ(k) ≤ it(k) . ◦ it(k) < fk since the machines are ordered in an increasing order of fk and, therefore, the largest index of a small job that is assigned to one of the ﬁrst k machines, it(k) , is smaller than the index of the ﬁnal (large) job on the kth machine, fk . ◦ The above arguments imply that the indices of the modiﬁed small jobs in Mk cannot exceed fk . ◦ Hence, the rearrangement of large and modiﬁed small jobs that were assigned to Mk by Aˆ keeps job number fk as the ﬁnal job on that machine. This proves pˆfˆk = pfk and, consequently, (16).

Proposition 4. Let Aˆ be an assignment of the modiﬁed jobs, Sˆ ∪ L, with target ˆ Then there exists a legal assignment, A, of the original jobs, S ∪ L, value T (A). such that ˆ + 2εplnf . T (A) ≤ T (A) (17) Remark. Proposition 4 complements Proposition 3 as it deals with the inverse reduction, from a solution in terms of the modiﬁed jobs to a solution in terms of the original jobs. Hence the similarity in the proofs of the two complementary propositions. Note, however, that the direction treated in Proposition 4 is the algorithmically important one (as opposed to the direction treated in Proposition 3 that is needed only for the error estimate). Proof. Due to the similarity of this proof to the previous one, we focus on the constructive part of the proof. We ﬁrst assume that the indices of the ﬁnal jobs, ˆ are monotonically increasing, namely, fˆ1 < fˆ2 < . . . < fˆm . as dictated by A, ˆ Then, we consider the preﬁx subsets and sums of A, ˆ ≤ k} , τˆk = Aˆk = {ˆ pi : pˆi ∈ Sˆ and A(i) pˆi 0≤k≤m, (18) ˆk pˆi ∈A

and deﬁne

τˆk σr r(k) = min r : = εplnf εplnf

0 ≤ k < m and r(m) = b ,

(19)

where σr and b are as before. Note that τˆk is always an integral multiple of εplnf and 0 = r(0) ≤ r(1) ≤ . . . ≤ r(m) = b. Finally, we deﬁne the assignment A : S ∪ L → {M1 , . . . , Mm }: A coincides with Aˆ on L; as for the small jobs S, A is deﬁned by A(pir ) = k

for all 1 ≤ r ≤ b such that r(k − 1) + 1 ≤ r ≤ r(k) .

(20)

The jobs in each machine – large and small – are then sorted according to their index. The proof of estimate (17) is analogous to the proof of (7) in Proposition 3. In fact, if we take the proof of (7) and swap there between every hat-notated ˆ pi ↔ pˆi , it (k) ↔ iˆ etc.), symbol with its non-hat counterpart (i.e., A ↔ A, t(k) we get the corresponding proof of (17).

Approximation Schemes for the Min-Max Starting Time Problem

415

The Algorithm. In the previous subsections we described two modiﬁcations of the given jobs that have a small eﬀect on the value of the target function, Propositions 2–4. The ﬁrst modiﬁcation translated the values of the large jobs that are smaller than plnf into values from a ﬁnite set Ω. The second modiﬁcation replaced the set of small jobs with modiﬁed small jobs of size either 0 or ξ0 · plnf = ε · plnf . Hence, after applying those two modiﬁcations, we are left with job sizes p where either p ≥ plnf or p ∈ Ω ∪ {0}. After those preliminaries, we are ready to describe our algorithm. The Main Loop 1. Identify the subsequence of m + 1 largest jobs, J m+1 = {pj1 , . . . , pjm+1 }, pj1 > pj2 > . . . > pjm+1 . 2. For r = 1, . . . , m + 1 do: (a) Set plnf = pjr . (b) Identify the subsets of small and large jobs, (1). (c) Discretize the large jobs, L → L , according to (3)+(4). ˆ (d) Preprocess the small jobs, S → S. (e) Solve the problem in an optimal manner for the modiﬁed sequence of jobs Sˆ ∪ L , using the core algorithm that is described below. (f) Record the optimal assignment, Ar , and its value, T (Ar ). 3. Select the assignment Ar for which T (Ar ) is minimal. 4. Translate Ar to an assignment A in terms of the original jobs, using Propositions 2 and 4. 5. Output assignment A. The Core Algorithm The core algorithm receives: (1) a value of r, 1 ≤ r ≤ m + 1; (2) a guess for the largest non-ﬁnal job, plnf = pjr ; (3) a sequence Φ of r − 1 jobs pj1 , . . . , pjr−1 that are ﬁnal jobs; (4) a sequence Γ of n − r jobs, the size of which belongs to Ω ∪ {0}. It should be noted that the given choice of plnf splits the remaining n − 1 jobs from Sˆ ∪ L into two subsequences of jobs: those that are larger than plnf (i.e., the r − 1 jobs in Φ that are non-modiﬁed jobs) and those that are smaller than plnf (i.e., the n − r jobs in Γ that are modiﬁed jobs, either large or small). Step 1: Filling in the remaining ﬁnal jobs. The input to this algorithm dictates the identity of r − 1 ﬁnal jobs, Φ. We need to select the additional m − r + 1 ﬁnal jobs out of the n − r jobs in Γ . We omit the proof of the next lemma. Lemma 1. The essential number of selections of the remaining m − r + 1 ﬁnal jobs out of the n − r jobs in Γ is polynomial in m.

416

Leah Epstein and Tamir Tassa

Step 2: A makespan problem with constraints. Assume that we selected in Step 1 the remaining m − r + 1 ﬁnal jobs and let FA = {fk : 1 ≤ k ≤ m} be the resulting selection of ﬁnal jobs, fk being the index of the selected ﬁnal job in Mk . Without loss of generality, we assume that f1 < f2 < . . . < fm . Given this selection, an optimal solution may be found by solving the makespan problem on the remaining n − m jobs with the constraint that a job pi may be assigned only to machines Mk where i < fk . Next, deﬁne Γη∗ to be the subset of non-ﬁnal jobs from Γη , −1 ≤ η ≤ q − 1, and let Γq∗ = {plnf }. For each 0 ≤ k ≤ m and −1 ≤ η ≤ q, we let zηk be the number of jobs from Γη∗ whose index is less than fk . Finally, we deﬁne for k each 0 ≤ k ≤ m the vector zk = (z−1 , . . . , zqk ) that describes the subset of non-ﬁnal jobs that could be assigned to at least one of the ﬁrst k machines. In particular, we see that 0 = z0 ≤ z1 ≤ . . . ≤ zm where the inequality sign is to be understood component-wise and zm describes the entire set of non-ﬁnal jobs, FAc . In addition, we let P(zk ) = {z ∈ N q+2 : 0 ≤ z ≤ zk } be the set of all sub-vectors of zk . Step 3: Shortest path in a graph. Next, we describe all possible assignments of those jobs to the m machines using a layered graph G = (V, E). The set of vertices is composed of m+1 layers, k m V = ∪m k=0 Vk , where Vk = P(z ) 0 ≤ k < m and Vm = {z }. We see that 0 the ﬁrst and last layers are composed of one vertex only, {z = 0} and {zm } respectively, while the intermediate layers are monotonically non-decreasing in size corresponding to the non-decreasing preﬁx subsets P(zk ). The set of edges is composed of m layers, E = ∪m k=1 Ek where Ek is deﬁned by Ek = { (u, v) ∈ Vk−1 × Vk : u ≤ v }. It is now apparent that all possible assignments of jobs from FAc to the m machines are represented by all paths in G from V0 to Vm . Assigning weights to the edges of the graph in the natural q q−1 (vη − uη ) · plnf · ξη , where {ξη }η=0 are given in (2) and manner, w[(u, v)] = η=0

ξq = 1, we may deﬁne the cost of a path (u0 , . . . , um ) ∈ V0 × . . . × Vm as T [(u0 , . . . , um )] = max{w[(uk−1 , uk )] : 1 ≤ k ≤ m}. In view of the above, we need to ﬁnd the shortest path in the graph, from the source V0 to the sink Vm , and then translate it to an assignment in the original jobs. Step 4: Translating the shortest path to an assignment of jobs. For 1 ≤ k ≤ m and −1 ≤ η ≤ q, let ukη be the number of jobs of type η that were assigned by the shortest path to machine k. Then assign the ukη jobs of the lowest indices in Γη∗ to machine Mk and remove those jobs from Γη∗ . Finally, assign the ﬁnal jobs of indices fk , 1 ≤ k ≤ m. Performance Estimates. We omit the proofs of the following theorems. Theorem 1. For a ﬁxed 0 < ε ≤ 1, the running time of the above described algorithm is polynomial in m and n.

Approximation Schemes for the Min-Max Starting Time Problem

417

Theorem 2. Let T o be the value of an optimal solution to the given Min-Max Starting Time Problem. Let A be the solution that the above described PTAS yields for that problem. Then for all 0 < ε ≤ 1, T (A) ≤ (1 + 35ε)T o .

3

Concluding Remarks

The min-max starting time problem is closely related to the makespan minimization problem in the hierarchical model [11]. In that model, each of the given jobs may be assigned to only a suﬃx subset of the machines, i.e., {Mk , . . . , Mm } for some 1 ≤ k ≤ m. An FPTAS for the makespan minimization problem in the case where m is constant is given in [7,8]. That FPTAS may serve as a building block for an FPTAS for the min-max starting time problem. We also observe that the same techniques that were presented above to construct a PTAS for the min-max starting time problem may be used in order to construct a PTAS for the makespan minimization problem in the hierarchical model (to the best of our knowledge, such a PTAS was never presented before).

Acknowledgement The authors would like to thank Yossi Azar from Tel-Aviv University for his helpful suggestions.

References 1. N. Alon, Y. Azar, G. Woeginger, and T. Yadid. Approximation schemes for scheduling on parallel machines. Journal of Scheduling, 1:1:55–66, 1998. 2. L. Epstein and R. van Stee. Minimizing the maximum starting time on-line. In Proc. of the 10th Annual European Symposium on Algorithms (ESA’2002), pages 449–460, 2002. 3. R.L. Graham. Bounds for certain multiprocessor anomalies. Bell System Technical Journal, 45:1563–1581, 1966. 4. R.L. Graham. Bounds on multiprocessing timing anomalies. SIAM J. Appl. Math, 17:416–429, 1969. 5. D. Hochbaum and D. Shmoys. A polynomial approximation scheme for scheduling on uniform processors: using the dual approximation approach. SIAM Journal on Computing, 17(3):539–551, 1988. 6. D. S. Hochbaum and D. B. Shmoys. Using dual approximation algorithms for scheduling problems: theoretical and practical results. Journal of the Association for Computing Machinery, 34(1):144–162, 1987. 7. E. Horowitz and S. Sahni. Exact and approximate algorithms for scheduling nonidentical processors. Journal of the Association for Computing Machinery, 23:317– 327, 1976. 8. K. Jansen and L. Porkolab. Improved approximation schemes for scheduling unrelated parallel machines. In Proceedings of the 31st annual ACM Symposium on Theory of Computing (STOC’99), pages 408–417, 1999. 9. J. K. Lenstra and A. H. G. Rinnooy Kan. Complexity os scheduling under precedence constraints. Operations Research, 26:22–35, 1978.

418

Leah Epstein and Tamir Tassa

10. S.R. Mehta, R. Chandrasekaran, and H. Emmons. Order-presrving allocation of jobs to two machines. Naval Research Logistics Quarterly, 21:846–847, 1975. 11. J. Naor, A. Bar-Noy, and A. Freund. On-line load balancing in a hierarchical server topology. In Proc. of the 7th European Symp. on Algorithms (ESA’99), pages 77–88. Springer-Verlag, 1999. 12. S. Sahni. Algorithms for scheduling independent tasks. Journal of the Association for Computing Machinery, 23:116–127, 1976. 13. P. Schuurman and G. Woeginger. Polynomial time approximation algorithms for machine scheduling: Ten open problems. Journal of Scheduling, 2:203–213, 1999.

Quantum Testers for Hidden Group Properties Katalin Friedl1 , Fr´ed´eric Magniez2 , Miklos Santha2 , and Pranab Sen2 1 2

CAI, Hungarian Academy of Sciences, H-1111 Budapest, Hungary CNRS–LRI, UMR 8623 Universit´e Paris-Sud, 91405 Orsay, France

Abstract. We construct eﬃcient or query eﬃcient quantum property testers for two existential group properties which have exponential query complexity both for their decision problem in the quantum and for their testing problem in the classical model of computing. These are periodicity in groups and the common coset range property of two functions having identical ranges within each coset of some normal subgroup.

1

Introduction

In the paradigm of property testing one would like to decide whether an object has a global property by performing random local checks. The goal is to distinguish with suﬃcient conﬁdence the objects which satisfy the property from those objects that are far from having the property. In this sense, property testing is a notion of approximation for the corresponding decision problem. Property testers, with a slightly diﬀerent objective, were ﬁrst considered for programs under the name of self-testers. Following the pioneering approach of Blum, Kannan, Luby and Rubinfeld [3], self-testers were constructed for programs purportedly computing functions with some algebraic properties such as linear functions, polynomial functions, and functions satisfying some functional equations [3,14]. The notion in its full generality was deﬁned by Goldreich, Goldwasser and Ron and successfully applied among others to graph properties [8]. For surveys on property testing see [6]. Quantum computing (for surveys see e.g. [13]) is an extremely active research area, where a growing trend is to cast quantum algorithms in a group theoretical setting. In this setting, we are given a ﬁnite group G and, besides the group operations, we also have at our disposal a function f mapping G into a ﬁnite set. The function f can be queried via an oracle. The complexity of an algorithm is measured by the number of queries (i.e. evaluations of the function f ), and also by the overall running time counting one query as one computational step. We say that an algorithm is query eﬃcient (resp. eﬃcient) if its query complexity (resp. overall time complexity) is polynomial in the logarithm of the order of G. The most important unifying problem of group theory for the purpose of

Research partially supported by the EU 5th framework programs RESQ IST-200137559 and RAND-APX IST-1999-14036, and by CNRS/STIC 01N80/0502 and 01N80/0607 grants, by ACI Cryptologie CR/02 02 0040 grant of the French Research Ministry, and by OTKA T42559, T42706, and NWO-OTKA N34040 grants.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 419–428, 2003. c Springer-Verlag Berlin Heidelberg 2003

420

Katalin Friedl et al.

quantum algorithms has turned out to be the Hidden Subgroup Problem (HSP), which can be cast in the following broad terms: Let H be a subgroup of G such that f is constant on each left coset of H and distinct on diﬀerent left cosets. We say that f hides the subgroup H. The task is to determine the hidden subgroup H. While no classical algorithm can solve this problem with polynomial query complexity, the biggest success of quantum computing until now is that it can be solved by a quantum algorithm eﬃciently whenever G is Abelian [15,11]. We will refer to this algorithm as the standard algorithm for the HSP. The main tool for this solution is Fourier sampling based on the (approximate) quantum Fourier transform for Abelian groups which can be eﬃciently implemented quantumly [11]. In strong opposition to these positive results, a natural generalization of the HSP has exponential quantum query complexity even in Abelian groups. In this generalization, the function f may not be distinct on diﬀerent cosets. Indeed, the unordered database search problem can be reduced to the decision problem whether a function on a cyclic group has a non-trivial period or not. Two diﬀerent extensions of property testing were studied recently in the quantum context. The ﬁrst approach consists in testing quantum devices by classical procedures. Mayers and Yao [12] have designed tests for deciding if a photon source is perfect. These tests guarantee that if a source passes them, it is adequate for the security of the Bennett-Brassard [1] quantum key distribution protocol. Dam, Magniez, Mosca and Santha [4] considered the design of testers for quantum gates. They showed the possibility of classically testing quantum processes and they provided the ﬁrst family of classical tests allowing one to estimate the reliability of quantum gates. The second approach considers testing deterministic functions by a quantum procedure. Quantum testing of deterministic function families was introduced by Buhrman, Fortnow, Newman, and R¨ ohrig [2], and they have constructed eﬃcient quantum testers for several properties. One of their nicest contributions is that they have considered the possibility that quantum testing of periodicity might be easier than the corresponding decision problem. Indeed, they succeeded in giving a polynomial time quantum tester for periodic functions over Zn2 . They have also proved that any classical tester requires exponential time for this task. Independently and earlier, while working on the extension of the HSP to periodic functions over Z which may be many-to-one in each period, Hales and Hallgren [10] have given the essential ingredients for constructing a polynomial time quantum tester for periodic functions over the cyclic group Zn . But contrarily to [2], their result is not stated in the testing context. In this work, we construct eﬃcient or query eﬃcient quantum testers for two hidden group properties, that is, existential properties over groups whose decision problems have exponential quantum query complexity. We also introduce a new technique in the analysis of quantum testers. Our main contribution is a generalization of the periodicity property studied in [10,2]. For any ﬁnite group G and any normal subgroup K, a function f satisﬁes the property LARGER-PERIOD(K) if there exists a normal subgroup

Quantum Testers for Hidden Group Properties

421

H > K for which f is H-periodic (i.e. f (xh) = f (x) for all x ∈ G and h ∈ H). For this property, we give an eﬃcient tester whenever G is Abelian (Theorem 1). This result generalizes the previous periodicity testers in three aspects. First, we work in any ﬁnite Abelian group G, while previously only G = Zn [10] and G = Zn2 [2] were considered. Second, the property we test is parametrized by some known normal subgroup K, while previously only the case K = {0} was considered. Third, our query complexity is only linear in the inverse of the distance parameter, whereas the previous works have a quadratic dependence. Our result implies that the period ﬁnding algorithm of [10] has, in fact, query complexity linear in the inverse of the distance parameter, as opposed to only quadratic dependence proved in that paper. The main technical ingredient of the periodicity test in Abelian groups is eﬃcient Fourier sampling. This procedure remains a powerful tool also in nonAbelian groups. Unfortunately, currently no eﬃcient implementation is known for it in general groups. Therefore, when dealing with non-Abelian groups, our aim is to construct query eﬃcient testers. We construct query eﬃcient testers, with query complexity linear in the inverse of the distance parameter, for two properties. First, we show that the tester used for LARGER-PERIOD(K) in Abelian groups yields a query eﬃcient tester when G is any ﬁnite group and K any normal subgroup (Theorem 2). Second, we study in any ﬁnite group G the property COMMON-COSET-RANGE(k, t) (for short CCR(k, t)), can be thought of as a generalization of the hidden translation property [5,7]. The heart of the tester for CCR(k, t) is again Fourier sampling applied in the direct product group G × Z2 . Our tester is query eﬃcient in any group if k is polylogarithmic in the size of the group (Theorem 4). After ﬁnishing this paper, we learnt from Lisa Hales that in her thesis [9], she has also obtained polynomial time quantum testers for periodic functions over any ﬁnite Abelian group, although her results, just as those of [10], are not stated explicitly in the testing context. Her proof technique is also closely related to that of [10], and the query complexity of her tester remains quadratic in the inverse of the distance parameter. After hearing a talk about our results, she has pointed out to us that our periodicity tester can be generalized to the integers. For the sake of completeness, with her permission, we include here in Section 4 this eﬃcient periodicity tester over the integers Z. We present a complete correctness proof for this tester (Theorem 3) by combining Hales’s ideas with our earlier periodicity testing results about ﬁnite Abelian groups.

2 2.1

Preliminaries Fourier Sampling over Abelian Groups

For a ﬁnite set D, let the uniform superposition over D be |D = √1

|D|

x∈D

|x,

and for a function f from D to a ﬁnite set S, let the uniform superposition of f be |f = √1 x∈D |x|f (x). For two functions f, g from D to S, their distance is |D|

422

Katalin Friedl et al.

dist(f, g) = |{x ∈ D : f (x) = g(x)}|/|D|. In this paper, · denotes the 2 -norm and ·1 denotes the 1 -norm of a vector. Proposition 1. For functions f, g deﬁned on the same ﬁnite set, dist(f, g) = 2 1 2 |f − |g . Let G be a ﬁnite Abelian group and H ≤ G a subgroup. The coset of x ∈ G with respect to H is denoted by x + H. We use the notation <X> for the subgroup generated by a subset X of G. We identify with G the set of characters of G, via some ﬁxed isomorphism y → χy . The orthogonal G of H ≤ G is deﬁned as H ⊥ = {y ∈ G : ∀h ∈ H, χy (h) = 1}, and we |H| set |H ⊥ (x) = y∈H ⊥ χy (x)|y. The quantum Fourier transform over |G| unitary transformation deﬁned as follows: For every x ∈ G, G, QFTG , is the QFTG |x = √1 y∈G χy (x)|y. |G|

Proposition 2. Let G be a ﬁnite Abelian group, x ∈ G and H ≤ G. Then QFT

G |x + H −−−−→ |H ⊥ (x).

The following well known quantum Fourier sampling algorithm will be used as a building block in our quantum testers. In the algorithm, f : G → S is given by a quantum oracle. Fourier samplingf (G) 1. Create zero-state |0G |0S . 2. Create the superposition √1

|G|

x∈G

|xG in the ﬁrst register.

3. Query function f . 4. Apply QFTG on the ﬁrst register. 5. Observe and then output the ﬁrst register. 2.2

Property Testing

Let D and S be two ﬁnite sets and let C be a family of functions from D to S. Let F ⊆ C be the sub-family of functions of interest, that is, the set of functions possessing the desired property. In the testing problem, one is interested in distinguishing functions f : D → S, given by an oracle, which belong to F, from functions which are far from every function in F. Deﬁnition 1 (δ-tester). Let F ⊆ C and 0 ≤ δ < 1. A quantum (resp. probabilistic) δ-tester for F on C is a quantum (resp. probabilistic) oracle Turing machine T such that, for every f ∈ C, 1. if f ∈ F then Pr[T f accepts] = 1, 2. if dist(f, F) > δ then Pr[T f rejects] ≥ 2/3, where the probabilities are taken over the observation results (resp. the coin tosses) of T . By our deﬁnition, a tester always accepts functions having the property F. We may also consider testers with two-sided error, where this condition is relaxed, and one requires only that the tester accept functions from F with probability at least 2/3.

Quantum Testers for Hidden Group Properties

3

423

Periodicity in Finite Groups

In this section, we design quantum testers for testing periodicity of functions from a ﬁnite group G to a ﬁnite set S. For a normal subgroup H G, a function f : G → S is H-periodic if for all x ∈ G and h ∈ H, f (xh) = f (x). Notice that our deﬁnition describes formally right H-periodicity, but this coincides with left H-periodicity since H is normal. The set of H-periodic functions is denoted by Per(H). For a known normal subgroup H, testing if f ∈ Per(H) can be easily done classically by sampling random elements x ∈ G and h ∈ H and verifying that f (xh) = f (x), as can be seen from the following proposition. Proposition 3. Let G be a ﬁnite group, H G and f : G → S a function. Let η = Prx∈G,h∈H [f (xh) = f (x)]. Then, η/2 ≤ dist(f, Per(H)) ≤ 2η. On the other hand, testing if a function has a non-trivial period is classically hard even in Zn2 [2]. The main result of this section is that we can test query eﬃciently (and eﬃciently in the Abelian case) by a quantum algorithm an even more general property: Does a function have a strictly larger period than a known normal subgroup K G? Indeed, we test the family LARGER-PERIOD(K) = {f : G → S | ∃H G, H > K and f is H-periodic}. 3.1

Finite Abelian Case

In this subsection, we give our algorithm for testing periodicity in ﬁnite Abelian groups. Theorem 1 below states that this algorithm is eﬃcient. The algorithm assumes that G has an eﬃcient exact quantum Fourier transform. When G only has an eﬃcient approximate quantum Fourier transform, the algorithm has two-sided error. Eﬃcient implementations of approximate quantum Fourier transforms exist in every ﬁnite Abelian group [11]. Test Larger periodf (G, K, δ) 1. N ← 4 log(|G|)/δ. 2. For i = 1, . . . , N do yi ← Fourier samplingf (G). 3. Accept iﬀ 1≤i≤N < K ⊥ . Theorem 1. For a ﬁnite set S, ﬁnite Abelian group G, subgroup K ≤ G, and 0 < δ < 1, Test Larger period(G, K, δ) is a δ-tester for LARGER-PERIOD(K) on the family of all functions from G to S, with O(log(|G|)/δ) query complexity and (log(|G|)/δ)O(1) time complexity. Let S be a ﬁnite set and G a ﬁnite Abelian group. We describe now the ingredients of our two-step correction process. First, we generalize the notion of uniform superposition of a function to uniform superposition of a probabilistic function. By deﬁnition, a probabilistic function is a mapping µ : x → µx from the domain G to probability distributions on S. For every x ∈ G, deﬁne the unit

424

Katalin Friedl et al.

1 -norm vector |µx = s∈S µx (s)|s. Then the uniform superposition of µ is deﬁned as |µ = √1 x∈G |x|µx . Notice that |µ has unit 2 -norm when µ is |G|

a (deterministic) function, otherwise its 2 -norm is smaller. A function f : G → S and a subgroup H ≤ G naturally deﬁne an H-periodic |f −1 (s)∩(x+H)| . The value µf,H probabilistic function µf,H , where µf,H x (s) = x (s) |H| is the proportion of elements in the coset x + H where f takes the value s. When f is H-periodic |µf,H = |f , and so |µf,H = 1, otherwise |µf,H < 1. The next two lemmas, which imply Theorem 1, give the connection 2 between the distance |f − |µf,H and respectively the probability that Fourier sampling outputs an element outside H ⊥ , and dist(f, Per(H)). 2 Lemma 1. |f − |µf,H = Pr[Fourier samplingf (G) outputs y ∈ H ⊥ ]. Proof. Since y ∈ H ⊥ iﬀ y ∈ 1 ⊥ √ √ 1 x∈G |{0} (x)|f (x) − |G|

{0}⊥ − H ⊥ , the probability term is 2 ⊥ x∈G |H (x)|f (x) . We apply the

|G||H|

inverse quantum Fourier transform QFT−1 G , which is 2 -norm preserving, to the ﬁrst register in the above expression. The probability becomes 2 1 |f − √ x∈G |x + H|f (x) , using Proposition 2. Changing the vari |G||H| ables, the second term inside the norm is 1 1 1 1 |x |f (x − h) = |x |f (x + h), |H| |H| |G| |G| x∈G

x∈G

h∈H

h∈H

where theequality holds because a subgroup of G. We conclude by observing H isf,H 1 f,H |f (x + h) = µ that |H| h∈H s∈S x (s)|s = |µx . 2 Lemma 2. dist(f, Per(H)) ≤ 2 |f − |µf,H . Proof.It will be useful to rewrite |f as a probabilistic function f f √1 x∈G |x s∈S δx (s)|s, where δx (s) = 1 if f (x) = s and 0 otherwise. |G|

Let us deﬁne the H-periodic function g : G → S by g(x) = Majh∈H f (x + h), where ties are decided arbitrarily. In fact, g is the correction of f with respect to H-periodicity. Proposition 1 and the H-periodicity of g imply dist(f, Per(H)) ≤ 2 1 |g − |µf,H ≤ |f − |µf,H . This will al|f − |g . We will show that 2 low us to prove the desired statement using the triangle inequality. Observe that for any function h : G → S, we have 2 |h − |µf,H 2 = 1 |δxh (s) − µf,H (1) x (s)| . |G| x∈G s∈S

Moreover for every x ∈ G, one can establish 2 2 f,H |δxg (s) − µf,H (µf,H x (s)| = 1 + x (s)) − 2µx (g(x)) ≤1+

s∈S

s∈S 2 (µf,H x (s))

−

2µf,H x (f (x))

=

s∈S

s∈S 2 |δxf (s) − µf,H x (s)| ,

(2)

Quantum Testers for Hidden Group Properties

425

f,H where the inequality follows from µf,H in turn follows x (f (x)) ≤ µx (g(x)), which |g − |µf,H immediately from the deﬁnition of g. From (1) and (2) we get that ≤ |f − |µf,H , which completes the proof.

Lemmas 1 and 2 together can be interpreted as the robustness [14] in the quantum context [4] of the property that Fourier samplingf (G) outputs only y ∈ H ⊥ : if f does not satisfy exactly the property but with error probability less than δ, then f is 2δ-close to a function that satisﬁes exactly the property. 3.2

Finite General Case

We now give our algorithm for testing periodicity in general ﬁnite groups. Our main tool continues to be the quantum Fourier√transform (over a general ﬁnite group). For any d × d matrix M , deﬁne |M = d 1≤i,j≤d Mi,j |M, i, j. Let G be a complete set of ﬁnite dimensional inequivalent be any ﬁnite group and let G irreducible unitary representations of G. Thus, for any ρ ∈ G of dimension dρ and x ∈ G, |ρ(x) = dρ 1≤i,j≤dρ (ρ(x))i,j |ρ, i, j. The quantum Fourier transform over G is the unitary transformation deﬁned as follows: For every : x ∈ G, QFTG |x = √1 |ρ(x). For any H G set H ⊥ = {ρ ∈ G ρ∈G |G|

∀h ∈ H, ρ(h) = Idρ }, where Idρ is the dρ × dρ identity matrix. Let |H ⊥ (x) = |H| ρ∈H ⊥ |ρ(x). |G| QFT

G Proposition 4. If x ∈ G and H G, then |xH −−−−→ |H ⊥ (x).

Test Larger periodf (G, K, δ) 1. N ← 4 log(|G|)/δ. 2. For i = 1, . . . , N do ρi ← Fourier samplingf (G). 3. Accept iﬀ ∩1≤i≤N ker ρi > K. In the above algorithm, Fourier samplingf (G) is as before, except that we only observe the representation ρ, and not the indices i, j. Thus, the output K is assumed to be a normal of Fourier samplingf (G) is an element of G. subgroup of G. For any ρ ∈ G, ker ρ denotes its kernel. We now prove the robustness of the property that Fourier samplingf (G) outputs only ρ ∈ H ⊥ , for any ﬁnite group G, normal subgroup H and H-periodic function f . This robustness corresponds to Lemmas 1 and 2 of the Abelian case. Lemma 3. Let f : G → S and H G. Then dist(f, Per(H)) ≤ 2 · Pr[Fourier samplingf (G) outputs ρ ∈ H ⊥ ]. Our second theorem states that Test Larger period is a query eﬃcient tester for LARGER-PERIOD(K) for any ﬁnite group G. Theorem 2. For a ﬁnite set S, ﬁnite group G, normal subgroup K G, and 0 < δ < 1, Test Larger period(G, K, δ) is a δ-tester for LARGER-PERIOD(K) on the family of all functions from G to S, with O(log(|G|)/δ) query complexity.

426

4

Katalin Friedl et al.

Periodicity on Z

We address here the problem of periodicity testing when the group is ﬁnitely generated Abelian, but possibly inﬁnite. For Z, it is still possible to test if a function is periodic. The proof involves Fourier sampling methods of [10] and the following lemma which was communicated to us by Hales. Lemma 4. Let G be a ﬁnite Abelian group, f : G → S a function and δ > 0. Set N = 4(log|G|)2 /δ. For i = 1, . . . , N , let yi = Fourier samplingf (G) and set Y = 1≤i≤N . Then Pr[f is δ-close to Per(Y ⊥ )] ≥ 2/3. Proof. Let E be the complementary event dist(f, Per(Y ⊥ )) > δ. Then E is realized exactly when there is a subgroup H ≤ G such that dist(f, Per(H)) > δ and H ⊥ = Y . Therefore Pr(E) is upper bounded by Pr[dist(f, Per(H)) > δ and H ⊥ = Y ] ≤ (Pr[y1 ∈ H ⊥ ])N . H≤G

H≤G,dist(f,Per(H))>δ

The number of subgroups of G is at most |G|log |G| , and since by Lemmas 1 and 2 the probability that y1 is in H ⊥ is at most 1 − δ/2, we can upper bound Pr[E] by |G|log |G| (1 − δ/2) ≤ 1/3. For the sake of clarity, we now restrict ourselves to functions deﬁned over the natural numbers N. For any integer T ≥ 1, we identify the set {0, . . . , T −1} with ZT in the usual way. We recast Test Larger period(G, K, δ) in the arithmetic formalism when G = ZT and K = ≤ G, for some p0 dividing T . Test Dividing periodf (T, p0 , δ) 1. N ← 4 log(T )/δ. 2. For i = 1, . . . , N do yi ← Fourier samplingf (ZT ) and compute the reduced fraction abii of yTi . 3. p ← lcm{bi : 1 ≤ i ≤ N }. 4. Accept iﬀ p divides and is less than p0 . Then Lemma 4 can be also rewritten as follows. Corollary 1. Let T ≥ 1 be an integer, f : ZT → S a function and δ > 0. Set N = 4(log T )2 /δ. For i = 1, . . . , N let yi = Fourier samplingf (ZT ), yi ai bi be the reduced fraction of T , and set p = lcm{bi : 1 ≤ i ≤ N }. Then Pr[f is δ-close to Per(

)] ≥ 2/3. We want to test periodicity in the family of functions deﬁned on N. To make the problem ﬁnite, we ﬁx an upper bound on the period. Then, a function f : {0, . . . , T − 1} → S is q-periodic, for 1 ≤ q < T , if f (x + aq) = f (x), for every x, a ∈ N such that x + aq < T . The problem we now want to test is if there exists a period less than some given number t. More precisely, we deﬁne for integers 2 ≤ t ≤ T, INT-PERIOD(T, t) = {f : {0, . . . , T − 1} → S | ∃q : 1 ≤ q < t, f is q-periodic}. Here we do not require that q divides t since we do not have any ﬁnite group structure.

Quantum Testers for Hidden Group Properties

427

Test Integer periodf (T, t, δ) 1. N ← Ω((log T )2 /δ). 2. For i = 1, . . . , N do yi ← Fourier samplingf (ZT ), and use the continued fractions method to round yTi to the nearest fraction abii with bi < t. 3. p ← lcm{bi : 1 ≤ i ≤ N }. 4. If p ≥ t, reject. 5. Tp ← T /pp. 6. M ← Ω(1/δ). 7. For i = 1, . . . , M let ai , xi ∈R ZTp . 1 |{i : f (xi + ai p mod Tp ) = f (xi )}| < 2δ . 8. Accept iﬀ M Theorem 3. For 0 < δ < 1, and integers 2 ≤ t ≤ T such that T /(log T )4 = Ω((t log t/δ)2 ), Test Integer period(T, t, δ) is a δ-tester with two-sided error for INT-PERIOD(T, t) on the family of functions from {0, . . . , T − 1} to S, with O((log T )2 /δ) query complexity and (log T /δ)O(1) time complexity.

5

Common Coset Range

In this section, G denotes a ﬁnite group and S a ﬁnite set. Let f0 , f1 be functions from G to S. For a normal subgroup H G, we say that f0 and f1 are Hsimilar if on all cosets of H the ranges of f0 and f1 are the same, that is, the multiset equality f0 (xH) = f1 (xH) holds for every x ∈ G. Consider the function f : G × Z2 → S, where by deﬁnition f (x, b) = fb (x). We will use f for (f0 , f1 ) when it is convenient in the coming discussion. We denote by Range(H) the set of functions f such that f0 and f1 are H-similar. We say that H is (k, t)-generated, for some positive integers k, t, if |H| ≤ k and it is the normal closure of a subgroup generated by at most t elements. The aim of this section is to establish that for any positive integers k and t, the family COMMON-COSET-RANGE(k, t) (for short CCR(k, t)), deﬁned as the set {f : G × Z2 → S | ∃H G : H is (k, t)-generated, f0 and f1 are H-similar}, can be tested by the following quantum test. Note that a subgroup of size k is always generated by at most log k elements, therefore we always assume that t ≤ log k. In the testing algorithm, we assume that we have a quantum oracle for the function f : G × Z2 → S. Test Common coset rangef (G, k, t, δ) 1. N ← 2kt log(|G|)/δ. 2. For i = 1, . . . , N do (ρi , bi ) ← Fourier samplingf (G × Z2 ). 3. Accept iﬀ ∃H G : H is (k, t)-generated ∀i (bi = 1 =⇒ ρi ∈ H ⊥ ). We ﬁrst prove the robustness of the property that when Fourier samplingf (G × Z2 ) outputs (ρ, 1), where G is any ﬁnite group, H G and f ∈ Range(H), then ρ is not in H ⊥ .

428

Katalin Friedl et al.

Lemma 5. Let S be a ﬁnite set and G a ﬁnite group. Let f : G × Z2 → S and H G. Then dist(f, Range(H)) ≤ |H| · Pr[Fourier samplingf (G × Z2 ) outputs (ρ, 1) such that ρ ∈ H ⊥ ]. Our next theorem implies that CCR(k, t) is query eﬃciently testable when k is polynomial in log|G|. Theorem 4. For any ﬁnite set S, ﬁnite group G, integers k ≥ 1, 1 ≤ t ≤ log k, and 0 < δ < 1, Test Common coset range(G, k, t, δ) is a δ-tester for CCR(k, t) on the family of all functions from G × Z2 to S, with O(kt log(|G|)/δ) query complexity. The proof technique of Theorem 4.2 of [2] yields: Theorem 5. Let G be a ﬁnite Abelian group and let k be the exponent of G. For testing CCR(k, 1) on G, any classical randomized bounded error query algorithm on G requires Ω( |G|) queries.

References 1. C. H. Bennett and G. Brassard. Quantum cryptography: Public key distribution and coin tossing. In Proc. IEEE International Conference on Computers, Systems, and Signal Processing, pages 175–179, 1984. 2. H. Buhrman, L. Fortnow, I. Newman, and H. R¨ ohrig. Quantum property testing. In Proc. ACM-SIAM Symposium on Discrete Algorithms, 2003. 3. M. Blum, M. Luby, and R. Rubinfeld. Self-testing/correcting with applications to numerical problems. J. Comput. System Sci., 47(3):549–595, 1993. 4. W. van Dam, F. Magniez, M. Mosca, and M. Santha. Self-testing of universal and fault-tolerant sets of quantum gates. Proc. 32nd ACM STOC, pp. 688–696, 2000. 5. M. Ettinger and P. Høyer. On quantum algorithms for noncommutative hidden subgroups. Adv. in Appl. Math., 25(3):239–251, 2000. 6. E. Fischer. The art of uninformed decisions: A primer to property testing, the computational complexity. In The Computational Complexity Column, volume 75, pages 97–126. The Bulletin of the EATCS, 2001. 7. K. Friedl, G. Ivanyos, F. Magniez, M. Santha, and P. Sen. Hidden translation and orbit coset in quantum computing. In Proc. 35th ACM STOC, 2003. 8. O. Goldreich, S. Goldwasser, and D. Ron. Property testing and its connection to learning and approximation. J. ACM, 45(4):653–750, 1998. 9. L. Hales. The Quantum Fourier Transform and Extensions of the Abelian Hidden Subgroup Problem. PhD thesis, University of California, Berkeley, 2002. 10. L. Hales and S. Hallgren. An improved quantum Fourier transform algorithm and applications. In Proc. 41st IEEE FOCS, pages 515–525, 2000. 11. A. Kitaev. Quantum measurements and the Abelian Stabilizer Problem. Technical report no. 9511026, Quantum Physics e-Print archive, 1995. 12. D. Mayers and A. Yao. Quantum cryptography with imperfect apparatus. In Proc. 39th IEEE FOCS, pages 503–509, 1998. 13. M. Nielsen and I. Chuang. Quantum Computation and Quantum Information. Cambridge University Press, 2000. 14. R. Rubinfeld and M. Sudan. Robust characterizations of polynomials with applications to program testing. SIAM J. Comp., 25(2):23–32, 1996. 15. P. Shor. Algorithms for quantum computation: Discrete logarithm and factoring. SIAM J. Comp., 26(5):1484–1509, 1997.

Local LTL with Past Constants Is Expressively Complete for Mazurkiewicz Traces Paul Gastin1 , Madhavan Mukund2 , and K. Narayan Kumar2 1 2

LIAFA, Universit´e Paris 7, 2, place Jussieu, F-75251 Paris Cedex 05, France [email protected] Chennai Mathematical Institute, 92 G N Chetty Road, Chennai 600 017, India {madhavan,kumar}@cmi.ac.in

Abstract. To obtain an expressively complete linear-time temporal logic (LTL) over Mazurkiewicz traces that is computationally tractable, we need to intepret formulas locally, at individual events in a trace, rather than globally, at conﬁgurations. Such local logics necessarily require past modalities, in contrast to the classical setting of LTL over sequences. Earlier attempts at deﬁning expressively complete local logics have used very general past modalities as well as ﬁlters (side-conditions) that “look sideways” and talk of concurrent events. In this paper, we show that it is possible to use unﬁltered future modalities in conjunction with past constants and still obtain a logic that is expressively complete over traces. Keywords: Temporal logics, Mazurkiewicz traces, concurrency

1

Introduction

Linear-time temporal logic (LTL) [17] has established itself as a useful formalism for specifying the interleaved behaviour of reactive systems. To combat the combinatorial blow-up involved in describing computations of concurrent systems in terms of interleavings, there has been a lot of interest in using temporal logic more directly on labelled partial orders. Mazurkiewicz traces [13] are labelled partial orders generated by dependence alphabets of the form (Σ, D), where D is a dependence relation over Σ. If (a, b) ∈ / D, a and b are deemed to be independent actions that may occur concurrently. Traces are a natural formalism for describing the behaviour of static networks of communicating ﬁnite-state agents [24]. LTL over Σ-labelled sequences is equivalent to FOΣ (<), the ﬁrst-order logic over Σ-labelled linear orders [12] and thus deﬁnes the class of aperiodic languages over Σ. Though FOΣ (<) permits assertions about both the past and the future, future modalities suﬃce for establishing the expressive completeness of LTL with respect to FOΣ (<) [8]. From a practical point of view, a ﬁnite-state program may be checked against an LTL speciﬁcation relatively eﬃciently.

Partial support of CEFIPRA-IFCPAR Project 2102-1 (ACSMV) is gratefully acknowledged.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 429–438, 2003. c Springer-Verlag Berlin Heidelberg 2003

430

Paul Gastin, Madhavan Mukund, and K. Narayan Kumar

The ﬁrst expressively complete temporal logic over traces was described in [6] for ﬁnite traces and in [19] for inﬁnite traces. The result was reﬁned in [4] to show expressive completeness without past modalities, using an extension of the proof technique developed for LTL in [23]. Formulas in both these logics are deﬁned at global conﬁgurations (maximal antichains). Unfortunately, reasoning at the level of global conﬁgurations makes the complexity of deciding satisﬁability non-elementary [21]. Computational tractability seems to require interpreting formulas at local states—eﬀectively at individual events. Recently, in [10], a local temporal logic has been deﬁned over traces and shown to be expressively complete and tractable (the satisﬁability problem is in Pspace). This logic uses both future and past modalities (similar to the until and since operators of LTL) which are further equipped with ﬁlters (side-conditions). It was also shown that for ﬁnite traces, a restricted form of past modalities suﬃces, but only in conjunction with ﬁltered future modalities. Another proposal is presented in [1] and this logic also uses the since operator. LTL without any past operators is expressively complete over words but this cannot be the case for traces: there exist two ﬁrst-order inequivalent traces that cannot be distinguished using only future modalities [22]. In this paper, we show that a very limited ability to talk about the past is suﬃcient to obtain expressive completeness over traces. Our logic uses unﬁltered future modalities and a ﬁnite number of past constants. (In particular, there is no nesting of past operators and for that matter even future formulas cannot be nested into past formulas.) As in [3,4,10], we show expressive completeness using an extension to traces of the proof technique introduced in [23] for LTL over sequences. From the recent general result proved in [9], it follows that the satisﬁability problem for this new logic is also in Pspace. The paper is organized as follows. We begin with some preliminaries about traces. In Section 3 we deﬁne our new temporal logic. Section 4 describes a syntactic partition of traces that is used in Section 5 to establish expressive completeness. Many proofs have had to be omitted in this extended abstract. A full version of the paper is available in [11].

2

Preliminaries

We brieﬂy recall some notions about Mazurkiewicz traces (see [5] for background). A dependence alphabet is a pair (Σ, D) where the alphabet Σ is a ﬁnite set of actions and the dependence relation D ⊆ Σ ×Σ is reﬂexive and symmetric. The independence relation I is the complement of D. For A ⊆ Σ, the set of letters independent of A is denoted by I(A) = {b ∈ Σ | (a, b) ∈ I for all a ∈ A} and the set of letters depending on (some action in) A is denoted by D(A) = Σ \ I(A). A Mazurkiewicz trace is a labelled partial order t = [V, ≤, λ] where V is a set of vertices labelled by λ : V → Σ and ≤ is a partial order over V satisfying the following conditions: For all x ∈ V , the downward set ↓x = {y ∈ V | y ≤ x} is ﬁnite, (λ(x), λ(y)) ∈ D implies x ≤ y or y ≤ x, and xy implies (λ(x), λ(y)) ∈ D, where = < \ <2 is the immediate successor relation in t.

Local LTL with Past Constants Is Expressively Complete

431

The alphabet of a trace t is the set alph(t) = λ(V ) ⊆ Σ and its alphabet at inﬁnity, alphinf(t), is the set of letters occurring inﬁnitely often in t. The set of all traces is denoted by R(Σ, D) or simply by R. A trace t is called ﬁnite if V is ﬁnite. For t = [V, ≤, λ] ∈ R, we deﬁne min(t) ⊆ V as the set of all minimal vertices of t. We can also read min(t) ⊆ Σ as the set of labels of the minimal vertices of t. It will be clear from the context what we actually mean. Let t1 = [V1 , ≤1 , λ1 ] and t2 = [V2 , ≤2 , λ2 ] be a pair of traces such that alphinf(t1 ) × alph(t2 ) ⊆ I. We then deﬁne the concatenation of t1 and t2 to be t1 ·t2 = [V, ≤, λ] where V = V1 ∪V2 (assuming wlog that V1 ∩V2 = ∅), λ = λ1 ∪λ2 and ≤ is the transitive closure of the relation ≤1 ∪ ≤2 ∪ (V1 × V2 ∩ λ−1 (D)). The set of ﬁnite traces is then a monoid, denoted M(Σ, D) or simply M, with the empty trace 1 = (∅, ∅, ∅) as unit. Here is some useful notation for subclasses of traces. For C ⊆ Σ, let RC = {t ∈ R | alph(x) ⊆ C} and MC = M ∩ RC . Also, (alph = C) = {t ∈ R | alph(t) = C}, (alphinf = C) = {t ∈ R | alphinf(t) = C} and (min = C) = {t ∈ R | min(t) = C}. For A, C ⊆ Σ, we set RA C = RC ∩ (alphinf = A). Observe that MC = R∅C . The ﬁrst order theory of traces FOΣ (<) is given by the syntax: ϕ ::= Pa (x) | x < y | ¬ϕ | ϕ ∨ ϕ | ∃xϕ, where a ∈ Σ and x, y ∈ Var are ﬁrst order variables. Given a trace t = [V, ≤, λ] and a valuation σ : Var → V , t, σ |= ϕ denotes that t satisﬁes ϕ under σ. We interpret each predicate Pa by the set {x ∈ V | λ(x) = a} and the relation < as the strict partial order relation of t. The semantics then lifts to all formulas as usual. Since the meaning of a closed formula (sentence) ϕ is independent of the valuation σ, we can associate with each sentence ϕ the language L(ϕ) = {t ∈ R | t |= ϕ}. We say that a trace language L ⊆ R is expressible in FOΣ (<) if there exists a sentence ϕ ∈ FOΣ (<) such that L = L(ϕ). We denote by FO(Σ,D) (<) the set of trace languages L ⊆ R(Σ, D) that are expressible in FOΣ (<). For n > 0, FOnΣ (<) denotes the set of formulas with at most n distinct variables (note that each variable may be bound and reused several times). We use the algebraic notion of recognizability. Let h : M → S be a morphism to a ﬁnite monoid S. For t, u ∈ R, we say that t and u are h-similar, denoted t ∼h u, if either t, u ∈ M and h(t) = h(u) or t and u have inﬁnite factorizations in non-empty ﬁnite traces t = t1 t2 · · ·, u = u1 u2 · · · with h(ti ) = h(ui ) for all i. The transitive closure ≈h of ∼h is an equivalence relation. Since S is ﬁnite, this equivalence relation is of ﬁnite index with at most |S|2 + |S| equivalence classes. A trace language L ⊆ R is recognized by h if it is saturated by ≈h (or equivalently by ∼h ), i.e., t ∈ L implies [t]≈h ⊆ L for all t ∈ R. Let L ⊆ R be recognized by a morphism h : M → S. For B ⊆ Σ, L ∩ MB and L ∩ RB are recognized by h MB the restriction of h to MB . A ﬁnite monoid S is aperiodic if there is an n ≥ 0 such that sn = sn+1 for all s ∈ S. A trace language L ⊆ R is aperiodic if it is recognized by some morphism to a ﬁnite and aperiodic monoid. First-order deﬁnability coincides with aperiodicity for traces.

432

Paul Gastin, Madhavan Mukund, and K. Narayan Kumar

Theorem 1 ([6,7]). A language L ⊆ R(Σ, D) is expressible in FOΣ (<) if and only if it is aperiodic.

3

Local Temporal Logic

We denote by LocTLiΣ the set of (internal) formulas over the alphabet Σ. They are given by the following syntax: ϕ ::= a ∈ Σ | ¬ϕ | ϕ ∨ ϕ | EX ϕ | ϕ U ϕ | ¬a S b, a, b ∈ Σ Let t = [V, ≤, λ] ∈ R be a ﬁnite or inﬁnite trace and let x ∈ V be some vertex of t. We write t, x |= ϕ to denote that trace t at node x satisﬁes the formula ϕ ∈ LocTLiΣ . This is deﬁned inductively as follows: t, x |= a t, x |= ¬ϕ t, x |= ϕ ∨ ψ t, x |= EX ϕ t, x |= ϕ U ψ t, x |= ¬a S b

ϕ ϕUψ

λ(x) = a t, x |= ϕ t, x |= ϕ or t, x |= ψ ∃y. x y and t, y |= ϕ ∃z ≥ x. [t, z |= ψ and ∀y. (x ≤ y < z) ⇒ t, y |= ϕ] ∃z ≤ x. [λ(z) = b and ∀y. (z < y ≤ x) ⇒ λ(y) = a]

if if if if if if

@

@ @

ψ

b

@ ¬a @ @ ¬a S b

The modality U is the “universal” until operator deﬁned in [3]. The modality S is the corresponding since operator. Note that we only use the operator S in the very restricted form of a ﬁxed number of past constants. Past modalities are essential, as indicated by the following example from [22], where the dependence relation is a − b − c − d. These two traces are not ﬁrst-order equivalent but are bisimilar at the level of events and thus cannot be distinguished by purely future modalities. a → b → c → b → c ··· d → c → b → c → b ··· ↑ ↑ a→b d→c As usual, we can derive useful operators such as universal next AX ϕ = ¬ EX ¬ϕ, eventually in the future F ϕ = U ϕ and always in the future G ϕ = ¬ F ¬ϕ. The modality F∞ a = F a ∧ G(a ⇒ EX F a) expresses the existence of inﬁnitely many vertices labelled with a above the current vertex. Traces as Models of Formulas: We now turn our attention to deﬁning when a trace satisﬁes a formula. For LTL over sequences, lifting satisfaction at positions to satisfaction by a word is quite simple: a word models a formula if its initial position models the formula. Since a trace, in general, does not have a unique

Local LTL with Past Constants Is Expressively Complete

433

initial position, we need to use initial formulas as introduced in [3]. These are boolean combinations of formulas EM ϕ, each of which asserts the existence of a minimal vertex in a trace satisfying the internal formula ϕ. More precisely, the set LocTLΣ of initial formulas over the alphabet Σ is deﬁned as follows: α ::= ⊥ | EM ϕ, ϕ ∈ LocTLiΣ | ¬α | α ∨ α The semantics of EM is given by: t |= EM ϕ if ∃x. (x ∈ min(t) and t, x |= ϕ) An initial formula α ∈ LocTLΣ deﬁnes the trace language L(α) = {t ∈ R | t |= α}. We can then express various alphabetic properties using initial formulas: L(EM a) = {t ∈ R | a ∈ min(t)}, L(EM F a) = {t ∈ R | a ∈ alph(t)}, and L(EM F∞ a) = {t ∈ R | a ∈ alphinf(t)}. Therefore, for C ⊆ Σ, trace languages such as (alph = C), (alphinf = C) and (min = C) are expressible in LocTLΣ . The following result is immediate from the deﬁnition of LocTLΣ . Proposition 2. If a trace language is expressible in LocTLΣ , then it is expressible in FO3Σ (<). We now show that the “ﬁltered” modalities EXb and Fb from [10], with the following semantics, are both expressible in LocTLiΣ . t, x |= EXb ϕ if ∃y. [x y and t, y |= ϕ and ∀z. (z ≤ y ∧ λ(z) = b) ⇒ z ≤ x] t, x |= Fb ϕ if ∃y. [x ≤ y and t, y |= ϕ and ∀z. (z ≤ y ∧ λ(z) = b) ⇒ z ≤ x] Proposition 3. For any trace t over some alphabet Σ, any position x in t and any formula ϕ of LocTLiΣ t, x |= EXb ϕ ⇐⇒ t, x |= (b ∧ EX(ϕ ∧ ¬b)) ∨ (a ∧ EX(ϕ ∧ ¬(¬a S b)))

a=b

Let the formula Safeb = (b ∧ AX ¬b) ∨ a=b (a ∧ AX ¬(¬a S b)). Further, let F0b ϕ = Safeb U ϕ and Fk+1 ϕ = Safeb U EXb (Fkb ϕ). b Proposition 4. For any trace t ∈ R(Σ, D), any position x in t and any formula ϕ of LocTLiΣ Fkb ϕ t, x |= Fb ϕ ⇐⇒ t, x |= k≤|Σ|

Now we establish some important lemmas that are critical in proving the expressive completeness of LocTLΣ . Lemma 5. Let A ⊆ Σ and b ∈ Σ with b ∈ / A. For all ϕ ∈ LocTLiA , there i is a formula ϕ ∈ LocTLA∪{b} such that for all t = t1 bt2 t3 ∈ R with t1 ∈ R, t2 ∈ RA , min(t2 ) ⊆ D(b) and min(t3 ) ⊆ {b} and for all x ∈ bt2 we have bt2 , x |= ϕ iﬀ t, x |= ϕ.

t1

b

t2

b

t3

434

Paul Gastin, Madhavan Mukund, and K. Narayan Kumar

EX Proof Sketch. We have a = a, ¬ϕ = ¬ϕ, ϕ ∨ψ =ϕ ∨ ψ, ϕ = EXb ϕ, ϕ Uψ = U (d ∧ ψ)) ∧ Fb (d ∧ ψ)) and ¬c S d = (¬c S d) ∧ ¬(¬d S b). d∈A∪{b} ((ϕ Lemma 6. Let A ⊆ Σ and b ∈ Σ with b ∈ / A. For all α ∈ LocTLA , there exists a formula α ∈ LocTLiA∪{b} such that for all t = t1 bt2 t3 ∈ R with t1 ∈ R, t2 ∈ RA , min(t2 ) ⊆ D(b) and min(t3 ) ⊆ {b}, we have t2 |= α if and only if t, min(bt2 t3 ) |= α. Proof Sketch. We have ¬α = ¬α, α ∨ β = α ∨ β and EM ϕ = EX(ϕ ∧ ¬b) where ϕ is the formula given by Lemma 5. Lemma 7. Let A ⊆ Σ and b ∈ Σ with b ∈ / A. For all α ∈ LocTLA , there exists a formula α ∈ LocTLA∪{b} such that for all t = t1 t2 with t1 ∈ RA and min(t2 ) ⊆ {b}, we have t1 |= α if and only if t |= α . ¬ϕ Proof Sketch. Let ϕ ∨ψ =ϕ ∨ ψ, = ¬ϕ, EX ϕ = EX(ϕ ∧ ¬(¬b S b)), ϕ Uψ = ϕ U (ψ ∧ ¬(¬b S b)) and ¬c S d = ¬c S d. Then, for all t = t1 t2 with t1 ∈ RA , min(t2 ) ⊆ {b} and for all x ∈ t1 , we have t1 , x |= ϕ if and only if t, x |= ϕ. ϕ = EM(ϕ Finally, let EM ∧ ¬b).

4

Decomposition of Traces

The proof of our main result is a case analysis based on partitioning the set of traces according to the structure of the trace. Fix a letter b ∈ Σ and set B = Σ \ {b}. Using the notation introduced in Section 2, let Γ A = {t ∈ RA B | min(t) ⊆ D(b)}, Γ = Γ ∅ , and ΩA = {t ∈ RI(A) | min(t) ⊆ {b}}. Each trace t ∈ R has a unique ﬁnite or inﬁnite factorization t = t0 bt1 bt2 · · · with t0 ∈ RB and ti ∈ RB ∩ (min ⊆ D(b)) for all i > 0. In particular, we have (bΓ )∗ bΓ A ΩA (min = {b}) = (bΓ )+ (bΓ )ω ∅=A⊆B

The following two results will allow us to use this decomposition eﬀectively in proving the expressive completeness of our logic. For this, we use F∞ b a = Fb a ∧ ¬ Fb (a ∧ ¬ EXb Fb a). Lemma 8. Let t = t0 t with t0 , t ∈ R and min(t ) = {b}. Then, 1. t ∈ (bΓ )∞ \ {1} if and only if t, min(t ) |= β with ∞ ∞ F c∧ ¬F c β= C

c∈C

c∈C /

where C ranges over connected subsets of Σ such that b ∈ C if C = ∅.

Local LTL with Past Constants Is Expressively Complete

435

2. t ∈ (bΓ )∗ bΓ A ΩA if and only if t, min(t ) |= γ with ∞ ∞ ∞ ∞ ¬F c ∧ F b ∧ F c∧ Fb a ∧ ¬ Fb a . γ= C⊆Σ

c∈C

c∈C /

a∈A

a∈A /

A

Note that “the” b in bΓ ΩA is characterized by the formula b ∧ F∞ b a, where a is any letter in A. Lemma 9. Let A ⊆ Σ and let L ⊆ R be a trace language recognized by a morphism h from M into a ﬁnite monoid S. Then, L ∩ (bΓ )∗ bΓ A ΩA = (L1 ∩ (bΓ )∗ )b(L2 ∩ Γ A )(L3 ∩ ΩA ) ﬁnite where the trace languages Li ⊆ R are recognized by h.

5

Expressive Completeness

If T is a ﬁnite alphabet, we deﬁne the linear temporal logic LTLT (XU) by the syntax: f ::= u ∈ T | f XU f | ¬f | f ∨ f. The length of a ﬁnite or inﬁnite word w = w1 w2 · · · ∈ T ∞ is |w| ∈ N ∪ {ω}. For a word w = w1 w2 · · · ∈ T ∞ the semantics of LTLT (XU) is given by w |= u if |w| > 0 and w1 = u w |= f XU g if ∃j ∈ N with 1 < j ≤ |w| + 1 and wj wj+1 · · · |= g and wk wk+1 · · · |= f, ∀1 < k < j. Note that if w |= f XU g then w is nonempty. A formula f ∈ LTLT (XU) deﬁnes the word language L(f ) = {w ∈ T ∞ | w |= f }. We use the following proposition which is a consequence of several results on the equivalence between aperiodic word languages, star-free word languages, ﬁrst order word languages and word languages expressible in LTLT (XU) [18,12,14,20,8,15,16,2]. Proposition 10. Every aperiodic word language K ∈ T ∞ is expressible in LTLT (XU). We ﬁx T = h(bΓ ) and we deﬁne the mapping σ : (bΓ )∞ → T ∞ by σ(t) = h(bt1 )h(bt2 ) · · · if t = bt1 bt2 · · · with ti ∈ Γ for i ≥ 1. Note that the mapping σ is well-deﬁned since each trace t ∈ (bΓ )∞ has a unique factorization t = bt1 bt2 · · · with ti ∈ Γ for i ≥ 1. Lemma 11. Let L ⊆ R be recognized by h. Then, 1. L ∩ (bΓ )ω = σ −1 (K) for some K expressible in LTLT (XU). 2. L ∩ (bΓ )+ = σ −1 (K) for some K expressible in LTLT (XU). Next we show show how to lift an LTLT (XU) formula for K ⊆ T ∞ to a LocTLi formula for σ −1 (K) ∈ (bΓ )∞ . Lemma 12. Suppose that any aperiodic trace language over B is expressible in LocTLB . Then, for all f ∈ LTLT (XU) there exists f ∈ LocTLiΣ such that for all t = t1 t with t1 ∈ R and t ∈ (bΓ )∞ \ {1}, we have σ(t ) |= f iﬀ t, min(t ) |= f.

436

Paul Gastin, Madhavan Mukund, and K. Narayan Kumar

Proof Sketch. The formula f is deﬁned by structural induction. We let f ∨g =

f ∨ g, ¬f = ¬f , f XU g = EX (¬b ∨ f ) U (b ∧ g) . The diﬃcult case is when f = s ∈ T . For all r ∈ S, the trace language h−1 (r) ∩ MB is aperiodic and therefore expressible in LocTLB by the hypothesis of the lemma: we ﬁnd αr ∈ LocTLB such that for all t ∈ MB , h(t ) = r if and only if t|= αr . Let αr ∈ LocTLiΣ be the formula obtained using Lemma 6. We let s = h(b)·r=s αr . Lemma 13. Suppose that any aperiodic trace language over B is expressible in LocTLB . Let A ⊆ Σ be non-empty and let f ∈ LTLT (XU). There exists f ∈ LocTLiΣ such that for all t = t1 t2 t3 with t1 ∈ R, t2 ∈ (bΓ )∗ , t3 ∈ bΓ A ΩA , we have σ(t2 ) |= f iﬀ t, min(t2 t3 ) |= f. Lemma 14. Suppose that for any proper subset A of Σ, any aperiodic trace language over A is expressible in LocTLA . Let L ⊆ R be an aperiodic trace language over Σ. Then, for all b ∈ Σ, there exists ϕ ∈ LocTLiΣ such that for all t = t0 t with t0 , t ∈ R and min(t ) = {b}, t ∈ L iﬀ t, min(t ) |= ϕ. Proof. We prove this lemma by induction on the size of the alphabet Σ. If Σ = ∅ then there is nothing to prove. Now, suppose that Σ = ∅ and let b ∈ Σ. We assume that L is recognized by the aperiodic morphism h : M → S. Now, L ∩ (min = {b}) can be written as (L ∩ (bΓ )∗ bΓ A ΩA ). (L ∩ ((bΓ )∞ \ {1})) ∅=A⊆B

By Lemma 11 we get L ∩ ((bΓ )∞ \ {1}) = σ −1 (L(f )) for some f ∈ LTLT (XU). From the hypothesis, aperiodic languages over B are expressible in LocTLB . Hence, we can apply Lemma 12 and we get f such that for all t = t0 t with t0 ∈ R and t ∈ (bΓ )∞ \ {1}, we have σ(t ) |= f iﬀ t, min(t ) |= f. We conclude this case taking ϕ = β ∧ f where β is deﬁned in Lemma 8. Now, we consider L ∩ (bΓ )∗ bΓ A ΩA where ∅ = A ⊆ B. By Lemma 9, (L1 ∩ (bΓ )∗ )b(L2 ∩ Γ A )(L3 ∩ ΩA ) L ∩ (bΓ )∗ bΓ A ΩA = ﬁnite where each Li is an aperiodic language recognized by h. Thus, it suﬃces to show that for aperiodic languages L1 , L2 and L3 recognized by h, there is a formula ϕ such that for all t = t0 t with t0 , t ∈ R and min(t ) = {b}, we have t, min(t ) |= ϕ if and only if t ∈ (L1 ∩ (bΓ )∗ )b(L2 ∩ Γ A )(L3 ∩ ΩA ). By Lemma 11 we get L1 ∩ (bΓ )∗ = σ −1 (L(f1 )) for some f1 ∈ LTLT (XU). From the hypothesis, aperiodic languages over B are expressible in LocTLB . Hence, we can apply Lemma 13 and we get f1 such that for all t = t0 t1 t with t0 ∈ R and t1 ∈ (bΓ )∗ , and t ∈ bΓ A ΩA , we have t1 ∈ L1 iﬀ t, min(t1 t ) |= f1 . Using again the hypothesis of the lemma, we get some formula α2 ∈ LocTLB such that L2 ∩ RB = L(α2 ). By Lemma 6 we ﬁnd α2 ∈ LocTLiΣ such that for all t = t0 t1 bt2 t3 with t0 ∈ R and t1 ∈ (bΓ )∗ , t2 ∈ Γ A and t3 ∈ ΩA , we have t2 ∈ L2 iﬀ t, min(bt2 t3 ) |= α2 .

Local LTL with Past Constants Is Expressively Complete

437

Finally, L3 is an aperiodic trace language over a smaller alphabet (since A = ∅, I(A) is a proper subset of Σ) and hence by induction hypothesis there is a formula ϕ3 such that for all t = t0 t1 bt2 t3 with t0 ∈ R and t1 ∈ (bΓ )∗ , t2 ∈ Γ A and t3 ∈ ΩA with t3 = 1, we have t3 ∈ L3 iﬀ t, min(t3 ) |= ϕ3 . Putting these three pieces together we let ψ = f1 ∧ F(b ∧ F∞ b a ∧ α2 ∧ (ϕ4 ∨ Fb EX(b ∧ ϕ3 ))) / L3 and ϕ4 = ¬ EX F b otherwise. Then, for all t = t0 t1 bt2 t3 with ϕ4 = ⊥ if 1 ∈ with t0 ∈ R and t1 ∈ (bΓ )∗ , t2 ∈ Γ A and t3 ∈ ΩA , we get from the above discussion that t1 bt2 t3 ∈ L1 bL2 L3 if and only if t, min(t1 bt2 t3 ) |= ψ. We complete the proof with ϕ = γ ∧ ψ where γ is the formula deﬁned in Lemma 8. Theorem 15. Any aperiodic real trace language over R(Σ, D) is expressible in LocTLΣ . Proof. The proof proceeds by induction on the size of Σ. When Σ = {a} is a singleton, L is either a ﬁnite set or the union of a ﬁnite set and a set of the form an a∗ for some n ≥ 0. In both cases, it is easy to check that L is expressible in LocTLΣ . For the inductive step, assume that the theorem holds for any aperiodic language over any proper subset of Σ. Let L be recognized by an aperiodic morphism h : M → S. Let b ∈ Σ and B = Σ \ {b} as usual. We can show as in Lemma 9 that L can be written as follows: L = (L1 ∩ RB )(L2 ∩ (min ⊆ {b})) ﬁnite

where L1 and L2 are languages recognized by the same aperiodic morphism h. Since the decomposition of any trace t ∈ R as t1 t2 with t1 ∈ RB and t2 ∈ (min ⊆ {b}) is unique, the above decomposition can be rewritten as (L1 ∩ RB )(min ⊆ {b}) ∩ (RB (L2 ∩ (min ⊆ {b}))) L = ﬁnite

Now, by the induction hypothesis, there is formula α1 in LocTLB such that for t1 ∈ RB , t1 |= α1 if and only if t1 ∈ L1 . Thus, by Lemma 7, there is a formula α 1 in LocTLΣ such that t |= α 1 if and only if t1 |= α1 whenever t = t1 t2 with α1 ). t1 ∈ RB and min(t2 ) ⊆ {b}. Thus, (L1 ∩ RB )(min ⊆ {b}) = L( Since we have assumed expressive completeness for every proper subset of Σ, by Lemma 14 there is a formula ϕ2 such that for any t = t1 t2 with min(t2 ) = b, t2 ∈ L2 if and only if t, min(t2 ) |= ϕ2 . Consider the formula α = α ∨ EM((b ∧ ϕ2 ) ∨ (¬b ∧ Fb EX(b ∧ ϕ2 ))) where α = ⊥ if 1 ∈ / L2 and α = ¬ EM F b otherwise. Then, t |= α if and only if either t ∈ RB and 1 ∈ L2 , or there is a minimal b event x in the trace t and t, x |= ϕ2 . That is t = t1 t2 with t1 ∈ RB , min(t2 ) = {b} and t2 ∈ L2 . Thus RB (L2 ∩ (min ⊆ {b})) = L(α) is also expressible in LocTLΣ .

438

Paul Gastin, Madhavan Mukund, and K. Narayan Kumar

References 1. B. Adsul and M. Sohoni. Complete and tractable local linear time temporal logics over traces. In Proc. of ICALP’02, LNCS 2380, 926–937. Springer Verlag, 2002. 2. J. Cohen, D. Perrin, and J.-E. Pin. On the expressive power of temporal logic. Journal of Computer and System Sciences, 46:271–295, 1993. 3. V. Diekert and P. Gastin. Local temporal logic is expressively complete for cograph dependence alphabets. In Proc. of LPAR’01, LNAI 2250, 55–69. Springer Verlag, 2001. 4. V. Diekert and P. Gastin. LTL is expressively complete for Mazurkiewicz traces. Journal of Computer and System Sciences, 64:396–418, 2002. 5. V. Diekert and G. Rozenberg, editors. The Book of Traces. World Scientiﬁc, Singapore, 1995. 6. W. Ebinger. Charakterisierung von Sprachklassen unendlicher Spuren durch Logiken. Dissertation, Institut f¨ ur Informatik, Universit¨ at Stuttgart, 1994. 7. W. Ebinger and A. Muscholl. Logical deﬁnability on inﬁnite traces. Theoretical Computer Science, 154:67–84, 1996. 8. D. Gabbay, A. Pnueli, S. Shelah, and J. Stavi. On the temporal analysis of fairness. In Proc. of PoPL’80, 163–173, Las Vegas, Nev., 1980. 9. P. Gastin and D. Kuske. Satisﬁability and model checking for MSO-deﬁnable temporal logics are in PSPACE. To appear in Proc. of CONCUR’03. 10. P. Gastin and M. Mukund. An elementary expressively complete temporal logic for Mazurkiewicz traces. In Proc. of ICALP’02, LNCS 2380, 938–949. Springer Verlag, 2002. 11. P. Gastin, M. Mukund, and K. Narayan Kumar. Local LTL with past constants is expressively complete for Mazurkiewicz traces. Tech. Rep. 2003-008, LIAFA, Universit´e Paris 7 (France), 2003. 12. J.A.W. Kamp. Tense Logic and the Theory of Linear Order. PhD thesis, University of California, Los Angeles, California, 1968. 13. A. Mazurkiewicz. Concurrent program schemes and their interpretations. DAIMI Rep. PB 78, Aarhus University, Aarhus, 1977. 14. R. McNaughton and S. Papert. Counter-Free Automata. MIT Press, 1971. 15. D. Perrin. Recent results on automata and inﬁnite words. In Proc. of MFCS’84, LNCS 176, 134–148. Springer Verlag, 1984. 16. D. Perrin and J.-E. Pin. First order logic and star-free sets. Journal of Computer and System Sciences, 32:393–406, 1986. 17. A. Pnueli. The temporal logic of programs. In FOCS’77, 46–57, 1977. 18. M.-P. Sch¨ utzenberger. On ﬁnite monoids having only trivial subgroups. Information and Control, 8:190–194, 1965. 19. P.S. Thiagarajan and I. Walukiewicz. An expressively complete linear time temporal logic for Mazurkiewicz traces. In Proc. of LICS’97, 183–194, 1997. 20. W. Thomas. Star-free regular sets of ω-sequences. Information and Control, 42:148–156, 1979. 21. I. Walukiewicz. Diﬃcult conﬁgurations – on the complexity of LTrL. In Proc. of ICALP’98, LNCS 1443, 140–151. Springer Verlag, 1998. 22. I. Walukiewicz. Local logics for traces. Journal of Automata, Languages and Combinatorics, 7(2):259–290, 2002. 23. Th. Wilke. Classifying discrete temporal properties. In Proc. of STACS’99, LNCS 1563, 32–46. Springer Verlag, 1999. 24. W. Zielonka. Notes on ﬁnite asynchronous automata. R.A.I.R.O. — Informatique Th´eorique et Applications, 21:99–135, 1987.

LTL with Past and Two-Way Very-Weak Alternating Automata Paul Gastin and Denis Oddoux LIAFA, Universit´e Paris 7, 2, place Jussieu, F-75251 Paris Cedex 05, France {Paul.Gastin,Denis.Oddoux}@liafa.jussieu.fr Abstract. In this paper, we propose a translation procedure of PLTL (LTL with past modalities) formulas to B¨ uchi automata using two-way very-weak alternating automata (2VWAA) as an intermiediary step. Our main result is an eﬃcient translation of 2VWAA to generalized B¨ uchi automata (GBA).

1

Introduction

Nowadays, computer systems (both hardware and software) play a central role in most human activities, including safety critical areas. It is therefore essential to improve their reliability. For this, we need to formally specify their expected behaviors. Choosing the speciﬁcation language is a crucial factor for the overall validation process. We need a language that allows to express easily all desired properties. Moreover the speciﬁcation language must be easily amenable to validation techniques such as model checking for instance. Temporal logics [13,1,12] are among the most widely used speciﬁcation languages, the most popular ones being branching time temporal logics (CTL, CTL*) and linear time temporal logic (LTL). These logics are based on pure future modalities, i.e., modalities that does not depend on what happened before the current time. Adding past modalities to LTL does not increase its expressive power [6,2,10] but PLTL (LTL with past modalities) is exponentially more succinct than LTL [9]. In this paper, we focus our attention on PLTL. The drawback of using past modalities is that the satisﬁability and the model checking problems are harder to solve. Not harder from a theoretical point of view since both problems are PSPACE complete [15] regardless of whether we use past modalities or not, but harder from a practical point of view. Model checking algorithms for PLTL have already been proposed [7,14,18] but few have been implemented and used in actual model checker. This is surprising since past modalities make speciﬁcations more succinct and, more importantly, much easier and more natural [11]. We believe that the reason is the diﬃculty to build eﬃciently a B¨ uchi automaton associated with a PLTL formula, which is a crucial step for a model checker using PLTL as a speciﬁcation language. This is precisely the problem addressed in the present paper. The easiest construction is the so-called tableau construction (see e.g. [7,11,15]) but a straightforward implementation, called declarative tableau, is highly ineﬃcient. Implementations are based on incremental tableau. They construct B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 439–448, 2003. c Springer-Verlag Berlin Heidelberg 2003

440

Paul Gastin and Denis Oddoux

only reachable states and apply some simpliﬁcation techniques. An incremental tableau construction for PLTL [7] requires backtracking and is much harder to implement than for LTL [5]. Following the automata theoretic approach of [17], another technique to generate eﬃciently a generalized B¨ uchi automaton (GBA) from an LTL formula is to use very-weak alternating automata (VWAA) as an intermediary step. As demonstrated in [3], this yields an implementation which is dramatically faster than those based on the tableau construction. In this paper, we develop the same technique for PLTL. Since we have to deal with past modalities we use two-way very-weak alternating automata (2VWAA). Two-way alternating automata (2AA) were already proposed for speciﬁcation languages that are much more expressive than PLTL [16,8] and a translation procedure from general 2AA to GBA is given in [8]. Since PLTL formulas are suﬃcient to specify most interesting properties and can be easily translated to 2VWAA, it is very important to develop an eﬃcient translation procedure restricted to 2VWAA. The main result of this paper is an eﬃcient translation procedure of progressing 2VWAA to GBA. Starting from a progressing 2VWAA with n states, we construct a GBA with at most 2n states. Since the translation from a PLTL formula ϕ gives a progressing 2VWAA with at most |ϕ| + 1 states, we get for this formula a GBA with at most 2|ϕ|+1 states. As a comparison, the algorithm of [8] 2 gives an automaton with 2O(|ϕ| ) states (recall that the speciﬁcation language considered in [8] is much more expressive than PLTL). Due to space constraints, some proofs have had to be omitted in this extended abstract. A full version of the paper is available in [4].

2

Two-Way Very-Weak Alternating Automata (2VWAA)

A two-way alternating automaton (2AA) is a six-tuple A = (Q, Σ, δ, I, F, R) Q Σ Q where Q is the set of states, Σ is the alphabet, δ : Q → 22 ×2 ×2 is the Q transition function, I ⊆ 2 gives the initial condition, F ⊆ Q is the set of ﬁnal states. R ⊆ Q is the set of repeated states. A more classical way of deﬁning 2AAs would use I ∈ B+ (Q) and δ : Q×Σ → + B (Q × {−1, 1}), where −1 and 1 indicate whether the head moves left or right. While the two approaches are equivalent, our deﬁnition allows for more compact representations of the automata and for faster algorithms. Notice in particular that in the transition function we use 2Σ instead of Σ, so that transitions that diﬀer only by actions can be gathered. However, the automaton still reads ﬁnite and inﬁnite words in Σ ∞ = Σ + ∪ Σ ω . Runs of 2AAs are deﬁned using Q-forests. A Q-forest over a word u ∈ Σ ∞ is a labeled forest (V, E, σ, ν) where V is the set of vertices, E ⊆ V × V is the set of edges, σ : V → Q is the state labeling function, ν : V → N is the position labeling function: it indicates the letter the automaton A is reading; ∀x ∈ V , 0 ≤ ν(x) ≤ |u| + 1 (where |u| ∈ N ∪ {ω} denotes the length of u), and ∀(x, y) ∈ E, either ν(y) = ν(x) + 1 if A goes forward on u, or ν(y) = ν(x) − 1 if A goes backward on u.

LTL with Past and Two-Way Very-Weak Alternating Automata

441

We shall use the following notations: λ = (σ, ν) : V → Q × N is the labelling ← − function, E(x) = {y ∈ V | (x, y) ∈ E} is the set of the sons of x, E (x) = {y ∈ → − ← − E(x) | ν(y) = ν(x) − 1} is the set of the left sons of x, and E (x) = E(x)\ E (x) ← − is the set of the right sons of x, left(x) = E ∗ ( E (x)) ∪ {x}, ]x, y] = {z ∈ E + (x) | y ∈ E ∗ (z)}, and similarly for [x, y], ]x, y[ and [x, y[. A run ρ of a 2AA A on a word u = u1 u2 · · · ∈ Σ ∞ is a Q-forest (V, E, σ, ν) over u such that the roots satisfy the initial condition: if Γ ⊆ V are the roots of the forest, then σ(Γ ) ∈ I and ν(Γ ) ⊆ {1}; and the sons of any node satisfy the transition function: ∀x ∈ V if ν(x) = 0 or ν(x) = |u| + 1 then E(x) = ∅, ← − → − otherwise ∃α ∈ 2Σ , uν(x) ∈ α and (σ( E (x)), α, σ( E (x))) ∈ δ(σ(x)). A run ρ is accepting if ∀x ∈ V , ν(x) = 0 or ν(x) = |u| + 1 imply σ(x) ∈ F ; and if any inﬁnite branch of ρ goes inﬁnitely often through R (note that ρ may have inﬁnite branches even if u is ﬁnite). Finally, L(A) is the set of words on which there exists an accepting run of A. A two-way very-weak alternating automaton (2VWAA) is a 2AA where there ← − → − ← − − → exists a partial order on Q such that ∀p ∈ Q, ∀( X , α, X ) ∈ δ(p), ∀q ∈ X ∪ X , q p. Actually, we use progressing (or loop-free) 2VWAA for which we describe in Section 4 an eﬃcient translation to B¨ uchi automaton. A run ρ is progressing if any inﬁnite branch x0 . . . xn . . . in ρ satisﬁes the following property: ∀N ≥ 0, ∃i ≥ 0, ν(xi ) ≥ N . A run ρ has a loop if two nodes on a same branch have the same label: ∃x ∈ V , ∃y ∈ E + (x), λ(x) = λ(y). Otherwise ρ is loop-free. A 2AA is progressing (respectively loop-free) if all runs (accepting or not) of this automaton are progressing (respectively loop-free). Note that a loop-free run on a ﬁnite word has no inﬁnite branches and is therefore progressing. More generally, any loop-free run is progressing. The converse is trivially false. Still, we have Proposition 1. A 2AA is progressing iﬀ it is loop-free. Also, given a 2VWAA with n states we can eﬀectively construct a progressing 2VWAA with at most 2n states, accepting the same language. When combined with the translation of progressing 2VWAA to GBA presented in Section 4, this yields an eﬃcient translation from general 2VWAA to GBA.

3

PLTL to Progressing 2VWAA

The syntax and the semantics of PLTL is classical [12]. Here we use the following syntax: ϕ ::= ⊥ | p | ¬ ϕ | ϕ ∨ ϕ | X ϕ | Y ϕ | ϕ U ϕ | ϕ S ϕ, where ⊥ stands for false and p ranges over the set AP of atomic propositions. An LTL formula is a PLTL formula that uses neither Y nor S. Let Σ = 2AP . We write L(ϕ) ⊆ Σ ∞ for the set of words satisfying a PLTL formula ϕ. We also use the dual operators Y, U, ϕ = ¬ X ¬ ϕ and ϕ1

, ∧, X, S. For instance, X S ϕ2 = ¬(¬ ϕ1 S ¬ ϕ2 ). Note ϕ = ¬ X ∨ X ϕ. that X ϕ is not equivalent to X ϕ on ﬁnite words: we have X A PLTL formula can be written in negative normal form, where negations are on predicates only. A formula in negative normal form uses ⊥, , predicates in

442

Paul Gastin and Denis Oddoux

AP, their negations, the operators ∨, X, U, Y, S and their duals. A PLTL formula in negative normal form that is neither a disjunction (∨) nor a conjunction (∧) is called a temporal formula. Notice that transforming a formula in negative normal form does not change the number of temporal operators of the formula. From now on, we suppose that every PLTL formula is in negative normal form. This section is devoted to the ﬁrst step of our algorithm: from a PLTL formula with n temporal operators we can compute a progressing 2VWAA with at most Q Σ Q n + 1 states. We use the following notations. For J1 , J2 ∈ 22 ×2 ×2 we let ← − ← − → − − → ← − → − ← − → − J1 ⊗J2 = {( X 1 ∪ X 2 , α1 ∩α2 , X 1 ∪ X 2 ) | ( X 1 , α1 , X 1 ) ∈ J1 , ( X 2 , α2 , X 2 ) ∈ J2 }. For a PLTL formula ψ in negative normal form we deﬁne ψ inductively by: ψ = {{ψ}} if ψ is a temporal formula, ψ1 ∨ ψ2 = ψ1 ∪ ψ2 and ψ1 ∧ ψ2 = {X1 ∪ X2 | X1 ∈ ψ1 and X2 ∈ ψ2 }. It is actually not completely immediate to deal smoothly with the special cases raised when checking a past modality at the beginning of a word, or a future modality at the end of a ﬁnite word. We solve this problem using a special state END, which is reached in an accepting run when the current position ν is outside the word u (σ(x) = END ⇒ ν(x) = 0 or ν(x) = |u| + 1). Deﬁnition 2 (ϕ to Aϕ ). For any PLTL formula ϕ in negative normal form on the set AP, let Aϕ = (Q, Σ, δ, I, F, R) be the 2AA deﬁned by Q = sub(ϕ)∪{END} where sub(ϕ) are the temporal subformulae of ϕ, Σ = 2AP , I = ϕ, F = {END}, R is the set of the subformulae in sub(ϕ) that are not of the form ϕ1 U ϕ2 , and δ isdeﬁned below (∆ extends δ to B + (Q)): δ(⊥) = ∅     δ( ) = {(∅, Σ, ∅)}     δ(p) = {(∅, Σp , ∅)} where Σp = {a ∈ Σ | p ∈ a}     δ(¬ p) = {(∅,   Σ¬ p , ∅)} where Σ¬ p = Σ\Σp   δ(X ψ) = (∅, Σ, e) | e ∈ ψ     ψ) = (∅, Σ, e) | e ∈ ψ ∪ {(∅, Σ, {END})}  δ(X  δ(Y ψ) = (e, Σ, ∅) | e ∈ ψ   ψ) = (e, Σ, ∅) | e ∈ ψ ∪ {({END}, Σ, ∅)} δ(Y     δ(ψ1 U ψ2 ) = ∆(ψ2 ) ∪ (∆(ψ1 ) ⊗ {(∅, Σ, {ψ1 U ψ2 })})     δ(ψ U ψ2 ) = ∆(ψ2 ) ⊗ (∆(ψ1 ) ∪ {(∅, Σ, {ψ1 U ψ2 }), (∅, Σ, {END})})  1    δ(ψ S ψ ) = ∆(ψ ) ∪ (∆(ψ ) ⊗ {({ψ S ψ }, Σ, ∅)})  1 2 2 1 1 2     δ(ψ ) = ∆(ψ ) ⊗ (∆(ψ ) ∪ {({ψ S ψ S ψ  1 2 2 1 1 2 }, Σ, ∅), ({END}, Σ, ∅)})   δ(END) = ∅  ∆(ψ) = δ(ψ) if ψ is a temporal formula  ∆(ψ1 ∨ ψ2 ) = ∆(ψ1 ) ∪ ∆(ψ2 )  ∆(ψ1 ∧ ψ2 ) = ∆(ψ1 ) ⊗ ∆(ψ2 ) Theorem 3 (PLTL to Progressing 2VWAA). For any PLTL formula in negative normal form ϕ with n temporal subformulae, the automaton Aϕ is a progressing 2VWAA with at most n + 1 states and L(Aϕ ) = L(ϕ). Proof. Let us deﬁne ψ1 ψ2 if ψ1 = END or if ψ1 is a subformula of ψ2 . It is easy to see with this partial order that Aϕ is very-weak.

LTL with Past and Two-Way Very-Weak Alternating Automata

443

Now, we show that Aϕ is progressing. Suppose that a node x and its son → − y have the same state-label ψ in a run of Aϕ . Then, either (y ∈ E (x) and ← − ψ2 )), or (y ∈ E (x) and (ψ = ψ1 S ψ2 or ψ = ψ1 (ψ = ψ1 U ψ2 or ψ = ψ1 U S ψ2 )). Now let x0 , x1 , . . . be an inﬁnite branch of a run ρ of Aϕ . Since Aϕ is veryweak, the sequence σ(x0 ), σ(x1 ), . . . is ultimately constant. From the previous ψ2 , statement, necessarily the limit of this sequence is a formula ψ1 U ψ2 or ψ1 U and hence ν is strictly increasing from a certain node on the branch: Aϕ is progressing. The proof for L(Aϕ ) = L(ϕ) is omitted. It is similar to the classical proof used for the similar translation of LTL formulas to VWAA.

4

Progressing 2VWAA to GBA

The general translation from a 2AA to a BA is rather involved. Here we take full advantage of the fact that we start from a progressing 2VWAA. The basic idea, starting from an accepting run ρ of a 2AA A over a word u = u1 u2 . . . ∈ Σ ∞ , is to consider the sets Xi = σ(ν −1 (i)), and to build a kind of automaton B such that the sequence X0 , X1 , X2 , . . . is an accepting run of B on u. Transitions of B should reﬂect the fact that each node x of ρ satisﬁes the transition function of A. If ν(x) = i, this involves the sets Xi−1 , Xi , Xi+1 and the letter ui . Hence it is natural to consider quadruples (Xi−1 , Xi , Xi+1 , ui ) as transitions of B. This is why we introduce here generalized BA having such quadruples as transitions. We are also using several acceptance conditions on transitions instead of states since the construction becomes simpler. A GBA is a ﬁve-tuple G = (Q, Σ, T, I, F, T ) where: Q is the set of states, Σ is the alphabet, T ⊆ Q × Q × Q × 2Σ is the transition function, I ⊆ Q × Q is the set of initial states, F ⊆ Q × Q is the set of ﬁnal states, T = {T1 , . . . , Tr } where Tj ⊆ T are the acceptance tables. A run σ of a GBA G on a word u = u1 u2 . . . ∈ Σ ω is an inﬁnite sequence q0 , q1 , α1 , . . . , qn , αn , . . . such that (q0 , q1 ) ∈ I, and ∀i ≥ 1, ui ∈ αi and (qi−1 , qi , qi+1 , αi ) ∈ T . A run on a ﬁnite word u ∈ Σ + is deﬁned similarly. The run σ is accepting if u ∈ Σ n and (qn , qn+1 ) ∈ F or u ∈ Σ ω and σ uses inﬁnitely many transitions from each Tj , 1 ≤ j ≤ r. L(G) is the set of words on which there exists an accepting run of G. 1 Deﬁnition 4 (A to GA ). For any progressing 2VWAA A = (Q, Σ, δ, I, F, R), 1 we deﬁne the GBA GA = (Q , Σ, T , I , F , T ) by Q = 2Q , ← − → − ← − → − ← − ← − → → − − T = {( X , X, X , α) | ∃ ( Y , α, Y ) ∈ δ(q) with Y ⊆ X and Y ⊆ X }, q∈X

I = {(X, Y ) ∈ Q × Q | X ⊆ F and ∃Z ∈ I with Z ⊆ Y }, F = {(X, Y ) ∈ Q × Q | Y ⊆ F }, and T = {Tq | q ∈ Q\R} where ← − → − Tq = {( X , X, X , α) ∈ T | q ∈ / X or ← − → − ← − ← − − → − → → − ∃( Y , β, Y ) ∈ δ(q) with Y ⊆ X , Y ⊆ X , β ⊇ α and q ∈ / Y} Indeed, the hard part in a translation from a 2AA A to a GBA G is to make sure that each accepting run of G, which is a ﬂat sequence of nodes, can be lifted

444

Paul Gastin and Denis Oddoux

to a Q-forest which deﬁnes an accepting run of A. It is remarkable that if A is a progressing 2VWAA, we can simply use sequences X0 , X1 , X2 , . . . as runs of G and still be able to lift them to accepting runs of A. 1 ) ⊆ L(A). Proposition 5. For any progressing 2VWAA A, L(GA 1 ) = L(A), but the other inclusion follows directly from results Actually L(GA that will be proved afterwards, hence we will not prove it here. See theorem 9 for the ﬁnal result. 1 on a word u = Proof. Let X0 , X1 , α1 , X2 , α2 , . . . be an accepting run of GA ∞ u1 u2 . . . ∈ Σ . We are going to deﬁne a labelled graph ρ = (V, E, σ, ν). Let V = {(q, i) | i ≥ 0, q ∈ Xi }. By deﬁnition of T , ∀1 ≤ i ≤ |u|, ∀q ∈ Xi , there ← − → − ← − → − exist Y ⊆ Xi−1 , Y ⊆ Xi+1 and α ∈ 2Σ with αi ⊆ α and ( Y , α, Y ) ∈ δ(q), and → − such that q ∈ / Y whenever q ∈ / R and (Xi−1 , Xi , Xi+1 , αi ) ∈ Tq . Let E((q, i)) = ← − → − Y × {i − 1} ∪ Y × {i + 1}. If i ∈ {0, |u| + 1} then let E((q, i)) = ∅. ∀(q, i) ∈ V , let σ((q, i)) = q and ν((q, i)) = i. Since (X0 , X1 ) ∈ I , ∃Γ ⊆ X1 × {1} such that σ(Γ ) ∈ I. Now if we “unfold” the labelled graph ρ = (V, E, σ, ν) from the set of roots Γ , we obtain a Q-forest ρ = (V , E , σ , ν ). Let x ∈ V . If ν (x) ∈ {0, |u| + 1} then E (x) = ∅. Otherwise, let q = σ (x) ← − − → ← − → − and i = ν (x), let Y = σ (E (x)) = σ(E(q, i)∩Q×{i−1}) and Y = σ (E (x)) = σ(E(q, i) ∩ Q × {i + 1}). By construction of ρ there exists α ∈ 2Σ with αi ⊆ α ← − → − and ( Y , α, Y ) ∈ δ(q). Since ui ∈ αi , we have proved that the sons of x satisfy the transition function. Hence ρ is a run of A on u. We shall now check that ρ is accepting: – σ (ν −1 (0)) ⊆ σ(ν −1 (0)) = X0 ⊆ F , since (X0 , X1 ) ∈ I , – if u is ﬁnite of length n then σ (ν −1 (n + 1)) ⊆ σ(ν −1 (n + 1)) = Xn+1 ⊆ F since (Xn , Xn+1 ) ∈ F , – assume that ρ contains an inﬁnite branch x0 , x1 , . . . ultimately state-labelled in Q\R: ∃q ∈ Q\R, ∃N > 0, ∀i ≥ N , σ(xi ) = q. Since A is progressing, from proposition 1 we know that ρ is loop-free (which implies that ρ is loop-free) and u is inﬁnite. Since X0 , X1 , . . . is accepting, ∃k ≥ ν(xN ) such that (Xk−1 , Xk , Xk+1 , αk ) ∈ Tq . ρ is loop-free so ν is necessarily strictly increasing on xN , xN +1 , . . ., and there exists j ≥ N such that λ(xj ) = (q, k) and λ(xj+1 ) = (q, k +1). But since σ(xj ) = q and (Xk−1 , Xk , Xk+1 , αk ) ∈ Tq we chose E((q, k)) in ρ such that (q, k+1) ∈ / E((q, k)). This is a contradiction, and hence ρ cannot contain such a branch. 1 is still too big to be used in an eﬃcient It is quite easy to see that GA implementation. It contains many useless states, and keeping only the accessible states would not be enough since the initial states are too numerous already. We 1 introduced GA merely to prove more easily that our construction is correct. The intuition how to get a smaller GBA from a 2VWAA is as follows: by removing useless parts of runs of a 2VWAA we obtain minimal runs, such that removing any subtree of the forest makes the run invalid. Ideally, we would like

LTL with Past and Two-Way Very-Weak Alternating Automata

445

2 1 to construct a GBA GA which is the restriction of GA to transitions of the form −1 −1 −1 (σ(ν (i − 1)), σ(ν (i)), σ(ν (i + 1)), ui ) obtained from minimal runs. For this, we start from a small set of initial states, and we compute the states and transitions accessible from these states. We need to store both the current set of states Y and the previous one X. Since A is two-way, it may happen that the set X is not big enough to fulﬁl all requirements imposed by Y . In this case, we have to backtrack and enlarge X. 2 Deﬁnition 6 (A to GA ). For any progressing 2VWAA A = (Q, Σ, δ, I, F, R), 2 = (Q , Σ, T , I , F , T ) be the GBA computed by let GA Initialization: I = {F } × I, ∇ = {F } × I, T = ∅. Then, we apply the following saturation procedure for each state (X, Y ) ∈ ∇ until we reach a ﬁxed point: δ(q) for each (X , α, Z) ∈

q∈Y

if X ⊆ X if (Y, Z) ∈ / ∇ then add (Y, Z) to ∇ (a) if (X, Y, Z, α) ∈ / T then add (X, Y, Z, α) to T else for each (F, X, Y, β) ∈ T with (F, X) ∈ I if (F, X ∪ X ) ∈ / ∇ then add (F, X ∪ X ) to ∇ (b) add (F, X ∪ X ) to I for each (V, W, X, γ), (W, X, Y, β) ∈ T if (W, X ∪ X ) ∈ / ∇ then add (W, X ∪ X ) to ∇ (c) if (V, W, X ∪ X , γ) ∈ / T then add (V, W, X ∪ X , γ) to T Finally, we set Q = {X ∈ 2Q | ∃Y ∈ 2Q , (X, Y ) ∈ ∇ or (Y, X) ∈ ∇} F = (2Q × 2F ) ∩ (Q × Q ), and T = {Tq | q ∈ Q\R} where ← − → − Tq = {( X , X, X , α) ∈ T | q ∈ / X or ← − → − ← − ← − − → − → → − ∃( Y , β, Y ) ∈ δ(q) with Y ⊆ X , Y ⊆ X , β ⊇ α and q ∈ / Y} 2 1 Proposition 7. For any progressing 2VWAA A, L(GA ) ⊆ L(GA ). 2 1 Proof. It is easy to notice that any state or transition of GA appears in GA : 2 Q ⊆ Q and T ⊆ T . Hence any accepting run of GA on a word u is also an 1 accepting run of GA on the same word. 2 ). Proposition 8. For any progressing 2VWAA A, L(A) ⊆ L(GA

Proof. Let ρ = (V, E, σ, ν) be an accepting run of A on a word u. We build by induction on k a sequence ρk = X0 , (Xi , αi )0
446

Paul Gastin and Denis Oddoux

Otherwise, ∀q ∈ Xn , choose xq ∈ V such that λ(xq ) = (q, n), and such that → − if q ∈ / R then q ∈ / σ( E (xq )) whenever this is possible. Let X = {xq | q ∈ Xn }, ← − → − Xn−1 = σ( E (X)), Xn+1 = σ( E (X)). Since ρ is a run of A on u and n ≤ |u|, , αn , Xn+1 ) ∈ δ(q). Since ρk satisﬁes the ∃αn with un ∈ αn such that (Xn−1 q∈Xn

inductive hypothesis, we have (Xn−1 , Xn ) ∈ ∇ and three cases can occur: 2 ⊆ Xn−1 , then from the construction of GA , (Xn , Xn+1 ) ∈ ∇ (a) if Xn−1 and (Xn−1 , Xn , Xn+1 , αn ) ∈ T . Moreover, for each q ∈ Xn \ R, if ∃x ∈ → − ν −1 (n) with σ(x) = q and q ∈ / σ( E (x)) then ∃β ⊇ αn , ∃X ⊆ Xn−1 , ∃Z ⊆ Xn+1 \{q}, (X , β, Z ) ∈ δ(q) hence (Xn−1 , Xn , Xn+1 , αn ) ∈ Tq . Therefore, ρk+1 = ρk , αn , Xn+1 satisﬁes the inductive hypothesis. ⊆ Xn−1 . Note that this implies n ≥ 2, because if We now suppose that Xn−1 n = 1 then since ρ is accepting, X0 ⊆ F = X0 . (b) If n = 2 then (X0 , X1 , X2 , α1 ) ∈ T and (X0 , X1 ) ∈ I so from the 2 construction of GA , (X0 , X1 ∪ X1 ) ∈ I ⊆ ∇. Hence ρk+1 = X0 , X1 ∪ X1 satisﬁes the inductive hypothesis. (c) Assume now that n ≥ 3. Then we have (Xn−3 , Xn−2 , Xn−1 , αn−2 ) ∈ 2 , T and (Xn−2 , Xn−1 , Xn , αn−1 ) ∈ T . From the construction of GA (Xn−3 , Xn−2 , Xn−1 ∪ Xn−1 , αn−2 ) ∈ T and (Xn−2 , Xn−1 ∪ Xn−1 ) ∈ ∇. Moreover if (Xn−3 , Xn−2 , Xn−1 , αn−2 ) ∈ Tq then (Xn−3 , Xn−2 , Xn−1 ∪ , αn−2 ) ∈ Tq . Therefore, ρk+1 = X0 , . . . , Xn−2 , αn−2 , Xn−1 ∪ Xn−1 Xn−1 satisﬁes the inductive hypothesis. Now, we show that the sequence (ρk )k≥0 converges. For this, consider the alphabet A = 2Q × 2Σ partially ordered by (X, α) (Y, β) if X ⊆ Y . Let us write ρk = (X0 , ∅), (Xi , αi )1≤i≤n , (Xn+1 , ∅) ∈ A∗ . The sequence of words (ρk )k≥0 is strictly increasing for the lexicographic order induced by . Hence, we can easily show that this sequence is either ﬁnite (if the construction terminates) or inﬁnite and converges to an inﬁnite word ρ ∈ Aω . If u is ﬁnite then the words ρk are of length at most |u| + 2. Therefore the construction terminates at some step k: let ρ = ρk . Since ρ is accepting, 2 σ(ν −1 (|u| + 1)) ⊆ F so (X|u| , X|u|+1 ) ∈ F . Hence ρ is an accepting run of GA on u. If u is inﬁnite then the sequence (ρk )k≥0 must be inﬁnite and converges to an inﬁnite word ρ ∈ Aω . Since any preﬁx of ρ is also a preﬁx of ρk for 2 on u. Suppose that ρ is not accepting: there exists some k, ρ is a run of GA q ∈ Q\R such that after a given rank N , all transitions used in ρ are not in Tq . → − Hence ∀x ∈ V if λ(x) = (q, i) with i ≥ N then q ∈ σ( E (x)). Moreover, since (XN −1 , XN , XN +1 , αN ) ∈ / Tq , we have q ∈ XN . So we can ﬁnd in ρ an inﬁnite branch ultimately labelled by q, which is a contradiction with ρ accepting. As a consequence of Propositions 5, 7 and 8 we obtain: Theorem 9 (Progressing 2VWAA to GBA). For any progressing 2VWAA 1 2 A, L(GA ) = L(GA ) = L(A). Hence from any progressing 2VWAA with n states and r repeated states we can construct a GBA accepting the same language with at most 2n states and n − r acceptance tables.

LTL with Past and Two-Way Very-Weak Alternating Automata

447

We conclude this section by explaining several simpliﬁcations that are used in the implementation of the algorithm described in Deﬁnition 6. Saturation procedure: In order to generate all needed transitions, we have to repeat the saturation procedure until no more changes occur. This has to be carefully implemented to avoid redundant computations. The idea here is to timestamp states and transitions by integers. When they are ﬁrst created, new elements of ∇ are timestamped by 0, whereas new transitions in T get the current timestamp. Each time a pair (X, Y ) is considered for saturation, the current timestamp is incremented, and when this pair has been treated, it gets the current timestamp. Now, step (a) is executed only when the timestamp of (X, Y ) is 0. In step (b) we consider only transitions (F, X, Y, β) whose timestamps are greater or equal to that of (X, Y ). Similarly, in step (c), we only consider pairs of transitions for which at least one timestamp is greater or equal to that of (X, Y ). Transition simpliﬁcation: The idea here is to remove redundant transitions. When there exist two transitions (X, Y, Z, α) and (X, Y, Z, β) with α ⊆ β, the ﬁrst transition is more restrictive than the second one, and thus can be deleted. This simpliﬁcation can be done on-the-ﬂy, that is when the algorithm adds a new transition to T . State simpliﬁcation: The idea here is to merge equivalent pairs of states. When two pairs of states (X, Y ) and (X , Y ) have the same outgoing transitions ((X, Y, Z, α) ∈ T ⇐⇒ (X , Y , Z, α) ∈ T and ∀q ∈ Q\R, (X, Y, Z, α) ∈ Tq ⇐⇒ (X , Y , Z, α) ∈ Tq ), they can be merged, and any transition pointing on one of the two states can now point on the merged state. The overall simpliﬁcation alternates transition simpliﬁcation and state simpliﬁcation until a ﬁxed point is reached. Experimental Results: We have implemented two algorithms, building re1 2 spectively GA and GA from a PLTL formula ϕ. Here are the results of some computations on the formula ϕn = ¬ G(p1 → F−1 (p2 ∧ F−1 (p3 ∧ . . . F−1 pn ) . . .), stating that each p1 is preceded by p2 , preceded itself by p3 , and so on until pn . This demonstrates the striking improvement of the on-the-ﬂy algorithm.

ϕ2 ϕ3 ϕ4 ϕ5

Aϕ states 3 4 5 6

before 28, 82 100, 544 364, 3630 1348,24830

1 GA after time space 6, 11 0.03 <380 12, 36 0.83 <380 27,102 230 1,700 58,264 130,000 39,000

before 7, 10 10, 19 13, 31 16, 46

2 GA after time space 4, 6 0.01 <380 6,13 0.01 <380 8,23 0.08 635 10,36 9.40 32,000

In order of appearance : tested formula, number of states of the VWAA, and i for each i ∈ {1, 2}, number of states and transitions of GA before and after simpliﬁcation, computation time in seconds, memory used in KB.

Model-Checking: At this point we are able from any PLTL formula ϕ to compute a GBA accepting the models of ϕ. In order to apply usual model checking techniques we can transform our GBA G = (Q, Σ, T, I, F, T ) to a more conventional automaton G = (Q , Σ, T , I, F, T ) where T = {((q− , q), α, (q, q+ )) |

448

Paul Gastin and Denis Oddoux

(q− , q, q+ , α) ∈ T }, the same for acceptance tables, and Q is the set of all pairs (q− , q) or (q, q+ ) appearing in T . We can actually view G as a convenient encoding of G . Classical techniques allows also to get a B¨ uchi automaton (BA) replacing the acceptance tables by a single set of repeated states.

References 1. E.A. Emerson. Temporal and modal logic. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, volume B, chapter 16, pages 995–1072. Elsevier Science, 1990. 2. D. Gabbay, A. Pnueli, S. Shelah, and J. Stavi. On the temporal analysis of fairness. In Proc. of PoPL’80, pages 163–173, Las Vegas, Nev., 1980. 3. P. Gastin and D. Oddoux. Fast ltl to b¨ uchi automata translation. In Proc. of CAV’01, number 2102 in LNCS, pages 53–65. Springer Verlag, 2001. 4. P. Gastin and D. Oddoux. LTL with past and two-way very-weak alternating automata. Tech. Rep. LIAFA 2003–010, Universit´e Paris 7 (France). 5. R. Gerth, D. Peled, M.Y. Vardi, and P. Wolper. Simple on-the-ﬂy automatic veriﬁcation of linear temporal logic. In Protocol Speciﬁcation Testing and Veriﬁcation, pages 3–18, Warsaw, Poland, 1995. Chapman & Hall. 6. J.A.W. Kamp. Tense Logic and the Theory of Linear Order. PhD thesis, University of California, Los Angeles, California, 1968. 7. Y. Kesten, Z. Manna, H. McGuire, and A. Pnueli. A decision algorithm for full propositional temporal logic. In Proc. of CAV’93, number 697 in LNCS, pages 97–109. Springer Verlag, 1993. 8. O. Kupferman, N. Piterman, and M.Y. Vardi. Extended temporal logic revisited. In Proc. of CONCUR’01, number 2154 in LNCS, pages 519–535. Springer Verlag, 2001. 9. F. Laroussinie, N. Markey, and Ph. Schnoebelen. Temporal logic with forgettable past. In Proc. of LICS’02, pages 383–392, 2002. 10. F. Laroussinie and Ph. Schnoebelen. A hierarchy of temporal logics with past. Theoretical Computer Science, 148:303–324, 1995. 11. O. Lichtenstein, A. Pnueli, and L.D. Zuck. The glory of the past. In Proc. of the 3rd Workshop on Logics of Programs, number 193 in LNCS, pages 196–218. Springer Verlag, 1985. 12. Z. Manna and A. Pnueli. The temporal logic of reactive and concurent systems: Speciﬁcation. Springer Verlag, Berlin-Heidelberg-New York, 1992. 13. A. Pnueli. The temporal logic of programs. In Proc. of FOCS’77, pages 46–57, 1977. 14. Y.S. Ramakrishna, L.E. Moser, L.K. Dillon, P.M. Melliar-Smith, and G. Kutty. An automata theoretic decision procedure for propositional temporal logic with Since and Until. Fundamenta Informaticae, 17:271–282, 1992. 15. A.P. Sistla and E.M. Clarke. The complexity of propositional linear time logic. Journal of the Association of Computing Machinery, 32:733–749, 1985. 16. M.Y. Vardi. Reasonning about the past with two-way automata. In Proc. of ICALP’98, number 1443 in LNCS, pages 628–641. Springer Verlag, 1998. 17. M.Y. Vardi and P. Wolper. An automata-theoretic approach to automatic program veriﬁcation. In Proc. of LICS’86, pages 332–344, 1986. 18. M.Y. Vardi and P. Wolper. Reasonning about inﬁnite computations. Information and Computation, 115:1–37, 1994.

Match-Bounded String Rewriting Systems Alfons Geser1 , Dieter Hofbauer2 , and Johannes Waldmann3 1

3

National Institute of Aerospace, 144 Research Drive Hampton, Virginia 23666, USA [email protected] 2 Fachbereich Mathematik/Informatik, Universit¨ at Kassel D-34109 Kassel, Germany [email protected] Fakult¨ at f¨ ur Mathematik und Informatik, Universit¨ at Leipzig D-04109 Leipzig, Germany [email protected]

Abstract. We investigate rewriting systems on strings by annotating letters with natural numbers, so called match heights. A position in a reduct will get height h+1 if the minimal height of all positions in the redex is h. In a match-bounded system, match heights are globally bounded. Exploiting recent results on deleting systems, we prove that it is decidable whether a given rewriting system has a given match bound. Further, we show that match-bounded systems preserve regularity of languages. Our main focus, however, is on termination of rewriting. Match-bounded systems are shown to be linearly terminating, and–more interestingly– for inverses of match-bounded systems, termination is decidable. These results provide new techniques for automated proofs of termination.

1

Introduction

Rewriting is a model of computation. It allows to handle questions like termination (there is no inﬁnite computation), normalization (a ﬁnal conﬁguration is reachable) and correctness (no erroneous conﬁguration is reachable). These questions can be formulated in terms of sets of descendants: if R is a rewriting system, and L is a language, then R∗ (L) = {y | x ∈ L, x →∗R y}. Now R is correct for L iﬀ R∗ (L) ∩ Err = ∅, and R is normalizing for L iﬀ L ⊆ R−∗ (Final), with Err (resp. Final) denoting the set of erroneous (resp. ﬁnal) conﬁgurations. Starting from classical program analysis, recent applications include veriﬁcation of XML transformations, and cryptographic protocols [6]. From the point of view of these applications, it is highly desirable that the reachability relation R∗ eﬀectively respects language classes with good decidability and closure properties. In the present paper, we achieve this by restricting the ﬂow of information in rewriting systems, using match bounds. We can then apply recent results on deleting systems, to obtain closure and termination results. All constructions are eﬀective (and we have indeed implemented them), so they can be used in automated proofs of termination. For instance, we can B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 449–459, 2003. c Springer-Verlag Berlin Heidelberg 2003

450

Alfons Geser, Dieter Hofbauer, and Johannes Waldmann

automatically verify termination of Zantema’s System {a2 b2 → b3 a3 }, a task that is notoriously diﬃcult even for a human. To obtain automated termination proofs, we transform rewriting systems as follows: We annotate letters with numbers, so called match heights. A position in a reduct will get height h + 1 if the minimal height of all positions in the redex is h. A rewriting system is match-bounded if match heights of derivations are globally bounded. We give the deﬁnition and examples in Sections 3 and 4, while in Section 5 we discuss how to verify or refute a match-bound. Results follow from our recent research [10] on deleting systems. A string rewriting system R is called deleting if there exists a partial ordering on its alphabet such that each letter in the right hand side of a rule is less than some letter in the corresponding left hand side. Deleting systems can be regarded as the inverses of context limited grammars as deﬁned and investigated by Hibbard [9]. Deleting rewriting systems terminate, and they have linearly bounded derivational complexity. In Section 6, we show that inverses of deleting string rewriting systems have decidable termination and uniform termination problem. This carries over to inverse match-bounded systems immediately: match-bounded systems are terminating, and inverse match-bounded systems have decidable termination and uniform termination problems. An application is given in Section 6. We conclude by discussing ramiﬁcations for further research in Section 7.

2

Preliminaries

We mostly stick to standard notations for strings and string rewriting, as e.g. in [2]. A string rewriting system (SRS) over an alphabet Σ is a relation R ⊆ Σ ∗ × Σ ∗ , inducing the rewrite relation →R = {(xy, xry) | x, y ∈ Σ ∗ , (, r) ∈ R} on Σ ∗ . Unless indicated otherwise, all rewriting systems are ﬁnite. Pairs (, r) from R are frequently referred to as rules → r. By lhs(R) and rhs(R) we denote the sets of left (resp. right) hand sides of R. The reﬂexive and transitive closure of + →R is →∗R , often abbreviated as R∗ , and →+ R or R denote the transitive closure. A rewriting rule → r is context-free if || ≤ 1, and a SRS is context-free if all its rules are. We use for the empty string, and |x| is the length of a string x. Further, for a language L ⊆ Σ ∗ , let factor(L) = {y ∈ Σ ∗ | ∃x, z ∈ Σ ∗ : xyz ∈ L}. For standard results on rational transductions we refer to [1]. For arelation ρ ⊆ A × B let ρ(a) = {b ∈ B | (a, b) ∈ ρ} for a ∈ A and ρ(A ) = a∈A ρ(a) for A ⊆ A, so the set of descendants of a language L ⊆ Σ ∗ modulo R is R∗ (L). The inverse of ρ is ρ− = {(b, a) | (a, b) ∈ ρ} ⊆ B × A, and we say that ρ satisﬁes the property inverse P if ρ− satisﬁes P . Deﬁne Inf(ρ) = {a ∈ A | ρ(a) is inﬁnite}; the relation ρ is ﬁnitely branching if Inf(ρ) = ∅. For a relation ρ ⊆ Σ ∗ ×Σ ∗ and a set ∆ ⊆ Σ, let ρ|∆ denote ρ∩(∆∗ ×∆∗ ). Note the diﬀerence between R∗ |∆ and (R|∆ )∗ for a SRS R. E.g., for R = {a → b, b → / (R|∆ )∗ . c} over Σ = {a, b, c} and ∆ = {a, c} we have (a, c) ∈ R∗ |∆ , but (a, c) ∈ ∗ ∗ A relation s ⊆ Σ × Γ is a substitution if s() = and s(xy) = s(x)s(y) for x, y ∈ Σ ∗ , so s is uniquely determined by the languages s(a) for a ∈ Σ. For a family of languages L over Γ , the substitution s is an L-substitution if s(a) ∈ L

Match-Bounded String Rewriting Systems

451

for a ∈ Σ. For instance, if L is the family of ﬁnite (context-free) languages, then s is a ﬁnite (resp. context-free) substitution. If ∈ / s(a) for every a ∈ Σ, then s is epsilon-free. Note that a ﬁnite substitution is ﬁnitely branching, and the same holds for the inverse of a ﬁnite and epsilon-free substitution. Now we recall deﬁnitions and results regarding deleting string rewriting systems. This topic originates with Hibbard [9]. A string rewriting system R over an alphabet Σ is >-deleting for a precedence > on Σ if ∈ / lhs(R), and if for each rule → r in R and for each letter a in r, there is some letter b in with b > a. The system R is deleting if it is >-deleting for some precedence >. If R is deleting, then R is terminating, and R has linear derivational complexity. Theorem 1 ([10]). Let R be a deleting string rewriting system over Σ. Then there are an extended alphabet Γ ⊇ Σ, a ﬁnite substitution s ⊆ Σ ∗ × Γ ∗ , and a context-free string rewriting system C over Γ such that R∗ = (s ◦ C −∗ )|Σ . This decomposition result implies that deleting systems eﬀectively preserve regularity, and that inverse deleting systems eﬀectively preserve context-freeness, the latter being already shown in [9].

3

Match-Bounded String Rewriting Systems

We will now apply the theory of deleting systems to obtain results for matchbounded rewriting. A derivation is match-bounded if dependencies between rule applications are limited. To make this precise, we will annotate positions in strings by natural numbers that indicate their matching height. Positions in a reduct will get height h + 1 if the minimal height of all positions in the corresponding redex was h. Given an alphabet Σ, deﬁne the morphisms liftc : Σ ∗ → (Σ × N)∗ for c ∈ N by liftc : a → (a, c), base : (Σ × N)∗ → Σ ∗ by base : (a, c) → a, and height : (Σ × N)∗ → N∗ by height : (a, c) → c. For a string rewriting system R over Σ we deﬁne the rewriting system match(R) = { → liftc (r) | ( → r) ∈ R, base( ) = , c = 1 + min(height( ))} over alphabet Σ × N. For instance, the system match({ab → bc}) contains the rules {a0 b0 → b1 c1 , a0 b1 → b1 c1 , a1 b0 → b1 c1 , a1 b1 → b2 c2 , a0 b2 → b1 c1 , . . .}, writing xh as abbreviation for (x, h). Note that this is an inﬁnite system. Every derivation modulo match(R) corresponds to a derivation modulo R, (for x, y ∈ (Σ × N)∗ , if x →match(R) y then base(x) →R base(y)) and vice versa. (for v, w ∈ Σ ∗ and x ∈ (Σ × N)∗ , if v →R w and base(x) = v, then there is y ∈ (Σ × N)∗ such that base(y) = w and x →match(R) y). Deﬁnition 1. A string rewriting system R over Σ is called match-bounded for L ⊆ Σ ∗ by c ∈ N if ∈ / lhs(R) and max(height(x)) ≤ c for every x ∈ match(R)∗ (lift0 (L)). If we omit L, then it is understood that L = Σ ∗ . Note that max(height x) (resp. min(height ) in the deﬁnition of match(R)) denotes the maximum (resp. minimum) of the corresponding sets of heights. We set max(∅) = 0.

452

Alfons Geser, Dieter Hofbauer, and Johannes Waldmann

For those systems R that are indeed match-bounded, it is enough to consider ﬁnite restrictions of the inﬁnite system match(R). Denote by matchc (R) the restriction of match(R) to the alphabet Σ × {0, 1, . . . , c}. Proposition 1. If R is match-bounded by c, then R∗ = lift0 ◦ matchc (R)∗ ◦base. Proposition 2. For all R, and all c ∈ N, the system matchc (R) is deleting. Proof. Use the ordering > on Σ × {0, . . . , c} where (a, m) > (b, n) iﬀ m < n. (Letters of minimal match height are maximal in the deletion ordering.) Corollary 1. If R is match-bounded for L, then R is terminating on L. Proof. An inﬁnite R-derivation can be transformed into an inﬁnite match(R)derivation which, given that R is match-bounded by c, is a matchc (R)-derivation. However, matchc (R) is terminating, since it is deleting. We conclude this section with a few examples of match-bounded rewriting systems; for non-examples, see Section 5. A large class of examples comes from the following observation: Dually to Proposition 2, we have Proposition 3. If R is deleting, then R is match-bounded. Proof. Assume R over Σ is deleting for the ordering > on Σ. Then R is matchbounded by the maximal depth (i.e., length of a descending chain) in (Σ, >). Example 1. The system {ba → cb, bd → d, cd → de} is match-bounded by 2, since it is deleting for the ordering a > b > d, a > c > e, c > d. Example 2. The system {ab → bac} is match-bounded by 1, {ab → ac, ca → bc} is match-bounded by 2, and {ab → ac, ca → b} is match-bounded by 3. (None of these systems is deleting.) See Section 5 for veriﬁcation of the bounds.

4

Match-Bounded Systems Preserve Regularity

Theorem 2. If R is match-bounded, then R preserves regularity. Proof. By Theorem 1, deleting systems preserve regularity. By Proposition 1, all we need is two more rational transducers to do the encoding and decoding. Example 3. The system R = {ab → ba} on Σ = {a, b} is not regularity preserving, since R∗ ((ab)∗ )∩a∗ b∗ = {an bn | n ≥ 0} is not regular. So Theorem 2 implies that R is not match-bounded. (See Example 5 for a more direct proof.) Example 4. Peg solitaire is a one-person game. The objective is to remove pegs from a board. A move consists of one peg X hopping over an adjacent peg Y , landing on the empty space on the other side of Y . After the hop, Y is removed. Peg solitaire on a one-dimensional board corresponds to the rewriting system P = { → , → },

Match-Bounded String Rewriting Systems

453

where stands for “peg”, and for “empty”. One is interested in the language of all positions that can be reduced to one single peg, which is P −∗ (∗ ∗ ). Regularity of P −∗ (∗ ∗ ) is a “folklore theorem”, see [16] for its history. The system P − is match-bounded by 2, so we obtain yet another proof of that result. The automata Ac constructed for matchc (P − )∗ (∗ ∗ ) have sizes of 2, 14, and 30 respectively for A0 , A1 , and A2 . Remark 1. Ravikumar [17] proved that P − preserves regularity by considering the system’s change-bound (which is 4). Change-boundedness is a concept which is strongly related to match-boundedness. Given a length-preserving string rewriting system R (viz. || = |r| for every rule → r), deﬁne the system change(R) = { → r | (base() → base(r)) ∈ R, height(succ()) = height(r)} over alphabet Σ × N, where succ is the morphism succ : (Σ × N)∗ → (Σ × N)∗ induced by succ : (a, h) → (a, h + 1). Ravikumar proves that if change(R) has bounded height, then R preserves regularity. Our results both generalize and strengthen this, the main improvement being that the deﬁnition of match does also apply to systems that are not length-preserving. For length-preserving systems, match(R) will always give lower or equal heights, so our result implies Ravikumar’s.

5

Veriﬁcation and Refutation of Match-Bounds

Theorem 3. The following problem is decidable: Given: A string rewriting system R, a regular language L, and c ∈ N. Question: Is R match-bounded by c for L? Proof. Construct (a ﬁnite automaton for) Lc = (lift0 ◦ matchc (R)∗ )(L), using Proposition 1. Then decide whether Lc contains a string x that has a factor liftc (), for some rule → r in R. If this is not the case, then Lc = Lc+1 = · · · and R is match-bounded by c for L. Otherwise, we have found a “high redex” in x, thus there is a string y with x →match(R) y and max(height(y)) = c + 1. For an implementation, the enormous growth of | matchc (R)| as a function of c is problematic. If we are computing matchc (R)∗ (lift0 (L)), then we should restrict attention to those rules of matchc (R) that are accessible in derivations starting from lift0 (L). For a language L ⊆ Σ ∗ , a system R over Σ, and a system S ⊆ match(R) deﬁne accessible(L, R, S) = match(R) ∩ (factor(S ∗ (lift0 (L))) × (Σ × N)∗ ). Note that this construction is eﬀective if a ﬁnite system S and a regular language L are eﬀectively given. We construct a sequence of rewriting systems Ri by R0 = ∅ and Ri+1 = accessible(L, R, Ri ). Induction on i shows Ri ⊆ matchi (R) for i ≥ 0. In particular, every system Ri is ﬁnite. By induction on i, using

454

Alfons Geser, Dieter Hofbauer, and Johannes Waldmann

monotonicity of S → accessible(L, R, S), one also proves that Ri ⊆ Ri+1 . Deﬁne ∗ R∞ = i∈N Ri . Clearly, R∞ (lift0 (L)) = match(R)∗ (lift0 (L)). If R is matchbounded by c, then R∞ is a subset of matchc (R); so R∞ is ﬁnite, and there is an index N such that RN = RN +1 = · · · . If R is not match-bounded then R∞ contains for each c a rule with height c, and is so inﬁnite. The enumeration of Ri up to i = | matchc (R)| + 1 can be used as an alternative decision procedure for Theorem 3. In some cases we can also verify automatically that a given rewriting system R is not match-bounded for a language L. For this purpose, we try to ﬁnd a self-embedding set of witnesses, as follows. The set raisedc (R, L) consists of all strings that occur as the base of a “high factor” (with all positions of height > 0) of a string that is reachable by a matchc (R)-derivation starting from lift0 (L): raisedc (R, L) = base(factor(matchc (R)∗ (lift0 (L))) ∩ (Σ × {1, 2, . . .})∗ ). First we observe that a match(R)-derivation can be raised to larger heights. For u , u ∈ (Σ × N)∗ we write u ≥ u if base(u ) = base(u) and height(u ) ≥n height(u), where ≥n denotes the pointwise greater-or-equal ordering on Nn . Lemma 1. If u ≥ u →match(R) v, then u →match(R) v ≥ v for some string v . Proposition 4. Let R be a string rewriting system, let L be a language, both over Σ. If there are c ∈ N and a language W ⊆ L ∩ raisedc (R, W ) with W ⊆ {}, then R is not match-bounded for L. Proof. We call u ∈ Σ ∗ a witness for height h if there is a match(R)-derivation from lift0 (u) to some string in (Σ × N)∗ that contains at least one position of height ≥ h. We will show that for each h ∈ N, there is some witness u ∈ W for height h. For h = 0, there is nothing to prove. By induction, assume u ∈ W is a witness for height h. Since W ⊆ raisedc (R, W ), there is some v ∈ W such that u ∈ raisedc (R, v). We claim that v is a witness for height h + 1. By deﬁnition of raisedc , there is a match(R)-derivation D from lift0 (v) to some string xu y with base(u ) = u and min(height u ) ≥ 1. Since u is a witness for h, there is a match(R)-derivation E from lift0 (u) to some word w with maximum height ≥ h. This derivation can be relabelled to a derivation from succ(lift0 (u)) = lift1 (u) to succ(w), where succ is the morphism deﬁned in Section 4 that increases the height of each position by 1. By Lemma 1 and u ≥ lift1 (u), this derivation can be raised to a derivation E : u →∗ w for some string w ≥ succ(w). Now, D and E can be combined to lift0 (v) →∗ xu y →∗ xw y, such that max(height w ) ≥ h + 1. Note that the condition in Proposition 4 can be eﬀectively checked if a ﬁnite SRS R, a number c ∈ N, and regular languages W and L are eﬀectively given. Example 5. The system R = {ab → ba} (cf. Example 3) is not match-bounded for Σ ∗ . Take W = (ab)+ . Then raised1 (R, W ) = factor((ba)+ ) ⊇ W . Example 6. Neither is R = {aabb → ba} match-bounded, as wittnessed by W = {a, b}∗ = raised1 (R, W ). See Example 8 for a similar system with different behaviour.

Match-Bounded String Rewriting Systems

455

We have implemented the algorithm according to Theorem 3 and Proposition 4, see http://theo1.informatik.uni-leipzig.de/˜joe/bounded/.

6

Deciding Termination for Inverse Deleting and Inverse Match-Bounded Systems

In this section, we will prove that termination is decidable for inverse deleting string rewriting systems, and conclude that the same holds for inverse matchbounded systems. Lemma 2. Let s ⊆ Σ ∗ × Γ ∗ be a substitution, and let K be a regular language over Γ . Then Inf(s ∩ (Σ ∗ × K)) is regular. Proof. Consider a ﬁnite automaton A with state set Q that accepts K. Denote x by L(A, p, q) the set of strings x for which there is a path p → q in A. We deﬁne an automaton B over alphabet Σ × {F, I} as follows. The sets of states, initial states, and ﬁnal states of B and A coincide. For p, q ∈ Q and a ∈ Σ, B contains the transition (a,I)

– p −→ q iﬀ the language s(a) ∩ L(A, p, q) is inﬁnite, (a,F )

– p −→ q iﬀ the language s(a) ∩ L(A, p, q) is ﬁnite and non-empty. We claim that a1 . . . an ∈ Inf(s ∩ (Σ ∗ × K)) for ai ∈ Σ if and only if there is an accepting path in B that is labelled by (a1 , b1 ) . . . (an , bn ) where at least one bi equals I. Therefore, Inf(s ∩ (Σ ∗ × K)) = π(L(B) \ (Σ × F )∗ ) where π : (Σ × {I, F })∗ → Σ ∗ is the morphism induced by π : (a, b) → a. Lemma 3. Let Σ, Σ0 , Γ, Γ0 be alphabets, let s ⊆ Σ ∗ × Γ ∗ be a substitution, and let T1 ⊆ Σ0∗ × Σ ∗ and T2 ⊆ Γ ∗ × Γ0∗ be ﬁnitely branching rational transductions. Then Inf(T1 ◦ s ◦ T2 ) is regular. Proof. By Lemma 2, since Inf(T1 ◦ s ◦ T2 ) = T1− (Inf(s ∩ (Σ ∗ × T2− (Γ0∗ )))).

Remark 2. The regularity results in Lemma 2 and Lemma 3 are eﬀective if s is an L-substitution for a family L of languages that is closed under intersection with regular sets, and for which emptiness and ﬁniteness are decidable. This is the case, e.g., for the family of context-free languages, as in the proof of Proposition 5 below. Proposition 5. For an inverse deleting SRS R, Inf(R∗ ) is eﬀectively regular. Proof. Let R be a system over alphabet Σ such that R− is deleting. First we exclude some trivial cases. Since R is inverse deleting, we have ∈ / rhs(R). And if R contains a rule → r = , then Inf(R∗ ) = Σ ∗ . So from now on, we may assume ∈ / lhs(R) ∪ rhs(R). By Theorem 1 we have R−∗ = (s ◦ C −∗ ) ∩ (Σ ∗ × Σ ∗ ), where s ⊆ Σ ∗ × Γ ∗ is a ﬁnite substitution into an extended alphabet Γ ⊇ Σ, and C is a context-free

456

Alfons Geser, Dieter Hofbauer, and Johannes Waldmann

rewriting system over Γ . Reviewing the construction in [10], we ﬁnd that no can occur on either side of s and C, so C −∗ is a context-free substitution c ⊆ Γ ∗ ×Γ ∗ , and s− ⊆ Γ ∗ × Σ ∗ is the inverse of a ﬁnite and epsilon-free subsitution. We have R∗ = e ◦ c ◦ s− , where e ⊆ Σ ∗ × Γ ∗ is the embedding of Σ ∗ in Γ ∗ , therefore the claim follows by Lemma 3 and Remark 2. Theorem 4. The following problem is decidable: Given: A regular language L over Σ; an inverse deleting SRS R over Σ. Question: Is there an inﬁnite R-derivation starting from a string in L? Proof. A ﬁnitely branching binary relation ρ is well-founded if and only if ρ∗ is ﬁnitely branching and ρ+ is irreﬂexive. Note that if R− is deleting then R−+ is well-founded, hence irreﬂexive. So there is an inﬁnite R-derivation starting from a string in L if, and only if, Inf(R∗ )∩L = ∅. By Proposition 5, Inf(R∗ ) is regular, so emptiness of Inf(R∗ ) ∩ L is decidable. Corollary 2. Termination and uniform termination are decidable for inverse deleting string rewriting systems. Proof. Choose L = {x} to decide whether there is an inﬁnite derivation starting with string x, and choose L = Σ ∗ to decide uniform termination. Example 7. McNaughton [13] proves decidability of termination and of uniform termination for the following class of string rewriting systems: A system R is called an inhibitor system, if there is a letter a ∈ / Σ such that ∈ Σ + and ∗ ∗ r ∈ (Σ ∪ {a}) \ Σ for every rule → r in R. (Inhibitor systems play a vital role in solving the uniform termination problem of well-behaved SRSs [13].) We can give an alternative proof by observing that an inhibitor system R is inverse deleting for the ordering that makes a greater than every other letter. Hence decidability of (uniform) termination follows from Corollary 2. As a bonus, we get context-freeness of R∗ (x) for x ∈ Σ ∗ , a result by Ginsburg and Greibach [8]. This shows once more that language classes and the uniform termination problem are intrinsically related. Theorem 5. Termination and uniform termination is decidable for string rewriting systems R for which R− is match-bounded. Proof. Assume R− match-bounded by c. Then each derivation modulo R corresponds to a derivation modulo matchc (R− )− = S, by the remark before Deﬁnition 1. So termination of R and S coincide. By Proposition 2, S is an inverse deleting system, and by Corollary 2, (uniform) termination of S is decidable. Example 8. Proving termination of the one-rule system Z = {aabb → bbbaaa} is known as Zantema’s Problem. This is a “modern classic” in rewriting [3,4,12,19,20,22], as it provides a test case where most of the automated methods for termination proofs fail. The match-bound of Z − is 2, therefore termination can be mechanically veriﬁed. (Recall that the fact that Z − is inverse match-bounded is in itself not a proof of termination for Z.) The computation of match(Z − )∗ (Σ ∗ ) according to Section 5 takes ﬁve iterations (i.e.,

Match-Bounded String Rewriting Systems

457

Z4− = Z5− = Z6− = · · · ). In our implementation (Haskell code compiled with ghc-5.04.2), this needs about 70 CPU seconds on a 2.4-GHz Pentium. The resulting automaton has 199 states. The intermediate constructions according to Theorem 1 involve much larger automata (up to 1576 states with 15999 transitions), on much larger alphabets (up to 283 letters).

7

Discussion

If the ﬂow of information during rewriting is suitably restricted, nice properties hold: termination, bounded derivational complexity, or preservation of regular languages. For instance, McNaughton [13] and independently Ferreira and Zantema [5] use extra letters to indicate absence of information ﬂow through certain positions. Kobayashi et al. [11] restrict derivations by using markers for the start and the end of a redex. S´enizergues [19] constructs ﬁnite automata to solve the termination problem for certain one-rule string rewriting systems. Moczydlowski and Geser [14,15] restrict the way the right hand side of a rule may be consumed in order to simulate the rewrite relation by the computation of a pushdown automaton. Our concepts of deleting and match-bounded rewriting aim at extending these approaches to a systematic theory of termination by language properties. The concept of match-bounded string rewriting opens two novel approaches to automated termination proofs: match-bounded systems are terminating, and for inverse match-bounded systems, termination is decidable. These methods can be further strengthened by considering match-boundedness not for all strings over the respective alphabet, but only for suitably chosen subsets. As we have demonstrated elsewhere [7], the right hand sides of forward closures are a suitable such subset. We expect these powerful tools to enable some major progress in the problem of deciding uniform termination for one-rule string rewriting systems, an open problem for 13 years [12], see [18, Problem 21]. Single-player games like Peg Solitaire can be analyzed through the construction of reachability sets. It is very challenging to extend this approach to twoplayer rewriting games [21]. Instead of termination (which is required anyway to give a well-deﬁned game), for instance, one would like to know whether winning sets are regular. Even the impartial case is hard; here the central question is whether Grundy values are bounded. It seems natural to carry over the notion of match-boundedness to term rewriting, in order to obtain both closure properties and new automated termination proof methods.

Acknowledgements This research was supported in part by the National Aeronautics and Space Administration (NASA) while the last two authors were visiting scientists at the Institute for Computer Applications in Science and Engineering (ICASE), NASA Langley Research Center (LaRC), Hampton, VA, in September 2002.

458

Alfons Geser, Dieter Hofbauer, and Johannes Waldmann

References 1. J. Berstel. Transductions and Context-Free Languages. Teubner, Stuttgart, 1979. 2. R. V. Book and F. Otto. String-Rewriting Systems. Texts and Monographs in Computer Science. Springer-Verlag, New York, 1993. 3. T. Coquand and H. Persson. A proof-theoretical investigation of Zantema’s problem. In M. Nielsen and W. Thomas (Eds.) 11th Annual Conf. of the EACSL CSL-97, Lect. Notes Comp. Sci. Vol. 1414, pp. 177–188. Springer-Verlag, 1998. 4. N. Dershowitz and C. Hoot. Topics in termination. In C. Kirchner (Ed.), Proc. 5th Int. Conf. Rewriting Techniques and Applications RTA-93, Lect. Notes Comp. Sci. Vol. 690, pp. 198–212. Springer-Verlag, 1993. 5. M. C. F. Ferreira and H. Zantema. Dummy elimination: Making termination easier. In H. Reichel (Ed.), 10th Int. Symp. Fundamentals of Computation Theory FCT-95, Lect. Notes Comp. Sci. Vol. 965, pp. 243–252. Springer-Verlag, 1995. 6. T. Genet and F. Klay. Rewriting for Cryptographic Protocol Veriﬁcation. In D. A. McAllester (Ed.), 17th Int. Conf. Automated Deduction CADE-17, Lect. Notes Artiﬁcial Intelligence Vol. 1831, pp. 271–290. Springer-Verlag, 2000. 7. A. Geser, D. Hofbauer, and J. Waldmann. Match-bounded string rewriting systems and automated termination proofs. 6th Int. Workshop on Termination WST-03, Valencia, Spain, 2003. 8. S. Ginsburg and S. A. Greibach. Mappings which preserve context sensitive languages. Inform. and Control, 9(6):563–582, 1966. 9. T. N. Hibbard. Context-limited grammars. J. ACM, 21(3):446–453, 1974. 10. D. Hofbauer and J. Waldmann. Deleting string rewriting systems preserve regularity. In Proc. 7th Int. Conf. Developments in Language Theory DLT-03, Lect. Notes Comp. Sci., Springer-Verlag, 2003. To appear. 11. Y. Kobayashi, M. Katsura, and K. Shikishima-Tsuji. Termination and derivational complexity of conﬂuent one-rule string-rewriting systems. Theoret. Comput. Sci., 262(1-2):583–632, 2001. 12. W. Kurth. Termination und Konﬂuenz von Semi-Thue-Systemen mit nur einer Regel. Dissertation, Technische Universit¨ at Clausthal, Germany, 1990. 13. R. McNaughton. Semi-Thue systems with an inhibitor. J. Automat. Reason., 26:409–431, 2001. 14. W. Moczydlowski Jr. Jednoregulowe systemy przepisywania sl´ ow. Masters thesis, Warsaw University, Poland, 2002. 15. W. Moczydlowski Jr. and A. Geser. Termination of single-threaded one-rule SemiThue systems. Technical Report TR 02-08 (273), Warsaw University, Dec. 2002. Available at http://research.nianet.org/˜geser/papers/single.html. 16. C. Moore and D. Eppstein. One-dimensional peg solitaire, and duotaire. In R. J. Nowakowski (Ed.), More Games of No Chance, Cambridge Univ. Press, 2003. 17. B. Ravikumar. Peg-solitaire, string rewriting systems and ﬁnite automata. In H.-W. Leong, H. Imai, and S. Jain (Eds.), Proc. 8th Int. Symp. Algorithms and Computation ISAAC-97, Lect. Notes Comp. Sci. Vol. 1350, pp. 233–242. SpringerVerlag, 1997. 18. The RTA list of open problems. http://www.lsv.ens-cachan.fr/rtaloop/. 19. G. S´enizergues. On the termination problem for one-rule semi-Thue systems. In H. Ganzinger (Ed.), Proc. 7th Int. Conf. Rewriting Techniques and Applications RTA-96, Lect. Notes Comp. Sci. Vol. 1103, pp. 302–316. Springer-Verlag, 1996.

Match-Bounded String Rewriting Systems

459

20. E. Tahhan Bittar. Complexit´e lin´eaire du probl`eme de Zantema. C. R. Acad. Sci. Paris S´er. I Inform. Th´ eor., t. 323:1201–1206, 1996. 21. J. Waldmann. Rewrite games. In S. Tison (Ed.), Proc. 13th Int. Conf. Rewriting Techniques and Applications RTA-02, Lect. Notes Comp. Sci. Vol. 2378, pp. 144– 158. Springer-Verlag, 2002. 22. H. Zantema and A. Geser. A complete characterization of termination of 0p 1q → 1r 0s . Appl. Algebra Engrg. Comm. Comput., 11(1):1–25, 2000.

Probabilistic and Nondeterministic Unary Automata Gregor Gramlich Institut f¨ ur Informatik Johann Wolfgang Goethe–Universit¨ at Frankfurt Robert-Mayer-Straße 11-15 60054 Frankfurt am Main, Germany [email protected] Fax: +49 - 69 - 798-28814

Abstract. We investigate unary regular languages and compare deterministic ﬁnite automata (DFA’s), nondeterministic ﬁnite automata (NFA’s) and probabilistic ﬁnite automata (PFA’s) with respect to their size. Given a unary PFA with n states and an -isolated cutpoint, we show 1 that the minimal equivalent DFA has at most n 2 states in its cycle. This result is almost optimal, since for any α < 1 a family of PFA’s can be α constructed such that every equivalent DFA has at least n 2 states. Thus we show that for the model of probabilistic automata with a constant error bound, there is only a polynomial blowup for cyclic languages. Given a unary NFA with n states, we show that eﬃciently √ approximating the size of a minimal equivalent NFA within the factor ln nn is impossible unless P = N P . This result even holds under the promise that the accepted language is cyclic. On the other hand we show that we can approximate a minimal NFA within the factor ln n, if we are given a cyclic unary n-state DFA.

1

Introduction

Regular languages and ﬁnite state automata as their acceptance devices, are well studied objects. We consider DFA’s, NFA’s and PFA’s with isolated cutpoint and compare their sizes. For an n-state PFA with -isolated cutpoint, the equivalent DFA needs at 1 n−1 most (1 + 2 ) states [9]. For a unary alphabet, Milani and Pighizzini [8] show √ the tight bound of Θ(e n ln n ) for the number of states in the cycle of the minimal DFA. This result does not depend on the size of the isolation and the proof of the lower bound actually relies on an isolation that tends to zero. We show that the isolation plays a crucial role, namely that L can be accepted by a DFA 1 with at most n 2 states in its cycle. Thus, for constant isolation , we improve the upper bound of Milani and Pighizzini to be a polynomial in n.

Partially supported by DFG project SCHN503/2-1

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 460–469, 2003. c Springer-Verlag Berlin Heidelberg 2003

Probabilistic and Nondeterministic Unary Automata

461

The minimization problem for DFA’s can be eﬃciently solved. But for a given DFA, the problem of determining the minimal number of states of an equivalent NFA is P SP ACE-complete [5]. A result of Stockmeyer and Meyer [10] shows that the problem of minimizing a given NFA is P SP ACE-complete for a binary alphabet and N P -complete for a unary alphabet. We show that, given an n-state NFA accepting L, it is impossible to eﬃciently approximate the number of states of a minimal NFA accepting L within a factor √ of ln nn unless P = N P . This result holds even under the promise that L is a unary cyclic language and can be extended to PFA’s with isolated cutpoint. On the other hand we show that if we are given a unary cyclic n-state DFA accepting L, then we can eﬃciently construct an equivalent NFA with at most k · (1 + ln n) states, where k is the number of states of a minimal NFA accepting L. This contrasts with a result of Jiang et al. [4] who show that the number of states of a minimal NFA, equivalent to a given unary DFA, cannot be computed in polynomial time, unless N P ⊆ DT IM E(nO(ln n) ). This result even holds, if we restrict the DFA to accept only cyclic languages. The next section gives a short introduction into unary NFA’s and unary PFA’s. Unary PFA’s with -isolated cutpoint, resp. unary NFA’s, are investigated in sections 3 and 4 respectively.

2

Preliminaries

We consider unary languages L ⊆ {a}∗ . A unary regular language is recognized by a DFA that starts with a possibly empty path and ends in a non-empty cycle. A language L is ultimately d-cyclic, if there is a µ ∈ IN0 , so that (aj ∈ L ⇔ j+d a ∈ L) holds for any j ≥ µ and we say that d is an ultimate period of L. A smallest ultimate period is called the minimal ultimate period c(L) and any ultimate period is a multiple of the minimal ultimate period. L is called cyclic, if the path of the minimal DFA for L is empty. For cyclic languages we use the term period instead of ultimate period and d-cyclic (resp. minimally d-cyclic) instead of ultimately d-cyclic (resp. minimally utimately d-cyclic). The size of an automaton A is the number of states of A. For a given regular language L, we use nsize(L) as the minimal size of an NFA accepting L. A normal form for unary NFA’s is established by Chrobak in [1]. His construction converts a given NFA N with n states into an equivalent NFA N consisting of a deterministic path and several deterministic cycles. Only the last state of the path branches nondeterministically into one state of each cycle. The path of N has length O(n2 ), and the number of all states in the cycles is bounded by n. Chrobak proves, that L(N ) is ultimately d-cyclic, where d is the least common multiple of the length of the cycles in N . For cyclic languages we introduce union automata as automata in Chrobak normal form with an empty path. Deﬁnition 1. A union automaton U is described by a collection (A1 , . . . , Ak ) of cyclic DFA’s. U accepts an kinput w iﬀ there is an Ai , such that Ai accepts w. The size of U is deﬁned as i=1 si , where si is the number of states of Ai .

462

Gregor Gramlich

To convert a union automaton U into an NFA with a single inital state, we simply add one state q0 and transitions from q0 to each state that succeeds an initial state of the deterministic automata that U consists of. Jiang, McDowell and Ravikumar [4] show a structural result about minimal unary NFA’s accepting cyclic languages. Fact 1. [4] Let L be a minimally D-cyclic unary language. Every minimal NFA accepting L can be obtained by converting some minimal union automaton U accepting L into an NFA. Moreover D is the least common multiple of the cycle lengths of U . αr 1 Consider the prime factorization of D = pα 1 ·. . .·pr , where the pi are distinct αr 1 and αi ∈ IN, then every NFA accepting L has at least pα 1 + . . . + pr states. This result oﬀers some clues about the composition of the (ultimate) period of a unary language which also apply to probabilistic ﬁnite automata which we deﬁne as follows. A unary PFA M with a set Q of n states is described by a stochastic n × n matrix A, a stochastic row vector π representing the initial distribution, and a column vector η ∈ {0, 1}n indicating the ﬁnal states. Observe that πAj η is the acceptance probability for input aj . The language accepted by M with respect to a cutpoint λ ∈ [0, 1] is L(M, λ) = {aj |πAj η > λ}. We call cutpoint λ -isolated, if for any j ∈ IN0 : |πAj η − λ| ≥ . We call a cutpoint isolated, if there is an > 0, so that it is -isolated. We regard A as the stochastic matrix of a ﬁnite Markov chain M, with rows and columns indexed by states, and consider the representation of M as a directed graph GA = (V, E) with V = Q. An arc from state q to state p exists in GA , if Ap,q > 0. We call a strongly connected component B ⊆ Q in GA ergodic1 , if starting in any state q ∈ B, we cannot reach any state outside of B. States within an ergodic component are called ergodic states, non-ergodic states are called transient. For an ergodic component B, the period of q ∈ B is deﬁned as dq = gcd{j| starting in q one can reach q with exactly j steps}. All states q ∈ B have the same period d = dq , which we call the period of B. Factorization and primality play an important role for (ultimate) periods. To estimate the size of the i-th prime number we use the following fact. Fact 2. [3] If pi is the i-th prime number, then i ln i ≤ pi ≤ 2i ln i for i ≥ 3.

3

Unary PFA’s with -Isolated Cutpoint

In [8] Milani and Pighizzini show, that the ergodic components of a unary PFA with isolated cutpoint basically play the same role as the cycles of an NFA in Chrobak normal form. The least common multiple D of the periods of these components is an ultimate period of the language L(M, λ) accepted by the PFA. This result does not take the isolation into account and yields√an exponential upper bound for the ultimate period, namely c(L(M, λ)) = O(e n ln n ) where n 1

Unlike some authors we do not require an ergodic component to be aperiodic.

Probabilistic and Nondeterministic Unary Automata

463

is the number of states in the PFA. We show that the ultimate period c(L(M, λ)) decreases signiﬁcantly with increasing isolation and this results in a polynomial upper bound for c(L(M, λ)), if is a constant. As a ﬁrst step, Lemma 1 shows that the period di of an ergodic component Bi with absorption probability ri < 2, where (πAt )p = prob(a random walk is eventually absorbed into Bi ), ri := lim t→∞

p∈Bi

does not play a role for c(L(M, λ)), neither do periods of collections of ergodic components with small combined absorption probability. Lemma 1. Let B1 , . . . , Bm be the ergodic components of a Markov chain with periods di and absorption probabilities ri , respectively. If the corresponding PFA M accepts L := L(M, λ) with -isolated cutpoint, then for any I ⊆ {1, . . . , m} with i∈I ri > 1 − 2, D(I) := lcm{di |i ∈ I} is an ultimate period of L and thus is a multiple of c(L). Proof (Sketch). For an ultimate period D of L the limit A∞ := limt→∞ (AD )t exists, where we require convergence in each entry of the matrix. This can be shown by bringing the matrix A into a normal form (see Gantmacher [2]), so that the stochastic submatrix Ai for each ergodic component Bi forms a block within A. If Bi has period di , then limt→∞ (Adi i )t exists. Since D is a multiple of every di , the limit of (AD )t exists. As a consequence from [8] and from the existence of this limit, for every δ there must be a µδ ∈ IN, such that for every j ≥ µδ , aj ∈ L ⇔ aj+D ∈ L and |(πAj )q − (πA(j mod D) A∞ )q | < δ. q∈Q

Let I ⊆ {1, . . . , m} be a set of indices with i∈I ri > 1 − 2. Assume that D(I) is not an ultimate period of L. Then there is some j > µδ with aj ∈ L and aj+D(I) ∈ L. So πAj η ≥ λ + and πAj+D(I) η ≤ λ − , and thus j j+D(I) η ≥ 2. Let (x)+ = x if x > 0, and π A −A let (x)+ = 0 otherwise. n Remember, that η ∈ {0, 1} . Then we have with QI := i∈I Bi ∪ {q|q transient} (π(Aj − Aj+D(I) ))q 2 ≤ q∈Q

≤

(π(Aj − Aj+D(I) ))+ q +

q∈QI

(π(Aj − Aj+D(I) ))+ q .

(1)

q∈QI

The proof of the existence of A∞ also shows that if we restrict the matrix A to all the states in QI and call the resulting substochastic matrix AI , then the D(I) limit limt→∞ (AI )t exists as well. And so, for δ = 2 − i∈I ri and for any j ≥ µδ , we get (π(Aj − Aj+D(I) ))+ (2) q < δ. q∈QI

But on the other hand, for any j ≥ 0

464

Gregor Gramlich

(π(Aj − Aj+D(I) ))+ q ≤

q∈QI

(πAj )q ≤

q∈QI

ri = 2 − δ.

(3)

i∈I

The second inequality follows, since the absorption probability is the limit of a monotonically increasing sequence. So we have reached a contradiction, since the sum of (3) and (2) does not satisfy (1). We can now exclude some prime powers as potential divisors of c(L(M, λ)). Deﬁnition 2. Let M be a PFA with ergodic periods di and absorption probabilities ri . We call a prime power q = ps -essential (for M ), if ri ≥ 2 and ri < 2. i: q divides di

i: q·p divides di

Lemma 2. If λ is -isolated for a PFA M , then D= q. q is −essential

is an ultimate period of L = L(M, λ). Hence D is a multiple of c(L). Proof. Assume that c(L) is a multiple of a prime power pk which does not divide any -essential prime power. Let J = {i|pk divides di }, and let I = {1, . . . , m}\J be the complement of J. Then pk does not divide any di with i ∈ I and thus pk k does not divide D(I) = lcm{d i |i ∈ I}.Since p does not divide any -essential prime power, we have that i∈J ri < 2, and so i∈I ri > 1 − 2. According to Lemma 1, D(I) is a multiple of c(L). But on the other hand D(I) is not a multiple of pk . This is a contradiction, since pk was assumed to divide c(L). Now we show the tight upper bound for the minimal ultimate period of a language accepted by an -isolated PFA. Theorem 1. a) For any unary PFA M with n states and -isolated cutpoint λ 1

c(L(M, λ)) ≤ n 2 . 1 with m ∈ IN, there is a PFA M with n b) For any 0 ≤ α < 1 and any = 2m α states and -isolated cutpoint λ, such that c(L(M, λ)) > n 2 .

Proof. a) Let M have m ergodic components with periods d1 , . . . , dm . Set D := q and remember, that q is −essential i: q divides di ri ≥ 2 for any -essential q, then ri 2 2 q ≤ q i: q divides di D = q is −essential

=

q is −essential m i=1 q is −essential, q divides di

q ri ≤

m i=1

dri i .

Probabilistic and Nondeterministic Unary Automata

465

m Now, since i=1 ri = 1, the weighted arithmetic mean is at least as large as the geometric mean, and thus m

ri d i ≥

i=1

m i=1

dri i .

Since D ≥ c(L(M, λ)) with Lemma 2, we obtain n≥

m

di ≥

i=1

m

ri d i ≥

i=1

m i=1

dri i ≥ D2 ≥ c(L(M, λ))2 .

And the claim follows. b) Let p1 , p2 , . . . be the sequence of prime numbers. We deﬁne the languages

k+m−1 j Lk,m = a j ≡ 0 mod pi i=k

k+m−1

m for k, m ≥ 1. Obviously c(Lk,m ) = i=k p i ≥ pm k ≥ (k ln k) . 1 On the other hand Lk,m can be accepted by a PFA with isolation = 2m 1 and cutpoint λ = 1 − 2m as follows. We deﬁne a “union automaton with an initial distribution” by setting up m disjoint cycles of length pk , pk+1 , . . . , pk+m−1 , respectively. The transition probability from one state to the next in a cycle is 1. There is exactly one ﬁnal state in each cycle and the initial distribution 1 places probability m on each ﬁnal state. For every word az ∈ Lk,m we have z ≡ 0(mod pi ) for every k ≤ i ≤ k + m − 1 and for every word az ∈ Lk,m there is at least one i with z ≡ 0(mod pi ). Thus a word is either accepted with 1 probability 1, or it can reach acceptance probability at most 1 − m . Applying Fact 2, the number of states in the PFA is

k+m k+m−1 k+m−1 pi ≤ 2 i ln i ≤ 2 x ln x dx nk,m = i=k

=2

i=k

k

2 x=k+m

x x ln x − 2 4 2

x=k

≤ (k + 2km + m ) ln(k + m) − k 2 ln k m + (2km + m2 ) ln(k + m). = k 2 ln 1 + k m k ≤ ln em = m, But since k ln 1 + m k = ln 1 + k 2

2

nk,m ≤ km + (2km + m2 ) ln(k + m) ≤ (3km + m2 ) ln(k + m). Thus for any 0 ≤ α < 1, any constant m =

1 2

and a suﬃciently large k, we have a

2 , c(Lk,m ) ≥ (k ln k)m > ((3km + m2 ) ln(k + m))αm ≥ nk,m

and the claim follows.

Our result shows that for a ﬁxed isolation , the ultimate period of the language accepted by the PFA M with n states is only polynomial in n.

466

4

Gregor Gramlich

Approximating the Size of a Minimal NFA

Stockmeyer and Meyer [10] show, that the universe problem L(N ) = Σ ∗ is N P complete for regular expressions and NFA’s N , even if we consider only unary languages. Since our argument is based on their construction, we show the proof. Fact 3. [10] For a unary NFA N , it is N P -hard to decide, if L(N ) = {a}∗ . Proof. We reduce 3SAT to the universe problem for unary NFA’s. Let Φ be a 3CNF-formula overn variables with m clauses. Let p1 , . . . , pn be the ﬁrst n n primes and set D := i=1 pi . According to the chinese remainder theorem, the n function µ : IN0 → IN0 with µ(x) = (x mod p1 , . . . , x mod pn ) is injective, if we restrict the domain to {0, . . . , D − 1}. We call x a code (for an assignment), if µ(x) ∈ {0, 1}n . We construct a union automaton NΦ that accepts {a}∗ iﬀ Φ is not satisﬁable. We ﬁrst make sure, that L0,Φ = {ak |k is not a code} is accepted. Therefore, for every prime pi (pi > 2) we construct a cycle that accepts the words aj with j ≡ 0(mod pi ) ∧ j ≡ 1(mod pi ). So there are 2 non-ﬁnal states and (pi − 2) ﬁnal states in the cycle. For every clause C of Φ with variables xi1 , xi2 , xi3 we construct a cycle C ∗ of length pi1 pi2 pi3 . C ∗ will accept {ak | the assignment k mod pij for xij (j = 1, 2, 3) does not satisfy C}. Since the falsifying assignment is unique for the three variables in question, exactly one state is accepting in C ∗ . The construction can be done in time polynomial in the length of Φ. If there is a word aj ∈ L(NΦ ), then j is a code for a satisfying assignment. On the other hand every satisfying assignment has a code j and aj is not accepted by NΦ . We set LΦ = L(NΦ ) for the automaton NΦ constructed above. Observe that LΦ is a union of cyclic languages and hence itself cyclic. Obviously if Φ ∈ 3SAT , 1. We will show, that for Φ ∈ 3SAT every then the minimal NFA for LΦ has size n NFA accepting LΦ must have at least i=2 pi states, which implies Theorem 2. Theorem 2. Given an NFA N with n states, it is impossible to eﬃciently ap√ n proximate nsize(L(N )) within a factor of ln n unless P = N P . We ﬁrst determine a lower bound for the period of LΦ . Lemma 3. For nany given 3CNF-formula Φ ∈ 3SAT the minimal period of LΦ is either D := i=2 pi or 2D. Proof. LΦ is 2D-cyclic, since 2D is the least common multiple of the cycle lengths of NΦ . Assume that neither D nor 2D is the minimal period of LΦ . Then there is i ≥ 2, such that d = pDi is a period of LΦ . We know that aqpi +2 ∈ L0,Φ for every q ∈ IN, because qpi + 2 does not represent a code. Since L0,Φ ⊆ LΦ and we assume that LΦ is d-cyclic, aqpi +2+rd belongs to LΦ for every r ∈ IN as well. On the other hand, since LΦ = {a}∗ , there is an al ∈ LΦ , and so al+td ∈ LΦ for every t ∈ IN. It is a contradiction, if we ﬁnd q, r, t ∈ IN0 , so that qpi +2+rd =

Probabilistic and Nondeterministic Unary Automata

467

l + td, since the corresponding word has to be in LΦ because of the left-hand side of the equation and cannot be in LΦ because of the right-hand side. ∃q, r, t : qpi + 2 + rd = l + td ⇔ ∃q, r, t : qpi = l − 2 + (t − r)d (mod d) ⇔ ∃q : qpi ≡ l − 2 ⇔ ∃q : q ≡ (l − 2)p−1 (mod d) i The multiplicative inverse of pi modulo d exists, since gcd(pi , d) = 1, and we have obtained the desired contradiction. We will need a linear relation between the number of clauses and variables in the CNF-formula. Fact 4. Let E3SAT −E5 be the satisﬁability problem for formulae with exactly 3 literals in every clause and every variable appearing in exactly 5 distinct clauses, then E3SAT − E5 is N P -complete. The following Lemma determines a lower bound for the size of an NFA equivalent to NΦ , if Φ is satisﬁable. Lemma 4. Let Φ ∈ E3SAT − E5 and assume that Φ consists of m clauses. Then nsize(L(NΦ )) ≥ cm2 ln m for some constant c. Proof. We know from Lemma 3, that L(NΦ ) is either minimally D-cyclic or 2Dn cyclic with D = i=2 pi where n is the number of variables n in Φ. Applying Fact 1 the size of a minimal NFA accepting LΦ is at least i=2 pi . We observe that n

i=2

pi ≥

n

i=1

i ln i ≥

n 1

x ln x dx ≥

n2 4

ln n

We have 5n = 3m and thus nsize(LΦ ) ≥ cm2 ln m for some constant c.

Finally we determine an upper bound for the size of the NFA NΦ . Lemma 5. Let Φ be a 3CNF formula with m clauses and exactly 5 appearances of every variable. Then the NFA NΦ has size Θ m4 (ln m)3 . Proof. The number of states in a cycle for a clause is a product of three primes. 3 (m ln m) ) states in all of these cycles. The So there are at most m · p3n = Θ(m n 2 cycles recognizing L0,Φ have i=2 pi = Θ(n ln n) states, where n is the number of variables of Φ. Since n = Θ(m) the claim follows. Proof (of Theorem 2). Assume that the polynomial time deterministic algorithm √ A approximates nsize(L(N )) within the factor ln ss for an NFA N with s states. We show that the satisﬁablity problem can be decided in polynomial time. Let Φ be the given input for the E3SAT − E5 problem, where we assume that Φ has n variables and m clauses. We construct the NFA NΦ as in fact 3. If Φ is not satisﬁable, then nsize(LΦ ) = 1, and according to Lemma 5 the algorithm A claims that an equivalent NFA with at most √ (Θ(m4 (ln m)3 )) s = o(m2 ln m) = ln s ln(Θ(m4 (ln m)3 ))

468

Gregor Gramlich

n 2 states exists. Since i=2 pi = Θ(m ln m), the claimed number of states is asymptotically smaller than nsize(LΨ ) for any satisﬁable formula Ψ with the same number of clauses as Φ. Hence with the help of A, we can decide if Φ is satisﬁable within polynomial time. Remark 1. For every 0 < ≤ 1 the same construction as in the proof of Theorem 2 can be used to show that it is not possible to approximate the size of a minimal 1 PFA with isolation equivalent to a given n-state PFA with isolation c · n− 4 √ within the factor ln nn . For a given formula Φ with m clauses we construct the PFA MΦ with m cycles2 and uniform initial distribution for the initial states of each cycle. We 1 . Hence a word is accepted by MΦ iﬀ it is accepted deﬁne the cutpoint as λ = 2m 1 1 ≥ c · n− 4 for by at least one cycle. Thus the cutpoint λ is δ-isolated with δ = 2m some appropriate c, and MΦ behaves like a union automaton. Since L(MΦ , λ) is the same languageas considered before, it is 1-cyclic if Φ is not satisﬁable m and has period D = i=2 pi or 2D if Φ is satisﬁable. Every PFA with isolated m m cutpoint that accepts a language with period i=2 pi has at least i=2 pi states [7], independent of the actual isolation. The approximation complexity changes if a unary cyclic language is speciﬁed by a DFA M , although the decision problem, namely to decide whether there is a k-state NFA accepting the cyclic language L(M ), is not eﬃciently solvable unless N P ⊆ DT IM E(nO(ln n) ) [4]. Theorem 3. Given a unary cyclic DFA accepting L with D states, an NFA for L with at most nsize(L) · (1 + ln D) states computed in polynomial time. can be3 Observe that nsize(L) · (1 + ln D) = O nsize(L) 2 ln nsize(L) . Proof. We reduce the optimization problem for a given cyclic DFA M to an instance of the weighted set cover problem. We can assume M to be a minimal cyclic D-state DFA with the set of states Q = {0, . . . , D − 1}, 0 as the initial state, and ﬁnal states F ⊆ Q. Then L(M ) = {aj+kD |j ∈ F, k ∈ IN0 }. For every dl that divides D we construct a deterministic cycle Cl with period dl . The union automaton consisting of these cycles will accept L(M ), if we choose the ﬁnal states of Cl as follows: For each aj ∈ L with 0 ≤ j < dl , we let Cl accept aj , iﬀ aj+k·dl ∈ L(M ) for any 0 ≤ k < dDl . Remember, that we don’t have to check for ax with x ≥ D, since L(M ) is D-cyclic and dl divides D. At this stage the union automaton will have a lot of unnecessary cycles. Therefore we deﬁne an instance of the set cover problem, where we introduce a set Tl := {j|0 ≤ j < D, aj is accepted by Cl } of weight wl := dl for every cycle Cl . The universe is {j|0 ≤ j < D, aj ∈ L(M )}. The instance can be constructed in polynomial time, since the number of divisors of D is less than D and thus the set cover problem consists of at most D sets with at most D elements. If N is a minimal NFA accepting L(M ), then we know from Fact 1 that N is a union automaton (with an additional initial state) that consists of cycles 2

To check the validity of a code we can also use the clause cycles.

Probabilistic and Nondeterministic Unary Automata

469

with periods that divide D. Every cycle C ∗ of N corresponds to a set Tl and the accepted words of C ∗ up to length D − 1 are contained in Tl . So a minimal union automaton with n states can be expressed by a set cover of weight n. On the other hand, every set cover can be considered to be a union automaton. Thus a minimal set cover corresponds to a minimal NFA. The greedy algorithm for the weighted set problem approximates the cover k optimal set cover within the factor H(k) = i=1 k1 ≤ 1 + ln k, where k is the size of the largest set [6]. For an n-state NFA N Chrobak [1] bounds the size of √ c(L(N )) by the Landau function and receives D = O(e n ln n ).

5

Conclusions and Open Problems

In Theorem 1 we have shown that PFA’s with constant isolation lead to only polynomially smaller automata in comparison to cyclic unary DFA’s. It is not hard to observe that PFA’s with constant isolation are negatively exponentially smaller than DFA’s for non-cyclic unary languages. The size relation between minimal PFA’s and minimal DFA’s for non-cyclic unary languages is to be further explored. The hardness result of Theorem 2 for minimizing unary NFA’s is tight within √ a square, since size ln nn is excluded for a given NFA of size n. Is Theorem 2 “essentially” optimal? Jiang and Ravikumar [5] state the open problem of approximating a minimal NFA given a DFA. Speciﬁcally to determine the complexity of designing an NFA accepting L(M ) with at most nsize(L(M ))k states for a given DFA M and a given k. We have answered the question for the case of unary cyclic DFA’s and k > 32 in Theorem 3.

References 1. Chrobak, M.: Finite automata and unary languages, Theoretical Computer Science 47, 1986, pp. 149-158. 2. Gantmacher, F.R.: Theory of Matrices, Vol. II, Chelsea, New York, 1959. 3. Graham, R., Knuth, D., Patashnik, O.: Concrete Mathematics, Addison Wesley, Reading, Massachusetts, 1989. 4. Jiang, T., McDowell, E., Ravikumar, B.: The structure and complexity of minimal NFA’s over a unary alphabet, Int. J. Found. of Comp. Sci., 2, 1991, pp. 163-182. 5. Jiang, T., Ravikumar, B.: Minimal NFA problems are hard, SIAM Journal on Computing, 22 (1), 1993, pp. 1117-1141. 6. Hochbaum, D. (editor): Approximation algorithms for N P -hard problems, PWS Publishing Company, Boston, 1997. 7. Mereghetti, C., Palano, B., Pighizzini, G.: On the succinctness of deterministic, nondeterministic, probabilistic and quantum ﬁnite automata, DCAGRS 2001. 8. Milani, M., Pighizzini, G.: Tight bounds on the simulation of unary probabilistic automata by deterministic automata, DCAGRS 2000. 9. Rabin, M.: Probabilistic automata, Information and Control, 1963, pp. 230-245. 10. Stockmeyer, L., Meyer, A.: Word Problems Requiring Exponential Time, Proc. of the 5th Ann. ACM Symposium on Theory of Computing, New York, 1973, pp. 1-9.

On Matroid Properties Deﬁnable in the MSO Logic Petr Hlinˇen´ y ´ SAV) Institute of Mathematics and Comp. Science (MU Matej Bel University and Slovak Academy of Sciences Severn´ a ul. 5, 974 00 Bansk´ a Bystrica, Slovakia [email protected]

Abstract. It has been proved by the author that all matroid properties deﬁnable in the monadic second-order (MSO) logic can be recognized in polynomial time for matroids of bounded branch-width which are represented by matrices over ﬁnite ﬁelds. (This result extends so called “M S2 -theorem” of graphs by Courcelle and others.) In this work we review the MSO theory of ﬁnite matroids and show some interesting matroid properties which are MSO-deﬁnable. In particular, all minorclosed properties are recognizable in such way. Keywords: matroid, branch-width, MSO logic, parametrized complexity.

1

Introduction

The theory of parametrized complexity provides a background for analysis of diﬃcult algorithmic problems which is ﬁner than classical complexity theory. We postpone formal deﬁnitions till Section 3. Brieﬂy saying, a problem is called “ﬁxed-parameter tractable” if there is an algorithm having running time with the (possible) super-polynomial part separated in terms of some natural “parameter”, which is supposed to be small even for large input in practice. (Successful practical applications of this concept are known, for example, in computational biology or in database theory.) We are interested in algorithmic problems that are parametrized by a “treelike” structure of the input objects. Graph “branch-width” is closely related to well-known tree-width [13], but a branch decomposition does not refer to vertices, and so branch-width directly generalizes from graphs to matroids. It follows from works of Courcelle [2] and Bodlaender [1] that all graph problems deﬁnable in the monadic second-order logic can be solved in linear time for graphs of bounded tree-width. Those include many notoriously hard problems like 3-colouring, Hamiltonicity, etc.

Parts of this research have been done during author’s stay at the Victoria University of Wellington in New Zealand. From August 2003 also Department of Computer Science, Technical University Ostrava, Czech Republic.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 470–479, 2003. c Springer-Verlag Berlin Heidelberg 2003

On Matroid Properties Deﬁnable in the MSO Logic

471

We study and present analogous results for matroids representable over ﬁnite ﬁelds. The motivation of our research is mainly theoretical — to show how the mentioned complexity phenomenon extends from graphs to a much larger class of combinatorial objects, and to stimulate further research interest in matroid branch-width and the complexity of matroid problems. (Unfortunately, wide generality of our approach leads to impractically huge constants involved in the algorithms, such as in Theorem 4.1.) Since not all computer scientists are familiar with structural matroid theory or with parametrized complexity, we give a basic overview of necessary concepts in the next two sections.

2

Matroids and Branch-Width

We refer to Oxley [12] for matroid terminology. A matroid is a pair M = (E, B) where E = E(M ) is the ground set of M (elements of M ), and B ⊆ 2E is a nonempty collection of bases of M . Moreover, matroid bases satisfy the “exchange axiom”; if B1 , B2 ∈ B and x ∈ B1 − B2 , then there is y ∈ B2 − B1 such that (B1 − {x}) ∪ {y} ∈ B. We consider only ﬁnite matroids. Subsets of bases are called independent sets, and the remaining sets are dependent. Minimal dependent sets are called circuits. All bases have the same cardinality called the rank r(M ) of the matroid. The rank function rM : 2E → N of M tells the maximal cardinality rM (X) of an independent subset of a set X ⊆ E(M ). If G is a graph, then its cycle matroid on the ground set E(G) is denoted by M (G). The bases of M (G) are the (maximal) spanning forests of G, and the circuits of M (G) are the cycles of G. Another example of a matroid is a ﬁnite set of vectors with usual linear dependency. If A is a matrix, then the matroid formed by the column vectors of A is called the vector matroid of A, and denoted by M (A). The matrix A is a representation of a matroid M M (A). We say that the matroid M (A) is F-represented if A is a matrix over a ﬁeld F. The dual matroid M ∗ of M is deﬁned on the same ground set E, and the bases of M ∗ are the set-complements of the bases of M . The dual rank function satisﬁes rM ∗ (X) = |X| − r(M ) + rM (E − X). A set X is coindependent in M if it is independent in M ∗ . An element e of M is called a loop (a coloop), if {e} is dependent in M (in M ∗ ). The matroid M \ e obtained by deleting a noncoloop element e is deﬁned as (E − {e}, B − ) where B − = {B : B ∈ B, e ∈ B}. The matroid M/e obtained by contracting a non-loop element e is deﬁned using duality M/e = (M ∗ \ e)∗ . (This corresponds to contracting an edge in a graph.) A minor of a matroid is obtained by a sequence of deletions and contractions of elements. Since these operations naturally commute, a minor M of a matroid M can be uniquely expressed as M = M \ D/C where D are the coindependent deleted elements and C are the independent contracted elements. A matroid family M is minor-closed if M ∈ M implies that all minors of M are in M. A matroid N is called an excluded minor (also known as “forbidden”) for a minor-closed family M if N ∈ M but N ∈ M for all proper minors N of N . The connectivity function λM of a matroid M is deﬁned for all subsets A ⊆ E = E(M ) by λM (A) = rM (A) + rM (E − A) − r(M ) + 1. Notice that λM (A) =

472

Petr Hlinˇen´ y 1

2

4

3

8

7 4

2

1

9

3

1

2

6

5

7

4

8 5

3

6

5 6

7

8 9

4

8

3

7

2

6

1

5

Fig. 1. Two examples of width-3 branch decompositions of the Pappus matroid (top left, rank 3) and of the binary aﬃne cube (bottom left, rank 4). Here the lines depict linear dependencies between matroid elements.

λM (E − A). Is is also routine to verify that λM (A) = λM ∗ (A), i.e. matroid connectivity is dual-invariant. A subset A ⊆ E is k-separating if λM (A) ≤ k. A partition (A, E−A) is called a k-separation if A is k-separating and both |A|, |E− A| ≥ k. For n > 1, the matroid M is called n-connected if it has no k-separation for k = 1, 2, . . . , n − 1, and |E(M )| ≥ 2n − 2. (A connected matroid corresponds to a vertex 2-connected graph. Geometric interpretation of a k-separation (A, B) is that the spans of A and of B intersect in a subspace of rank less than k.) Let (T ) denote the set of leaves of a tree T . A branch decomposition of a matroid M is a pair (T, τ ) where T is a tree of maximal degree three, and τ is a bijection of E(M ) onto (T ). Let f be an edge of T , and T1 , T2 be the connected components of T − f . The width of an edge f in T is λM (A) = λM (B), where A = τ −1 ((T1 )) and B = τ −1 ((T2 )). The width of the branch decomposition (T, τ ) is maximum of the widths of all edges of T , and the branch-width of M is the minimal width over all branch decompositions of M . If T has no edge, then we take its width as 0. An example of a branch decomposition is presented in Fig. 1. Notice that matroid branch-width is invariant under duality. It is straightforward to verify that branch-width does not increase when taking minors: Let (T, τ ) be a branch decomposition of a matroid M . Say, up to duality, that M = M \ e. We form T from T by deleting the leaf τ (e), and set τ to be τ restricted to E(M ). Then, for any partition (A, B) of E(M ) given by an edge f in T , we have obvious λM (A − {e}) ≤ λM (A), and so the width of (T , τ ) is not bigger than the width of (T, τ ) for M .

On Matroid Properties Deﬁnable in the MSO Logic

473

We remark that branch-width of a graph G is deﬁned analogously, using the connectivity function λG where λG (F ) for F ⊆ E(G) is the number of vertices incident both with F and E(G) − F . Clearly, branch-width of a graph G is never smaller than branch-width of its cycle matroid M (G). It is still an open conjecture that these numbers are actually equal. On the other hand, branchwidth is within a constant factor of tree-width in graphs [13]. Lastly in this section we mention few words about relations of matroid theory to computer science. As the reader surely knows, a greedy algorithm on a matroid is one of the basic tools in combinatorial optimization. That is why matroids naturally arise in a number of optimization problems; such as the minimum spanning tree or job assignment problems. More involved applications of matroids in combinatorial optimization could be found in numerous works of Edmonds, Cunningham and others. Besides that, the concept of branch-width has attracted increasing attention among matroid theorists recently, and several deep results of Robertson-Seymour’s graph minor theory have been extended from graphs to matroids representable over ﬁnite ﬁelds; such as [6]. Robertson-Seymour’s theory has been followed by many interesting algorithmic applications on graphs (mostly related to tree-width or branch-width). Therefore we think it is right time now to look at complexity aspects of branchwidth in matroid problems. For example, we have given a straightforward polynomial algorithm for computation of the Tutte polynomial [10] on a representable matroid of bounded branch-width. (It seems that matroids present a more suitable model than graphs for computing the Tutte polynomial on structures of bounded tree-/branch-width.) As yet another motivation we remark that linear codes over a ﬁnite ﬁeld F are in a direct correspondence with F-represented matroids.

3

Parametrized Complexity

When speaking about parametrized complexity, we closely follow Downey and Fellows [4]. Here we present the basic deﬁnition of parametrized tractability. For simplicity, we restrict the deﬁnition to decision problems, although an extension to computation problems is straightforward. Let Σ be the input alphabet. A parametrized problem is an arbitrary subset Ap ⊆ Σ ∗ × N. For an instance (x, k) ∈ Ap , we call k the parameter and x the input for the problem. (The parameter is sometimes implicit in the context.) We say that a parametrized problem Ap is (nonuniformly) ﬁxed-parameter tractable if there is a sequence of algorithms {Ai : i ∈ N}, and a constant c; such that (x, k) ∈ Ap iﬀ the algorithm Ak accepts (x, k), and that the running time of Ak on (x, k) is O(|x|c ) for each k. Similarly, a parametrized problem Ap is uniformly ﬁxed-parameter tractable if there is an algorithm A, a constant c, and an arbitrary function f : N → N; such that (x, k) ∈ Ap iﬀ the algorithm A accepts (x, k), and that the running time of A on (x, k) is O(f (k) · |x|c ). There is a natural correspondence of a parametrized problem Ap to an ordinary problem A = { x, k : (x, k) ∈ Ap } (for example, the problem of a

474

Petr Hlinˇen´ y

k-vertex cover in a graph), or to a problem A = {x : ∃k (x, k) ∈ Ap } if k is not “directly involved” in the question (such as a Hamiltonian cycle in a graph of tree-width k). On the other hand, an ordinary problem may have several natural parametrized versions respecting diﬀerent parameters. We remark that the parameter is formally a natural number, but that may encode arbitrary ﬁnite structures in a standard way. As we have already noted above, our interest is in parametrized problems where the parameter is branch-width (tree-width). Inspired by the algorithm of Bodlaender [1], we have shown that branch-width of matroids represented over ﬁnite ﬁelds is ﬁxed parameter tractable, and that, moreover, we could eﬃciently construct a branch decomposition. Let Bt denote the class of all matroids of branch-width at most t. We have proved the following: Theorem 3.1. (PH [9]) Let t ≥ 1 be ﬁxed, and let F be a ﬁnite ﬁeld. Suppose that A is an r × n matrix over F (r ≤ n) such that the represented matroid M (A) ∈ Bt . Then there is an algorithm that ﬁnds a branch decomposition of the matroid M (A) of width at most 3t in time O(n3 ). Actually, our algorithm directly constructs a so called “parse tree” for the mentioned branch decomposition. Unfortunately, the algorithm in Theorem 3.1 does not necessarily produce the optimal branch decomposition. On the other hand, there are ﬁnitely many excluded minors for the class Bk for each k, and these excluded minors are constructed algorithmically since they have size at most 15 (6k+1 − 1) by [5]. Hence, in this particular case, we can extend the idea in Theorem 5.2 to show: Corollary 3.2. Let F be a ﬁnite ﬁeld. Suppose that A is a given matrix over F. Then branch-width of the matroid M (A) is uniformly ﬁxed parameter tractable.

4

MSO Logic of Matroids

The monadic second-order (MSO) theory of matroids uses language based on the monadic second-order logic. The syntax includes variables for matroid elements and element sets, the quantiﬁers ∀, ∃ applicable to these variables, the logical connectives ∧, ∨, ¬, and the following predicates: 1. =, the equality for elements and their sets, 2. e ∈ F , where e is an element variable and F is an element set variable, 3. indep(F ), where F is an element set variable, and the predicate tells whether F is independent in the matroid. Moreover, we write φ → ψ to stand for ¬φ∨ψ, and X ⊆ Y for ∀x(x ∈ Y ∨x ∈ X). Notice that the “universe” of a formula (the model in logic terms) in the above theory is one particular matroid. To give a better feeling for the MSO theory of matroids, we provide few simple predicates now. We write basis(B) ≡ indep(B) ∧ ∀e e ∈ B ∨ ¬ indep(B ∪ {e}) where indep(B ∪ {e}) is a shortcut for obvious ∃X indep(X) ∧ e ∈ X ∧ B ⊆ X ∧ ∀x(x = e ∨ x ∈ B ∨ x ∈ X) . Similarly,

On Matroid Properties Deﬁnable in the MSO Logic

475

we write a predicate circuit(C) ≡ ¬ indep(C) ∧ ∀e e ∈ C → indep(C − {e}) where indep(C − {e}) is a shortcut for ∃X indep(X) ∧ e ∈ X ∧ X ⊆ C ∧ ∀x(x = e ∨ x ∈ C ∨ x ∈ X) . Let us now look at the (graph) property of being Hamiltonian. In matroid language, that means to have a circuit containing a basis. So we may write a sentence hamilton ≡ ∃C circuit(C) ∧ ∃e basis(C − {e}) . A related matroidal property is to be a paving matroid M — i.e., to have all circuits C in M of size |C| ≥ r(M ). Let us explain this sample property in detail. Since C − {e} is independent for each e ∈ C by deﬁnition of a circuit, we have |C| ≤ r(M ) + 1 for any circuit C in M . Considering a basis B ⊇ C − {e} and the inequality |C| ≥ r(M ) = |B| valid in a paving matroid, we conclude that there is an element f such that B ⊆ C ∪ {f}. The converse also holds. Hence we express paving ≡ ∀C circuit(C) → ∃f, B B ⊆ C ∪ {f } ∧ basis(B) . The reason why we are looking for properties deﬁnable in the MSO logic of matroids is, that such properties can be recognized in polynomial time for matroids of bounded branch-width over ﬁnite ﬁelds. The following result is based on a ﬁnite-state recognizability of matroidal MSO properties, proved by the author in [8], and on Theorem 3.1. Theorem 4.1. (PH [7,8,9]) Let F be a ﬁnite ﬁeld. Assume that M is a class of matroids deﬁned in one of the following ways; (a) there is an MSO sentence φ such that M ∈ M iﬀ φ is true on M , or (b) there is a sequence of MSO sentences {φk : k = 1, 2, . . .} and, for all k ≥ 1 and matroids M ∈ Bk , we have M ∈ M iﬀ φk is true on M . Suppose that A is an n-column matrix over F such that M (A) ∈ Bt where t ≥ 1 is ﬁxed. Then there is an algorithm deciding whether M (A) ∈ M in time O(n3 ), and this algorithm can be constructed from the given sentence(s) φ or φt for all t. Remark. In the language of parametrized complexity, Theorem 4.1 says that the class of F-represented matroids deﬁned by MSO sentences φ or φt is ﬁxedparameter tractable with respect to the combined parameter F, t . Moreover, in the case (a), or in the case (b) when the sentences φk are constructible by an algorithm, the class M is uniformly ﬁxed-parameter tractable. So it follows that the properties of being Hamiltonian or a paving matroid can be eﬃciently recognized on F-represented matroids of bounded branch-width. Other simple matroidal properties deﬁnable in the MSO logic are, for example, the properties of being identically self-dual, or being a “free spike” [11]. Moreover, all properties deﬁnable in the extended MSO theory of graphs (M S2 ) are also MSO-deﬁnable over graphic matroids [8]. Several more interesting classical matroid properties are shown to be MSO-deﬁnable in the next sections.

5

Minor-Closed Properties

It is easy to see that the class of F-representable matroids is minor-closed, and so is the class Bt of matroids of branch-width at most t. We say that a set S is wellquasi-ordered (WQO) if there are neither inﬁnite antichains nor inﬁnite strictly

476

Petr Hlinˇen´ y

descending chains in S. By a deep result of [6], matroids of bounded branchwidth which are representable over a ﬁxed ﬁnite ﬁeld F are WQO in the minor order. (However, unlike graphs, matroids are not WQO in general.) So it follows that any minor-closed matroid family M has a ﬁnite number of F-representable excluded minors in Bt . We now show that presence of one particular minor can be described by an MSO sentence. Lemma 5.1. Let N be a matroid. There is a (computable) MSO sentence ψN such that ψN is true on a matroid M if and only if M has an N -minor. Proof. N is a minor of M if and only if there are two sets C, D such that C is independent and D is coindependent in M , and that N = M \ D/C. Suppose that N = M \ D/C holds. Then a set X ⊆ E(N ) is dependent in N if and only if there is a dependent set Y ⊆ E(M ) in M such that Y − X ⊆ C. (This simple claim may be more obvious when viewed over the dual matroid M ∗ — a set is dependent in M iﬀ it intersects each basis of M ∗ , and N ∗ = M ∗ /D \ C.) Since N is ﬁxed, we may identify the elements of the (supposed) N -minor in M by variables x1 , . . . , xn in order, where n = |E(N )|. Then, knowing the contract set C (and implicit D), we are able to say which subsets of {x1 , . . . , xn } are dependent in M \ D/C. For each J ⊆ [1, n], we write mdep(xj : j ∈ J; C) ≡ ∃ Y ¬ indep(Y ) ∧ ∀y y ∈ Y ∨ y ∈ C ∨ y = xj . j∈J

Now, M \ D/C is isomorphic to N iﬀ the dependent subsets of {x1 , . . . , xn } exactly match the dependent sets of N . Hence we express ψN as

¬ mdep(xj : j ∈ J; C) ∧ mdep(xj : j ∈ J; C) , ∃ C ∃ x1 , . . . , xn J∈J+

J∈J−

where J+ is the set of all J ⊆ [1, n] such that {xj : j ∈ J} actually is independent in N , and where J− is the complement of J+ . 2 Hence, in connection with Theorem 4.1, we conclude: Theorem 5.2. Let t ≥ 1 be ﬁxed, let F be a ﬁnite ﬁeld, and let M be a minorclosed family. Given a matrix A over F with n columns such that M (A) ∈ Bt , one can decide whether the matroid M (A) belongs to M in time O(n3 ). Proof. As already noted above, the family M has a ﬁnite number of Frepresentable excluded minors X1 , . . . , Xp ∈ Bt . Keeping in mind that all minors of M (A) also belong to Bt , we see that M (A) ∈ M iﬀ M (A) has no minors isomorphic to X1 , . . . , Xp . (For formal completeness, we may verify M (A) ∈ Bt using Corollary 3.2.) We write φt ≡ ¬ψX1 ∧ . . . ∧ ¬ψXp using Lemma 5.1. Finally, we apply Theorem 4.1(b). 2 Applications of this theorem include determining the exact branch-width (cf. Section 3) or tree-width of a matroid, or deciding matroid orientability and representability over another ﬁeld.

On Matroid Properties Deﬁnable in the MSO Logic

477

Remark. Unfortunately, the proof of Theorem 5.2 is non-constructive — there is no way in general how to compute the excluded minors X1 , . . . , Xp , not even their number or size. So we cannot speak about uniform ﬁxed-parameter tractability here.

6

Matroid Connectivity

Another interesting task is to describe matroid connectivity in the MSO logic. That can be done quite easily. Lemma 6.1. Let M be a matroid on the ground set E, and let k ≥ 1. There is an MSO formula σk (X) which is true for X ⊆ E if and only if λM (X) ≥ k + 1. Proof. By deﬁnition, λM (X) ≥ k + 1 iﬀ rM (X) + rM (E − X) ≥ r(M ) + k. Using standard matroidal arguments, this is equivalent to stating that there exist two bases B1 , B2 of M such that B2 ∩ X ⊂ B1 and |(B1 − B2 ) ∩ X| ≥ k. We may formalize this statement as

σk (X) ≡ ∃B1 , B2 basis(B1 ) ∧ basis(B2 ) ∧ ∀x (x ∈ B2 ∧ x ∈ X) → x ∈ B1 ∧ ∧ ∃z1 , . . . , zk

i=j

zi = zj ∧

i

zi ∈ X ∧

i

zi ∈ B1 ∧

i

zi ∈ B2

. 2

So we may ﬁnish this section with the next immediate result: Corollary 6.2. For each n > 1, there is an MSO sentence κn which is true on a matroid M if and only if M is n-connected.

7

Transversal Matroids

A matroid M is transversal if there is a bipartite graph G with vertex parts V = E(M ) and W , such that the rank of any set X in M equals the largest size of a matching incident with X in G. (Equivalently, a transversal matroid is a union of rank-1 matroids.) We consider transversal matroids here mainly because they have long history of research, but there is not much known about their relation to branch-width. Two elements e, f in a matroid M are parallel if {e, f } form a circuit, and e, f are in series if e, f are parallel in the dual M ∗ . A series minor of a matroid M is obtained by a sequence of contractions of series elements and arbitrary deletions of elements in M . A matroid having a representation over GF (2) is called a binary matroid. The trouble with transversal matroids is that these are not closed under taking minors or duals. However, series minors of transversal matroids are transversal again. We cannot use a “series” analogue of Theorem 5.2 since there is no wellquasi-ordering property of series minors even of bounded branch-width. Still, we can say a bit:

478

Petr Hlinˇen´ y

Theorem 7.1. There is an MSO sentence τ which is true on a matroid M if and only if M is a binary transversal matroid. Sketch of proof. Let Ck2 denote the graph obtained from a length-k cycle Ck by adding one parallel edge to each edge of Ck . According to [3], the following is true: A matroid M is both binary and transversal if and only if M has no series minor isomorphic to either the 4-element line U2,4 , or the graphic matroids M (K4 ) or M (Ck2 ) for k ≥ 3. Let N = M \ D/C be a minor of M , and let F = E(N ). There are no problems to express that N is a series minor of M , i.e. that C consists of series elements of M \ D. (For simplicity, we assume no coloops.) We write ∀x ∈ C ∃y ∈ F ∀Z Z ⊆ F ∪ C ∧ basis(Z) → x ∈ Z ∨ y ∈ Z . Now let P be a matroid. We may express whether P is isomorphic to M (Ck2 ) (regardless of the value of k) as follows ∃Z circuit(Z) ∧ ∀x ∈ Z ∃y ∈ Z circuit(x, y) ∧ ∀y ∈ Z ∃!x x ∈ Z ∧ circuit(x, y) where ∃!x Π(x) is a shortcut for ∃x Π(x) ∧ ∀x, x x = x ∨ ¬Π(x) ∨ ¬Π(x ) . The rest of the proof proceeds by combining the previous formulas with the ideas in the proof of Lemma 5.1. (Considering matroid P as a minor of M , we use the predicate mdep from that proof to express circuit in the above formula.) We leave technical details to the reader. 2 Since the proof of Theorem 7.1 is very speciﬁc to binary matroids, we doubt that it could be extended to all matroids. Thus we ask: Problem 7.2. Is the property of being a transversal matroid MSO-deﬁnable?

Acknowledgement I would like to thank Prof. Geoﬀ Whittle from Victoria University for introducing me to the beauties of structural matroid theory, and to Prof. Rod Downey for pointing my research towards parametrized complexity of matroid problems. Moreover, I am grateful to the NZ Marsden Fund and the Victoria University of Wellington for supporting my stay in New Zealand.

References 1. H.L. Bodlaender, A Linear Time Algorithm for Finding Tree-Decompositions of Small Treewidth, SIAM J. Computing 25 (1996), 1305–1317. 2. B. Courcelle, The Monadic Second-Order Logic of Graphs I. Recognizable sets of Finite Graphs, Information and Computation 85 (1990), 12–75. 3. J. de Sousa, D.J.A. Welsh, A Characterisation of Binary Transversal Matroids, J. Math. Anal. Appl. 40 (1972), 55–59.

On Matroid Properties Deﬁnable in the MSO Logic

479

4. R.G. Downey, M.R. Fellows, Parametrized Complexity, Springer-Verlag, 1999. 5. J.F. Geelen, A.H.M. Gerards, N. Robertson, G.P. Whittle, On the Excluded Minors for the Matroids of Branch-Width k, J. Combin. Theory Ser. B, to appear (2003). 6. J.F. Geelen, A.H.M. Gerards, G.P. Whittle, Branch-Width and Well-QuasiOrdering in Matroids and Graphs, J. Combin. Theory Ser. B 84 (2002), 270–290. 7. P. Hlinˇen´ y, Branch-Width, Parse Trees, and Monadic Second-Order Logic for Matroids (Extended Abstract), In: STACS 2003, Lecture Notes in Computer Science 2607, Springer Verlag (2003), 319–330. 8. P. Hlinˇen´ y, Branch-Width, Parse Trees, and Monadic Second-Order Logic for Matroids, submitted, 2002. 9. P. Hlinˇen´ y, A Parametrized Algorithm for Matroid Branch-Width, submitted, 2002. 10. P. Hlinˇen´ y, The Tutte Polynomial for Matroids of Bounded Branch-Width, submitted, 2002. 11. P. Hlinˇen´ y, It is Hard to Recognize Free Spikes, submitted, 2002. 12. J.G. Oxley, Matroid Theory, Oxford University Press, 1992,1997. 13. N. Robertson, P.D. Seymour, Graph Minors X. Obstructions to Tree-Decomposition, J. Combin. Theory Ser. B 52 (1991), 153–190.

Characterizations of Catalytic Membrane Computing Systems (Extended Abstract) Oscar H. Ibarra1 , Zhe Dang2 , Omer Egecioglu1 , and Gaurav Saxena1 1

2

Department of Computer Science University of California Santa Barbara, CA 93106, USA [email protected] Fax: 805-893-8553 School of Electrical Engineering and Computer Science Washington State University Pullman, WA 99164, USA

Abstract. We look at 1-region membrane computing systems which only use rules of the form Ca → Cv, where C is a catalyst, a is a noncatalyst, and v is a (possibly null) string of noncatalysts. There are no rules of the form a → v. Thus, we can think of these systems as “purely” catalytic. We consider two types: (1) when the initial conﬁguration contains only one catalyst, and (2) when the initial conﬁguration contains multiple (not necessarily distinct) catalysts. We show that systems of the ﬁrst type are equivalent to communication-free Petri nets, which are also equivalent to commutative context-free grammars. They deﬁne precisely the semilinear sets. This partially answers an open question in [19]. Systems of the second type deﬁne exactly the recursively enumerable sets of tuples (i.e., Turing machine computable). We also study an extended model where the rules are of the form q : (p, Ca → Cv) (where q and p are states), i.e., the application of the rules is guided by a ﬁnite-state control. For this generalized model, type (1) as well as type (2) with some restriction correspond to vector addition systems. Keywords: membrane computing, catalytic system, semilinear set, vector addition system, reachability problem.

1

Introduction

In recent years, there has been a burst of research in the area of membrane computing [16], which identiﬁes an unconventional computing model (namely a P system) from natural phenomena of cell evolutions and chemical reactions [2]. Due to the built-in nature of maximal parallelism inherent in the model, P systems have a great potential for implementing massively concurrent systems in an efﬁcient way, once future biotechnology (or silicon-technology) gives way to a practical bio-realization (or a chiprealization). In this sense, it is important to study the computing power of the model.

This research was supported in part by NSF Grants IIS-0101134 and CCR02-08595.

B. Rovan and P. Vojt´asˇ (Eds.): MFCS 2003, LNCS 2747, pp. 480–489, 2003. c Springer-Verlag Berlin Heidelberg 2003

Characterizations of Catalytic Membrane Computing Systems

481

Two fundamental questions one can ask of any computing device (such as a Turing machine) are: (1) What kinds of restrictions/variations can be placed on the device without reducing its computing power? (2) What kinds of restrictions/variations can be placed on the device which will reduce its computing power? For Turing machines, the answer to (1) is that Turing machines (as well as variations like multitape, nondeterministic, etc.) accept exactly the recursively enumerable (r.e.) languages. For (2), there is a wide spectrum of well-known results concerning various sub-Turing computing models that have been introduced during the past half century – to list a few, there are ﬁnite automata, pushdown automata, linearly bounded automata, various restricted counter automata, etc. Undoubtedly, these sub-Turing models have enhanced our understanding of the computing power of Turing machines and have provided important insights into the analysis and complexity of many problems in various areas of computer science. We believe that studying the computing power of P systems would lend itself to the discovery of new results if a similar methodology is followed. Indeed, much research work has shown that P systems and their many variants are universal (i.e., equivalent to Turing machines) [4,16,17,3,6,8,19] (surveys are found in [12,18]). However, there is little work in addressing the sub-Turing computing power of restricted P systems. To this end, we present some new results in this paper, speciﬁcally focusing on catalytic P systems. A P system S consists of a ﬁnite number of membranes, each of which contains a multiset of objects (symbols). The membranes are organized as a Venn diagram or a tree structure where one membrane may contain 0 or many membranes. The dynamics of S is governed by a set of rules associated with each membrane. Each rule speciﬁes how objects evolve and move into neighboring membranes. The rule set can also be associated with priority: a lower priority rule does not apply if one with a higher priority is applicable. A precise deﬁnition of S can be found in [16]. Since, from a recent result in [19], P systems with one membrane (i.e., 1-region P systems) and without priority are already able to simulate two counter machines and hence universal [14], for the purposes of this paper, we focus on catalytic 1-region P Systems, or simply catalytic systems (CS’s) [16,19]. A CS S operates on two types of symbols: catalytic symbols called catalysts (denoted by capital letters C, D, etc) and noncatalytic symbols called noncatalysts (denoted by lower case letters a, b, c, d, etc). An evolution rule in S is of the form Ca → Cv, where C is a catalyst, a is a noncatalyst, and v is a (possibly null) string (an obvious representation of a multiset) of noncatalysts. A CS S is speciﬁed by a ﬁnite set of rules together with an initial multiset (conﬁguration) w0 , which is a string of catalysts and noncatalysts. As with the standard semantics of P systems [16], each evolution step of S is a result of applying all the rules in S in a maximally parallel manner. More precisely, starting from the initial conﬁguration w0 , the system goes through a sequence of conﬁgurations, where each conﬁguration is derived from the directly preceding conﬁguration in one step by the application of a subset of rules, which are chosen nondeterministically. Note that a rule Ca → Cv is applicable if there is a C and an a in the preceding conﬁguration. The result of applying this rule is the replacement of a by v. If there is another occurrence of C and another occurrence of a, then the same rule or another rule with Ca on the left hand side can be applied. We require that the chosen subset of rules to apply must be

482

Oscar H. Ibarra et al.

maximally parallel in the sense that no other applicable rule can be added to the subset. Conﬁguration w is reachable if it appears in some execution sequence. w is halting if none of the rules is applicable. The set of all reachable conﬁgurations is denoted by R(S). The set of all halting reachable conﬁgurations (which is a subset of R(S)) is denoted by Rh (S). We show that CS’s, whose initial conﬁguration contains only one catalyst, are equivalent to communication-free Petri nets, which are also equivalent to commutative context free grammars [5,11]. They deﬁne precisely the semilinear sets. Hence R(S) and Rh (S) are semilinear. This partially answers an open problem in [19], where it was shown that when the initial conﬁguration contains six catalysts, S is universal, and [19] raised the question of what is the optimal number of catalysts for universality. Our result shows that one catalyst is not enough. We also study an extended model where the rules are of the form q : (p, Ca → Cv) (where q and p are states), i.e., the application of the rules is guided by a ﬁnite-state control. For this generalized model, systems with one catalyst in its initial conﬁguration as well as systems with multiple catalysts in its initial conﬁguration but with some restriction correspond to vector addition systems. We conclude this section by recalling the deﬁnitions of semilinear sets and Parikh maps [15]. Let N be the set of nonnegative integers and k be a positive integer. A set S ⊆ Nk is a linear set if there exist vectors v0 , v1 , . . . , vt in Nk such that S = {v | v = v0 + a1 v1 + . . . + at vt , ai ∈ N}. The vectors v0 (referred to as the constant vector) and v1 , v2 , . . . , vt (referred to as the periods) are called the generators of the linear set S. A set S ⊆ Nk is semilinear if it is a ﬁnite union of linear sets. The empty set is a trivial (semi)linear set, where the set of generators is empty. Every ﬁnite subset of Nk is semilinear – it is a ﬁnite union of linear sets whose generators are constant vectors. Clearly, semilinear sets are closed under union and projection. It is also known that semilinear sets are closed under intersection and complementation. Let Σ = {a1 , a2 , . . . , an } be an alphabet. For each string w in Σ ∗ , deﬁne the Parikh map of w to be ψ(w) = (|w|a1 , |w|a2 , . . . , |w|an ), where |w|ai is the number of occurrences of ai in w. For a language (set of strings) L ⊆ Σ ∗ , the Parikh map of L is ψ(L) = {ψ(w) | w ∈ L}.

2

1-Region Catalytic Systems

In this section, we study 1-region membrane computing systems which use only rules of the form Ca → Cv, where C is a catalyst, a is a noncatalyst, and v is a (possibly null) string of noncatalysts. Note that we do not allow rules of the form a → v as in a P System. Thus, we could think of these systems as “purely” catalytic. As deﬁned earlier, we denote such a system by CS. Let S be a CS and w be an initial conﬁguration (string) representing a multiset of catalysts and noncatalysts. A conﬁguration x is a reachable conﬁguration if S can reach x starting from the initial conﬁguration w. Call x a halting conﬁguration if no rule is applicable on x. Unless otherwise speciﬁed, “reachable conﬁguration” will mean any reachable conﬁguration, halting or not. Note that a non-halting reachable conﬁguration x is an intermediate conﬁguration in a possibly inﬁnite computation. We denote by R(S)

Characterizations of Catalytic Membrane Computing Systems

483

the set of Parikh maps of reachable conﬁgurations with respect to noncatalysts only. Since catalysts do not change in a computation, we do not include them in the Parikh map. Also, for convenience, when we talk about conﬁgurations, we sometimes do not include the catalysts. R(S) is called the reachability set of R. Rh (S) will denote the set of all halting reachable conﬁgurations. 2.1 The Initial Conﬁguration Has Only One Catalyst In this subsection, we assume that the initial conﬁguration of the CS has only one catalyst C. A noncatalyst a is evolutionary if there is a rule in the system of the form Ca → Cv; otherwise, a is non-evolutionary. Call a CS simple if each rule Ca → Cv has at most one evolutionary noncatalyst in v. Our ﬁrst result shows that semilinear sets and simple CS’s are intimately related. Theorem 1. 1. Let Q ⊆ Nk . If Q is semilinear, then there is a simple CS S such that Q is deﬁnable by S, i.e., Q is the projection of Rh (S) on k coordinates. 2. Let S be a simple CS. Then Rh (S) and R(S) are semilinear. Later, in Section 4, we will see that, in fact, the above theorem holds for any CS whose initial conﬁguration has only one catalyst. Suppose that we extend the model of a CS so that the rules are now of the form q : (p, Ca → Cv), i.e., the application of the rules is guided by a ﬁnite-state control. The rule means that if the system is in state q, application of Ca → Cv will land the system in state p. We call this system a CS with states or CSS. In addition, we allow the rules to be prioritized, i.e., there is a partial order on the rules: A rule r of lower priority than r cannot be applied if r is applicable. We refer to such a system as a CSSP. For both systems, the computation starts at (q0 , w), where q0 is a designated start state, and w is the initial conﬁguration consisting of catalyst C and noncatalysts. In Section 4, we will see that a CSS can deﬁne only a recursive set of tuples. In contrast, the following result shows that a CSSP can simulate a Turing machine. Theorem 2. Let S be a CSSP with one catalyst and two noncatalysts. Then S can simulate a Turing machine. Directly from Theorem 2, we have: Corollary 1. Let S be a CSSP with one catalyst and two noncatalysts. Then R(S) ⊆ N2 need not be a semilinear set. We will see later that in contrast to the above result, when the rules are not prioritized, i.e., we have a CSS S with one catalyst and two noncatalysts, R(S) is semilinear. 2.2 The Initial Conﬁguration Has Multiple Catalysts In this subsection, we assume that initial conﬁguration of the CS can have multiple catalysts.

484

Oscar H. Ibarra et al.

In general, we say that a noncatalyst is k-bounded if it appears at most k times in any reachable conﬁguration. It is bounded if it is k-bounded for some k. Consider a CSSP whose initial conﬁguration has multiple catalysts. Assume that except for one noncatalyst, all other noncatalysts are bounded or make at most r (for some ﬁxed r) alternations between nondecreasing and nonincreasing multiplicity in any computation. Call this a reversal-bounded CSSP. Corollary 2. If S is a reversal-bounded CSSP, then Rh (S) and R(S) are semilinear. Without the reversal-bounded restriction, a CSSP can simulate a TM. In fact, a CS (with multiple catalysts in its initial conﬁguration) can simulate a TM. It was shown in [19] that a CS augmented with noncooperating rules of the form a → v, where a is a noncatalyst and v is a (possibly null) string of noncatalysts is universal in the sense that such an augmented system with 6 catalysts can deﬁne any recursively enumerable set of tuples. A close analysis of the proof in [19] shows that all the rules can be made purely catalytic (i.e., of the form Ca → Cv) using at most 8 catalysts. Actually, this number 8 can be improved further using the newest results in [7]: Corollary 3. A CS with 7 catalysts can deﬁne any recursively enumerable set of tuples. There is another restriction on a CSSP S that makes it deﬁne only a semilinear set. Let T be a sequence of conﬁgurations corresponding to some computation of S starting from a given initial conﬁguration w (which contains multiple catalysts). A noncatalyst a is positive on T if the following holds: if a occurs in the initial conﬁguration or does not occur in the initial conﬁguration but later appears as a result of some catalytic rule, then the number of occurrences (multiplicity) of a in any conﬁguration after the ﬁrst time it appears is at least 1. (There is no bound on the number of times the number of a’s alternate between nondecreasing and nonincreasing, as long there is at least 1.) We say that a is negative on T if it is not positive on T , i.e., the number of occurrences of a in conﬁgurations in T can be zero. Any sequence T of conﬁgurations for which every noncatalyst is bounded or is positive is called a positive computation. Corollary 4. Any semilinear set is deﬁnable by a CSSP where every computation path is positive. Conversely, we have, Corollary 5. Let S be a CSSP. Suppose that every computation path of S is positive. Then Rh (S) and R(S) are semilinear. The previous corollary can further be strengthened. Corollary 6. Let S be a CSSP. Suppose we allow one (and only one) noncatalyst, say a, to be negative. This means that a conﬁguration with a positive occurrence (multiplicity) of a can lead to a conﬁguration with no occurrence of a. Suppose that every computation path of S is positive, except for a. Then Rh (S) and R(S) are semilinear.

Characterizations of Catalytic Membrane Computing Systems

3

485

Characterizations in Terms of Vector Addition Systems

An n-dimensional vector addition system (VAS) is a pair G = x, W , where x ∈ Nn is called the start point (or start vector) and W is a ﬁnite set of vectors in Zn , where Z is the set of all integers (positive, negative, zero). The reachability set of the VAS x, W is the set R(G) = {z | for some j, z = x + v1 + ... + vj , where, for all 1 ≤ i ≤ j, each vi ∈ W and x + v1 + ... + vi ≥ 0}. The halting reachability set Rh (G) = {z | z ∈ R(G), z + v ≥ 0 for every v in W }. An n-dimensional vector addition system with states (VASS) is a VAS x, W together with a ﬁnite set T of transitions of the form p → (q, v), where q and p are states and v is in W . The meaning is that such a transition can be applied at point y in state p and yields the point y + v in state q, provided that y + v ≥ 0. The VASS is speciﬁed by G = x, T, p0 , where p0 is the starting state. The reachability problem for a VASS (respectively, VAS) G is to determine, given a vector y, whether y is in R(G). The equivalence problem is to determine given two VASS (respectively, VAS) G and G , whether R(G) = R(G ). Similarly, one can deﬁne the reachability problem and equivalence problem for halting conﬁgurations. We summarize the following known results concerningVAS andVASS [20,9,1,10,13]: Theorem 3. 1. Let G be an n-dimensional VASS. We can effectively construct an (n + 3)-dimensional VAS G that simulates G. 2. If G is a 2-dimensional VASS G, then R(G) is an effectively computable semilinear set. 3. There is a 3-dimensional VASS G such that R(G) is not semilinear. 4. If G is a 5-dimensional VAS G, then R(G) is an effectively computable semilinear set. 5. There is a 6-dimensional VAS G such that R(G) is not semilinear. 6. The reachability problem for VASS (and hence also for VAS) is decidable. 7. The equivalence problem for VAS (and hence also for VASS) is undecidable. Clearly, it follows from part 6 of the theorem above that the halting reachability problem for VASS (respectively, VAS) is decidable. 3.1 The Initial Conﬁguration Has Only One Catalyst We ﬁrst consider CSS (i.e., CS with states) whose initial conﬁguration has only one catalyst. There is an example of a 3-dimensional VASS G in [10] such that R(G) is not semilinear: G =< x, T, p >, where x = (0, 0, 1), and the transitions in T are: p → (p, (0, 1, −1)) p → (q, (0, 0, 0)) q → (q, (0, −1, 2)) q → (p, (1, 0, 0)) Thus, there are only two states p and q. The following was shown in [10]: 1. (x1 , x2 , x3 ) is reachable in state p if and only if 0 < x2 + x3 ≤ 2x1 . 2. (x1 , x2 , x3 ) is reachable in state q if and only if 0 < 2x2 + x3 ≤ 2x1 +1 . Hence R(G) is not semilinear. From this example, we can show: Corollary 7. There is CSS S with 1 catalyst, 3 noncatalysts, and two states such that R(S) is not semilinear.

486

Oscar H. Ibarra et al.

In fact, as shown below, each CSS corresponds to a VASS and vice versa. Lemma 1. 1. Let S be a CSS. We can effectively construct a VASS G such that R(G) = R(S). 2. Every VASS can be simulated by a CSS. From Theorem 3 part 6, we have: Corollary 8. The reachability problem for CSS is decidable. Clearly a reachable conﬁguration is halting if no rule is applicable on the conﬁguration. It follows from the above result that the halting reachability problem (i.e., determining if a conﬁguration is in Rh (S)) is also decidable. A VASS is communication-free if for each transition q → (p, (j1 , ..., jk )) in the VASS, at most one ji is negative, and if negative its value is −1. From Lemma 1 and the observation that the VASS constructed for the proof of Lemma 1 can be made communication-free, we have: Theorem 4. The following systems are equivalent in the sense that each system can simulate the others: CSS, VASS, communication-free VASS. Now consider a communication-free VASS without states, i.e., a VAS where in every transition, at most one component is negative, and if negative, its value is -1. Call this a communication-free VAS. Communication-free VAS’s are equivalent to communicationfree Petri nets, which are also equivalent to commutative context-free grammars [5,11]. It is known that they have effectively computable semilinear reachability sets [5]. It turns out that communication-free VAS’s characterize CS’s. Theorem 5. Every communication-free VAS G can be simulated by a CS, and vice versa. Corollary 9. If S is a CS, then R(S) and Rh (S) are effectively computable semilinear sets. The following is obvious, as we can easily construct a VAS from the speciﬁcation of the linear set. Corollary 10. If Q is a linear set, then we can effectively construct a communicationfree VAS G such that R(G) = Q. Hence, every semilinear set is a union of the reachability sets of communication-free VAS’s. From the NP-completeness of the reachability problem for communication-free Petri nets (which are equivalent to commutative context-free grammars) [11,5], we have: Corollary 11. The reachability problem for CS is NP-complete. We have already seen that a CSS S with prioritized rules (CSSP) and with two noncatalysts can simulate a TM (Theorem 2); hence R(S) need not be semilinear. Interestingly, if we drop the requirement that the rules are prioritized, such a system has a semilinear reachable set. Corollary 12. Let S be a CSS with two noncatalysts. Then R(S) and Rh (S) are effectively computable semilinear sets. Open Problem: Suppose S has only rules of the form Ca → Cv whose initial conﬁguration has exactly one catalyst. Suppose the rules are prioritized. How is R(S) related to VASS?

Characterizations of Catalytic Membrane Computing Systems

487

3.2 The Initial Conﬁguration Has Multiple Catalysts We have seen that a CS with multiple catalysts can simulate a TM. Consider the following restricted version: Instead of “maximal parallelism” in the application of the rules at each step of the computation, we only allow “limited parallelism” by organizing the rules to apply in one step to be in the following form (called a matrix rule): (D1 b1 → D1 v1 , ..., Ds bs → Ds vs ) where the Di ’s are catalysts (need not be distinct), the bi ’s are noncatalysts (need not be distinct), the vi ’s are strings of noncatalysts (need not be distinct), and s is the degree of the matrix. The matrix rules in a given system may have different degrees. The meaning of a matrix rule is that it is applicable if and only if each component of the matrix is applicable. The system halts if no matrix rule is applicable. Call this system a matrix CS, or MCS for short. We shall also consider MCS with states (called MCSS), where now the matrix rules have states and are of the form: p : (q, (D1 b1 → D1 v1 , ..., Ds bs → Ds vs )) Now the matrix is applicable if the system is in state p and all the matrix components are applicable. After the application of the matrix, the system enters state q. Lemma 2. Given a VAS (VASS) G, we can effectively construct an MCS (MCSS) S such that R(S) = R(G) × {1}. Lemma 3. Given an MCSS S over n noncatalysts, we can effectively construct an (n + 1)-dimensional VASS G such that R(S) = projn (R(G) ∩ (Nn × {1})). The VASS in Lemma 3 can be converted to a VAS. It was shown in [10] that if G is an n-dimensional VASS with states q1 , ..., qk , then we can construct an (n+3)-dimensional VAS G with the following property: If the VASS G is at (i1 , ..., in ) in state qj , then the VAS G will be at (i1 , ..., in , aj , bj , 0), where aj = j for j = 1 to k, bk = k + 1 and bj = bj+1 + k + 1 for j = 1 to k − 1. The last three coordinates keep track of the state changes, and G has additional transitions for updating these coordinates. However, these additional transitions only modify the last three coordinates. Deﬁne the ﬁnite set of tuples Fk = {(j, (k − j + 1)(k + 1)) | j = 1, ..., k} (note that k is the number of states of G). Then we have: Corollary 13. Given an MCSS S over n noncatalysts, we can effectively construct an (n+4)-dimensional VAS G such that R(S) = projn (R(G )∩(Nn ×{1}×Fk ×{0})), for some effectively computable k (which depends only on the number of states and number of rules in G). From Theorem 4, Lemmas 2 and 3, and the above corollary, we have: Theorem 6. The following systems are equivalent in the sense that each system can simulate the others: CSS, MCS, MCSS, VAS, VASS, communication-free VASS. Corollary 14. It is decidable to determine, given an MCSS S and a conﬁguration α, whether α is a reachable conﬁguration (halting or not).

488

Oscar H. Ibarra et al.

Corollary 15. It is decidable to determine, given an MCSS S and a conﬁguration α, whether α is a halting reachable conﬁguration. From Lemma 2 and Theorem 3 part 7, we have: Corollary 16. The equivalence and containment problems for MCSS are undecidable.

4

Closure Properties

Let S be a catalytic system of any type introduced in the previous sections. For the purposes of investigating closure properties, we will say that S deﬁnes a set Q ⊆ Nk (or Q is deﬁnable by S) if Rh (S) = Q × {0}r for some given r. Thus, the last r coordinates of the (k + r)-tuples in Rh (S) are zero, and the ﬁrst k-components are exactly the tuples in Q. Fixed the noncatalysts to be a1 , a2 , a3 , .... Thus, any system S has noncatalysts a1 , ..., at for some t. We say that a class of catalytic systems of a given type is closed under: 1. Intersection if given two systems S1 and S2 , which deﬁne sets Q1 ⊆ Nk and Q2 ⊆ Nk , respectively, there exists a system S which deﬁnes Q = Q1 ∩ Q2 . 2. Union if given two systems S1 and S2 , which deﬁne sets Q1 ⊆ Nk and Q2 ⊆ Nk , respectively, there exists a system S which deﬁnes Q = Q1 ∪ Q2 3. Complementation if given a system S which deﬁnes a set Q ⊆ Nk , there exists a system S which deﬁnes Q = Nk − Q. 4. Concatenation if given two systems S1 and S2 , which deﬁne sets Q1 ⊆ Nk and Q2 ⊆ Nk , respectively, there exists a system S which deﬁnes Q = Q1 Q2 , where Q1 Q2 = {(i1 + j1 , ..., ik + jk ) | (i1 , ..., ik ) ∈ Q1 , (j1 , ..., jk ) ∈ Q2 }. 5. Kleene + if given a system S which deﬁnes a set Q ⊆ Nk , there exists a system S which deﬁnes Q = n≥1 Qn . 6. Kleene * if given a system S which deﬁnes a set Q ⊆ Nk , there exists a system S which deﬁnes Q = n≥0 Qn . Other unary and binary operations can be deﬁned similarly. Theorem 7. The class CS with only one catalyst in the initial conﬁguration is closed under intersection, union, complementation, concatenation, and Kleene+ (or Kleene∗ ). Investigation of closure properties of other types of catalytic systems is a subject for future research.

Acknowledgment We would like to thank Dung Huynh and Hsu-Chun Yen for their comments and for pointing out some of the references concerning vector addition systems. We also appreciate the comments and encouragement of Gheorghe Paun and Petr Sosik on this work.

Characterizations of Catalytic Membrane Computing Systems

489

References 1. H. G. Baker. Rabin’s proof of the undecidability of the reachability set inclusion problem for vector addition systems. In C.S.C. Memo 79, Project MAC, MIT, 1973. 2. G. Berry and G. Boudol. The chemical abstract machine. In POPL’90, pages 81–94. ACM Press, 1990. 3. P. Bottoni, C. Martin-Vide, Gh. Paun, and G. Rozenberg. Membrane systems with promoters/inhibitors. Acta Informatica, 38(10):695–720, 2002. 4. J. Dassow and Gh. Paun. On the power of membrane computing. Journal of Universal Computer Science, 5(2):33–49, 1999. 5. J. Esparza. Petri nets, commutative context-free grammars, and basic parallel processes. In FCT’95, volume 965 of LNCS, pages 221–232. Springer, 1995. 6. R. Freund and M. Oswald. P Systems with activated/prohibited membrane channels. In WMC-CdeA’02, volume 2597 of LNCS, pages 261–269. Springer, 2003. 7. R. Freund, M. Oswald, and P. Sosik. Reducing the number of catalysts needed in computationally universal P systems without priorities. In the 5th Descriptional Complexity of Formal Systems Workshop (DFCS), July 12-14, 2003, Budapest, Hungary. 8. P. Frisco and H. Jan Hoogeboom. Simulating counter automata by P Systems with symport/antiport. In WMC-CdeA’02, volume 2597 of LNCS, pages 288–301. Springer, 2003. 9. M. H. Hack. The equality problem for vector addition systems is undecidable. In C.S.C. Memo 121, Project MAC, MIT, 1975. 10. J. Hopcroft and J.-J. Pansiot. On the reachability problem for 5-dimensional vector addition systems. TCS, 8(2):135–159, 1979. 11. D.T. Huynh. Commutative grammars: The complexity of uniform word problems. Information and Control, 57:21–39, 1983. 12. C. Martin-Vide and Gh. Paun. Computing with membranes (P Systems): Universality results. In MCU, volume 2055 of LNCS, pages 82–101. Springer, 2001. 13. E. Mayr. Persistence of vector replacement systems is decidable. Acta Informatica, 15:309– 318, 1981. 14. M. Minsky. Recursive unsolvability of Post’s problem of Tag and other topics in the theory of Turing machines. Ann. of Math., 74:437–455, 1961. 15. R. Parikh. On context-free languages. Journal of the ACM, 13:570–581, 1966. 16. Gh. Paun. Computing with membranes. JCSS, 61(1):108–143, 2000. 17. Gh. Paun. Computing with membranes (P Systems): A variant. International Journal of Foundations of Computer Science, 11(1):167–181, 2000. 18. Gh. Paun and G. Rozenberg. A guide to membrane computing. TCS, 287(1):73–100, 2002. 19. P. Sosik and R. Freund. P Systems without priorities are computationally universal. In WMC-CdeA’02, volume 2597 of LNCS, pages 400–409. Springer, 2003. 20. J. van Leeuwen. A partial solution to the reachability problem for vector addition systems. In STOC’74, pages 303–309.

Augmenting Local Edge-Connectivity between Vertices and Vertex Subsets in Undirected Graphs Toshimasa Ishii and Masayuki Hagiwara Department of Information and Computer Sciences Toyohashi University of Technology Aichi 441-8580, Japan {ishii,masa}@algo.ics.tut.ac.jp

Abstract. Given an undirected multigraph G = (V, E), a family W of sets W ⊆ V of vertices (areas), and a requirement function rW : W → Z + (where Z + is the set of positive integers), we consider the problem of augmenting G by the smallest number of new edges so that the resulting graph has at least rW (W ) edge-disjoint paths between v and W for every pair of a vertex v ∈ V and an area W ∈ W. So far this problem was shown to be NP-hard in the uniform case of rW (W ) = 1 for each W ∈ W, and polynomially solvable in the uniform case of rW (W ) = r ≥ 2 for each W ∈ W. In this paper, we show that the problem can be solved in O(m+ pr∗ n5 log (n/r∗ )) time, even in the general case of rW (W ) ≥ 3 for each W ∈ W, where n = |V |, m = |{{u, v}|(u, v) ∈ E}|, p = |W|, and r∗ = max{rW (W ) | W ∈ W}. Moreover, we give an approximation algorithm which ﬁnds a solution with at most one surplus edges over the optimal value in the same time complexity in the general case of rW (W ) ≥ 2 for each W ∈ W.

1

Introduction

In a communication network, graph connectivity is a fundamental measure of its robustness. The problem of achieving a high connectivity between every (or speciﬁed) two vertices has been extensively studied as the network design problem and so on (see [2,12] for surveys). Most of all those researches have dealt with connectivity between two vertices in a graph. However, in many real-wold networks, the connectivity between every two vertices is not necessarily required. For example, in a multimedia network, for a set W of vertices oﬀering certain service i, such as mirror servers, a user at a vertex v can use service i by communicating with one vertex w ∈ W through a path between w and v. In such networks, it is desirable that the network has some pairwise disjoint paths from the vertex v to at least one of vertices in W . This means that the measure of reliability is the connectivity between a vertex and a set of vertices rather than that between two vertices. From this point of view, H. Ito et al. considered the node to area connectivity (NA-connectivity, for B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 490–499, 2003. c Springer-Verlag Berlin Heidelberg 2003

Augmenting Local Edge-Connectivity between Vertices and Vertex Subsets

491

v3 v5

v4

W3

W3

v

v2

10

v1

v

11

W1

v7

W1

v9

v6

W2

W2

v8

W

(G=(V,E),

rW W1 (

(i)

)=2,

={W1,W2,W3})

rW W2 (

) =3,

rW W3 (

) = 4

W

(G=(V,E),

rW W1 (

(ii)

)=2,

={W1,W2,W3})

rW W2 (

) =3,

rW W3 (

) = 4

Fig. 1. Illustration of an instance of rW -NA-ECAP. (i) An initial graph G = (V, E) with a family W = {W1 = {v4 , v7 , v11 }, W2 = {v1 , v8 , v9 }, W3 = {v1 , v2 , v10 }} of areas, where a requirement function rW : W → Z + satisﬁes rW (W1 ) = 2, rW (W2 ) = 3, and rW (W3 ) = 4. (ii) An rW -NA-edge-connected graph obtained from G by adding a set of edges drawn as broken lines; there are at least rW (W ) edge-disjoint paths between every pair of a vertex v ∈ V and an area W ∈ W.

short) as a concept that represents the connectivity between vertices and sets of vertices (areas) in a graph [5,6,7]. In this paper, given a multigraph G = (V, E) with a family W of sets W of vertices (areas), and a requirement function rW : W → Z + , we consider the problem of asking to augment G by adding the smallest number of new edges so that the resulting graph has at least rW (W ) pairwise edge-disjoint paths between v and W for every pair of a vertex v ∈ V and an area W ∈ W. We call this problem rW -NA-edge-connectivity augmentation problem (for short, rW -NA-ECAP). Figure 1 gives an instance of rW -NA-ECAP with rW (W1 ) = 2, rW (W2 ) = 3, and rW (W3 ) = 4. So far r-NA-ECAP in the uniform case that rW (W ) = r holds for every area W ∈ W has been studied, and several algorithms have been developed. It was shown by H. Miwa et al. [9] that 1-NA-ECAP is NP-hard. Whereas, r-NA-ECAP is polynomially solvable in the case of r = 2 by H. Miwa et al. [9], and in the case of r ≥ 3 by T. Ishii et al. [4]. However, it was still open whether the problem with general requirements rW (W ) ≥ 2, W ∈ W is polynomially solvable or not. The above two algorithms for r-NA-ECAP are based on algorithms for solving the classical edge-connectivity augmentation problem which augments the edge-connectivity of a graph, but they are essentially diﬀerent; the former one follows the method based on the minimum cut structure by T. Watanabe et al. [13], and the latter one follows the so-called ‘splitting oﬀ’ method by A. Frank [1]. In this paper, by extending the approach in [4] and establishing a min-max formula to rW -NA-ECAP, we show that rW -NA-ECAP with general requirements rW (W ) ≥ 3 for each W ∈ W can be solved in O(m+ pr∗ n5 log (n/r∗ )) time, where n = |V |, m = |{{u, v}|(u, v) ∈ E}|, p = |W|, and r∗ = max{rW (W ) | W ∈ W}. We also give an approximation algorithm for rW -NA-ECAP with general requirements rW (W ) ≥ 2, W ∈ W which delivers a solution with at most one edge over the optimal in the same time complexity. Some of the proofs will be omitted from this extended abstract.

492

2

Toshimasa Ishii and Masayuki Hagiwara

Problem Deﬁnition

Let G = (V, E) stand for an undirected graph with a set V of vertices and a set E of edges. An edge with end vertices u and v is denoted by (u, v). We denote |V | by n and |{{u, v}|(u, v) ∈ E}| by m. A singleton set {x} may be simply written as x, and “ ⊂ ” implies proper inclusion while “ ⊆ ” means “ ⊂ ” or “ = ”. In G = (V, E), its vertex set V and edge set E may be denoted by V (G) and E(G), respectively. For a subset V ⊆ V in G, G[V ] denotes the subgraph induced by V . For an edge set E with E ∩ E = ∅, we denote the augmented graph (V, E ∪ E ) by G + E . For an edge set E , we denote by V [E ] a set of all end vertices of edges in E . An area graph is deﬁned as a graph G = (V, E) with a family W of vertex subsets W ⊆ V which are called areas (see Figure 1). We denote an area graph G with W by (G, W). In the sequel, we may denote (G, W) by G simply if no confusion arises. For two disjoint subsets X, Y ⊂ V of vertices, we denote by EG (X, Y ) the set of edges e = (x, y) such that x ∈ X and y ∈ Y , and also denote |EG (X, Y )| by dG (X, Y ). A cut is deﬁned as a subset X of V with ∅ = X = V , and the size of a cut X is deﬁned by dG (X, V − X), which may also be written as dG (X). Moreover, we deﬁne d(∅) = 0. For two cuts X, Y ⊂ V with X ∩ Y = ∅ in G, we denote by λG (X, Y ) the minimum size of cuts which separate X and Y , i.e., λG (X, Y ) = min{dG (S)|S ⊇ X, S ⊆ V − Y }. For two cuts X, Y ⊂ V with X ∩ Y = ∅ in G, we deﬁne λG (X, Y ) = ∞. The edge-connectivity of G, denoted by λ(G), is deﬁned as minX⊂V,Y ⊂V λG (X, Y ). For a vertex v ∈ V and a set W ⊆ V of vertices, the node-to-area edge-connectivity (NA-edge-connectivity, for short) between v and W is deﬁned as λG (v, W ). Note that λG (v, W ) = ∞ holds for v ∈ W . Also note that by Menger’s theorem, λG (v, W ) ≥ r holds if and only if there exist at least r edge-disjoint paths between v and W . For an area graph (G, W) and a function rW : W → Z + ∪ {0}, we say that (G, W) is rW -NA-edge-connected if λ(v, W ) ≥ rW (W ) holds for every pair of a vertex v ∈ V and an area W ∈ W. Note that the area graph (G, W) in Figure 1(ii) is rW -NA-edge-connected, where rW (W1 ) = 2, rW (W2 ) = 3, and rW (W3 ) = 4. In this paper, we consider the following problem, called rW -NA-ECAP. Problem 1. (rW -NA-edge-connectivity augmentation problem, rW -NA-ECAP) Input: An area graph (G, W) and a requirement function rW : W → Z + . Output: A set E ∗ of new edges with the minimum cardinality such that G + E ∗ is rW -NA-edge-connected.

3

Lower Bound on the Optimal Value

For an area graph (G, W) and a ﬁxed function rW : W → Z + , let opt(G, W, rW ) denote the optimal value to rW -NA-ECAP in (G, W), i.e., the minimum size |E ∗ | of a set E ∗ of new edges such that G + E ∗ is rW -NA-edge-connected. In this section, we derive lower bounds on opt(G, W, rW ) to rW -NA-ECAP with (G, W). In the sequel, let W = {W1 , W2 , . . . , Wp }, rW (Wi ) = ri , and r1 ≤ r2 ≤ · · · ≤ rp if no confusion occurs.

Augmenting Local Edge-Connectivity between Vertices and Vertex Subsets

493

A family X = {X1 , . . . , Xt } of cuts in G is called a subpartition of V , if every two cuts Xi , Xj ∈ X satisfy Xi ∩ Xj = ∅ and ∪X∈X X ⊆ V holds. For an area graph (G, W) and an area Wi ∈ W, a cut X with X ∩ Wi = ∅ is called type (Ai ), and a cut X with X ⊇ Wi is called type (Bi ) (note that a cut X of type (Bi ) satisﬁes X = V by the deﬁnition of a cut). We easily see the following property. Lemma 1. An area graph (G, W) is rW -NA-edge-connected if and only if all cuts X ⊂ V of type (Ai ) or (Bi ) satisfy dG (X) ≥ ri for each area Wi ∈ W.

Let X be a cut in (G, W). If X is a cut of type (Ai ) or (Bi ) with dG (X) < ri for some area Wi ∈ W, then it is necessary to add at least ri − dG (X) edges between X and V − X. It follows since if X is of type (Ai ) (resp., type (Bi )), then the NA-edge-connectivity between a vertex in X (resp., V − X) and an area Wi ∈ W with Wi ∩ X = ∅ (resp., Wi ⊆ X) need be augmented to at least ri . Here we deﬁne αG,W,rW (X) as follows, which indicates the number of necessary edges which join two vertices form a cut X to the cut V − X (note that r1 ≤ r2 ≤ · · · ≤ rp holds). Deﬁnition 1. For each cut X of type (Aj ) or (Bj ) for some area Wj , we deﬁne iX as the maximum index i such that X is of type (Ai ) or (Bi ), and deﬁne αG,W,rW (X) = max{0, riX − dG (X)}. For any other cut X, we deﬁne

αG,W,rW (X) = 0. Lemma 2. For each cut X, it is necessary to add at least αG,W,rW (X) edges between X and V − X.

Let

α(G, W, rW ) = max αG,W,rW (X) , X

(1)

X∈X

where the maximization is taken over all subpartitions of V . Then any feasible solution to rW -NA-ECAP with (G, W) must contain an edge which joins two vertices from a cut X with αG,W,rW (X) > 0 and the cut V − X. Adding one edge can contribute to at most two ‘cut deﬁciencies’ in a subpartition of V , and hence we see the following lemma. Lemma 3. opt(G, W, rW ) ≥ α(G, W, rW )/2 holds.

The area graph (G, W) in Figure 1(i) satisﬁes α(G, W, rW ) = 8. We have X∈X αG,W,rW (X) = 8 for the subpartition X = {{v1 }, {v2 }, {v4 }, {v6 , v7 , v8 }, {v9 , v11 }, {v10 }} of V . We remark that there is an area graph (G, W) with opt(G, W, rW ) > α(G, W, rW )/2. Figure 2 gives an instance for r = r1 = r2 = r3 = 2. Each cut {vi }, i = 1, 2, 4 is of type (A3 ) and satisﬁes r − dG (vi ) = 1 and the cut {v3 } is of type (A1 ) and satisﬁes r − dG (v3 ) = 1. Then we see α(G, W, rW )/2 = 2. In order to make (G, W) rW -NA-edge-connected by adding two new edges, we must add e = (v1 , v2 ) and e = (v3 , v4 ) without loss of generality. G + {e, e } is not rW -NA-edge-connected by λG+{e,e } (v1 , W3 ) = 1. We will show that all such instances can be completely characterized as follows.

494

Toshimasa Ishii and Masayuki Hagiwara

v1 W3

v3 v4

W2

W1

v2

W) Fig. 2. Illustration of an area graph (G, W) with opt(G, W, rW ) = α(G,W,r + 1, 2 where rW (Wi ) = 2 holds for i = 1, 2, 3.

Deﬁnition 2. We say that an area graph (G, W) has property (P ) ifα(G, W, rW ) is even and there is a subpartition X = {X1 , . . . , Xt } of V with X∈X αG,W,rW (X) = α(G, W, rW ) satisfying the following conditions (P 1)–(P 3): (P 1) Each cut X ∈ X is of type (Ai ) for some Wi ∈ W. (P 2) The cut X1 satisﬁes αG,W,rW (X1 ) = 1 and X1 ⊂ C1 for some component C1 of G with X ∩ C1 = ∅ for each = 2, 3, . . . , t. (P 3) For each = 2, 3, . . . , t, there is a cut Y of type (Bj ) with some Wj ∈ W such that we have X ∪ X1 ⊆ Y and X∈X ,X⊂Y αG,W,rW (X) ≤ (rj + 1) −dG (Y ), and that every cut X ∈ X satisﬁes X ⊂ Y or X ∩ Y = ∅.

Intuitively, the above condition (P3) indicates that for any feasible solution E , if the number of edges e ∈ E incident to Y is equal to X∈X ,X⊂Y αG,W,rW (X), then any edge e ∈ E must have its end vertex also in V − Y , from dG+E (Y ) ≥ rj . Note that (G, W) in Figure 2 has property (P) because α(G, W, rW ) = 4 holds and a subpartition X = {X1 = {v4 }, X2 = {v1 }, X3 = {v2 }, X4 = {v3 }} of V satisﬁes Y2 = C1 ∪{v1 }, Y3 = C1 ∪{v2 }, and Y4 = C1 ∪{v3 } for the component C1 of G containing v4 . Lemma 4. If (G, W) has property (P), then opt(G, W, rW ) ≥ α(G, W, rW )/2 +1 holds. Proof. Assume by contradiction that (G, W) has property (P) and there is an edge set E ∗ with |E ∗ | = α(G, W, rW )/2 such that G + E ∗ is rW -NAedge-connected (note that α(G, W, rW ) is even). Let X = {X1 , . . . , Xt } denote a subpartition of V satisfying X∈X αG,W,rW (X) = α(G, W, rW ) and the above (P1)–(P3). Since |E ∗ | = α(G, W, rW )/2 holds, each cut X ∈ X satisﬁes dG+E ∗ (X) = riX , and hence dG (X) = riX − dG (X) = αG,W,rW (X), where G = (V, E ∗ ). Therefore, any edge (x , x ) ∈ E ∗ satisﬁes x ∈ X and x ∈ X for some two cuts X , X ∈ X with = . From this, there exists a cut Xs ∈ X with s = 1 and EG (Xs , X1 ) = ∅. Since (G, W) satisﬁes property (P), there is a cut Ys of type (Bj ) which satisﬁes (P3), and hence d v∈Ys G (v) ≤ (rj +1)−dG (Ys ). Since G [Ys ] contains one edge in EG (Xs , X1 ), we have dG (Ys ) ≤ (rj − 1) − dG (Ys ), which implies that dG+E ∗ (Ys ) = dG (Ys ) + dG (Ys ) ≤ rj − 1. Hence a vertex v ∈ V − Ys satisﬁes λG+E ∗ (v, Wj ) ≤ rj − 1, contradicting that G + E ∗ is rW -NA-edge-connected (note that Ys is of type (Bj ) and hence we have Wj ⊆ Ys ).

Augmenting Local Edge-Connectivity between Vertices and Vertex Subsets

495

In this paper, we prove that rW -NA-ECAP enjoys the following min-max theorem and is polynomially solvable. Theorem 1. For rW -NA-ECAP with rW (W ) ≥ 3 for each area W ∈ W, opt(G, W, rW ) = α(G, W, rW )/2 holds if (G, W) does not have property (P ), and opt(G, W, rW ) = α(G, W, rW )/2 + 1 holds otherwise. Moreover, a solution E ∗ with |E ∗ | = opt(G, W, rW ) can be obtained in O(m+ prp n5 log (n/rp )) time.

Theorem 2. For rW -NA-ECAP with rW (W ) ≥ 2 for each area W ∈ W, a solution E ∗ with |E ∗ | ≤ opt(G, W, rW ) + 1 can be obtained in O(m+ prp n5 log (n/rp )) time.

4

Algorithm

Based on the lower bounds in the previous section, we give an algorithm, called rW -NAEC-AUG, which ﬁnds a feasible solution E to rW -NA-ECAP with |E | = opt(G, W, rW ), for a given area graph (G, W) and a requirement function rW : W → Z + − {1, 2}. It ﬁnds a feasible solution E with |E | = α(G, W, rW )/2 + 1 if (G, W) has property (P), |E | = α(G, W, rW )/2 otherwise. For a graph H = (V ∪ {s}, E) and a designated vertex s ∈ / V , an operation called edge-splitting (at s) is deﬁned as deleting two edges (s, u), (s, v) ∈ E and adding one new edge (u, v). That is, the graph H = (V ∪ {s}, (E − {(s, u), (s, v)}) ∪ {(u, v)}) is obtained from such edge-splitting operation. Then we say that H is obtained from H by splitting a pair of edges (s, u) and (s, v). A sequence of splittings is complete if the resulting graph H does not have any neighbor of s. Conversely, we say that H is obtained from H by hooking up an edge (u, v) ∈ E(H − s) at s, if we construct H by replacing an edge (u, v) with two edges (s, u) and (s, v) in H. The edge-splitting operation is known to be a useful tool for solving connectivity augmentation problems [1]. An outline of our algorithm is as follows. We ﬁrst add a new vertex s and the minimum number of new edges between s and an area graph (G, W) to construct an rW -NA-edge-connected graph H and convert H into an rW -NAedge-connected graph by splitting oﬀ edges incident to s and eliminating s. More precisely, we describe the algorithm below, and introduce three theorems necessary to justify the algorithm, whose proofs are omitted due to space limitation. An example of computational process of rW -NAEC-AUG is shown in Figure 3. Algorithm rW -NAEC-AUG. Input: An area graph (G = (V, E), W = {W1 , W2 , . . . , Wp }) and a requirement function rW : W → Z + − {1, 2}. Output: A set E ∗ of new edges with |E ∗ | = opt(G, W, rW ) such that G + E ∗ is rW -NA-edge-connected. Step 1: We add a new vertex s and a set F1 of new edges between s and V such that in the resulting graph H = (V ∪ {s}, E ∪ F1 ), all cuts X ⊂ V of type (Ai ) or (Bi ) satisfy dH (X) ≥ ri for each Wi ∈ W, (2)

496

Toshimasa Ishii and Masayuki Hagiwara

∪

H=(V

{s},

E

F1

∪

F1)

∪

H=(V

s

{s},

E

F1

∪

F1)

s

W3

v1 v2

W1 W2

(i)

∪

H=(V

{s},

E

F1

∪

F1)

(ii)

s

∪

H=(V

{s},

F1

E

∪

F1)

s

v3 v4

(iii)

(iv)

Fig. 3. Computational process of algorithm rW -NAEC-AUG applied to the area graph (G, W) in Figure 1 and (rW (W1 ), rW (W2 ), rW (W3 )) = (2, 3, 4). The lower bound in Section 3 is α(G, W, rW )/2 = 4. (i) H = (V ∪ {s}, E ∪ F1 ) obtained by Step 1. Edges in F1 are drawn as broken lines. Then λH (v, W ) ≥ rW (W ) holds for every pair of v ∈ V and W ∈ W. (ii) H1 = (H − {(s, v1 ), (s, v2 )}) ∪ {(v1 , v2 )} obtained from H by an admissible splitting of (s, v1 ) and (s, v2 ). (iii) H2 = (H1 − {(s, v3 ), (s, v4 )}) ∪ {(v3 , v4 )} obtained from H1 by an admissible splitting of (s, v3 ) and (s, v4 ). (iv) H3 obtained from H2 by a complete admissible splitting at s. The graph G3 = H3 − s is rW -NAedge-connected.

and no F ⊂ F1 satisﬁes this property (as will be shown, |F1 | = α(G, W, rW ) holds). If dH (s) is odd, then we add to F1 one extra edge between s and V . Step 2: We split two edges incident to s while preserving (2) (such splitting pair is called admissible). We continue to execute admissible edge-splittings at s until no pair of two edges incident to s is admissible. Let H2 = (V ∪ {s}, E ∪ E2 ∪ F2 ) be the resulting graph, where F2 = EH2 (s) and E2 denotes a set of split edges. If F2 = ∅ holds, then halt after outputting E ∗ := E2 . Otherwise dH2 (s) = 4 holds and the graph H2 − s has two components C1 and C2 with dH2 (s, C1 ) = 3 and dH2 (s, C2 ) = 1, where EH2 (s, C2 ) = {(s, u∗ )}. We have the following four cases (a) – (d). (a) The vertex u∗ is contained in no cut X ⊆ C2 of type (Ai ) with dH2 (X) = ri for any i. Then after replacing (s, u∗ ) with a new edge (s, v) for some vertex v ∈ C1 while preserving (2), execute a complete admissible splitting at s. Output a set E ∗ of all split edges, where |E ∗ | = α(G, W, rW )/2 holds.

Augmenting Local Edge-Connectivity between Vertices and Vertex Subsets

497

(b) E2 ∩ E(H2 [V − C1 ]) = ∅ holds. Then after hooking up one edge e ∈ E2 ∩ E(H2 [V − C1 ]), execute a complete admissible splitting at s. Output a set E ∗ of all split edges, where |E ∗ | = α(G, W, rW )/2 holds. (c) There is a set E ⊆ E2 of at most two split edges such that the graph H3 resulting from hooking up a set E of edges in H2 has an admissible pair {(s, u∗ ), f } for some f ∈ EH3 (s, V ). After a complete admissible splitting at s in H3 , output a set E ∗ of all split edges, where |E ∗ | = α(G, W, rW )/2 holds. (d) None of (a) – (c) holds. Then we can prove that (G, W) has property (P). After adding one new edge e∗ to EH2 (C1 , C2 ), execute a complete admissible splitting at s in H2 + {e∗ }. Outputting an edge set E ∗ := E3 ∪ {e∗ }, where E3 denotes a set of all split edges and |E ∗ | = α(G, W, rW )/2 + 1 holds.

To justify the algorithm rW -NAEC-AUG, it suﬃces to show the following three theorems. Theorem 3. Let (G = (V, E), W = {W1 , . . . , Wp }) be an area graph, and 0 ≤ / V and r1 ≤ · · · ≤ rp be integers. Let H = (V ∪ {s}, E ∪ F1 ) be a graph with s ∈ F1 = EH (s, V ) such that H satisﬁes (2) and no F ⊂ F1 satisﬁes this property. Then |F1 | = α(G, W, rW ) holds.

Theorem 4. Let (G = (V, E), W = {W1 , . . . , Wp }) be an area graph, and 2 ≤ r1 ≤ · · · ≤ rp be integers. Let H = (V ∪ {s}, E ∪ F ) with F = EH (s, V ) = ∅, s∈ / V , and an even dH (s), satisfy (2). If no pair of two edge in F is admissible, then we have dH (s) = 4 and G has two components C1 and C2 with dH (s, C1 ) = 3 and dH (s, C2 ) = 1. Moreover, in the graph H + e∗ obtained by adding one arbitrary new edge e∗ to EG (C1 , C2 ), there is a complete admissible splitting at s.

Theorem 5. Let (G, W) and H satisfy the assumption of Theorem 4, and 3 ≤ r1 ≤ · · · ≤ rp be integers. Let H ∗ be a graph obtained by a sequence of admissible splittings at s from H such that EH ∗ (s, V ) = ∅ holds and no pair of two edge in EH ∗ (s, V ) is admissible in H ∗ . Let C1 and C2 be two components in H ∗ − s with dH ∗ (s, C1 ) = 3 and dH ∗ (s, C2 ) = 1 (they exist by Theorem 4). Then if H ∗ satisﬁes one of the following conditions (a)–(c), then H has a complete admissible splitting at s after replacing at most one edge in EH (s, V ). Otherwise (G, W) has property (P). (a) For {(s, u∗ )} = EH ∗ (s, C2 ), u∗ is contained in no cut X ⊆ C2 of type (Ai ) with dH ∗ (X) = ri for any i. (b) E1 ∩ E(H ∗ [V − C1 ]) = ∅ holds, where E1 denotes a set of all split edges. (c) There is a set E ⊆ E1 of at most two split edges such that the graph H resulting from hooking up a set E of edges in H ∗ has an admissible pair {(s, u∗ ), f } for some f ∈ EH (s, V ).

By Theorems 4 and 5, for a set E ∗ of edges obtained by algorithm rW -NAECAUG, the graph H ∗ = (V ∪ {s}, E ∪ E ∗ ) satisﬁes (2), i.e., all cuts X ⊂ V of type (Ai ) or (Bi ) satisfy dH ∗ (X) ≥ ri for each area Wi ∈ W. By dH ∗ (s) = 0, all cuts X ⊂ V satisfy dG+E ∗ (X) = dH ∗ (X). By Lemma 1, this implies that G + E ∗ is rW -NA-edge-connected. By Theorems 3 and 5, we have |E ∗ | =

498

Toshimasa Ishii and Masayuki Hagiwara

α(G, W, rW )/2+1 in the cases where an initial area graph (G, W) has property (P), |E ∗ | = α(G, W, rW )/2 otherwise. By Lemmas 3 and 4, we have |E ∗ | = opt(G, W, rW ). Finally, we analyze the time complexity of algorithm rW -NAEC-AUG. By the maximum ﬂow technique in [3], we can compute in O(mn log (n2 /m)) time λG (v, W ) for a vertex v ∈ V and an area W ∈ W. Hence it can be checked in O(mpn2 log (n2 /m)) time whether H satisﬁes (3) or not. In Step 1, for each vertex v ∈ V , after deleting all edges between s and v, we check whether the resulting graph H satisﬁes (3) or not. If (3) is violated, then we add maxx∈V,Wi ∈W {ri − λH (x, Wi )} edges between s and v in H . In Step 2, for each pair {u, v} ⊆ V , after splitting min{dH (s, u), dH (s, v)} pairs {(s, u), (s, v)}, we check whether the resulting graph H satisﬁes (3) or not. If (3) is violated, then we hook up 12 maxx∈V,Wi ∈W {ri −λH (x, Wi )} pairs in H . The procedures (a) – (d) can be also executed in polynomial time since the number of hooking up operations is O(n4 ). By further analysis, we can prove that hooking up split edges O(n2 ) times suﬃce for these procedures, but we here omit the details. Therefore, we see that algorithm rW -NAEC-AUG can be implemented to run in O(mpn4 log (n2 /m)) time. As a result, this total complexity can be reduced to O(m+ prp n5 log (n/rp )) by applying the procedure to a sparse spanning subgraph of G with O(rp n) edges, where such sparsiﬁcation takes O(m + n log n) time [10,11]. Summarizing the argument given so far, Theorem 1 is now established. Notice that the assumption of r1 ≥ 3 is necessary only for Theorem 5. Therefore, even in the case of r1 = 2, we see by Theorem 4 that we can obtain a feasible solution E to rW -NA-ECAP with |E | ≤ α(G, W, rW )/2 + 1 ≤ opt(G, W, rW ) + 1. This implies Theorem 2. Actually, we remark that there are some diﬀerences between the case of r1 = 2 and the case of r1 ≥ 3. For example, a graph (G = (V ∪ {v}, E), W) obtained from the graph (G, W) in Figure 2 by adding an isolated vertex v does not have property (P), but satisﬁes opt(G , W, rW ) > α(G , W, rW )/2.

5

Conclusion

In this paper, given an area multigraph (G = (V, E), W) and a requirement function rW : W → Z + , we have proposed a polynomial time algorithm for rW -NA-ECAP in the case where each area W ∈ W satisﬁes rW (W ) ≥ 3. The time complexity of our algorithm is O(m+ pr∗ n5 log (n/r∗ )). Moreover, we have showned that in the case of rW (W ) ≥ 2, W ∈ W, a solution with at most one edge over the optimal can be found in the same time complexity. However, it is still open whether the problem in the case of rW (W ) ≥ 2, W ∈ W is polynomially solvable. We ﬁnally remark that our method in this paper cannot be applied to the problem of augmenting a given simple graph while preserving the simplicity of the graph. For such simplicity preserving problems, it was shown [8] that even the edge-connectivity augmentation problem is NP-hard.

Augmenting Local Edge-Connectivity between Vertices and Vertex Subsets

499

Acknowledgments This research is supported by a Grant-in-Aid for the 21st Century COE Program “Intelligent Human Sensing” from the Ministry of Education, Culture, Sports, Science, and Technology.

References 1. A. Frank, Augmenting graphs to meet edge-connectivity requirements, SIAM J. Discrete Math., 5(1), (1992), 25–53. 2. A. Frank, Connectivity augmentation problems in network design, in Mathematical Programming: State of the Art 1994, J.R. Birge and K.G. Murty (Eds.), The University of Michigan, Ann Arbor, MI, (1994), 34–63. 3. A. V. Goldberg and R. E. Tarjan, A new approach to the maximum ﬂow problem, J. Assoc. Comput. Mach., 35, (1988), 921–940. 4. T. Ishii, Y. Akiyama, and H. Nagamochi, Minimum augmentation of edgeconnectivity between vertices and sets of vertices in undirected graphs, Electr. Notes Theo. Comp. Sci., vol.78, Computing Theory: The Australian Theory Symposium (CATS’03), (200 3). 5. H. Ito, Node-to-area connectivity of graphs, Transactions of the Institute of Electrical Engineers of Japan, 11C(4), (1994), 463-469. 6. H. Ito, Node-to-area connectivity of graphs, In M. Fushimi and K. Tone, editors, Proceedings of APORS94, World Scientiﬁc publishing, (1995), 89–96. 7. H. Ito and M. Yokoyama, Edge connectivity between nodes and node-subsets, Networks, 31(3), (1998), 157–164. 8. T. Jord´ an, Two NP-complete augmentation problems, Preprint no. 8, Department of Mathematics and Computer Science, Odense University, (1997). 9. H. Miwa and H. Ito, Edge augmenting problems for increasing connectivity between vertices and vertex subsets, 1999 Technical Report of IPSJ, 99-AL-66(8), (1999), 17–24. 10. H. Nagamochi and T. Ibaraki, A linear-time algorithm for ﬁnding a sparse kconnected spanning subgraph of a k-connected graph, Algorithmica, 7, (1992), 583– 596. 11. H. Nagamochi and T. Ibaraki, Computing edge-connectivity of multigraphs and capacitated graphs, SIAM J. Discrete Math., 5, (1992), 54–66. 12. H. Nagamochi and T. Ibaraki, Graph connectivity and its augmentation: applications of MA orderings, Discrete Applied Mathematics, 123(1), (2002), 447–472. 13. T. Watanabe and A. Nakamura, Edge-connectivity augmentation problems, J. Comput. System Sci., 35, (1987), 96–144.

Scheduling and Traﬃc Allocation for Tasks with Bounded Splittability Piotr Krysta1, Peter Sanders1 , and Berthold V¨ ocking2 1

2

Max-Planck-Institut f¨ ur Informatik Stuhlsatzenhausweg 85, Saarbr¨ ucken, Germany {krysta,sanders}@mpi-sb.mpg.de Dept. of Computer Science, Universit¨ at Dortmund Baroper Str. 301, 44221 Dortmund, Germany [email protected]

Abstract. We investigate variants of the problem of scheduling tasks on uniformly related machines to minimize the makespan. In the k-splittable scheduling problem each task can be broken into at most k ≥ 2 pieces to be assigned to diﬀerent machines. In a more general SAC problem each task j has its own splittability parameter kj ≥ 2. These problems are NPhard and previous research focuses mainly on approximation algorithms. Our motivation to study these scheduling problems is traﬃc allocation for server farms based on a variant of the Internet Domain Name Service (DNS) that uses a stochastic splitting of request streams. We show that the traﬃc allocation problem with standard latency functions from Queueing Theory cannot be approximated in polynomial time within any ﬁnite factor because of the extreme behavior of these functions. Our main result is a polynomial time, exact algorithm for the k-splittable scheduling problem as well as the SAC problem with a ﬁxed number of machines. The running time of our algorithm is exponential in the number of machines but is only linear in the number of tasks. This result is the ﬁrst proof that bounded splittability reduces the complexity of scheduling as the unsplittable scheduling is known to be NP-hard already for two machines. Furthermore, since our algorithm solves the scheduling problem exactly, it also solves the traﬃc allocation problem.

1

Introduction

A server farm is a collection of servers delivering data to a set of clients. Large scale server farms are distributed all over the Internet and deliver various types of site content including graphics, streaming media, downloadable ﬁles, and HTML on behalf of other content providers who pay for an eﬃcient and reliable delivery of their site data. To satisfy these requirements, one needs an advanced traﬃc management that takes care for the assignment of traﬃc streams to individual servers. Such streams can be formed, e.g., by traﬃc directed to the same page,

Partially supported by DFG grants Vo889/1-1, Sa933/1-1, and the IST program of the EU under contract IST-1999-14186 (ALCOM-FT).

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 500–510, 2003. c Springer-Verlag Berlin Heidelberg 2003

Scheduling and Traﬃc Allocation for Tasks with Bounded Splittability

501

traﬃc directed to pages of the same content provider, or by the traﬃc requested from clients in the same geographical region or domain, or also by combinations of these criteria. The objective is to distribute these streams as evenly as possible over all servers in order to ensure site availability and optimal performance. For each traﬃc stream there is a corresponding stream of requests sent from the clients to the server farm. Current implementations of commercial Web server farms use the Internet Domain Name Service (DNS) to direct the requests to the server that is responsible for delivering the data of the corresponding traﬃc stream. The DNS can answer a query such as “What is www.uni-dortmund.de?” with a short list of IP addresses rather than only a single IP address. The original idea behind returning this list is that, in case of failures, clients can redirect their requests to alternative servers. Nowadays, slightly deviating from this idea, these lists are also used for the purpose of load balancing among replicated servers (cf., e.g., [8]). When clients make a DNS query for a name mapped to a list of addresses, the server responds with the entire list of IP addresses, rotating the ordering of addresses for each reply. As clients typically send their HTTP requests to the IP address listed ﬁrst, DNS rotation distributes the requests more or less evenly among all the replicated servers in the list. Suppose the request streams are formed by a suﬃciently large number of clients such that it is reasonably well described by a Poisson process. Let λj denote the rate of stream j, i.e., the expected number of requests in some speciﬁed time interval. Under this assumption, rotating a list of servers corresponds to splitting stream j into substreams each of which having rate λj /. We propose a slightly more sophisticated stochastic splitting policy that allows for a better load balancing and additionally preserves the Poisson property of the request streams. Suppose, the DNS attaches a vector pj1 , . . . , pj with i pji = 1 to the list of each stream j. In this way, every individual request in stream j can be directed to the ith server on this list with probability pji . This policy breaks Poisson stream j into Poisson streams of rate pj1 λj , . . . , pj λj , respectively. The possibility to split streams into smaller substreams can obviously reduce the maximum latency. It is not obvious, however, whether it is easier or more diﬃcult to ﬁnd an optimal assignment if every stream is allowed to be broken into a bounded number of substreams. Observe that the allocation problem above is a variant of machine scheduling in which streams correspond to jobs and servers to machines. In the context of machine scheduling, bounded splittability has been investigated before with the motivation to speed up the execution of parallel programs. We will introduce ﬁrst the relevant background in scheduling. Scheduling on Uniformly Related Machines. Suppose a set of jobs [n] = {1, . . . , n} need to be scheduled on a set of machines [m] = {1, . . . , m}. Jobs are described by sizes λ1 , . . . , λn ∈ >0 , and machines are described by their speeds s1 , . . . , sn ∈ >0 . In the classical, unsplittable scheduling problem on uniformly related machines, every job must be assigned to exactly one machine. This mapping can be described by an assignment matrix (xij )i∈[m],j∈[n] , where xij is an indicator variable with xij = 1 if job j is assigned to machine i and 0 otherwise. The objective is to minimize the makespan z = maxi∈[m] j∈[n] λj xij /si . It is

502

Piotr Krysta, Peter Sanders, and Berthold V¨ ocking

well known that this problem is strongly NP-hard. Hochbaum and Shmoys [5, 6] gave the ﬁrst polynomial time approximation schemes (PTAS) for this problem. If the number of machines is ﬁxed, then the problem is only weakly NP-hard and it admits a fully polynomial time approximation scheme (FPTAS) [7]. A fractional relaxation of the problem leads to splittable scheduling. In the fully splittable scheduling problem the variables xij can take arbitrary real values from [0, 1] subject to the constraints i∈[m] xij ≥ 1, for every j ∈ [n]. This problem is trivially solvable, e.g., by assigning a piece of each job to each machine whose size is proportional to the speed of the machine. k-Splittable Machine Scheduling and the SAC Problem. In the k-splittable machine scheduling problem each job can be broken into at most k ≥ 2 pieces that must be placed on diﬀerent machines, i.e., at most k of the variables xij ∈ [0, 1], for every j, are allowed to be positive. Recently, Shachnai and Tamir [12] introduced a generalization of this problem, called scheduling with machine allotment constraints (SAC). In this problem, each job j has its own splittability parameter kj ≥ 1. In our study, we will mostly assume kj ≥ 2, for every j ∈ [n]. Shachnai and Tamir [12] prove that, in contrast to the fully splittable scheduling problem, the k-splittable machine scheduling problem is strongly NP-hard even on identical machines. They also give a PTAS for the SAC problem, whose running time, however, does not render practical as the splittability appears double exponentially in the running time. As a more practical result, they present a very fast maxj (1 + 1/kj )-approximation algorithm. This result suggests that, in fact, approximation should get easier when the splittability is increased. We should mention here, that there is a related scheduling problem in which preemption is allowed, that is, jobs can be split arbitrarily but pieces of the same job cannot be processed at the same time on diﬀerent machines. Shachnai and Tamir study also combinations of SAC and scheduling with preemption in which jobs can be broken into a bounded number of pieces and additionally there are bounds on the number of pieces that can be executed at the same time. Further variants of scheduling with diﬀerent notions of splittability with motivations from parallel computing and production planning can be found in [12] and [13]. Scheduling with Non-linear Latency Functions. The only diﬀerence between the k-splittable scheduling and the traﬃc allocation problems is that the latency occurring at servers may not be linear. A typical example of a latency function at a server of speed s with an incoming Poisson stream at rate λ is λ . This family of functions can be derived from the formula fs (λ) = s(s−min{s,λ}) for the waiting time on an M/M/1 queueing system. Of course, M/M/1 waiting time is only one out of many examples for latency functions that can be obtained from Queueing Theory. In fact, a typical property of such functions is that the latency goes to inﬁnity when the injection rate approaches the service rate. Instead of focusing on particular latency functions, we will set up a more general framework to analyze the eﬀects of non-linearity. The k-splittable traﬃc allocation problem is a variant of the k-splittable scheduling. Streams are described by rates λ1 , . . . , λn , and servers–by bandwidths or service rates s1 , . . . , sm . Hence, traﬃc streams can be identiﬁed with jobs and servers with machines.

Scheduling and Traﬃc Allocation for Tasks with Bounded Splittability

503

The latencies occurring at the servers are described by a family of latency functions F = {fs : ≥0 → ≥0 ∪{∞} | s ∈ >0 } where fs denotes a non-decreasing latency function for a server with service rate s. Scheduling under non-linear latency functions has been considered before. Alon et al. [2] give a PTAS for makespan minimization on identical machines with certain well-behaved latency functions. This was extended into a PTAS for makespan minimization on uniformly related machines by Epstein and Sgall [4]. In both studies, the latency functions must fulﬁll some analytical properties like convexity and uniform continuity under a logarithmic scale. Unfortunately, the uniform continuity condition excludes typical functions from Queueing Theory. Our Results. The main result of this paper is a ﬁxed-parameter tractable algorithm for the k-splittable scheduling and the more general SAC problem with splittability at least two for every job. Our algorithm has polynomial running time for every ﬁxed number of machines. This result is remarkable as unsplittable scheduling is known to be NP-hard already on two machines. In fact, our result is the ﬁrst proof that bounded splittability reduces the complexity of scheduling. In more detail, given any upper bound T on the makespan of an optimal assignment, our algorithm computes a feasible assignment with makespan at most T in time O(n+mm+m/(k0 −1) ) with k0 = min{k1 , k2 , . . . , kn }. Furthermore, despite the possibility to split the jobs into pieces of non-rational size, we prove that the optimal makespan can be represented by a rational number with only a polynomial number of bits. Thus the optimal makespan can be found by using binary search techniques over the rationals. This yields an exact, polynomialtime algorithm for SAC with a ﬁxed number of machines. (We have recently improved the running time in case of identical machines [1].) Note, that this problem is strongly NP-hard when the number of machines is not ﬁxed and k0 ≥ 2 [12]. In addition, we study the eﬀects due to the non-linearity of latency functions. The algorithm above can be adopted to work eﬃciently for a wide class of latency functions containing even such extreme functions as M/M/1 waiting time. On the negative side, we prove that latency functions like M/M/1 do not admit polynomial time approximation algorithms with ﬁnite approximation ratio if the number of machines is unbounded. The latter result is an ultimate rationale for our approach to devise eﬃcient algorithms for a ﬁxed number of machines.

2

An Exact Algorithm for SAC with Given Makespan

We present here an exact algorithm for SAC with kj ≥ 2 for every job. Our algorithm has polynomial running time for any ﬁxed number of machines. We assume that an upper bound on the optimal makespan is given. This upper bound deﬁnes a capacity for each machine. The capacity of machine i is denoted by ci . The computed schedule has to satisfy j∈[n] λj xij ≤ ci , for every i ∈ [m]. A diﬃcult subproblem to be solved is to decide into which pieces of which size the jobs should be cut. In principle, the number of possible cuts is unbounded. We will show that it suﬃces to consider only those cuts that “saturate” a machine.

504

Piotr Krysta, Peter Sanders, and Berthold V¨ ocking

Let πij = λj xij denote the size of the piece of job j allocated to machine i. Machine i is saturated by job j if πij = ci . Our algorithm (Algorithm 1) schedules the bulkiest job j ﬁrst where the bulkiness of j is λj /(kj − 1). Using backtracking it tries all ways to cut one piece from job j s.t. a machine is saturated. The saturated machine is removed from the problem; the splittability and size of j is reduced accordingly. The remaining problem is solved recursively. Two special cases arise. If j is too small to saturate kj machines, all remaining jobs can be scheduled using a simple greedy approach known as McNaughton’s rule [10]. Since the splittability kj of a job is decreased whenever a piece is cut oﬀ, a remaining piece can eventually become unsplittable. Since this remaining piece will be inﬁnitely bulky, it will be scheduled next. In this case, all machines that can accommodate the piece are tried. For the precise description see Fig. 1. I := [n] – – Machines to be saturated; Jobs to be scheduled [m]; J := λ > c ∨ ¬solve() then output “no solution possible” if j i j∈J i∈I else output nonzero πij values Function solve() : Boolean if J = ∅ then return true ﬁnd a j ∈ J that maximizes λj /(kj − 1) – – Unsplittable remaining piece if kj = 1 then forall i ∈ I with ci ≥ λj do – – (*) πij := λj ; ci := ci − λj ; J := J \ {j} if solve() then return true undo changes made in line (*) else – – Job j is splittable if λj /(kj − 1) ≤ min {ci : i ∈ I} then McNaughton(); return true forall i ∈ I with ci < λj do – – (**) πij := ci ; λj := λj − ci ; kj := kj − 1; I := I \ {i} if solve() then return true undo changes made in line (**) return false Procedure McNaughton() – – Schedule greedily pick any i ∈ I foreach j ∈ J do while ci ≤ λj do πij := ci ; λj := λj − ci ; I := I \ {i}; pick any new i ∈ I πij := λj ; ci := ci − λj Fig. 1. Algorithm 1: Find a schedule of n jobs with splittabilities kj to m machines.

Theorem 1. Algorithm 1 ﬁnds a feasible solution for SAC with a given possible makespan, provided that the splittability of each job is at least two. It can be implemented to run in time O n + mm+m/(k0 −1) , where k0 = min{k1 , . . . , kn }. Proof. All the necessary data structures can be initialized in time O(m + n) if we use a representation of the piece size matrix (πij ) that only stores nonzero entries. There can be at most m recursive calls that saturate a machine and at

Scheduling and Traﬃc Allocation for Tasks with Bounded Splittability

505

most m/(k0 − 1) recursive calls made for unsplittable pieces that remain after a job j was split kj − 1 times. All in all, the backtrack tree considers no more than m!mm/(k0 −1) possibilities. The selection of the bulkiest job can be implemented to run in time O(log m) independent of n: Only the m largest jobs can ever be a candidate. Hence it suﬃces to select these jobs initially using an O(n) time algorithm [3] and keep them in a priority queue data structure. Greedy scheduling using McNaughton’s rule takes time O(n + m). Overall, we get an execution time of O n + m + m!mm/(k0 −1) log m = O n + mm+m/(k0 −1) . The algorithm also produces only correct schedules. In particular, when λj /(kj − 1) ≤ min {ci : i ∈ I} McNaughton’s rule can complete the schedule because no remaining job is large enough to saturate more than k j −1 of the remain ing machines. In particular, solve() maintains the invariant j∈J λj ≤ i∈I ci and when McNaughton’s rule is called, it can complete the schedule: Lemma 1. McNaughton’s rule computes a correct schedule if j∈J λj ≤ i∈I ci and ∀i ∈ I, j ∈ J : λj /(kj − 1) ≤ ci . Proof. The only thing that can go wrong is that a job j is split more than kj − 1 times, i.e., into ≥ kj + 1 pieces. Then, it completely ﬁlls at least kj − 1 machines with capacity at least mini∈I ci , contradicting λj /(kj −1) ≤ ci . Now we come to the interesting part of the proof. We have to show that the search succeeds if a feasible schedule exists. We show the stronger claim that the algorithm is correct even if unsplittable jobs are present. (In this case only the above running time analysis would fail.) The proof is by induction on m. For m = 1 this is trivial since no splits are necessary. Consider the case m > 1. If there are unsplittable jobs, they are inﬁnitely bulky and so are scheduled ﬁrst. Since all possible placements for them are tried, nothing can be missed for them. When a splittable job is bulkiest, only those splits are considered that saturate one machine. Lemma 2 shows that if there is a feasible schedule, there must also be one with this property. The recursive call leaves a problem with one machine less and the induction hypothesis is applicable. Lemma 2. If a feasible schedule exists and the bulkiest job is large enough to saturate a machine then there is a feasible schedule where the bulkiest job saturates a machine. Our approach to proving Lemma 2 is to show that any feasible schedule can be transformed into a feasible schedule where the bulkiest job saturates a machine. To simplify this task, we ﬁrst establish a toolbox of simpler transformations. We begin with two very simple transformations that aﬀect only two jobs and obviously maintain feasibility. See Fig. 2-(a) and 2-(b) for illustrations. Lemma 3. For any feasible schedule, consider two jobs p and q sharing machine i , i.e., πi p > 0 and πi q > 0. For any machine i such that πi q < πip there is a feasible schedule where the overlapping piece of q is moved to machine i, i.e., (πi p , πip , πi q , πiq ) := (πi p + πi q , πip − πi q , 0, πiq + πi q ).

506

Piotr Krysta, Peter Sanders, and Berthold V¨ ocking

Fig. 2. Manipulating schedules. Lines represent jobs. (Bent) boxes represent machines. (a): The move from Lemma 3; (b): The swap from Lemma 4; (c): Saturation using Lemma 5; (d): The rotation from Lemma 6; (e): Moving j away from r.

Lemma 4. For any feasible schedule, consider two jobs p and q sharing machine i, i.e., πip > 0 and πiq > 0. Furthermore, consider two other pieces πip p and πiq q of p and q. If πip p ≤ πiq + πiq q and πiq q ≤ πip + πip p then there is a feasible schedule where the pieces πip p and πiq q are swapped as follows: ( πip p , πip , πiq p , πip q , πiq , πiq q ) := ( 0, πip + πip p − πiq q , πiq p + πiq q , πip q + πip p , πiq + πiq q − πip p , 0 ) As a ﬁrst application of Lemma 3 we now explain how a large job j allocated to at most kj − 1 machines can “take over” a small machine. Lemma 5. Consider a job j and machine i such that λj /(kj − 1) ≥ ci . If there is a feasible schedule where j is scheduled to at most (kj − 1) machines, then there is a feasible schedule where j saturates machine i. Proof. Let i denote a machine index that maximizes πi j and note that πi j ≥ λj /(kj −1) ≥ ci . We can now apply Lemma 3 to subsequently move all the pieces on machine i to machine i . Lemma 3 remains applicable because πi j is large enough to saturate machine i. See Fig. 2-(c) for an illustration. After the above local transformations, we come to a global transformation that greatly simpliﬁes the kind of schedules we have to consider. Definition 1. Job j is called split if |{i : πij > 0}| > 1. The split graph corresponding to a schedule is an undirected hypergraph G = ([m], E) where each split job j corresponds to a hyperedge {i : πij > 0} ∈ E. Lemma 6. If a feasible schedule exists, then there is a feasible schedule whose split graph is a forest. Proof. It suﬃces to show that for a feasible schedule whose split graph G contains a cycle there is also a feasible schedule whose corresponding split graph has a smaller value of e∈E |e|. Then it follows that a feasible schedule that minimizes |e| is a forest. e∈E So suppose G contains a cycle involving edges. Let succ(j) stand for (j + 1) mod + 1. By appropriately renumbering machines and jobs we can assume

Scheduling and Traﬃc Allocation for Tasks with Bounded Splittability

507

without loss of generality that this cycle is made up of jobs 1 to and machines 1 to such that for j ∈ [], πjj > 0, πsucc(j)j > 0, and δ = π11 = minj∈[] min πjj , πsucc(j)j . Fig. 2-(d) depicts this normalized situation. Now we rotate the pieces in the cycle by increasing πjj by δ and decreasing πsucc(j)j by the same amount. The schedule remains feasible since the load of the machines in the cycle remains unchanged. Since the ﬁrst job is now split in one piece less, e∈E |e| decreases. Now we have all the necessary tools to establish Lemma 2. Proof. Consider any feasible schedule; let j denote the bulkiest job and let there be a machine i0 with λj /(kj − 1) ≥ ci0 . We transform this schedule in several steps. We ﬁrst apply Lemma 6 to obtain a schedule whose split graph is a forest. We now concentrate on the tree T where j is allocated. If job j is allocated to at most kj − 1 machines, we can saturate i0 using Lemma 5 and we are done. If one piece of j is allocated to a leaf i of T then all other jobs mapped to machine i are allocated there entirely. Let i denote another machine j is mapped to. We apply Lemma 3 to move small jobs from i to i . When this is no longer possible, either job j saturates machine i and we are done or there is a job j with λj = πij > πi j . Now we can apply Lemma 4 to pieces πij , πi j , πij , and a zero size piece of job j . This transformation reduces the number of pieces of job j so that we can saturate machine i0 using Lemma 5. Finally, j could be allocated to machines that are all interior nodes of T . We focus on the two largest pieces πij and πi j so that πij + πi j ≥ 2λj /kj . Now ﬁx a leaf r that is connected to i via a path that does not involve j as an edge. This is possible since j is connected to interior nodes only. Now we intend to move job j away from r, i.e., we transform the schedule such that the path between node r and job j becomes longer. (The path between a node v and a job e in a tree starts at v and uses edges e = e until a node is reached that has e as an incident edge.) We do this iteratively until j is incident to a leaf in T . Then we can apply the transformations described above and we are done. We ﬁrst apply Lemma 3 to move small pieces of jobs allocated to machine i to machine i. Although this changes the shape of T , it leaves the distance between jobs j and r invariant unless j ends up in machine i completely so that we can apply Lemma 5 and we are done. When Lemma 3 is no longer applicable, either j saturates machine i and we are done or there is a job q with πi q > πij . In that case we consider the smallest other piece πiq q of job q. More precisely, if q is split into at most kq − 1 nonzero pieces we pick some iq with πiq q = 0. Otherwise we pick iq = min { = i : πq > 0}. In either case πiq q ≤ λq /(kq − 1). Recall that πij + πi j ≥ 2λj /kj since this sum is invariant under the move operations we have performed. Furthermore, j is the bulkiest job so that πiq q ≤

λj 2λj 2λj λq ≤ = ≤ kq − 1 kj − 1 (kj − 1) + (kj − 1) (kj − 1) + 1 2λj = ≤ πij + πi j . kj

508

Piotr Krysta, Peter Sanders, and Berthold V¨ ocking

So, we can apply Lemma 4 to pieces i and i of job j and to pieces i and iq of job q. This increases the distance from job j to machine r as desired. Fig. 2-(e) gives an example where we apply Lemma 3 once and then Lemma 4.

3

Finding the Optimal Makespan

We assumed so far that an upper bound on the optimal makespan is known. The obvious idea now is to ﬁnd the optimal makespan using binary search. In order to show that this search terminates one needs to prove that the optimal makespan is a rational number. This is not completely obvious as in principle jobs might be broken into pieces of non-rational size. The following lemma, however, shows that the optimal makespan can be represented by rational numbers of polynomial length. Let denote the set of non-negative rational numbers that can be represented by an -bit numerator and an -bit denominator and the symbol ∞. The proof of the next lemma is omitted in this extended abstract. Lemma 7. There is a constant κ > 0 s.t. the value of an optimum solution to SAC problem with kj ≥ 2 (for all j) is in N κ , where N is the problem size. By Lemma 7, the optimal makespan can be found by binary search methods over the rationals (see, e.g., [9, 11]) with Algorithm 1 as a decision oracle. Thus: Corollary 1. For every ﬁxed number of machines, there is an exact polynomial time optimization algorithm for the SAC problem with splittability at least two.

4

Solving the Traﬃc Allocation Problem

In this section, we show how to apply the binary search approach to the traﬃc allocation problem, i.e., we solve the SAC problem with non-linear latency functions. We need to make some very modest assumptions about these functions. A latency function is monotone if it is positive and non-decreasing. The functions need not be continuous or strictly increasing, e.g., step functions are covered. For a monotone function f : ≥0 → ≥0 ∪ {∞}, let the inverse of f be deﬁned by f −1 (y) = sup{λ|f (λ) ≤ y}, for y ≥ f (0), and f −1 (y) = 0, for y < f (0). We say that a function f is polynomially length-bounded if for every λ ∈ , f (λ) ∈ poly() . For example, the M/M/1 waiting time function is polynomially length-bounded although limλ→b− fs (λ) = ∞. This is because, for λ, s ∈ with λ < s, one can show (s − λ) ∈ 2 , s(s − λ) ∈ 4 and λ/(s(s − λ)) ∈ 8 so that f (λ) ∈ 8 . We say that a family of latency functions F is eﬃciently computable if, for every s, λ ∈ , fs (λ) and fs−1 (λ) can be calculated in time polynomial in . Observe that the functions from an eﬃciently computable family must also be polynomially length-bounded. It is easy to check that the M/M/1 waiting time family and other typical function families from Queueing Theory are eﬃciently computable. We obtain the following result whose proof is omitted.

Scheduling and Traﬃc Allocation for Tasks with Bounded Splittability

509

Theorem 2. Let F be any eﬃciently computable family of monotone functions. Consider the SAC problem with latency functions from F and splittability at least two. Suppose the best possible maximum latency can be represented by a number from . Then an optimal solution can be found in time O poly() · (n + mm+m/(k0 −1) ) with k0 = min{kj : j = 1, 2, . . . , n}. Note that is an obvious lower bound on the running time of any algorithm computing the exact, optimal makespan. It is unclear if there exist latency functions s.t. cannot be bounded polynomially in the input length. If an appropriate upper bound on is not known in advance, we can use the geometric search, which can be stopped after computing the optimal latency with desired precision.

5

Non-approximability for Non-linear Scheduling

The M/M/1 waiting time cost function family deﬁned in Section 1 has an inﬁnity pole as λ → b− . Intuitively, this pole reﬂects the capacity restriction on the servers and it is typical also for other families that can be derived from Queueing Theory. The following theorem, whose proof is omitted, shows that the non-linear k-splittable scheduling even with identical servers is completely inapproximable. Theorem 3. Let F be an eﬃciently computable family of monotone latency functions. Suppose there is s ∈ >0 s.t. limλ→s fs (λ) = ∞. Then there does not exist a polynomial time approximation algorithm with ﬁnite approximation ratio for the non-linear k-splittable scheduling problem under F , provided P = NP.

References 1. A. Agarwal, T. Agarwal, S. Chopra, A. Feldmann, N. Kammenhuber, P. Krysta and B. V¨ ocking. An Experimental Study of k-Splittable Scheduling for DNS-Based Traﬃc Allocation. To appear in Proc. of the 9th EUROPAR, 2003. 2. N. Alon, Y. Azar, G. J. Woeginger and T. Yadid. Approximation schemes for scheduling on parallel machines. Journal of Scheduling, 1:55–66, 1998. 3. M. Blum, R. Floyd, V. Pratt, R. Rivest, and R. Tarjan. Time bounds for selection. J. Computer and System Science, 7(4):448–461, August 1973. 4. L. Epstein and J. Sgall. Approximation schemes for scheduling on uniformly related and identical parallel machines. Proc. of the 7th ESA, 151–162, 1999. 5. D. S. Hochbaum and D.B. Shmoys. Using dual approximation algorithms for scheduling problems: theoretical and practical results. J. ACM, 34: 144–162, 1987. 6. D. S. Hochbaum and D.B. Shmoys. A polynomial approximation scheme for scheduling on uniform processors: using the dual approximation approach. SIAM Journal on Computing, 17: 539–551, 1988. 7. E. Horowitz and S. K. Sahni. Exact and approximate algorithms for scheduling nonidentical processors. J. ACM, 23:317–327, 1976. 8. J. F. Kurose and K. W. Ross. Computer networking: a top-down approach featuring the Internet. Addison-Wesley, 2001. 9. St. Kwek and K. Mehlhorn. Optimal search for rationals. Information Processing Letters, 86:23 - 26, 2003.

510

Piotr Krysta, Peter Sanders, and Berthold V¨ ocking

10. R. McNaughton. Scheduling with deadlines and loss functions. Management Science, 6:1–12, 1959. 11. C. H. Papadimitriou. Eﬃcient search for rationals. Information Processing Letters, 8:1–4, 1979. 12. H. Shachnai and T. Tamir. Multiprocessor Scheduling with Machine Allotment and Parallelism Constraints. Algorithmica, 32(4): 651–678, 2002. 13. W. Xing and J. Zhang. Parallel machine scheduling with splitting jobs. Discrete Applied Mathematics, 103: 259–269, 2000.

Computing Average Value in Ad Hoc Networks Mirosław Kutyłowski1 and Daniel Letkiewicz2 1

2

Inst. of Mathematics, Wrocław University of Technology [email protected] Inst. of Engineering Cybernetics, Wrocław University of Technology [email protected]

Abstract. We consider a single-hop sensor network with n = Θ(N ) stations using R independent communication channels. Communication between the stations can fail at random or be scrambled by an adversary so that it cannot be distinguished from a random noise. Assume that each station Si holds an integer value Ti . The problem that we consider is to replace the values Ti by their average (rounded to integer values). A typical situation is that we have a local sensor network that needs to make a decision based on the values read by sensors by computing the average value or some kind of voting. We design a protocol that solves this problem in O(N/R · log N ) steps. The protocol is robust: a constant random fraction of messages can be lost (by communication channel failure, by action of an adversary or by synchronization problems). Also a constant fraction of stations may go down (or be destroyed by an adversary) without serious consequences for the rest. The algorithm is well suited for dynamic systems, for which the values Ti may change and the protocol once started works forever. Keywords: mobile computing, radio network, sensor network

1 Introduction Ad hoc networks that communicate via radio channels gain importance due to many new application areas: sensor networks used to monitor environment, self-organizing networks of mobile devices, mobile networks used in military and rescue operations. Ad hoc networks provide many features that are very interesting from practical point of view. They have no global control (which could be either attacked or accidentally destroyed), should work if some stations leave or join the network. So the systems based on ad hoc networks are robust (once they work). On the other side, it is quite difficult to design efficient algorithms for ad hoc networks. Classical distributed algorithms have been designed for wired environments with quite different communication features. For instance, in many cases one can assume that an hoc networks works synchronously (due to GPS signals); if the network works on a small area, then two stations may communicate directly (single-hop model) and there is no communication latency. On the other hand, stations compete for access into a limited number of radio channels. They may disturb each other making the transmission

This research was partially supported by Komitet Bada´n Naukowych grant grant 8T11C 04419.

B. Rovan and P. Vojt´asˇ (Eds.): MFCS 2003, LNCS 2747, pp. 511–520, 2003. c Springer-Verlag Berlin Heidelberg 2003

512

Mirosław Kutyłowski and Daniel Letkiewicz

unreadable, if broadcasting at the same time on the same channel. Therefore quite a different algorithmic approach is necessary. Recently, there have been a lot of research on fundamental issues for ad hoc networks (as a starting point for references see [7]). Problem Statement. In this paper we consider the following task. Each station Si of a network holds initially an integer value Ti . The goal is to compute the average value of all numbers Ti so that each station changes its Ti into the average value. We demand that the numbers held by the stations remain to be integers and their sum is not changed (so at the end, some small differences could be inevitable and the stations hold not exactly the average value but values close to it). The intuition is that Ti might be a physical value measured by a sensor or a preference of Si , expressed as an integer value (for instance 0 meaning totally against, 50 meaning undecided and 100 as a fully supporting voice). The network may have to compute the average in order to get an output of a group of sensors or to make a common decision regarding its behavior. This task, which can trivially be solved for most computing systems (for instance, by collecting data, simple arithmetic and broiadcasting the result) becomes non trivial in ad hoc networks. Computation Model. We consider networks consisting of identical stations with no ID’s (the idea is that it is unpredictable which stations appear in the network and the devices are bulk produced). However, we assume that the stations know n, the number of stations in the network, within a constant factor. This parameter is called N for the rest of the paper. (Let us remark that N can be derived by an efficient algorithm [5].) Communication between stations is accomplished through R independent communication channels labeled by numbers 0 through R−1. A station may either send a message through a chosen channel or listen to a chosen channel (but not both at the same time, according to the IEEE 802.11 standard). If more than one station is sending on the same channel, then a collision occurs and the messages on this channel are scrambled. We assume that a station listening to the channel on which there is a collision receives noise and cannot even recognize that a collision has occurred (no-collision-detection model). In this paper we consider only networks that are concentrated in a local area: we assume that if a station sends a message, then every station (except the sender) can hear it. So we talk about single-hop network. Computation of each station consists of steps. During a step a station may perform a local computation and either send or receive messages through a chosen channel. For the sake of simplicity of presentation we assume that computation is synchronous. However, our results also hold for asynchronous systems, provided that the stations work with comparable speeds and lack of synchronization may only result in failure of a constant fraction of communication rounds. We do not use any global clock available to all stations. (In fact, our algorithm can be used to agree upon a common time for other purposes.) Design Goals. We design a protocol that has to remain stable and efficient in the following sense:

Computing Average Value in Ad Hoc Networks

513

– Each station may break down or leave the network. However we assume that Ω(N ) stations remain active. – A message sent by one station is not received by a station that is listening with probability not lower than p, where p < 1 is fixed. – An adversary, who knows all details about the algorithm, may scramble communication for a constant fraction of the total communication time over all communication channels. (So no “hot spots” of communication pattern of the protocol are admissible - they would be easily attacked by an adversary). – The protocol has to be suited for dynamic systems, which once started compute the average forever. So it has to be applied in systems where the values Ti may change. (For a discussion on dynamic systems see [8].) Preferably, a solution should be periodic with the same code is executed repeatedly. – The protocol should ensure that the number of steps at which a station transmitting a message or listening is minimized. Also there should be no station that is involved in communication substantially longer than an average station. This is due to the fact that energy consumption is due mainly for radio communication and that battery opereted devices has only limited energy resources. Former Results. Computing an average value is closely related to load balancing in distributed systems. (In fact, in the later case we have not only to compare the loads but also forward some tasks.) However, the algorithms are designed for wired networks. An exception for it is a solution proposed by Gosh and Muthukrishnan [4]: they propose a simple protocol based on random matchings in the connection graph. In one round of their protocol the load is balanced between the nodes connected by an edge of a matching. This approach is very different from straightforward algorithms that try to gather information and then make decisions based on it. They provide an estimation of convergence of this algorithm to the equal load based on total characteristics of the graph (its degree and the second eigenvalue). Their proof shows decrease of a certain potential describing how much the load is unbalanced. The protocol of Gosh and Muthukrishnan has been reused for permuting at random in a distributed system [2]. However, in this case the analysis is based on a rapid-mixing property of a corresponding Markov chain and a refined path coupling approach [3]. New Results. We extend the results from [4] and adopt it to the case of a single hop ad hoc network. The issue is that in the protocol of Gosh and Muthukrishnan one needs to show not only that a certain potential decreases fast, but also that there are no “bad points” in the network that have results far from the valid ones (if potential is low, we can only guarantee that the number of bad points is small). Theorem 1. Consider an ad hoc network consisting of Θ(N ) stations using R communication channels. Let D denote the maximum difference of the form Ti − Tj at the beginning of the protocol. With high probability, i.e. probability at least 1 − N1 , after executing O(N/R · (log N + log D)) steps of the protocol: – the sum of all values Ti remains unchanged and each station keeps one value, – either all station keep the same value Ti , or these values differ by at most 1, or they differ by at most 2 and the number of stations that keep the biggest and the smallest values is bounded by ε · N for a small constant ε.

514

Mirosław Kutyłowski and Daniel Letkiewicz

2 Protocol Description The protocol repeats a single stage consisting of 3N/R steps, three consecutive steps are called a round. A stage is in fact a step of protocol of Gosh and Muthukrishnan. Description of a Stage. 1. Each station Si chooses t, t ∈ [1, . . . , N ] and a bit b uniformly at random (the choices of t, t and b are stochastically independent). 2. Station Si performs the following actions during round t/R on the channel t mod R. – If b = 0, then Si transmits Ti at step 1. Otherwise, it transmits Ti at step 2. – At step 3, station Si listens. If a message comes from another station with the Ti transmitted and an another value, say Tj , then Si changes Ti as follows: • if Ti + Tj is even, then station Si puts Ti := 12 (Ti + Tj ), • if Ti + Tj is odd, then station Si puts Ti := 12 (Ti + Tj ), if its b equals 0, and Ti := 12 (Ti + Tj ) otherwise. 3. If t = t, then during round t /R station Si uses channel t mod R: – it listens during the first two steps, – it concatenates the messages heard and sends them during the third step. The idea of the protocol is that 3N/R steps of a single stage are used as a place for N slots in which pairs of stations can balance their values. If everything works fine, then for a given channel at a given round: – during step 1 a station Su for which b = 0 sends Tu , – during step 2 a station Sv for which b = 1 sends Tv , – step 3 is used to avoid Byzantine problems [6]: another station Sw repeats Tu and Tv . Otherwise, neither Su nor Sv could be sure that its message came through. (An update of only one of the values Su or Sv would violate the condition that the sum of all values must not change.) Of course, such a situation happens only for some slots. However, standard considerations (see for instance [3]) show that the following fact: Lemma 1. With high probability, during a stage balancing does occur at step 3 for at least c · N slots, where c is a fixed constant, 0 < c < 1. Note that Lemma 1 holds also, if for a constant fraction of slots communication failure occurs or an adversary scrambles messages. Since the stations communicate at randomly chosen moments, it is difficult for an adversary to attack only some group of stations.

3 Analysis of the Protocol The analysis consists of three different phases (even if the stations behave in exactly the same way all the time). In Phase I we show that some potential function reaches a certain low level – this part is borrowed from [4]. In Phase II we guarantee with high probability that all stations deviate by at most β from the average value. Then Phase 3 is used to show that with high probability all stations hold one of at most 3 consecutive values. In order to simplify the presentation we call the stations S1 , . . . Sn , even if the algorithm does not use any ID’s of the stations.

Computing Average Value in Ad Hoc Networks

515

3.1 Phase I Let Tt,j denote the value of Tj hold by S j nimmediately after executing stage t. Let T denote the average value, that is, T = n1 i=1 T0,i . We examine the values xt,j = Tt,j − T . In order to examine the differences from the average value we consider the following potential function: ∆t =

n

x2t,i .

i=1

Claim A. E [∆t+1 ] ≤ ρ · ∆t +

n 4

for some constant ρ < 1.

Proof. We first assume that new values hold by two stations after balancing their values become equal (possibly reaching non-integer values). Then we make an adjustment to the real situation when the values must remain to be integers. By linearity of expectation, n n E [∆t+1 ] = E x2t+1,i = E x2t+1,i . i=1

i=1

So now we inspect a single E x2t+1,i . As already mentioned, with a constant probability δ station Si balances Tt,j with some other station, say with Ss . Assume that the values held by Si and Ss become equal. So xt+1,i and xt+1,s become equal to z = 12 (xt,i + xt,s ). Therefore 2 2 xt,i + xt,s 2 E xt+1,i = (1 − δ) · xt,i + δ · E 2 = (1 − δ) · x2t,i + δ · E 14 x2t,i + 14 x2t,s + 12 xt,i · xt,s = (1 − 34 δ) · x2t,i + 14 δ · E x2t,s + 12 δ · E [xt,i · xt,s ] . Since s is uniformly distributed over {1, . . . , n}, we get n 1 2 1 · xt,j = · ∆t . E x2t,s = n n j=1

The next expression we have to evaluate is n n 1 1 E [xt,i · xt,s ] = · (xt,i · xt,j ) = · xt,i · xt,j . n n j=1 j=1

n

xt,j = 0. So E [xt,i · xt,s ] equals 0 and finally, 1 E x2t+1,i = (1 − 34 δ) · x2t,i + 14 δ · · ∆t . n When we sum up all expectations E x2t+1,i we get n 1 E [∆t+1 ] = (1 − 34 δ) · x2t,i + 14 δ · ∆t n i=1 Obviously,

j=1

= (1 − 34 δ) · ∆t + 14 δ · ∆t = (1 − 12 δ) · ∆t .

516

Mirosław Kutyłowski and Daniel Letkiewicz

Now, let us consider the case when Tt,i + Tt,s (or equivalently xt,i + xt,s ) is odd. Let us see how it contributes to the change of ∆t+1 compared to the value computed previously. In the simplified case, Si and Ss contribute 2 xt,i + xt,s 2 2 to the value of ∆t+1 . Now, this contribution could be 2 2 xt,i + xt,s + 1 xt,i + xt,s − 1 + . 2 2 For every y,

y+1 2

2 +

y−1 2

2

=2·

y 2 2

+

so we have to increase the value computed for ∆t+1 by at most established a link. It follows finally that n E [∆t+1 ] ≤ ρ · ∆t + 4 1 for ρ = (1 − 2 δ).

1 2 1 2

for each pair that has (1)

Claim B. After τ0 = O(log D + log n) stages, ∆τ0 ≤ α · n + 1 with probability 1 − O( n12 ). Proof. Let ∇t = ∆t − αn, for α =

1 4(1−ρ) .

By inequality (1)

E [∇t+1 ] = E [∆t+1 − αn] ≤ ρ · ∆t +

n − αn = 4

n − αn = ρ · ∇t . 4 It follows that E [∇t+1 ] ≤ ρ · E [∇t ]. Let τ0 = logρ−1 D · n2 . Then E [∇τ0 ] ≤ n−2 . So by Markov inequality, Pr[∇τ0 ≥ 1] ≤ n−2 . We conclude that Pr[∆τ0 < 1 + αn] is at least 1 − n−2 . ρ · ∇t + ραn +

3.2 Phase II

√ We assume that ∆τ0 < 1 + αn. Let β = 2 α. Let B = Bt be the set of stations Si such that |xt,i | > β, and G = Gt be the set of stations Sj such that |xt,j | = β. Claim C. |B ∪ G| < 14 n + O(1)

for each t ≥ τ0 .

Proof. The stations from B ∪ G contribute at least |B ∪ G| · β 2 = |B ∪ G| · 4α to ∆t . 1 Since ∆t ≤ ∆τ0 < αn + 1, we must have |B ∪ G| < 14 n + 4α . Now we define a potential function ∆ used to measure the size of Bt . Definition 1. For station Si ∈ Bt define x ˜i,t = xi,t − β and as a zero otherwise. Then x ˜2i,t . ∆t = i∈Bt

Computing Average Value in Ad Hoc Networks

Claim D. E ∆t+1 ≤ µ · ∆t

517

for some constant µ < 1.

Proof. We consider a single station Si ∈ Bt , with xi,t > β. By Claim C, with a constant probability it communicates with a station Sj ∈ Bt ∪ Gt . In this case, station Sj may join B, but let us consider contribution of stations Si and Sj to ∆t+1 . Let δ = β − xj,t . Then: 2 2 2 x ˜2i,t x ˜i,t − δ x˜i,t − 1 x˜i,t 2 2 . ˜j,t+1 ≤ 2 · <2· ≤2· = x ˜i,t+1 + x 2 2 2 2 If Si communicates with a station Sj ∈ Bt ∪ Gt , then obviously x ˜2i,t+1 + x ˜2j,t+1 ≤ x˜2i,t + x ˜2j,t . We may apply a simple bookkeeping technique assigning the contribution of stations ˜2i,t for to ∆t+1 so that the expected value of contribution of Si is bounded by µ · x some constant µ (we assign to Si the contribution of Si and Sj in the first case, and equals contribution of Si in the second case). Since by linearity of expectation E ∆ t+1 the sum of expected values of these contributions, E ∆t+1 ≤ µ · E [∆t ]. Claim E. For t ≥ τ0 + T , where T = O(log n), the set Bt is empty with probability 1 − O( n12 ). E ∆τ0 ≤ E [∆τ0 ] Proof. By Claim D, E ∆τ0 +t ≤ µt · E ∆τ0 . On the other hand, < α · n + 1. Let T = logµ−1 (2α · n3 ) and t ≥ T . Then E ∆τ0 +t ≤ n−2 . So by Markov inequality, Pr[∆τ0 +t ≥ 1] ≤ n−2 . Observe that ∆ takes only integer values, so Bτ0 +t is empty, if ∆τ0 +t < 1. 3.3 Phase III Let τ1 = τ0 + T , where T satisfies Claim E. Then for t > τ1 we may assume that for no station |xi,t | ≥ β. Our goal now is to reduce the maximal value of |xi,t |. We achieve this in at most 2β − 1 subphases, each consisting of O(log n) stages: during each subphase we “cut off” one of the values that can be taken by xi,t . Always it is the smallest or the biggest value. Let V (s) = Vt (s) be the set of stations, for which xi,t takes the value s. Consider t1 > τ1 . Let l = min{xi,t1 : i ≤ n} and and g = max{xi,t1 : i ≤ n}. Assume that l + 1 < g − 1 (so, there are at least four values of numbers xi,t1 ). We show that for t = t1 + O(log n) either Vt (l) = ∅ or Vt (g) = ∅. Obviously, no station may join Vt (l) or Vt (g), so their sizes are non-increasing. Now consider a single stage. Observe that |Vt (l)∪Vt (l+1)| ≤ 12 n or |Vt (g)∪Vt (g−1)| ≤ 12 n. W.l.o.g. we may assume that |Vt (g) ∪ Vt (g − 1)| ≤ 12 n. Consider a station Si ∈ Vt (g). With a constant probability Si communicates with a station Sj that does not belong to Vt (g) ∪ Vt (g − 1). Then station Si leaves Vt (g) and Sj remains outside Vt (g). Indeed, the values xi,t and xj,t differ by at most 2, so xi,t+1 , xj,t+1 ≤ xi,t − 1. It follows that E [|Vt+1 (l)|] ≤ ψ · |Vt (l)| for some ψ < 1. We see that in a single stage we expect either |Vt (l)| or |Vt (g)| to shrink by a constant factor. Using Markov inequality as in the proof of Claim E we may then easily derive the following property:

518

Mirosław Kutyłowski and Daniel Letkiewicz

Claim F. For some T = O(log n), if t > T , then with probability 1 − O( n12 ) either the set Vτ1 +t (l) or the set Vτ1 +t (g) is empty. By Claim F, after O(β log n) = O(log n) stages we end up in the situation in which there are at most three values taken by xi,t . Even then, we may proceed in the same way as before in order to reduce the sizes of Vt (l) or Vt (g) as long as one of these sets has size Ω(n). So we can derive the following claim which concludes the proof of Theorem 1: Claim G. For some T = O(log n), for τ2 = τ1 + T and t ≥ τ2 with probability 1 − O( n12 ) either xi,t takes only two values, or there are three values and the number of stations holding the smallest and the largest values is at most γ · n.

4 Properties of the Protocol and Discussion Changes in the Network. By a simple examination of the proof we get the following additional properties: – the result holds even if a constant fraction of messages is lost. This only increases the number of stages by a constant factor. – if some number of stations goes down during the execution of the protocol, then the final values do not concentrate around the average value of the original values of the stations that have survived, but anyway they differ by at most 2. If a new station joins the network and its value deviates from the values hold by the rest of the stations, then we may proceed with the same analysis. Conclusions regarding the rate of convergence and the rate at which new stations emerge can be derived as before. Energy Efficiency. Time complexity is not the most important complexity measure for mobile networks. Since the devices are powered by batteries, it is important to design algorithms that are energy efficient (otherwise, the network may fail due to exhaustion of batteries). The main usage of energy is for transmitting messages and listening to the communication channel. Energy usage of internal computations and sensors is substantially smaller and can be neglected. Surprisingly, for transmitting and listening comparable amounts of energy are necessary. A properly designed algorithm should require a small number of messages (not only messages sent, but also messages awaited by the stations). Additionally, the differences between the number of messages sent or awaited by different stations should be as small as possible. The reason is that with similar energy resources no station should be at higher risk of going down due to energy exhaustion. In out algorithm energy usage of each station is Θ(log n). This is optimal since we need that many sending trials to ensure that its value has been transmitted successfully with high probability in the presence of constant probability of transmission failures. Protocol Extensions – Getting Exactly One Value. Our algorithm leaves the network in a state in which there might be 3 different values. The bound from Theorem 1 regarding behavior of the algorithm cannot be improved. Indeed, consider the following example:

Computing Average Value in Ad Hoc Networks

519

assume that initially exactly one station holds value T − 1, exactly one station holds T + 1, and the rest has value T . Then in order to get into the state when all stations get value T we need that the station with T − 1 communicates with the station with value T + 1. However, probability that it happens during a single stage is Θ(1/N ). Therefore, probability that these two station encounter each other within logarithmic number of stages is O(log N/N ). Once we are left with the values that differ from the average value by less than two, it is quite reasonable to start a procedure of computing the minimum over all active stations. In fact, it suffices to redefine part 3 of a stage: instead of computing the average of two values both stations are assigned the smaller of their two values. Arguing as in Claim E, we may easily show that after O(log n) stages with high probability all stations know the minimum. If each station knows the number n of the active stations, a simple trick may be applied to compute the average value exactly. At the beginning, each value is multiplied by n. Then the average value becomes s = Ti and it is an integer. So after executing the protocol we end up in the second situation described in Theorem 1 or all stations hold the same value. In order to get rid of the first situation a simple protocol may broadcast the minimal and maximal values to all stations within O(log N ) steps. Then all stations may find s and thereby the average s/n. Dynamic Processes. For a dynamic process, in which the the values considered are changing (think for instance about the output of sensors), we may observe that the protocol works quite well. For the sake of discussion assume that the values may only increase. If we wish to ignore the effect of increments of the values we may think about old units of the values existing at the beginning of the protocol, and the new ones due to incrementing the values. When executing part 3 of a stage and “allocating” the units to stations A and B we may assume that first the same (up to 1) amount of old units is given to A and B and afterwards the new units are assigned. In this way, the new units do not influence the behavior of the old ones. So a good distribution of the old units will be achieved as stated by Theorem 1 despite the fact that the values have changed. Security Issues. Since the stations exchange information at random moments an adversary which only disturbs communication can only slow down the rate of convergence. However, if it knows that there is a station with a value X that differs a lot from the average, it may increase its chances a little bit: if X has not occurred so far during a stage, then it might be advantageous for the adversary to scramble the rest of the stage making sure that X remains untouched. Of course, serious problems occur when an adversary can fake messages of the legitimate stations. If the legitimate stations have a common secret, say K, then the problems can be avoided. Faking messages becomes hard, when the messages are secured with MAC code using K. In order to avoid the first problem it is necessary to encipher all messages (together with random nounces). In this case an adversary cannot say which values are exchanged by the algorithm. The only information that might be derived is the fact that somebody has transmitted at a given time. But this seems not to bring any substantial advantage, except that then it could be advantageous to attack the third step

520

Mirosław Kutyłowski and Daniel Letkiewicz

of a round. Encryption with a symmetric algorithm should be no problem regarding speed differences between transmission and internal computations.

Acknowledgment We thank Artur Czumaj for some ideas developed together and contained in this paper.

References 1. Chlebus, B. S.: Randomized communication in radio networks. A chapter in “Handbook on Randomized Computing” (P. M. Pardalos, S. Rajasekaran, J. H. Reif, J. D. P. Rolim, Eds.), Kluwer Academic Publishers, 2001, vol. I, 401-456 2. Czumaj, A., Kanarek, P., Kutyłowski, M., and Lory´s, K.: Distributed stochastic processes for generating random permutations. ACM-SIAM SODA’99, 271-280 3. Czumaj, A., Kutyłowski, M.: Generating random permutations and delayed path coupling method for mixing time of Markov chains. Random Structures and Algorithms 17 (2000), 238–259 4. Gosh, B., Muthukrishnan, S.: Dynamic Load Balancing in Parallel and Distributed Networks by Random Matchings. JCSS 53(3) (1996), 357–370 5. Jurdzi´nski, T., Kutyłowski, M., Zatopia´nski, J.: Energy-Efficient Size Approximation for Radio Networks with no Collision Detection. COCOON’2002, LNCS 2387, Springer-Verlag, 279-289 6. L. Lamport, R. Shostak and M. Pease: The Byzantine Generals Problem. ACM TOPLAS 4 (1982), 382-401 7. I. Stojmenoviˇc (Ed.): Handbook of Wireless Networks and Mobile Computing, Wiley, 2002 8. E. Upfal: Design and Analysis of Dynamic Processes: A Stochastic Approach, ESA’1998, LNCS 1461, Springer-Verlag, 26–34

A Polynomial-Time Algorithm for Deciding True Concurrency Equivalences of Basic Parallel Processes Sławomir Lasota Institute of Informatics, Warsaw University, Poland [email protected]

Abstract. A polynomial-time algorithm is presented to decide distributed bisimilarity of Basic Parallel Processes. As a direct conclusion, several other noninterleaving semantic equivalences are also decidable in polynomial time for this class of process, since they coincide with distributed bisimilarity.

1 Introduction One important problem in the verification of concurrent systems is to check whether two given systems P and Q are equivalent under a chosen notion of equivalence. For process algebras generating infinite-state systems the equivalence checking problem cannot be decidable in general, therefore restricted classes of processes have been defined and investigated. We study here the class of Basic Parallel Processes [9] (BPP), an extension of recursively defined finite-state systems by parallel composition. Strong bisimilarity [25] is a well accepted behavioural equivalence, which often remains decidable for infinite-state systems. An elegant proof of decidability of bisimilarity for BPP and even for BPPτ , extension of BPP by communication, was given in [10]. The PSPACE lower bound has been recently proved in [26], followed by the PSPACEcompleteness result of Janˇcar [18]. On the other hand, all other equivalences in van Glabbeek’s spectrum are undecidable [17]. BPP is the natural class of processes to investigate non-interleaving equivalences, intended to capture true concurrent computations of a system. One of the bisimulationlike non-interleaving equivalences is distributed bisimilarity [6], taking into account spatial distribution of a process. Already in [8] distributed bisimilarity was shown to be decidable on BPPτ by means of a sound and complete tableau proof system. Concerning complexity, the tableau depth was only bounded exponentially. In this paper we design a polynomial-time decision procedure for distributed bisimilarity. It strongly uses a polynomial-time algorithm for deciding strong bisimilarity on normed BPP processes proposed in [16]. Distributed bisimilarity is therefore very likely to be computationally more feasible than interleaving bisimilarity, in the light of the recent PSPACE lower bound for the latter one by Srba [26]. Further interesting conclusions follow from the fact that many non-interleaving equivalences coincide on BPP. As mentioned in [12], Kiehn proved [21] that location equivalence [7], causal equivalence [11] and distributed bisimilarity all coincide

A part of this work has been performed during the post-doc stay at Laboratoire Specification et Verification, ENS Cachan. Partially supported by the KBN grant 7 T11C 002 21 and the EC Research Training Network “Games and Automata for Synthesis and Validation” (GAMES).

B. Rovan and P. Vojt´asˇ (Eds.): MFCS 2003, LNCS 2747, pp. 521–530, 2003. c Springer-Verlag Berlin Heidelberg 2003

522

Sławomir Lasota

on CPP, a sublanguage of BPPτ without explicit τ , hence also on BPP. Furthermore, causal equivalence and history preserving bisimilarity [13] coincide on BPP by the result of Aceto [1]; moreover, Fr¨oschle showed coincidence of distributed and history preserving bisimilarity on BPP [12]. The coincidence with performance equivalence, a timed bisimulation equivalence proposed by [14], has been shown in [23], to complete the picture1 . As a direct conclusion from all these results, all the mentioned equivalences can be also decided in polynomial time on BPP. Related results are [15] decision procedures for causal equivalence, location equivalence and ST-bisimulation equivalence of BPPτ as well as for their weak versions on a subset of BPPτ . However, complexity issues were not addressed there. Furthermore, polynomial-time complexity of performance equivalence extends the result of [5] shown only for timed BPP in a full standard form [9]. Surprisingly, polynomial-time complexity of history preserving bisimilarity on BPP can be contrasted with the EXPTIME-completeness on finite-state systems (finite 1-safe nets) [19]. Similarly, decidability of hereditary history preserving bisimilarity [4] on BPP, proved in [12], can be contrasted with undecidability on finite 1-safe nets shown by Jurdzi´nski and Nielsen in [20]. We start by Section 2 containing definitions and some basic facts and then we outline our algorithm in Section 3. The algorithm works for BPP processes in standard form, similarly as in [9]. A polynomial-time preprocessing procedure transforming a process into standard form can be found in the full version of the paper [24].

2 Basic Definitions and Facts Let Act be a finite set of actions, ranged over by a, b, etc. and let Const be a finite set of process constants, ranged over by X, Y , etc. The set of BPP process expressions [9] over Act and Const is given by: Pi | P P | P P (1) P ::= 0 | X | a.P | i∈I

where 0 denotes the empty process, a. is an action prefix, i∈I Pi is a finite nondeterministic choice for a finite nonempty set I and stands for a parallel composition. The only operator not present in CCS [25] is the left merge , which differs from the parallel composition only in that the very first action must be performed in the left argument. The purpose of considering here is the standard form (3) below. A BPP process definition ∆ consists of a finite set Act(∆) of actions, a finite set Const(∆) of constants and a finite number of recursive process equations X = P, def

one for each constant X ∈ Const(∆), where P is a process expression over Act(∆) and Const(∆). Sets Const(∆) and Act(∆) are often even not mentioned explicitly, as they can be deduced from process equations. In the sequel we shall assume that a 1

Related results are [2], where Aceto proved that distributed bisimilarity coincides with timed bisimilarity on BPP without recursion, and [22], where decidability of strong bisimilarity for timed BPP was shown, that generalizes the result of [10].

A Polynomial-Time Algorithm for Deciding True Concurrency Equivalences

523

process definitions that our algorithm inputs is guarded, i.e., that each occurrence of a constant on the right-hand side is within the scope of an action prefix. For instance, def X = a.Y X + b.Z is not guarded. By a BPP process we mean a pair (P, ∆), where ∆ is a BPP process definition and P is a BPP process expression over Act(∆) and Const(∆). When ∆ is evident from the context, P itself is called a process too. Distributed bisimilarity was introduced in [6], but here we follow [9]. Given a BPP process definition ∆, consider the following SOS transition rules: a

P → [P , P ]

(X = P ) ∈ ∆ def

a

X → [P , P ] a

Pj → [P , P ] for some j∈I a i∈I Pi → [P , P ] a

P → [P , P ] a

P Q →

a

a.P → [P, 0] a

P → [P , P ] a

P Q → [P , P Q] a

(2)

Q → [Q , Q ]

[P , P Q]

a

P Q → [Q , P Q ]

a

We write P → [P , P ] if this transition can be derived from the above rules. The rules a reflect a view on a process as distributed in space. Each transition P → [P , P ] gives rise to a local derivative P , which intuitively records a location at which the action is observed, and a concurrent derivative P , recording the part of the process separated from the local component. BPP processes (P1 , ∆1 ) and (P2 , ∆2 ) are distributed bisimilar, denoted (P1 , ∆1 ) ∼ (P2 , ∆2 ) if they are related by some distributed bisimulation R, i.e., a binary relation over BPP process expressions such that whenever (P, Q) ∈ R, for each a ∈ Act, a

a

– if P → [P , P ] then Q → [Q , Q ] for some Q , Q such that (P , Q ) ∈ R and (P , Q ) ∈ R, a a – if Q → [Q , Q ] then P → [P , P ] for some P , P such that (P , Q ) ∈ R and (P , Q ) ∈ R. In the next section we prove polynomial-time complexity of the problem of checking distributed bisimilarity for a given pair of constants. We do not lose generality, as checking P ∼Q for arbitrary P, Q is equivalent to checking XP ∼XQ , where XP and def def XQ are new fresh constants with defining equations: XP = a.P and XQ = a.Q, for arbitrary a. Moreover, w.l.o.g. we assume that both constants share a process definition. Problem: D ISTRIBUTED BISIMILARITY FOR BPP Instance: A BPP process definition ∆ and X, Y ∈ Const(∆) Question: (X, ∆)∼(Y, ∆) ? Christensen presented in [8] a sound and complete tableau proof system for ∼ on BPPτ and proved an exponential upper bound for the depth of a tableau. Theorem 1 ([8, 9]). Distributed bisimilarity is decidable on BPP.

524

Sławomir Lasota

Christensen [9] showed also that each BPP process definition ∆ can be effectively transformed into an equivalent process definition ∆ in standard form, i.e., consisting exclusively of process equations in the restricted form def X= (ai .Pi )Qi , (3) i∈I

where all Pi and Qi are merely a parallel composition of constants, i.e., of the form X1 X2 . . .Xn ,

for n > 0 and X1 , . . . , Xn ∈ Const(∆ ).

(4)

Note that (3) is not guarded in general. We omit brackets in (4) as is associative and commutative w.r.t. ∼ (and w.r.t. any other known semantical equivalence, in fact). A parallel composition of constants (4) is called basic process (expression) in the sequel. Observe that processes Pi in (3) are precisely local derivatives of X and processes Qi are precisely its concurrent derivatives. Hence the left merge operator allows here to syntactically separate both derivatives. Consequently, in the next section we will only consider basic processes since both local and concurrent derivatives of a basic process are basic again. Since ∆ is guarded, the process definition ∆ produced by Christensen’s transformation is bounded in the following sense: Definition 1. Define a binary relation over Const(∆ ) as follows: Y ≺1 X iff Y appears in some concurrent derivative of X. We say that ∆ is bounded if the transitive + closure ≺+ 1 of ≺1 is irreflexive, i.e., no constant satisfies X≺1 X.

Christensen’s transformation produces ∆ that contains only basic processes consisting of at most three constants. The price for this constant bound is the exponential size of ∆ w.r.t. the size of ∆, defined as the number of process equations in ∆ plus the sum of lengths of right-hand sides. This is why we proved (Theorem 2 below) that the transformation to the standard form (3) can be done in polynomial time.

Theorem 2. There exists a polynomial-time algorithm that transforms a guarded process definition ∆ into a process definition ∆ such that: 1. ∆ is in standard form (3), 2. ∆ is bounded, 3. Const(∆) ⊆ Const(∆ ) and for each X ∈ Const(∆), (X, ∆)∼(X, ∆ ). (The proof is given in the full version of this paper [24].) a → P obtained Strong bisimilarity is defined w.r.t. single-derivative transitions P − by the rules (2) when the distribution of a process is ignored – for details we refer eg. to [25]. A remarkable result is that strong bisimilarity can be decided for BPP processes in polynomial time [16], but only when all the constants are normed. A constant, or generally a process P , is normed if an inactive process is reachable from P , i.e., if there a1 an a2 P1 −→ . . . −− → Pn , n ≥ 0, such that there are is a finite sequence of transitions P −→ no further transition from Pn . The norm of P is the length of the shortest such sequence. Strong bisimilarity is less restrictive than distributed bisimilarity. Hence a process definition ∆ can be transformed into a strong bisimilarity equivalent process definition ∆ in a more restrictive than (3) full standard form [9]. Full standard form admits exdef clusively process equations in the form: X = i∈I ai .Pi , where all Pi are basic again. Theorem 3 ([16]). There exists a polynomial-time algorithm to decide strong bisimilarity on normed BPP in full standard form.

A Polynomial-Time Algorithm for Deciding True Concurrency Equivalences

525

3 Algorithm Throughout this section we fix ∆, and hence also sets of constants and actions. ∆ is assumed to be in standard form and bounded. A more succinct representation is possible for ∆ in standard form. Namely, due to associativity and commutativity of , it is sufficient to remember the number of occurrences of each constant in each basic expression Pi and Qi in the right-hand side of all equations (3) in ∆, encoded in binary. In this section, complexity is measured w.r.t. the size of ∆ defined as the number of process equations plus the sum of lengths of the succinct representations of the right-hand sides. Theorem 3 is still valid for this definition of size. The reachability relation over process expressions is defined as the smallest transitive relation such that each P is related to all its local and concurrent derivatives, i.e, a whenever P → [P , P ], the relation contains pairs (P, P ) and (P, P ). We say that Q is reachable from P if the pair (P, Q) is in the reachability relation. Let us denote by P the set of all process expressions reachable from the constants. As observed previously, all P ∈ P are basic. Unless stated otherwise, all the relations mentioned below are binary relations over P. This includes also ∼, which is restricted now to P only. Unless explicitly stated otherwise, P , Q, etc., range over P. An exponential-time algorithm can be easily derived as follows. First define two monotonic operators as follows. The operator B2 acts on a pair of relations as follows: (P, Q) ∈ B2 (R , R ) iff for each a ∈ Act, a

a

– if P → [P , P ] then Q → [Q , Q ] for some Q , Q such that (P , Q ) ∈ R and (P , Q ) ∈ R , a a – if Q → [Q , Q ] then P → [P , P ] for some P , P such that (P , Q ) ∈ R and (P , Q ) ∈ R . Then the operator B1 is defined by B1 (R) := B2 (R, R). Now, define the approximating equivalences ∼i as follows: – P ∼0 Q for all P and Q, – ∼i+1 := B1 (∼i ), for i ≥ 0. Distributed bisimulations R are exactly the post-fixed points of B1 , i.e., are defined by R ⊆ B1 (R). Hence ∼ being the union of all distributed bisimulations is then the greatest fixed point of B1 , by the Knaster-Tarski theorem. Recall that BPP is imagefinite, that is to say each process expression has only finitely many local and concurrent derivatives. Thus by a standard argument the decreasing chain {∼i } converges to ∼: ∼= ∼i . (5) i∈N

Furthermore, observe that each local derivative of a basic process X1 . . .Xn is a local derivative of some Xj , i.e., is equal to some process Pi appearing in a process equation (3) in ∆. In consequence, the number of local derivatives of all basic processes is polynomial. Let us denote the set of all those by L. Moreover, there are only exponentially many processes reachable from each P ∈ L – this follows easily from boundedness of ∆. Consequently, the cardinality N of the whole P is exponential. Hence ∼

526

Sławomir Lasota

can be computed over P in exponential time, e.g., as the limit of the sequence {∼i } of equivalences, since the sequence stabilizes after at most N −1 steps. We have not focused on details of the exponential-time algorithm, as in the rest of this section we argue that one can do better: the problem can be solved in polynomial time. Essentially, this is possible due to a ”quicker” convergence to the greatest fixed point, as explained in Lemma 2 below and thereafter. Then, in the crucial Lemma 4 we reduce distributed bisimilarity to strong bisimilarity between normed processes. To this aim we incorporate local derivatives into actions and obtain single-derivative transitions. We start by a couple of definitions and simple facts. Definition 2. Given a binary relation S, a distributed bisimulation w.r.t. S is any binary relation R such that R ⊆ B2 (S, R). P and Q are distributed bisimilar w.r.t. S, denoted P ∼S Q, if they are related by some distributed bisimulation w.r.t. S. Definition 3. We say that a relation R is a distributed bisimilarity w.r.t. itself if R = ∼R . Let ≈ denote the greatest distributed bisimilarity w.r.t. itself. A relation is a distributed bisimilarity w.r.t. itself precisely if it is a fixed point of the monotonic mapping R → ∼R . Hence the greatest distributed bisimilarity w.r.t. itself always exists. Lemma 1. ∼ and ≈ coincide. Proof. For one inclusion, recall that ∼ is the union of all distributed bisimulations while ∼∼ is the union of all distributed bisimulations w.r.t. ∼. Since each distributed bisimulation is a distributed bisimulation w.r.t. ∼, we have ∼ ⊆ ∼∼ , i.e., ∼ is a post-fixed point of the mapping R → ∼R . As ≈ is the greatest fixed point of that mapping, we obtain ∼ ⊆ ≈. For the other inclusion, assume a relation S is a distributed bisimilarity w.r.t. itself, S = ∼S . Since ∼S is the union of all distributed bisimulations w.r.t. S, it is the (greatest) fixed point of the monotonic mapping R → B2 (S, R), i.e., ∼S = B2 (S, ∼S ). Substituting S in place of ∼S we get S = B2 (S, S), i.e., S is a fixed point of B1 . Hence S ⊆ ∼. As S was chosen arbitrary, we showed that each distributed bisimilarity w.r.t. itself is included in ∼. Hence also ≈ does, i.e., ≈ ⊆ ∼. 2 We proved that ≈ is just another formulation of ∼. But ≈ gives rise to another sequence of approximating equivalences {≈i } that converges more rapidly than {∼i }, namely after a polynomial number of iterations: Lemma 2. ≈ = i∈N ≈i , where the sequence {≈i } is defined by: – P ≈0 Q for all P and Q, – ≈i+1 := ∼≈i . Proof. Obviously ≈ ⊆ i∈N ≈i , so we only need to show the opposite inclusion. Similarly as ∼, ∼S is the greatest fixed point of the monotonic mapping R → B2 (S, R), for any fixed S. So ∼S is also a limit of a decreasing sequence of approximations: ∼Si , (6) ∼S = i∈N

where relations ∼Si are defined by:

A Polynomial-Time Algorithm for Deciding True Concurrency Equivalences

527

– P ∼S0 Q for all P and Q, – ∼Si+1 := B2 (S, ∼Si ). Having this, by an easy induction we show ≈i ⊆ ∼i , for all i ≥ 0. As the induction assumption suppose ≈i ⊆ ∼i , for a fixed i ≥ 0. Now substitute ≈i in place of S in (6) and in the definition of ∼Sj , j ≥ 0. Due to monotonicity of B2 we derive by another ≈i i easy induction on j ≤ i that ∼≈ j+1 = B2 (≈i , ∼j ) ⊆ B2 (∼j , ∼j ) = ∼j+1 , since i ≈i ⊆ ∼i ⊆ ∼j by the induction assumption and by j ≤ i. Hence ∼≈ i+1 ⊆ ∼i+1 . ≈i ≈i By (6) we know that ≈i+1 = ∼ ⊆ ∼i+1 , hence we conclude ≈i+1 ⊆ ∼i+1 . This completes the induction step. 2 Having ≈i ⊆ ∼i , for all i ≥ 0, we apply Lemma 1 and (5). Equipped with Lemmas 1 and 2, we are ready to describe the polynomial-time algorithm. Recall that L denotes the set of all local derivatives. The algorithm consists of two phases as outlined in the figure below. By R∩(L×L) we mean here the restriction of a relation R to pairs from L. P HASE 1: let 0 := L×L REPEAT FOR n = 0, 1, . . . compute n+1 ⊆ L×L as follows: n+1 := ∼n ∩(L×L) UNTIL n = n+1 P HASE 2:

decide whether X ∼n Y

The first phase is a crucial one, but amounts simply to computing an initial part of {≈i } up to the position n where it eventually stabilizes. The trick is that ≈i is computed only for local derivatives. Then, in the second phase we only need to check whether the input pair (X, Y ) belongs to ∼≈n . Assuming that the first phase of the algorithm terminates, the outcome ∼n coincides with ≈. Lemma 3. If n = n+1 then ∼n = ≈. Proof. Assuming n = n+1 , we will show ∼n ⊆ ≈ and ∼n ⊇ ≈. For both inclusions we will silently use an obvious fact: when two relations S1 and S2 coincide on L × L, i.e., S1 ∩(L×L) = S2 ∩(L×L), then ∼S1 = ∼S2 . For ∼n ⊆ ≈, it is sufficient to show that ∼n is a distributed bisimilarity w.r.t. itn n = ∼∼ ∩(L×L) = ∼n+1 = ∼n . self. Indeed, ∼∼ n For ∼ ⊇ ≈, we show by induction that for all i ≥ 0, (a) i = ≈i ∩(L×L), (b) ∼i = ≈i+1 . For i = 0 it is obvious, so assume i > 0 and (a) and (b) hold for i−1. We prove (b)

(a) first: i = ∼i−1 ∩(L×L) = ≈i ∩(L×L). Then (b) follows easily from (a): (a)

∼i = ∼≈i ∩(L×L) = ∼≈i = ≈i+1 . Now ∼n ⊇ ≈ follows from (b) since ≈n+1 ⊇ ≈, by Lemma 2.

2

528

Sławomir Lasota

Termination of the first phase of the algorithm after a polynomial number of iterations of the main loop is guaranteed as the sequence {i } is non-increasing: 0 ⊇ 1 ⊇ . . . , and each i contains only polynomially many pairs. What we still need to show is that the single iteration of the loop body, i.e., computation of i+1 from i , can be done in polynomial time. To this aim we will prove the following Lemma 4. In the proof we will profit from Theorem 3. Lemma 4. Let S ⊆ L×L be an equivalence such that there exists a polynomial-time algorithm (w.r.t. the size of ∆) to decide whether (P, Q) ∈ S, for given P, Q ∈ L. Then there exists a polynomial-time algorithm to decide P ∼S Q, for given P , Q ∈ P. Proof. As the first stage, we construct from ∆ a new process definition ∆ in the full standard form, equivalent to ∆ in the following sense: P, Q ∈ P, (P, ∆) ∼S (Q, ∆)

iff (P, ∆ ) is strongly bisimilar to (Q, ∆ ).

(7)

The construction of ∆ is as follows: Const(∆ ) := Const(∆)

Act(∆ ) := Act(∆) × L/S,

where L/S denotes the set of equivalence classes of S and can be computed in polynomial time. Furthermore, whenever ∆ contains a process equation def X= (ai .Pi )Qi , (8) i∈I

∆ contains

X= def

i∈I

(ai , [Pi ]S ).Qi ,

(9)

where [Pi ]S denotes the equivalence class of Pi in S. Having this, (7) is clear, by the very definitions of the two bisimilarities involved. Now, the crucial point is that ∆ is always normed, since ∆ is bounded. We have Y ≺1 X (cf. Section 2) iff Y appears in some Qi on the right-hand side of process equation (8) defining X. As the transitive closure ≺+ 1 is irreflexive, the following equations (10) and (11) are well-defined and give the norm of all the constants. First, for each constant X defined by (9) in ∆ , norm(X) = 1 + min{norm(Qi )}. i∈I

(10)

Second, the norm is additive w.r.t. parallel composition: norm(P Q) = norm(P ) + norm(Q),

(11)

and this implies that the norm of each concurrent derivative Qi in equation (10) is the sum of norms of all parallel components. Now we apply Theorem 3 to ∆ and by (7) get a polynomial-time procedure to decide ∼S . 2 Evidently each i is an equivalence, hence the lemma applies and we conclude that the body of the main loop of the first phase requires only polynomial time: it amounts to invoking the decision procedure for strong bisimilarity on normed BPP polynomially many times, since the set L×L has polynomial cardinality. By Lemma 4 the second phase can be computed in polynomial time as well. Correctness of the algorithm follows by Lemmas 3 and 1. This completes the proof of the following:

A Polynomial-Time Algorithm for Deciding True Concurrency Equivalences

529

Theorem 4. There exists a polynomial-time algorithm to decide distributed bisimilarity for BPP processes.

4 Final Remarks We have proposed a polynomial-time decision procedure for distributed bisimilarity on BPP. As mentioned in the introduction, many non-interleaving equivalences coincide on BPP. Therefore, we directly conclude from Theorem 4: Corollary 1. There exists a polynomial-time algorithm to decide the following equivalences for BPP processes: location equivalence, causal equivalence, history preserving equivalence, performance equivalence. Consider BPPτ , an extension of BPP with communication between parallel components expressed by one additional rule: a

P → [P , P ] τ

a ¯

Q → [Q , Q ]

P Q → [P Q , P Q ]

.

(12)

A local derivative of a τ -transition can be composed of two local derivatives of parallel components. Hence local derivatives cannot be encoded directly into actions, and the reduction of distributed bisimilarity to strong bisimilarity in the proof of Lemma 4 fails. A crucial ingredient of our decision procedure is the polynomial-time transformation of a process definition to the standard form, described in the full version of this paper [24]. It is different from the transformation proposed by Christensen in [9], since the process definition in standard form yielded by the latter is of exponential size. Our algorithm needs Θ(n2 ) calls to the polynomial-time algorithm of [16] in each iteration in the first phase, where n stands for the size of ∆. At most n iterations are needed, since all i are equivalences, and therefore total cost is Θ(n3 ) calls to the procedure of [16]. On the other hand, P-completeness of the problem follows easily since it subsumes strong bisimilarity for finite-state systems and the latter is P-complete [3]. An interesting continuation of the work would be to develop a more efficient direct algorithm, not referring to the procedure of [16].

Acknowledgements The author is very grateful to Philippe Schnoebelen for many fruitful discussions.

References 1. L. Aceto. History preserving, causal and mixed-ordering equivalence over stable event structures. Fundamenta Informaticae, 17:319–331, 1992. 2. L. Aceto. Relating distributed, temporal and causal observations of simple processes. Fundamenta Informaticae, 17:369–397, 1992. 3. J. Balc´azar, J. Gabarr´o, and M. S´antha. Deciding bisimilarity is P-complete. Formal Aspects of Computing, (6A):638–648, 1992.

530

Sławomir Lasota

4. M. Bednarczyk. Hereditary history preserving bisimulation or what is the power of the future perfect in program logics. Technical report, Polish Academy of Sciences, Gda´nsk, 1991. 5. B. B´erard, A. Labroue, and P. Schnoebelen. Verifying performance equivalence for timed Basic Parallel Processes. In Proc. FOSSACS’00, LNCS 1784, pages 35–47, 2000. 6. I. Castellani. Bisimulations for Concurrency. PhD thesis, University of Edinburg, 1988. 7. I. Castellani. Process algebras with localities. In J. Bergstra, A. Ponse, S. Smolka, eds., Handbook of Process Algebra, chapter 15, pages 945–1046, 2001. 8. S. Christensen. Distributed bisimilarity is decidable for a class of infinite state systems. In Proc. 3th Int. Conf. Concurrency Theory (CONCUR’92), LNCS 630, pages 148–161, 1992. 9. S. Christensen. Decidability and Decomposition in process algebras. PhD thesis, Dept. of Computer Science, University of Edinburgh, UK, 1993. 10. S. Christensen, Y. Hirshfeld, and F. Moller. Bisimulation equivalence is decidable for Basic Parallel Processes. In Proc. CONCUR’93, LNCS 713, pages 143–157, 1993. 11. P Darondeau and P. Degano. Causal trees. In Proc. ICALP’89, LNCS 372, pages 234–248, 1989. 12. S. Fr¨oschle. Decidability of plain and hereditary history-preserving bisimulation for BPP. In Proc. EXPRESS’99, volume 27 of ENTCS, 1999. 13. R. van Glabbeek and U. Goltz. Equivalence notions for concurrent systems and refinement of actions. In Proc. MFCS’89, LNCS 379, pages 237–248, 1989. 14. R. Gorrieri, M. Roccetti, and E. Stancampiano. A theory of processes with durational actions. Theoretical Computer Science, 140(1):73–94, 1995. 15. M. Hennessy and A. Kiehn. On the decidability of non-interleaving process equivalences. In Proc. 5th Int Conf. Concurrency Theory (CONCUR’94), pages 18–33, 1994. 16. Y. Hirshfeld, M. Jerrum, and F. Moller. A polynomial time algorithm for deciding bisimulation equivalence of normed basic parallel processes. Mathematical Structures in Computer Science, 6:251–259, 1996. 17. H. H¨uttel. Undecidable equivalences for basic parallel processes. In Proc. TACS’94, LNCS 789, pages 454–464, 1994. 18. P. Janˇcar. Bisimilarity of basic parallel processes is PSPACE-complete. In Proc. LICS’03, to appear, 2003. 19. L Jategaonkar and A. R. Meyer. Deciding true concurrency equivalences on safe, finite nets. Theoretical Computer Science, 154:107–143, 1996. 20. M. Jurdzi´nski and M. Nielsen. Hereditary history preserving bisimilarity is undecidable. In Proc. STACS’00, LNCS 1770, pages 358–369, 2000. 21. A. Kiehn. A note on distributed bisimulations. Unpublished draft, 1999. 22. S. Lasota. Decidability of strong bisimilarity for timed BPP. In Proc. 13th Int. Conf. on Concurrency Theory (CONCUR’02), LNCS 2421, pages 562–578. Springer-Verlag, 2002. 23. S. Lasota. On coincidence of distributed and performance equivalence for Basic Parallel Processes. http://www.mimuw.edu.pl/˜sl/papers/, unpublished draft, 2002. 24. S. Lasota. A polynomial-time algorithm for deciding true concurrency equivalences of Basic Parallel Processes. Research Report LSV-02-13, LSV, ENS de Cachan, France, 2002. 25. R. Milner. Communication and Concurrency. Prentice Hall, 1989. 26. J. Srba. Strong bisimilarity and regularity of Basic Parallel Processes is PSPACE-hard. In Proc. STACS’02, LNCS 2285, 2002.

Solving the Sabotage Game Is PSPACE-Hard Christof L¨ oding and Philipp Rohde Lehrstuhl f¨ ur Informatik VII, RWTH Aachen {loeding,rohde}@informatik.rwth-aachen.de

Abstract. We consider the sabotage game as presented by van Benthem. In this game one player moves along the edges of a finite multigraph and the other player takes out a link after each step. One can consider usual algorithmic tasks like reachability, Hamilton path, or complete search as winning conditions for this game. As the game definitely ends after at most the number of edges steps, it is easy to see that solving the sabotage game for the mentioned tasks takes at most PSPACE in the size of the graph. In this paper we establish the PSPACE-hardness of this problem. Furthermore, we introduce a modal logic over changing models to express tasks corresponding to the sabotage games and we show that model checking this logic is PSPACE-complete.

1

Introduction

In some ﬁelds of computer science, especially the controlling of reactive systems, an interesting sort of tasks arises, which consider temporal changes of a systems itself. In contrast to the usual tasks over reactive systems, where movements within a system are considered, an additional process aﬀects: the dynamic change of the system itself. Hence we have two diﬀerent processes: a local movement within the system and a global change of the system. Consider, for example, a network where connections or servers may break down. Some natural questions arise for such a system: is it possible – regardless of the removed connections – to interchange information between two designated servers? Is there a protocol which guarantees that the destination can be reached? Another example for a task of this kind was recently given by van Benthem [1], which can be described as the real Travelling Salesman Problem: is it possible to ﬁnd your way between two cities within a railway network where a malevolent demon starts cancelling connections? As usual one can model such kind of reactive system as a two-person game, where one player tries to achieve a certain goal given by a winning condition and the other player tries to prevent this. As winning conditions one can consider algorithmic tasks over graphs as, e.g., reachability, Hamilton path, or complete search. Determining the winner of these games gives us the answers for our original tasks. In this paper we show that solving sabotage games where one player (the Runner ) moves along edges in a multi-graph and the other player (the Blocker ) removes an edge in each round is PSPACE-hard for the three mentioned winning B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 531–540, 2003. c Springer-Verlag Berlin Heidelberg 2003

532

Christof L¨ oding and Philipp Rohde

conditions. The main aspect of the sabotage game is that the Runner can only act locally by moving one step further from his actual position whereas the Blocker has the possibility to behave globally on the arena of the game. So the sabotage game is in fact a match between a local and a global player. This distinguishes the sabotage game from the classical games that are studied in combinatorial game theory (see [2] for an overview). In Sect. 2 we introduce the basic notions of the sabotage game. In Sect. 3 we show the PSPACE-hardness for the sabotage game with the reachability condition on undirected graphs by giving a polynomial time reduction from the PSPACE-complete problem of Quantiﬁed Boolean Formulas to these games. In Sect. 4 we give polynomial time reductions from sabotage games with reachability condition to the other winning conditions. In the last section we introduce the extension SML of modal logic over transitions systems which captures the concept of removing edges, i.e., SML is a modal logic over changing models. We give the syntax and the semantics of SML and provide a translation to ﬁrst order logic. By applying the results of the ﬁrst part we will show that model checking for this logic is PSPACE-complete. We would like to thank Johan van Benthem and Peter van Emde Boas for several ideas and comments on the topic.

2

The Sabotage Game

In this section we give the deﬁnition of the sabotage game and we repeat three algorithmic tasks over graphs which can be considered as winning conditions for this game. A multi-graph is a pair (V, e) where V is a non-empty, ﬁnite set of vertices and e : V × V → N is an edge multiplicity function, i.e., e(u, v) denotes the number of edges between the vertices u and v. e(u, v) = 0 means that u and v are not connected. In case of an undirected graph we have in addition e(u, v) = e(v, u) for all u, v ∈ V . A single-graph is given by a multiplicity function with e(u, v) ≤ 1 for all vertices u, v ∈ V . The size of a multi-graph (V, e) is given by |V | + |E|, where we set |E| := u,v∈V e(u, v) for directed graphs and |E| := 12 u,v∈V e(u, v) for undirected graphs. Let (V, e0 ) be a multi-graph and v0 ∈ V be an initial vertex. The two-person sabotage game is played as follows: initially the game arena is A0 = (V, e0 , v0 ). The two players, which we call Runner and Blocker, move alternatingly, where the Runner starts his run from vertex v0 . At the start of round n the Runner moves one step further along an existing edge of the graph, i.e., if vn is his actual position, he chooses a vn+1 ∈ V with en (vn , vn+1 ) > 0 and moves to vn+1 . Afterwards the Blocker removes one edge of the graph, i.e., he chooses two vertices u and v somewhere in the graph with en (u, v) > 0. In the directed case we deﬁne en+1 (u, v) := en (u, v) − 1 and en+1 (·, ·) := en (·, ·) otherwise. In the undirected case we let en+1 (u, v) := en+1 (v, u) := en (u, v) − 1. The multi-graph An+1 = (V, en+1 , vn+1 ) becomes the arena for the next round. The game ends, if either the Runner cannot make a move, i.e., there is no link starting from his actual position or if the winning condition is fulﬁlled.

Solving the Sabotage Game Is PSPACE-Hard

533

As winning conditions for the sabotage game on an undirected or directed graph one can consider the usual tasks over graphs, for example: 1. Reachability: the Runner wins iﬀ he can reach a given vertex (which we call the goal ) 2. Hamilton Path or Travelling Salesman: the Runner wins iﬀ he can move along a Hamilton Path, i.e., he visits each vertex exactly once 3. Complete Search: the Runner wins iﬀ he can visit each vertex (possibly more than once) It is easy to see that for the reachability game with one single goal the use of multi-graphs is crucial, but we can bound the multiplicity uniformly by two or, if we allow a second goal vertex, we even can transform every multi-graph game into a single-graph game: Lemma 1. Let G be a sabotage game with reachability condition on a multigraph arena A. Then there are games G , G on arenas A , A with a size polynomial in the size of A such that the Runner wins G iﬀ he wins G , resp. G , and A is a single-graph with two goals and A is a multi-graph with one goal and only single or double edges where the double edges occur only connected with the goal. Proof. We only sketch the proof for directed graphs. To obtain A one adds a new goal and replaces each edge between vertices u and v with multiplicity k > 0 by the construction depicted in Fig. 1 (with k new vertices). We actually need a new goal if v is the original goal. The arena A is constructed similarly: if v is not the original goal we apply the same construction (Fig. 1), but reusing the existing goal instead of adding a new one. If v is the goal then we add double edges from the new vertices to v (see Fig. 2). Note that Blocker does not gain additional moves because all new vertices are directly connected to the goal. u •

•

···

u •

•

v Fig. 1. Replacement for A

•

•

• 2

2

··· v

2

•

•

2

Fig. 2. Replacement for A

Since edges are only deleted but not added during the play the following fact is easy to see: Lemma 2. If the Runner has a winning strategy in the sabotage game with reachability condition then he can win without visiting any vertex twice. In the sequel we will introduce several game arenas where we use edges with a multiplicity ‘high enough’ to ensure that the Blocker cannot win the game

534

Christof L¨ oding and Philipp Rohde

by reducing these edges. In ﬁgures these edges are represented by a curly link • . For the moment we can consider these links to be ‘unremovable’. • Due to the previous lemma we have: if the Runner can win the reachability game at all, then he can do so within at most |V | − 1 rounds. Hence we can set the multiplicity of the ’unremovable’ edges to |V | − 1. To bound the multiplicity of edges uniformly one can apply Lemma 1.

3

PSPACE-Hardness for Sabotage Reachability Games

In this section we prove that the PSPACE-complete problem of Quantiﬁed Boolean Formulas (cf. [3]), QBF for short, can be reduced by a polynomial time reduction to sabotage games on undirected graphs with the reachability condition. Let ϕ ≡ ∃x1 ∀x2 ∃x3 . . . Qxn ψ be an instance of QBF, where Q is ∃ for n odd and ∀ otherwise and ψ is a quantiﬁer-free Boolean formula in conjunctive normal form. We will construct an undirected game arena for a sabotage game Gϕ with a reachability condition such that the Runner has a winning strategy in the game iﬀ the formula ϕ is satisﬁable. A reduction like the classical one from QBF to the Geography Game (cf. [3]) does not work here, since the Blocker may destroy connections in a part of the graph which should be visited only later in the game. This could be solved by blowing up the distances, but this approach results in an arena with a size exponential in the size n of ϕ. So we have to restrict the liberty of the Blocker in a more sophisticated way, i.e., to force him removing edges only ‘locally’. The game arena Gϕ consists of two parts: a chain of n gadgets where ﬁrst the Runner chooses an assignment for x1 , then the Blocker chooses an assignment for x2 before the Runner chooses an assignment for x3 and so on. The second part gives the Blocker the possibility to select one of the clauses of ψ. The Runner must certify that this clause is indeed satisﬁed by the chosen assignment: he can reach the goal vertex and win the game iﬀ at least one literal in the clause is true under the assignment. Figure 5 shows an example of the sabotage game Gϕ for the formula ϕ ≡ ∃x1 ∀x2 ∃x3 (c1 ∧c2 ∧c3 ∧c4 ) where we assume that each clause consists of exactly three literals. In the following we describe in detail the several components of Gϕ and their arrangement. The main step of the construction is to take care about the opportunity of the Blocker to remove edges somewhere in the graph. The ∃-Gadget. The gadget where the Runner chooses an assignment for the xi with i odd is displayed in Fig. 3. We are assuming that the run reaches this gadget at vertex A at the ﬁrst time. Vertex B is intended to be the exit. In the complete construction there are also edges from Xi , resp. Xi leading to the last gadget of the graph, represented as dotted lines labelled by back. We will see later that taking these edges as a shortcut, starting from the ∃-gadget directly to the last gadget is useless for the Runner. The only meaningful direction is coming from the last gadget back to the ∃-gadget. So we temporary assume that

Solving the Sabotage Game Is PSPACE-Hard

in

•

•

•

Xi

in

A 4

4

•

535

•

•

•

D

Xi

•

•

Xi

back

back

•

A 4

4

•

Xi

•

3

C back

back

B

B

out

out

Fig. 3. ∃-gadget for xi with i odd

Fig. 4. ∀-gadget for xi with i even

start

•

•

•

X1

•

•

•

X2

•

•

•

X3

• 4

•

4

• 4

•

4

• 4

•

4

•

•

X1

•

•

•

X2

•

3

•

•

X3

•

• A1 C1 • • •

A2 •

C2 • • •

A3 •

C3

•

• • •

• •

• unremovable link n

• edge of multiplicity n • single edge

•

C4

•

• • •

Fig. 5. The arena for ∃x1 ∀x2 ∃x3 (c1 ∧ c2 ∧ c3 ∧ c4 ) Types of edges: •

A4

536

Christof L¨ oding and Philipp Rohde

the Runner does not take these edges. In the sequel we further assume, due to Lemma 2, that the Runner does not move backwards. The Runner makes his choice simply by moving from A either to the left or to the right. Thereby he moves either towards Xi if he wants xi to be false or towards Xi if he wants xi to be true. We consider only the ﬁrst case. The Blocker has exactly four steps to remove all the links between Xi and the goal before the Runner reaches this vertex. On the other hand the Blocker cannot remove edges somewhere else in the graph without loosing the game. Why we use four steps here will be clariﬁed later on. If the Runner has reached Xi and he moves towards B then the Blocker has to delete the edge between B and Xi since otherwise the Runner can reach the goal on this way (there are still four edges left between Xi and the goal). The ∀-Gadget. The gadget where the Blocker chooses an assignment for the xi with i even is a little bit more sophisticated. Figure 4 shows the construction. If the Blocker wants xi to be false he tries to lead the Runner towards Xi . In this case he simply removes the three edges between C and Xi during the ﬁrst three steps. Then the Runner has to move across D and in the meantime the Blocker deletes the four edges between Xi and the goal to ensure that the Runner cannot win directly. As above he removes in the last step the link between B and Xi to prevent a premature end of the game. If the Blocker wants to assign true to xi he should lead the Runner towards Xi . To achieve this aim he removes three of the four links between Xi and the goal before the Runner reaches C. Nevertheless the Runner has the free choice at vertex C whether he moves towards Xi or towards Xi , i.e., the Blocker cannot guarantee that the run goes across Xi . But let us consider the two possible cases: ﬁrst we assume that the Runner moves as intended and uses an edge between C and Xi . In this round the Blocker removes the last link from Xi to the goal. Then the Runner moves to B and the Blocker deletes the edge from B to Xi . Now assume that the Runner ‘misbehaves’ and moves from C to D and further towards Xi . Then the Blocker ﬁrst removes the four edges between Xi and the goal. When the Runner now moves from Xi to B the Blocker has to take care that the Runner cannot reach the goal via the link between B and Xi (there is still one edge left from Xi to the goal). For that he can delete the last link between Xi and the goal and isolate the goal completely within this gadget. The Veriﬁcation Gadget. The last component of the arena is a gadget where the Blocker can choose one of the clauses of the formula ψ. Before we give the representation of this gadget let us explain the idea. If the Blocker chooses the clause c then the Runner can select for his part one literal xi of c. There is an edge back to the ∃-gadget if i is odd or to the ∀-gadget if i is even, videlicet to Xi if xi is positive in c, resp. to Xi if xi is negative in c. So if the chosen assignment satisﬁes ψ, then for all clauses of ψ there is at least one literal which is true. Since the path through the assignment gadgets visits the opposite truth values this means that there is at least one edge back to an Xi , resp. Xi , which itself is connected to the goal by an edge with a multiplicity of four (assuming

Solving the Sabotage Game Is PSPACE-Hard

537

that the Runner did not misbehave in the ∀-gadget). Therefore the Runner can reach the goal and wins the game. For the converse if the chosen assignment does not satisfy ψ, then there is a clause c in ψ such that every literal in c is assigned false. If the Blocker chooses this clause c then every edge back to the assignment gadgets ends in an Xi , resp. Xi , which is unconnected to the goal. If we show that there is no other way to reach the goal this means that the Runner looses the game. But we have to be very careful neither to allow any shortcuts for the Runner nor to give the Blocker to much liberty. Figure 5 contains the veriﬁcation gadget for ψ ≡ c1 ∧ c2 ∧ c3 ∧ c4 where each clause ci has exactly three literals. The curly edges at the bottom of the gadget lead back to the corresponding literals of each clause. The Blocker chooses the clause ck by ﬁrst removing the edges from Aj to Cj for j < k one after the other. Then he cuts the link between Ak and Ak+1 , resp. between Ak and the goal if ck is the last clause. By Lemma 2 it is useless for the Runner to go back, thus he can only follow the given path to Ck . If he reaches this vertex the Blocker must remove the link from Ck to the goal to prevent the win for the opponent. In the next step the Runner selects a literal xi , resp. ¬xi in ck , moves towards the corresponding vertex and afterwards along the curly edge back to the assignment gadgets as described above. At this point the Blocker has exactly two moves left, i.e., he is allowed to remove two edges somewhere in the graph. But we have: if the ‘right’ assignment for this literal has been chosen then there are exactly four edges left connecting the corresponding vertex and the goal. So the Blocker has not the opportunity to isolate the goal and the Runner wins the game. Otherwise, if the ‘wrong’ assignment has been chosen then there is no link from Xi , resp. Xi to the goal left. Any continuation which the Runner could take either leads him back to an already visited vertex (which is a loss by Lemma 2) or, by taking another back -edge in the ‘wrong’ direction, to another vertex in the veriﬁcation gadget. We handle the latter case in general: if the Runner uses a shortcut starting from a literal vertex and moves directly to the bottom of the veriﬁcation gadget then the Blocker can prevent the continuation of the run by removing the corresponding single edge between the clause vertex Ck and the vertex beneath and the Runner has to move back. So the Runner wins the game if and only if he wins it without using any shortcut. If the Runner reaches a vertex Ak and the Blocker removes either the edge between Ak and Ck or the one between Ck and the goal or one of the edges leading to the vertices beneath Ck (one for each literal in ck ) then the Runner moves towards Ak+1 , resp. towards the goal if ck is the last clause. The Runner has to do so since, in the latter two cases, entering the ‘damaged’ area around Ck could be a disadvantage for him. Finally we consider the case that the Blocker removes an edge somewhere else in the graph instead. This behaviour is only reasonable if the chosen assignment satisﬁes ψ. So consider the round when the Runner reaches for the ﬁrst time an Ak such that the edges from Ak to Ak+1 , resp. the goal, as well as all edges connected to Ck are still left. If ck is the last clause then the Runner just reaches

538

Christof L¨ oding and Philipp Rohde

the goal and wins the game. Otherwise he moves to Ck , chooses an appropriate literal xi , resp. ¬xi such that at least three edges from the corresponding vertex are still left (at least one literal of this kind exists in each clause). Since Ak is the ﬁrst vertex with this property the Blocker has gained only one additional move, so nevertheless it remains at least one edge from the vertex Xi , resp. Xi to the goal. So if the Runner can chose a satisfying assignment at all then the Blocker cannot prevent the win for the Runner by this behaviour. This explains the multiplicity of four within the assignment gadgets. This completes the construction of the game Gϕ . Obviously, this construction can be done in polynomial time. Therefore, we obtain the following results. Lemma 3. The Runner has a winning strategy in the sabotage game Gϕ iﬀ ϕ is satisﬁable. Theorem 4. There is a polynomial time reduction from QBF to sabotage games with reachability winning condition on undirected graphs. In particular solving these games is PSPACE-hard. Since each edge of the game Gϕ has an ‘intended direction’, it is straight forward to check that a similar construction works for directed graphs as well. The construction can also be adapted to prove the PSPACE-hardness of other variants of the game, e.g., if the Blocker is allowed to remove up to n edges in each round for a ﬁxed number n or if the Blocker removes vertices instead of edges. For the details we refer the reader to [4].

4

The Remaining Winning Conditions

In this section we give polynomial time reductions from sabotage games with reachability condition to the ones with complete search condition and with Hamilton path condition. We only consider games on undirected graphs. Let G be a sabotage game on an undirected arena A = (V, e, v0 ) with the reachability condition. We present an arena B such that the Runner wins G iﬀ he wins the game G on B with the complete search condition iﬀ he wins the game G on B with the Hamilton path condition. To obtain B we add several vertices to A: let m := |V | − 2 and let v1 , . . . , vm be an enumeration of all vertices in A except the initial vertex and the goal. We add a sequence P1 , . . . , Pm of new vertices to A together with several chains of new vertices such that each chain has length max{|V |, |E|} and their nodes are linked among each other by ‘unremovable’ edges. We add these chains from Pi as well as from Pi+1 to vertex vi for i < m and one chain from Pm to vertex vm . Furthermore we add for i < m shortcuts from the last vertices in the chains between Pi and vi to the last vertices in the chains between Pi+1 and vi to give the Runner the possibility to skip the visitation of vi . Additionally there is one link with multiplicity |V | from P1 to the goal in A, see Fig. 6. If the Runner can reach the goal in the original game G then by Lemma 2 he can do so within at most |V | − 1 steps. In this case there is at least one link

Solving the Sabotage Game Is PSPACE-Hard

539

|V |

goal •

A start •

v1 v2 v3

•

...

•

•

...

•

•

...

•

•

...

•

•

...

•

P1 P2 P3

max{|V |,|E|}

Fig. 6. Game arena B

to P1 which he uses to reach P1 . He follows the chain to v1 . If he had already visited v1 on his way to the goal he uses the shortcut at the last vertex in the chain, otherwise he visits v1 . Afterwards he moves to P2 using the next chain. Continuing like this he reaches Pm and moves towards the last vertex vm . If he had already visited vm he just stops one vertex before. Otherwise he stops at vm . Moving this way he visits each vertex of B exactly once and wins both games G and G . For the converse: if the Runner cannot reach the goal in G then he cannot do so in the games G and G as well. If he tries to use a shortcut via some Pi the Blocker has enough time on the way to Pi to cut all the links between the goal and P1 . On Runner’s way back from some Pj to a vertex in A he is able to remove all edges in the original game arena A to isolate the goal completely. Thus the Runner looses both games G and G on B. So we have: Theorem 5. There is a polynomial time reduction from sabotage games with reachability condition to sabotage games with complete search condition, resp., with Hamilton path condition. In particular solving these games is PSPACEhard.

5

A Sabotage Modal Logic

In [1] van Benthem considered a ‘sabotage modal logic’, i.e., a modal logic over changing models to express tasks corresponding to sabotage games. He introduced a cross-model modality referring to submodels from which objects have been removed. In this section we will give a formal deﬁnition of a sabotage modal logic with a ‘transition-deleting’ modality and we will show how to apply the results of the previous sections to determine the complexity of uniform model checking for this logic. To realise the use of multi-graphs we will interpret the logic over edge-labelled transition systems. By applying Lemma 1 the complexity results for the reachability game can be obtained for multi-graphs with a uniformly bounded multiplicity. Hence we can do with a ﬁnite alphabet Σ.

540

Christof L¨ oding and Philipp Rohde

Definition 6. Let p be an unary predicate symbol and a ∈ Σ. Formulae of the sabotage modal logic SML over transition systems are deﬁned by ϕ ::= p | ¬ϕ | ϕ ∨ ϕ | ♦a ϕ | ♦ - aϕ The dual modality a and the label-free versions ♦, are deﬁned as usual. The - a, ♦- and - are deﬁned analogously. modalities Let T = (S, {Ra | a ∈ Σ}, L) be a transition system. For t, t ∈ S and a ∈ Σ a we deﬁne the submodel T(t,t ) := (S, {Rb | b ∈ Σ \ {a}} ∪ {Ra \ {(t, t )}}, L). For a given state s ∈ S the semantics of SML is deﬁned as for usual modal logic together with a (T , s) |= ♦ - a ϕ iﬀ there is (t, t ) ∈ Ra such that (T(t,t ) , s) |= ϕ

For a transition system T let Tˆ be the corresponding FO-structure. Similar to the usual modal logic one can translate the logic SML into ﬁrst order logic. Since FO-model checking is in PSPACE we obtain (see [4] for a proof): Theorem 7. For every SML-formula ϕ there is an eﬀectively constructible FOformula ϕ(x) ˆ such that for every transition system M and state s of M one has ˆ The size of ϕ(x) ˆ is polynomial in the size of ϕ. (T , s) |=SML ϕ iﬀ Tˆ |=FO ϕ[s]. In particular, SML-model checking is in PSPACE. We can express the winning of the Runner in the sabotage game G on directed graphs with the reachability condition by an SML-formula. For that we consider the game arena as a transition system T (G) such that the multiplicity of edges is captured by the edge labelling and such that the goal vertex of the game is viewed as the only state with predicate p. We inductively deﬁne the SML-formula - γi ) ∨ p. Then we obtain the following lemma (see γn by γ0 := p and γi+1 := (♦ [4] for a proof) and in combination with Theorem 4 the PSPACE-completeness of SML model checking. Lemma 8. The Runner has a winning strategy from vertex s in the sabotage game G iﬀ (T (G), s) |= γn where n is the number of edges of the game arena. Theorem 9. Model checking for the sabotage logic SML is PSPACE-complete.

References 1. van Benthem, J.: An essay on sabotage and obstruction. In Hutter, D., Werner, S., eds.: Festschrift in Honour of Prof. J¨ org Siekmann. LNAI. Springer (2002) 2. Demaine, E.D.: Playing games with algorithms: Algorithmic combinatorial game theory. In: Proceedings of MFCS 2001. Volume 2136 of LNCS., Springer (2001) 18–32 3. Papadimitriou, C.H.: Computational Complexity. Addison–Wesley (1994) 4. L¨ oding, C., Rohde, P.: Solving the sabotage game is PSPACE-hard. Technical Report AIB-05-2003, RWTH Aachen (2003)

The Approximate Well-Founded Semantics for Logic Programs with Uncertainty Yann Loyer1 and Umberto Straccia2 1

PRiSM, Universit´e de Versailles, 45 Avenue des Etats-Unis, 78035 Versailles, France 2 I.S.T.I. - C.N.R., Via G. Moruzzi,1 I-56124 Pisa, Italy

Abstract. The management of uncertain information in logic programs becomes to be important whenever the real world information to be represented is of imperfect nature and the classical crisp true, false approximation is not adequate. A general framework, called Parametric Deductive Databases with Uncertainty (PDDU) framework [10], was proposed as a unifying umbrella for many existing approaches towards the manipulation of uncertainty in logic programs. We extend PDDU with (non-monotonic) negation, a well-known and important feature of logic programs. We show that, dealing with uncertain and incomplete knowledge, atoms should be assigned only approximations of uncertainty values, unless some assumption is used to complete the knowledge. We rely on the closed world assumption to infer as much default “false” knowledge as possible. Our approach leads also to a novel characterizations, both epistemic and operational, of the well-founded semantics in PDDU, and preserves the continuity of the immediate consequence operator, a major feature of the classical PDDU framework.

1

Introduction

The management of uncertainty within deduction systems is an important issue whenever the real world information to be represented is of imperfect nature. In logic programming, the problem has attracted the attention of many researchers and numerous frameworks have been proposed. Essentially, they differ in the underlying notion of uncertainty (e.g. probability theory [9,13,14,15], fuzzy set theory [16,17,19], multivalued logic [7,8,10], possibilistic logic [2]) and how uncertainty values, associated to rules and facts, are managed. Lakshmanan and Shiri have recently proposed a general framework [10], called Parametric Deductive Databases with Uncertainty (PDDU), that captures and generalizes many of the precedent approaches. In [10], a rule is of the form α A ← B1 , ..., Bn . Computationally, given an assignment I of certainties to the Bi s, the certainty of A is computed by taking the “conjunction” of the certainties I(Bi ) and then somehow “propagating” it to the rule head, taking into account the certainty α of the implication. However, despite its generality, one fundamental issue that remains unaddressed in PDDU is non-monotonic negation, a well-known and important feature in logic programming. In this paper, we extend PDDU [10] to normal logic programs, logic programs with negation. In order to deal with knowledge that is usually not only uncertain, but also incomplete, we believe that one should rely on approximations of uncertainty values only. Then we study the problem of assigning a semantics to a normal logic program in such B. Rovan and P. Vojt´asˇ (Eds.): MFCS 2003, LNCS 2747, pp. 541–550, 2003. c Springer-Verlag Berlin Heidelberg 2003

542

Yann Loyer and Umberto Straccia

a framework. We ﬁrst consider the least model, show that it extends the Kripke-Kleene semantics [4] from Datalog programs to normal logic programs, but that it is usually to weak. We then explain how one should try to determine approximations as much precise as possible by completing its knowledge by a kind of default reasoning based on the wellknown Closed World Assumption (CWA). Our approach consists in determining how much knowledge “extracted” from the CWA can “safely” be used to “complete” a logic program. Our approach leads to novel characterizations, both epistemic and operational, of the well-founded semantics [3] for logic programs and extends that semantics to PDDU. Moreover we show that the continuity of the immediate consequence operator, used for inferring information from the program, is preserved. This is important as this is a major feature of classical PDDU, opposed to classical frameworks like [8]. Negation has already been considered in some deductive databases with uncertainty frameworks. In [13,14], the stable semantics has been considered, but limited to the case where the underlying uncertainty formalism is probability theory. That semantics has been considered also in [19], where a semi-possibilistic logic has been proposed, a particular negation operator has been introduced and a ﬁxed min/max-evaluation of conjunction and disjunction is adopted. To the best of our knowledge, there is no work dealing with default negation within PDDU, except than our previous attempt [11]. The semantics deﬁned in [11] is weaker than the one presented in this paper, as in the latter approach more knowledge can be extracted from a program, it has no epistemic characterization, and rely on a less natural management of negation. In the remaining, we proceed as follows. In the following section, the syntax of PDDU with negation, called normal parametric programs, is given, Section 3 contains the deﬁnitions of interpretation and model of a program. In Section 4, we present the fundamental notion of support of a program provided by the CWA with respect to (w.r.t.) an interpretation. Then we propose novel characterizations of the well-founded semantics and compare our approach with usual semantics. Section 5 concludes.

2

Preliminaries

Consider an arbitrary ﬁrst order language that contains inﬁnitely many variable symbols, ﬁnitely many constants, and predicate symbols, but no function symbols. The predicate symbol π(A) of an atomic formula A given by A = p(X1 , . . . , Xn ) is deﬁned by π(A) = p. The truth-space is given by a complete lattice: atomic formulae are mapped into elements of a certainty lattice L = T , , ⊗, ⊕ (a complete lattice), where T is the set of certainty values, is a partial order, ⊗ and ⊕ are the meet and join operators, respectively. With ⊥ and we denote the least and greatest element in T . With B(T ) we denote the set of ﬁnite multisets (denoted {| · }| ) over T . For instance, a typical certainty lattice is L[0,1] = T , , ⊗, ⊕, where T = [0, 1], α β iff α ≤ β, α⊗β = min(α, β), α ⊕ β = max(α, β), ⊥ = 0 and = 1. While the language does not contain function symbols, it contains symbols for families of conjunction (Fc ), propagation (Fp ) and disjunction functions (Fd ), called combination functions. Roughly, as we will see below, the conjunction function (e.g. ⊗) determines the certainty of the conjunction of L1 , ..., Ln α (the body) of a logic program rule like A ← L1 , ..., Ln , a propagation function (e.g. ⊗) determines how to “propagate” the certainty, resulting from the evaluation of the body L1 , ..., Ln , to the head A, by taking into account the certainty α of the implication,

The Approximate Well-Founded Semantics for Logic Programs with Uncertainty

543

while the disjunction function (e.g. ⊕) dictates how to combine the certainties in case an atom appears in the heads of several rules (evaluates a disjunction). Examples of conjunction, propagation and disjunction over L[0,1] are fc (x, y) = min(x, y), fp (x, y) = xy, fd (x, y) = x + y − xy. Formally, a propagation function is a mapping from T × T to T and a conjunction or disjunction function is a mapping from B(T ) to T . Each combination function is monotonic and continuous w.r.t. each one of its arguments. Conjunction and disjunction functions are commutative and associative. Additionally, each kind of function must verify some of the following properties1 : (i) bounded-above: f (α1 , α2 ) αi , for i = 1, 2, ∀α1 , α2 ∈ T ; (ii) bounded-below: f (α1 , α2 ) αi , for i = 1, 2, ∀α1 , α2 ∈ T ; (iii) f ({α}) = α, ∀α ∈ T ; (iv) f (∅) = ⊥; (v) f (∅) = ; and (vi) f (α, ) = α, ∀α ∈ T . The following should be satisﬁed. A conjunction function in Fc should satisfy properties (i), (iii), (v) and (vi); a propagation function in Fp should satisfy properties (i) and (vi), while a disjunction function in Fd should satisfy properties (ii), (iii) and (iv). We also assume that there is a function from T to T , called negation function, denoted ¬, that is anti-monotone w.r.t. and satisﬁes ¬¬α = α, ∀α ∈ T . E.g., in L[0,1] , ¬α = 1 − α is quite typical. Finally, a literal is an atomic formula or its negation. Deﬁnition 1 (Normal Parametric Program [10]). A normal parametric program P (np-program) is a 5-tuple L, R, C, P, D, whose components are deﬁned as follows: (i) L = T , , ⊗, ⊕ is a complete lattice, where T is a set of certainties partially ordered by , ⊗ is the meet operator and ⊕ the join operator; (ii) R is a ﬁnite set of normal α parametric rules (np-rules), each of which is a statement of the form: r : A ←r L1 , ..., Ln where A is an atomic formula, L1 , ..., Ln are literals or values in T and αr ∈ T \ {⊥} is the certainty of the rule; (iii) C maps each np-rule to a conjunction function in Fc ; (iv) P maps each np-rule to a propagation function in Fp ; (v) D maps each predicate symbol in P to a disjunction function in Fd . α

For ease of presentation, we write r : A ←r L1 , ..., Ln ; fd , fp , fc to represent a np-rule in which fd ∈ Fd is the disjunction function associated with π(A) and, fc ∈ Fc and fp ∈ Fp are respectively the conjunction and propagation functions associated with r. Note that, by Deﬁnition 1, rules with same head must have the same disjunction function associated. The following example illustrates the notion of np-program. Example 1. Consider an insurance company, which has information about its customers used to determine the risk coefﬁcient of each customer. The company has: (i) data grouped into a set F of facts; and (ii) a set R of rules. Suppose the company has the following database (which is an np-program P = F ∪ R), where a value of the risk coefﬁcient may be already known, but has to be re-evaluated (the client may be a new client and his risk coefﬁcient is given by his precedent insurance company). The certainty lattice is L[0;1] , with fp (x, y) = xy.

F =

1

  1   Experience(John) ← 0.7 ⊕, fp , ⊗   Risk(John)

1

← 0.5 ⊕, f , ⊗

p   1  Sport car(John) ←  0.8 ⊕, fp , ⊗

For simplicity, we formulate the properties treating any function as a binary function on T .

544

Yann Loyer and Umberto Straccia

R=

 1  Good driver(X) ← Experience(X), ¬Risk(X) ⊕, ⊗, ⊗    0.8 Risk(X)

 Risk(X)    Risk(X)

← Young(X), ⊕, fp , ⊗

0.8

← Sport car(X) ⊕, fp , ⊗ 1

← Experience(X), ¬Good driver(X) ⊕, fp , ⊗

        

Using another disjunction function associated to the rules with head Risk, such as fd (x, y) = x + y − xy, might have been more appropriate in such an example (i.e. we accumulate the risk factors, rather than take the max only), but we will use ⊕ in order to facilitate the reader’s comprehension later on when we compute the semantics of P . We further deﬁne the Herbrand base BP of an np-program P as the set of all instantiated atoms corresponding to atoms appearing in P and deﬁne P ∗ to be the Herbrand instantiation of P , i.e. the set of all ground instantiations of the rules in P (P ∗ is ﬁnite). Note that a Datalog program with negation P is equivalent to the np-program constructed by ret placing each rule in P of the form A←L1 , ..., Ln by the rule A ← L1 , ..., Ln ; ⊕, ⊗, ⊗, where the classical certainty lattice L{t,f } is considered, where L{t,f } = T , , ⊗, ⊕, with T = {t, f }, is deﬁned by f t, ⊕ = max , ⊗ = min , ¬f = t and ¬t = f , ⊥ = f and = t.

3

Interpretations of Programs

The semantics of a program P is determined by selecting a particular interpretation of P in the set of models of P , where an interpretation I of an np-program P is a function that assigns to all atoms of the Herbrand base of P a value in T . In Datalog programs, as well as in PDDU, that chosen model is usually the least model of P w.r.t. 2 . Unfortunately, the introduction of negation may have the consequence that some logic programs do not have a unique minimal model, as shown in the following example. Example 2. Consider the certainty lattice L[0,1] and the program P = {(A←¬B), (B←¬A), (A←0.2), (B←0.3)}. Informally, an interpretation I is a model of the program if it satisﬁes every rule, while I satisﬁes a rule X ← Y if I(X) I(Y ) 3 . So, this program has an inﬁnite number of models Ixy , where 0.2 x 1, 0.3 y 1, y ≥ 1 − x, Ixy (A) = x and Ixy (B) = y (those in the A area). There are also an inﬁnite number of minimal models (those on the thin diagonal line). The minimal models Ixy are such that y = 1 − x. 2 Concerning the previous example we may note that the certainty of A in the minimal models is in the interval [0.2, 0.7], while for B the interval is [0.3, 0.8]. An obvious question is: what should be the answer to a query A to the program proposed in Example 2? There are at least two answers: (i) the certainty of A is undeﬁned, as there is no unique minimal model. This is clearly a conservative approach, which in case of ambiguity prefers to leave A unspeciﬁed; (ii) the certainty of A is in [0.2, 0.7], which means that even if there is no unique value for A, in all minimal models the certainty of A is in [0.2, 0.7]. In this approach we still try to provide some information. Of course, some 2 3

is extended to the set of interpretations as follows: I J iff for all atoms A, I(A) J(A). Roughly, X ← Y dictates that “X should be at least as true as Y .

The Approximate Well-Founded Semantics for Logic Programs with Uncertainty

545

care should be used. Indeed from I(A) ∈ [0.2, 0.7] and I(B) ∈ [0.3, 0.8] we should not conclude that I(A) = 0.2 and I(B) = 0.3 is a model of the program. Applying a usual approach, like the well-founded semantics [18] or the Kripke-Kleene semantics [4], would lead us to choose the conservative solution 1. This was also the approach in our early attempt to deal with normal parametric programs [11]. Such a semantics seems to be too weak, in the sense that it loses some knowledge (e.g. the value of A should be at least 0.2). In this paper we address solution 2. To this end, we propose to rely on T × T . Any element of T × T is denoted by [a; b] and interpreted as an interval on T , i.e. [a; b] is interpreted as the set of elements x ∈ T such that a x b. For instance, turning back to Example 2 above, in the intended model of P , the certainty of A is “approximated” with [0.2; 0.7], i.e. the certainty of A lies in between 0.2 and 0.7 (similarly for B). Formally, given a complete lattice L = T , , ⊗, ⊕, we construct a bilattice over T ×T , according to a well-known construction method (see [3,6]). We recall that a bilattice is a triple B, t , k , where B is a nonempty set and t , k are both partial orderings giving to B the structure of a lattice with a top and a bottom [6]. We consider B = T ×T with orderings: (i) the truth ordering t , where [a1 ; b1 ] t [a2 ; b2 ] iff a1 a2 and b1 b2 ; and (ii) the knowledge ordering k , where [a1 ; b1 ] k [a2 ; b2 ] iff a1 a2 and b2 b1 . The intuition of those orders is that truth increases if the interval contains greater values (e.g. [0.1; 0.4] t [0.2; 0.5]), whereas the knowledge increases when the interval (i.e. in our case the approximation of a certainty value) becomes more precise (e.g. [0.1; 0.4] k [0.2; 0.3], i.e. we have more knowledge). The least and greatest elements of T × T are respectively (i) f = [⊥; ⊥] (false) and t = [; ] (true), w.r.t. t ; and (ii) ⊥ = [⊥; ] (unknown – the less precise interval, i.e. the atom’s certainty value is unknown) and = [; ⊥] (inconsistent – the empty interval) w.r.t. k . The meet, join and negation on T × T w.r.t. both orderings are deﬁned by extending the meet, join and negation from T to T × T in the natural way: let [a1 ; b1 ], [a2 ; b2 ] ∈ T × T , then – [a1 ; b1 ] ⊗t [a2 ; b2 ] = [a1 ⊗ a2 ; b1 ⊗ b2 ] and [a1 ; b1 ] ⊕t [a2 ; b2 ] = [a1 ⊕ a2 ; b1 ⊕ b2 ]; – [a1 ; b1 ] ⊗k [a2 ; b2 ] = [a1 ⊗ a2 ; b1 ⊕ b2 ] and [a1 ; b1 ] ⊕k [a2 ; b2 ] = [a1 ⊕ a2 ; b1 ⊗ b2 ]; – ¬[a1 ; b1 ] = [¬b1 ; ¬a1 ]. ⊗t and ⊕t (⊗k and ⊕k ) denote the meet and join operations on T × T w.r.t. the truth (knowledge) ordering, respectively. For instance, taking L[0,1] , [0.1; 0.4] ⊕t [0.2; 0.5] = [0.2; 0.5], [0.1; 0.4]⊗t [0.2; 0.5] = [0.1; 0.4], [0.1; 0.4]⊕k [0.2; 0.5] = [0.2; 0.4], [0.1; 0.4] ⊗k [0.2; 0.5] = [0.1; 0.5] and ¬[0.1; 0.4] = [0.6; 0.9]. Finally, we extend in a similar way the combination functions from T to T × T . Let fc (resp. fp and fd ) be a conjunction (resp. propagation and disjunction) function over T and [a1 ; b1 ], [a2 ; b2 ] ∈ T × T : (i) fc ([a1 ; b1 ], [a2 ; b2 ]) = [fc (a1 , a2 ); fc (b1 , b2 )]; (ii) fp ([a1 ; b1 ], [a2 ; b2 ]) = [fp (a1 , a2 ); fp (b1 , b2 )]; and (iii)fd ([a1 ; b1 ], [a2 ; b2 ]) = [fd (a1 , a2 ); fd (b1 , b2 )]. It is easy to verify that these extended combination functions preserve the original properties of combination functions.The following theorem holds. Theorem 1. Consider T × T with the orderings t and k . Then (i) ⊗t , ⊕t , ⊗k , ⊕k and the extensions of combination functions are continuous (and, thus, monotonic) w.r.t. t and k ; (ii) any extended negation function is monotonic w.r.t. k ; and (iii) if the negation function satisﬁes the de Morgan laws, i.e. ∀a, b ∈ T .¬(a ⊕ b) = ¬a ⊗ ¬b then the extended negation function is continuous w.r.t. k .

546

Yann Loyer and Umberto Straccia

Proof: We proof only the last item, as the others are immediate. Consider a chain of intervals x0 k x1 k . . ., where xj = [aj ; bj ] with aj , bj ∈ T . To show the continuity of the extended negation function w.r.t. k , we show that ¬ ⊕kj≥0 xj = ⊕kj≥0 ¬xj : ¬ ⊕kj≥0 xj = ¬[⊕j≥0 aj ; ⊗j≥0 bj ] = [¬ ⊗j≥0 bj ; ¬ ⊕j≥0 aj ] = [⊕j≥0 ¬bj ; ⊗j≥0 ¬aj ] = ⊕kj≥0 [¬bj ; ¬aj ] = ⊕kj≥0 ¬[aj ; bj ] = ⊕kj≥0 ¬xj . We can now extend interpretations over T to the above speciﬁed “interval” bilattice. Deﬁnition 2 (Approximate Interpretation). Let P be an np-program. An approximate interpretation of P is a total function I from the Herbrand base BP to the set T × T . The set of all the approximate interpretations of P is denoted CP . Intuitively, assigning the logical value [a; b] to an atom A means that the exact certainty value of A lies in between a and b with respect to . Our goal will be to determine for each atom of the Herbrand base of P the most precise interval that can be inferred. At ﬁrst, we extend the two orderings on T ×T to the set of approximate interpretations CP in a usual way: let I1 and I2 be in CP , then (i) I1 t I2 iff I1 (A) t I2 (A), for all ground atoms A; and (ii) I1 k I2 iff I1 (A) k I2 (A), for all ground atoms A. Under these two orderings CP becomes a complete bilattice. The meet and join operations over T × T for both orderings are extended to CP in the usual way (e.g. for any atom A, (I ⊕k J)(A) = I(A) ⊕k J(A)). Negation is extended similarly, for any atom A, ¬I(A) = I(¬A), and approximate interpretations are extended to T , for any α ∈ T , I(α) = [α; α]. At second, we identify the models of a program. The deﬁnition extends the one given in [10] to intervals. Deﬁnition 3 (Models of a Logic Program). Let P be an np-program and let I be an approximate interpretation of P . α

1. I satisﬁes a ground np-rule r : A ←r L1 , ..., Ln ; fd , fp , fc in P , denoted |=I r, iff fp ([αr ; αr ], fc ({|I(L1 ), . . . , I(Ln )|})) t I(A); 2. I is a model of P , or I satisﬁes P , denoted |=I P , iff for all atoms A ∈ BP , fd (X) t I(A) where fd is the disjunction function associated with π(A) and α X = {|fp ([αr ; αr ], fc ({|I(L1 ), . . . , I(Ln )|})): A ←r L1 , ..., Ln ; fd , fp , fc ∈ P ∗}| . At third, among all possible models of an np-program, we have now to specify which one is the intended model. The characterization of that model will require the deﬁnition of an immediate consequence operator that will be used to infer knowledge from a program. That operator is a simple extension from T to T × T of the immediate consequence operator deﬁned in [10] to give semantics to classical PDDU. Deﬁnition 4. Let P be any np-program. The immediate consequence operator TP is a mapping from CP to CP , deﬁned as follows: for every interpretation I, for every ground atom A, TP (I)(A) = fd (X), where fd is the disjunction function associated with π(A) α and X = {|fp ([αr ; αr ], fc ({|I(L1 ), . . . , I(Ln )|})): A ←r L1 , ..., Ln ; fd , fp , fc ∈ P ∗}| . Note that from the property (iv) of combination functions satisﬁed by all disjunction functions, it follows that if an atom A does not appear as the head of a rule, then TP (I)(A) = f. Note also that any ﬁxpoint of TP is a model of P . We have Theorem 2. For any np-program P , TP is monotonic and, if the de Morgan laws hold, continuous w.r.t. k .

The Approximate Well-Founded Semantics for Logic Programs with Uncertainty

547

Proof: The proof of monotonicity is easy. To prove the continuity w.r.t. k , consider a chain of interpretations I0 k I1 k . . .. We show that for any A ∈ BP , TP (⊕kj≥0 Ij )(A) = ⊕kj≥0 TP (Ij )(A) (Eq. 1). As CP is a complete lattice, the sequence I0 k I1 k . . . ¯ has a least upper bound, say I¯ = ⊕kj≥0 Ij . For any B ∈ BP , we have ⊕kj≥0 Ij (B) = I(B) k k k ¯ and, from Theorem 1, ⊕j≥0 Ij (¬B) = ⊕j≥0 ¬Ij (B) = ¬ ⊕j≥0 Ij (B) = ¬I(B) and, ¯ thus, for any literal or certainty value L, ⊕kj≥0 Ij (L) = I(L) (Eq. 2) Now, consider the ﬁnite set (P ∗ is ﬁnite) of all ground rules r1 , . . . , rk having A as head, where α ri = A ←i Li1 , ..., Lini ; fd , fpi , fci ). Let us evaluate the left hand side of Equation (1). ¯ ¯ i ), . . . , I(L ¯ i }| )): 0 ≤ i ≤ TP (⊕kj≥0 Ij )(A) = TP (I)(A) = fd ({|fpi ([αi ; αi ], fci ({|I(L ni 1 k k i k|}). On the other hand side, ⊕j≥0 TP (Ij )(A) = ⊕j≥0 fd ({|fp ([αi ; αi ], fci ({|Ij (Li1 ),. . . , Ij (Lini }| )): 0 ≤ i ≤ k|}). But, fd , fpi and fci are continuous and, thus, by Equation (2), ⊕kj≥0 TP (Ij )(A) = fd ({| ⊕kj≥0 {fpi ([αi ; αi ], fci ({|Ij (Li1 ),. . . , Ij (Lini }| )): 0 ≤ i ≤ k}|}) = fd ({|fpi ([αi ; αi ], ⊕kj≥0 {fci ({|Ij (Li1 ), . . . , Ij (Lini }| )}): 0 ≤ i ≤ k|}) = fd ({|fpi ([αi ; αi ], ¯ i ), . . . , fci ({| ⊕kj≥0 Ij (Li1 ), . . . , ⊕kj≥0 Ij (Lini }| )): 0 ≤ i ≤ k|}) = fd ({|fpi ([αi ;αi ],fci ({|I(L 1 i ¯ I(Lni }| )): 0 ≤ i ≤ k|}). Therefore, Equation (1) holds and, thus, TP is continuous.

4

Semantics of Normal Logic Programs

Usually, the semantics of a normal logic program is the least model of the program w.r.t. the knowledge ordering. That model always exists and coincides with the the least ﬁxed-point of TP with respect to k (which exists as TP is monotonic w.r.t. k ). Note that this least model with respect to k corresponds to an extension of the classical KripkeKleene semantics [4] of Datalog programs with negation to normal parametric programs: if we restrict our attention to Datalog with negation, then we have to deal with four values [f ; f ], [t; t], [f ; t] and [t; f ] that correspond to the truth values false, true, unknown and inconsistent, respectively. Then, our bilattice coincides with Belnap’s logic [1] and for any Datalog program with negation P , the least ﬁxed-point of TP w.r.t. k is a model of P that coincides with the Kripke-Kleene semantics of P . To illustrate the different notions introduced in the paper, we rely on the Example 3. Example 3 (Running example). The certainty lattice is L[0,1] and the np-program is 1

1

1

P = {(A ← B, 0.6; ⊕, ⊗, ⊗), (B ← B; ⊕, ⊗, ⊗), (A ← 0.3; ⊕, ⊗, ⊗)}.

2

For ease of presentation, we represent an interpretation as a set of expressions of the form A: [x; y], where A is a ground atom, indicating that I(A) = [x; y]. E.g. the following sequence of interpretations I0 , I1 , I2 shows how the Kripke- Kleene semantics, KKP , of the running Example 3 is computed (as the iterated ﬁxed-point of TP , starting from I0 = I⊥ , the k minimal interpretation that maps any A ∈ BP to [⊥; ], and In+1 = TP (In )): I0 = {A: [0; 1], B: [0; 1]}, I1 = {A: [0.3; 0.6], B: [0; 1]}, I2 = I1 = KK(P ). In that model, which is minimal w.r.t. k and contains only the knowledge provided by P , the certainty of B lies between 0 and 1, i.e. is unknown, and the certainty of A then lies between 0.3 and 0.6. As well known, that semantics is usually considered as too weak. We propose to consider the Closed World Assumption (CWA) to complete our knowledge (the CWA assumes that all atoms whose value cannot be inferred from the program are false by default). This is done by deﬁning the notion of support, introduced

548

Yann Loyer and Umberto Straccia

in [12], of a program w.r.t. an interpretation. Given a program P and an interpretation I, the support of P w.r.t. I, denoted CP (I), determines in a principled way how much false knowledge, i.e. how much knowledge provided by the CWA, can “safely” be joined to I w.r.t. the program P . Roughly speaking, a part of the CWA is an interpretation J such that J k If , where If maps any A ∈ BP to [⊥; ⊥], and we consider that such an interpretation can be safely added to I if J k TP (I ⊕k J), i.e. if J does not contradict the knowledge represented by P and I. Deﬁnition 5. The support of an np-program P w.r.t. an interpretation I, denoted CP (I), is the maximal interpretation J w.r.t. k such that J k If and J k TP (I ⊕k J). k It is easy to note that CP (I) = {J | J k If and J k TP (I ⊕k J)}. The following theorem provides an algorithm for computing the support. Theorem 3. CP (I) coincides with the iterated ﬁxpoint of the function FP,I beginning the computation with If , where FP,I (J) = If ⊗k TP (I ⊕k J). From Theorems 1 and 2, it can be shown that FP,I is monotone and, if the de Morgan laws hold, continuous w.r.t. k . It follows that the iteration of the function FP,I starting from If decreases w.r.t. k . We will refer to CP as the closed world operator. Corollary 1. Let P be an np-program. The closed world operator CP is monotone and, if the de Morgan laws hold, continuous w.r.t. the knowledge order k . The following sequence of interpretations J0 , J1 , J2 shows the computation of CP (KKP ), i.e. the additional knowledge that can be considered using the CWA on the Kripke-Kleene semantics KKP of the running Example 3 (I = KKP , J0 = If and Jn+1 = FP,I (Jn )): J0 = {A: [0; 0], B: [0; 0]}, J1 = {A: [0.0; 0.3], B: [0; 0]}, J2 = J1 = CP (KKP ). CP (KKP ) asserts that, according to the CWA and w.r.t. P and KKP , the certainty of A should be at most 0.3, while that of B is exactly 0. We have now two ways to infer information from an np-program P and an approximate interpretation I: using TP and using CP . To maximize the knowledge derived from P and the CWA, but without introducing any other extra knowledge, we propose to choose the least model of P containing its own support, i.e. that cannot be completed anymore according to the CWA, as the semantics of P . This consideration leads to the following epistemic deﬁnition of semantics of a program P . Deﬁnition 6. The approximate well-founded semantics of an np-program P , denoted WP , is the least model I of P w.r.t. k such that CP (I) k I. Now we provide a ﬁxpoint characterization and, thus, a way of computation of the approximate well-founded semantics. It is based on an operator, called approximate well-founded operator, that combines the two operators that have been deﬁned above. Given an interpretation I, we complete it with its support provided by the CWA, and then activate the rules of the program on the obtained interpretation using the immediate consequence operator. Deﬁnition 7. Let P be an np-program. The approximate well-founded operator, denoted AWP , takes in input an approximate interpretation I ∈ CP and returns AWP (I) ∈ CP deﬁned by AWP (I) = TP (I ⊕k CP (I)).

The Approximate Well-Founded Semantics for Logic Programs with Uncertainty

549

From [12], the following theorems can be shown. Theorem 4. Let P be an np-program. Any ﬁxed-point I of AWP is a model of P . Using the properties of monotonicity and continuity of TP and CP w.r.t. the knowledge order k over CP , from the fact that CP is a complete lattice w.r.t. k , by the well-known Knaster-Tarski theorem, it follows that Theorem 5. Let P be an np-program. The approximate well-founded operator AWP is monotone and, if the de Morgan laws hold, continuous w.r.t. the knowledge order k . Therefore, AWP has a least ﬁxed-point w.r.t. the knowledge order k . Moreover that least ﬁxpoint coincides with the approximate well-founded semantics WP of P . The following sequence of interpretations shows the computation of WP of Example 3 (I0 = I⊥ and In+1 = AWP (In )). The certainty of A is 0.3 and the certainty of B is 0. Note that KKP k WP , i.e. the well-founded semantics contains more knowledge than the Kripke-Kleene semantics that was completed with some default knowledge from the CWA. I0 = {A: [0; 1], B: [0; 1]}, CP (I0 ) = {A: [0; 0.3], B: [0; 0]}, I1 = {A: [0.3; 0.3], B: [0; 0]}, CP (I1 ) = {A: [0; 0.3], B: [0; 0]}, I2 = I1 = WP . Example 4. Consider the program P = R ∪ F given in Example 1. The computation of the approximate well-founded semantics WP of P gives the following result4 : WP = {R(J): [0.64; 0.7], S(J): [0.8; 0.8], Y(J): [0; 0], G(J): [0.3; 0.36], E(J): [0.7; 0.7]}, which establishes that John’s degree of Risk is in between [0.64, 0.7]. 2 Finally, our approach captures and extends the usual semantics of logic programs. Theorem 6. If we restrict our attention to PDDU, then for any program P the approximate well-founded semantics WP assigns exact values to all atoms and coincides with the semantics of P proposed in [10]. Theorem 7. If we restrict our attention to Datalog with negation, then we have to deal with Belnap’s bilattices [1] and for any Datalog program with negation P , (i) any stable model [5] of P is a ﬁxpoint of AWP , and (ii) the approximate well-founded semantics WP coincides with the well-founded semantics of P [18].

5

Conclusions

We present a novel characterization, both epistemic and operational, of the well-founded semantics in PDDU [10], an unifying umbrella for many existing approaches towards the manipulation of uncertainty in logic programs. We extended it with non-monotonic (default) negation. Main features of our extension are (i) dealing with uncertain and incomplete knowledge, atoms are assigned approximation of uncertainty values; (ii) the CWA is used to complete the knowledge to infer the most precise approximations as possible relying on a natural management of negation; (iii) that the continuity of the immediate consequence operator is preserved (which is a major feature of the classical PDDU framework); and (iv) our approach extends to PDDU with negation not only the semantics proposed in [10] for PDDU, but also the usual semantics of Datalog with negation: the well-founded semantics and the Kripke-Kleene semantics. 4

For ease of presentation, we use the ﬁrst letter of predicates and constants only.

550

Yann Loyer and Umberto Straccia

References 1. N. D. Belnap. How a computer should think. In Gilbert Ryle, editor, Contemporary aspects of philosophy, pages 30–56. Oriel Press, Stocksﬁeld, GB, 1977. 2. D. Dubois, J. Lang, and H. Prade. Towards possibilistic logic programming. In Proc. of the 8th Int. Conf. on Logic Programming (ICLP-91), pages 581–595, 1991. 3. M. Fitting. The family of stable models. J. of Logic Programming, 17:197–225, 1993. 4. M. Fitting. A Kripke-Kleene-semantics for general logic programs. J. of Logic Programming, 2:295–312, 1985. 5. M. Gelfond and V. Lifschitz. The stable model semantics for logic programming. In Proc. of the 5th Int. Conf. on Logic Programming, pages 1070–1080, 1988. 6. M. L. Ginsberg. Multi-valued logics: a uniform approach to reasoning in artiﬁcial intelligence. Computational Intelligence, 4:265–316, 1988. 7. M. Kifer and A. Li. On the semantics of rule-based expert systems with uncertainty. In Proc. of the Int. Conf. on Database Theory (ICDT-88), in LNCS 326, pages 102–117, 1988. 8. M. Kifer and V.S. Subrahmanian. Theory of generalized annotaded logic programming and its applications. J. of Logic Programming, 12:335–367, 1992. 9. L. V.S. Lakshmanan and N. Shiri. Probabilistic deductive databases. In Int. Logic Programming Symposium, pages 254–268, 1994. 10. L. V.S. Lakshmanan and N. Shiri. A parametric approach to deductive databases with uncertainty. IEEE Transactions on Knowledge and Data Engineering, 13(4):554–570, 2001. 11. Y. Loyer and U. Straccia. The well-founded semantics in normal logic programs with uncertainty. In Proc. of the 6th Int. Symposium on Functional and Logic Programming (FLOPS2002), in LNCS 2441, pages 152–166, 2002. 12. Y. Loyer and U. Straccia. The well-founded semantics of logic programs over bilattices: an alternative characterisation. Technical Report ISTI-2003-TR-05, Istituto di Scienza e Tecnologie dell’Informazione, Consiglio Nazionale delle Ricerche, Pisa, Italy, 2003. Submitted. 13. T. Lukasiewicz. Fixpoint characterizations for many-valued disjunctive logic programs with probabilistic semantics, in LNCS, 2173, pages 336–350, 2001. 14. R. Ng and V.S. Subrahmanian. Stable model semantics for probabilistic deductive databases. In Proc. of the 6th Int. Symposium on Methodologies for Intelligent Systems (ISMIS-91), in LNAI 542, pages 163–171, 1991. 15. R. Ng and V.S. Subrahmanian. Probabilistic logic programming. Information and Computation, 101(2):150–201, 1993. 16. E.Y. Shapiro. Logic programs with uncertainties: A tool for implementing rule-based systems. In Proc. of the 8th Int. Joint Conf. on Artiﬁcial Intelligence (IJCAI-83), pages 529–532, 1983. 17. M.H. van Emden. Quantitative deduction and its ﬁxpoint theory. J. of Logic Programming, 4(1):37–53, 1986. 18. A. van Gelder, K. A. Ross, and J. S. Schlimpf. The well-founded semantics for general logic programs. J. of the ACM, 38(3):620–650, January 1991. 19. G. Wagner. Negation in fuzzy and possibilistic logic programs. In T. Martin and F. Arcelli, editors, Logic programming and Soft Computing. Research Studies Press, 1998.

Which Is the Worst-Case Nash Equilibrium? Thomas L¨ ucking1 , Marios Mavronicolas2, Burkhard Monien1 , Manuel Rode1, , Paul Spirakis3,4 , and Imrich Vrto5 1

3

Faculty of Computer Science, Electrical Engineering and Mathematics University of Paderborn, F¨ urstenallee 11, 33102 Paderborn, Germany {luck,bm,rode}@uni-paderborn.de 2 Department of Computer Science, University of Cyprus P. O. Box 20537, Nicosia CY-1678, Cyprus [email protected] Computer Technology Institute, P. O. Box 1122, 261 10 Patras, Greece [email protected] 4 Department of Computer Engineering and Informatics University of Patras, Rion, 265 00 Patras, Greece 5 Institute of Mathematics, Slovak Academy of Sciences 841 04 Bratislava 4, D´ ubravska´ a 9, Slovak Republic [email protected]

Abstract. A Nash equilibrium of a routing network represents a stable state of the network where no user finds it beneficial to unilaterally deviate from its routing strategy. In this work, we investigate the structure of such equilibria within the context of a certain game that models selfish routing for a set of n users each shipping its traﬃc over a network consisting of m parallel links. In particular, we are interested in identifying the worst-case Nash equilibrium – the one that maximizes social cost. Worst-case Nash equilibria were first introduced and studied in the pioneering work of Koutsoupias and Papadimitriou [9]. More specifically, we continue the study of the Conjecture of the Fully Mixed Nash Equilibrium, henceforth abbreviated as FMNE Conjecture, which asserts that the fully mixed Nash equilibrium, when existing, is the worst-case Nash equilibrium. (In the fully mixed Nash equilibrium, the mixed strategy of each user assigns (strictly) positive probability to every link.) We report substantial progress towards identifying the validity, methodologies to establish, and limitations of, the FMNE Conjecture.

1

Introduction

Motivation and Framework. Nash equilibrium [12,13] is arguably the most important solution concept in (non-cooperative) Game Theory1 . It represents

1

This work has been partially supported by the IST Program of the European Union under contract numbers IST-1999-14186 (ALCOM-FT) and IST-2001-33116 (FLAGS), by funds from the Joint Program of Scientific and Technological Collaboration between Greece and Cyprus, by research funds at University of Cyprus, and by the VEGA grant No. 2/3164/23. Graduate School of Dynamic Intelligent Systems See [14] for a concise introduction to contemporary Game Theory.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 551–561, 2003. c Springer-Verlag Berlin Heidelberg 2003

552

Thomas L¨ ucking et al.

a stable state of the play of a strategic game in which each player holds an accurate opinion about the (expected) behavior of other players and acts rationally. Understanding the combinatorial structure of Nash equilibria is a necessary prerequisite to either designing eﬃcient algorithms to compute them, or establishing corresponding hardness and thereby designing (eﬃcient) approximation algorithms2 . In this work, we embark on a systematic study of the combinatorial structure of Nash equilibria in the context of a simple routing game that models selﬁsh routing over a non-cooperative network such as the Internet. This game was originally introduced in a pioneering work of Koutsoupias and Papadimitriou [9]; that work deﬁned coordination ratio (also known as price of anarchy [15]) as a worst-case measure of the impact of the selﬁsh behavior of users on the eﬃciency of routing over a non-cooperative network operating at a Nash equilibrium. As a worst-case measure, the coordination ratio bounds the maximum loss of efﬁciency due to selﬁsh behavior of users at the worst-case Nash equilibrium; in sharp contrast, the principal motivation of our work is to identify the actual worst-case Nash equilibrium of the selﬁsh routing game. Within the framework of the selﬁsh routing game of Koutsoupias and Papadimitriou [9], we assume a collection of n users, each employing a mixed strategy, which is a probability distribution over m parallel links, to control the shipping of its own assigned traﬃc. For each link, a capacity speciﬁes the rate at which the link processes traﬃc. In a Nash equilibrium, each user selﬁshly routes its traﬃc on those links that minimize its expected latency cost, given the network congestion caused by the other users. The social cost of a Nash equilibrium is the expectation, over all random choices of the users, of the maximum, over all links, latency through a link. The worst-case Nash equilibrium is one that maximizes social cost. Our study distinguishes between pure Nash equilibria, where each user chooses exactly one link (with probability one), and mixed Nash equilibria, where the choices of each user are modeled by a probability distribution over links. Of special interest to our work is the fully mixed Nash equilibrium [10], where each user chooses each link with non-zero probability; henceforth, denote F the fully mixed Nash equilibrium. We will also introduce and study disjointly mixed Nash equilibria, where (loosely speaking) mixed strategies of diﬀerent users do not intersect. Allowing link capacities to vary arbitrarily gives rise to the standard model of related links, also known as model of uniform links in the scheduling literature (cf. Gonzales et al. [5]); the name is due to the fact that the order of the delays a user experiences on each of the links is the same across all users. A special case of the model of related links is the model of identical links, where all link capacities are equal (cf. Graham [6]); thus, in this model, each user incurs the same delay on all links. We also consider the model of unrelated links, where instead of associating a traﬃc and a capacity with each user and link, respec2

Computation of Nash equilibria has been long observed to be a very challenging, yet notoriously hard algorithmic problem; see [15] for an advocation.

Which Is the Worst-Case Nash Equilibrium?

553

tively, we assign a delay for each pair of a user and a link in an arbitrary way (cf. Horowitz and Sahni [7]); thus, in the unrelated links model, there is no relation between the delays incurred to a user on diﬀerent links. Reciprocally, in the model of identical traﬃcs, all user traﬃcs are equal; they may vary arbitrarily in the model of arbitrary traﬃcs. We are interested in understanding the impact of model assumptions on links and users on the patterns of the worst-case Nash equilibria for the selﬁsh routing game we consider. Results and Contribution. In this work, we embark on a systematic study of a natural conjecture due to Gairing et al. [4], which asserts that the fully mixed Nash equilibrium is the worst-case Nash equilibrium (with respect to social cost). Fully Mixed Nash Equilibrium Conjecture [4]. Consider the model of arbitrary traﬃcs and related links. Then, for any traﬃc vector w such that the fully mixed Nash equilibrium F exists, and for any Nash equilibrium P, SC (w, P) ≤ SC (w, F). Henceforth, abbreviate the Fully Mixed Nash Equilibrium Conjecture as the FMNE Conjecture. Our study reports substantial progress towards the settlement of the FMNE Conjecture: – We prove the FMNE Conjecture for several interesting special cases of it (within the model of related links). – In doing so, we provide proof techniques and tools which, while applicable to interesting special cases of it, may suﬃce for the general case as well. – We reveal limitations of the FMNE Conjecture by establishing that it is not, in general, valid over the model of unrelated links; we present both positive and negative instances for the conjecture. Related Work, Comparison and Significance. The selﬁsh routing game considered in this paper was ﬁrst introduced and studied in the pioneering work of Koutsoupias and Papadimitriou [9]. This game was subsequently studied in the work of Mavronicolas and Spirakis [10], where fully mixed Nash equilibria were introduced and analyzed. Both works focused mainly on proving bounds on coordination ratio. Subsequent works that provided bounds on coordination ratio include [1,2,8]. The work of Fotakis et al. [3] was the ﬁrst to study the combinatorial structure and the computational complexity of Nash equilibria for the selﬁsh routing game we consider; that work was subsequently extended by Gairing et al. [4]. (See details below.) The closest to our work are the one by Fotakis et al. [3] and the one by Gairing et al. [4]. – The FMNE Conjecture has been inspired by two results due to Fotakis et al. [3] that conﬁrm or support the conjecture. First, Fotakis et al. [3, Theorem 6] establish the Fully Mixed Nash Equilibrium Conjecture for the model of identical links and assuming that n = 2; Theorem 3 in this work extends this

554

Thomas L¨ ucking et al.

result to the model of related links, still assuming that n = 2 while assuming, in addition, that traﬃcs are identical. Second, Fotakis et al. [3, Theorem 7] prove that, for the model of related links and of identical traﬃcs, the social cost of any Nash equilibrium is no more than 49.02 times the social cost of the fully mixed Nash Equilibrium. – The FMNE Conjecture was explicitly stated in the work of Gairing et al. [4, Conjecture 1.1]. In the same paper, two results are shown that conﬁrm or support the conjecture. First, Gairing et al. [4, Theorem 4.2] establish the validity of the FMNE Conjecture when restricted to pure Nash equilibria. Second, Gairing et al. [4, Theorem 5.1] prove that for the model of identical links, the social cost of any Nash equilibrium is no more than 6 + ε times the social cost of the fully mixed Nash equilibrium, for any constant ε > 0. (Note that since this result does not assume identical traﬃcs, it is incomparable to the related result by Fotakis et al. [3, Theorem 7] (for the model of related links) which does.) The ultimate settlement of the FMNE Conjecture (for the model of related links) may reveal an interesting complexity-theoretic contrast between the worstcase pure and the worst-case mixed Nash equilibria. On one hand, identifying the worst-case pure Nash equilibrium is an N P-hard problem [3, Theorem 4]; on the other hand, if the FMNE Conjecture is valid, identiﬁcation of the worstcase mixed Nash equilibrium is immediate in the cases where the fully mixed Nash equilibrium exists. (In addition, the characterization of the fully mixed Nash equilibrium shown in [10, Theorem 14] implies that such existence can be checked in polynomial time.) Road Map. The rest of this paper is organized as follows. Section 2 presents our deﬁnitions and some preliminaries. The case of disjointly mixed Nash equilibria is treated in Section 3. Section 4 considers the case of identical traﬃcs and related links with n = 2. The reciprocal case of identical traﬃcs and identical links with m = 2 is studied in Section 5. Section 6 examines the case of unrelated links. We conclude, in Section 7, with a discussion of our results and some open problems.

2

Framework

Most of our deﬁnitions are patterned after those in [10, Section 2], [3, Section 2] and [4, Section 2], which, in turn, were based on those in [9, Sections 1 & 2]. Mathematical Preliminaries and Notation. Throughout, denote for any integer m ≥ 2, [m] = {1, . . . , m}. For a random variable X, denote E(X) the expectation of X. General. We consider a network consisting of a set of m parallel links 1, 2, . . . , m from a source node to a destination node. Each of n network users 1, 2, . . . , n, or users for short, wishes to route a particular amount of traﬃc along a (non-ﬁxed) link from source to destination. (Throughout, we will be using subscripts for users and superscripts for links.) In the model of related links, denote wi the

Which Is the Worst-Case Nash Equilibrium?

555

traﬃc of user i ∈ [n], and W = i∈[n] wi . Deﬁne the n × 1 traﬃc vector w in the natural way. Assume throughout that m > 1 and n > 1. Assume also, without loss of generality, that w1 ≥ w2 ≥ . . . ≥ wn . In the model of unrelated links, denote Cij the cost of user i ∈ [n] on link j ∈ [m]. Deﬁne the n × m cost matrix C in the natural way. A pure strategy for user i ∈ [n] is some speciﬁc link. A mixed strategy for user i ∈ [n] is a probability distribution over pure strategies; thus, a mixed strategy is a probability distribution over the set of links. The support of the mixed strategy for user i ∈ [n], denoted support(i), is the set of those pure strategies (links) to which i assigns positive probability. A pure strategy proﬁle is represented by an n-tuple 1 , 2 , . . . , n ∈ [m]n ; a mixed strategy proﬁle is represented by an n × m probability matrix P of nm probabilities pji , i ∈ [n] and j ∈ [m], where pji is the probability that user i chooses link j. For a probability matrix P, deﬁne indicator variables Iij ∈ {0, 1}, where i ∈ [n] and j ∈ [m], such that Iij = 1 if and only if pji > 0. Thus, the support of the mixed strategy for user i ∈ [n] is the set {j ∈ [m] | Iij = 1}. For each link j ∈ [m], deﬁne the view of link j, denoted view (j), as the set of users i ∈ [n] that potentially assign their traﬃcs to link j; so, view (j) = {i ∈ [n] | Iij = 1}. For each link j ∈ [m], denote V j = |view (j)|. Syntactic Classes of Mixed Strategies. A mixed strategy proﬁle P is disjointly mixed if for all links j ∈ [m], |{i ∈ view (j) : pji < 1}| ≤ 1, that is, there is at most one non-pure user on each link. A mixed strategy proﬁle P is fully mixed [10, Section 2.2] if for all users i ∈ [n] and links j ∈ [m], Iij = 1 3 . Throughout, we will cast a pure strategy proﬁle as a special case of a mixed strategy proﬁle in which all (mixed) strategies are pure. System, Models and Cost Measures. In the model of related links, denote c > 0 the capacity of link ∈ [m], representing the rate at which the link processes traﬃc, and C = l∈[m] cl . So, the latency for traﬃc w through link equals w/c . In the model of identical capacities, all link capacities are equal to c, for some constant c > 0; link capacities may vary arbitrarily in the model of arbitrary capacities. Assume throughout, without loss of generality, that c1 ≥ c2 ≥ . . . ≥ cm . In the model of identical traﬃcs, all user traﬃcs are equal to 1; user traﬃcs may vary arbitrarily in the model of arbitrary traﬃcs. For a pure strategy proﬁle 1 , 2 , . . . , n , the latency cost for user i ∈ [n], denoted λi , is the latency cost of the link it chooses, that is, ( k:k =i wk )/ci . For a mixed strategy proﬁle P, denote δ the actual traﬃc on link ∈ [m]; so, δ is a random variable. For each link ∈ [m], denote θ the expected traﬃc n on link ∈ [m]; thus, θ = E(δ ) = i=1 pi wi . For a mixed strategy proﬁle P, the expected latency cost for user i ∈ [n] on link ∈ [m], denoted λi , is the expectation, over all random choices of the remaining users, of the latency cost for user i had its traﬃc been assigned to link ; thus, 3

An earlier treatment of fully mixed strategies in the context of bimatrix games has been found in [16], called there completely mixed strategies. See also [11] for a subsequent treatment in the context of strategically zero-sum games.

556

Thomas L¨ ucking et al.

λi

=

wi +

k=1,k=i c

pk wk

=

(1 − pi )wi + θ . c

For each user i ∈ [n], the minimum expected latency cost, denoted λi , is the minimum, over all links ∈ [m], of the expected latency cost for user i on link ; thus, λi = min∈[m] λi . Associated with a traﬃc vector w and a mixed strategy proﬁle P is the social cost [9, Section 2], denoted SC(w, P), which is the expectation, over all random choices of the users, of the maximum (over all links) latency of traﬃc through a link; thus,

SC(w, P) = E

max

∈[m]

k:k = c

wk

=

1 ,2 ,...,n ∈[m]n

n k=1

pkk

· max

∈[m]

k:k = c

wk

.

Note that SC (w, P) reduces to the maximum latency through a link in the case of pure strategies. On the other hand, the social optimum [9, Section 2] associated with a traﬃc vector w, denoted OPT(w), is the least possible maximum (over all links) latency of traﬃc through a link. Note that while SC(w, P) is deﬁned in relation to a mixed strategy proﬁle P, OPT(w) refers to the optimum pure strategy proﬁle. In the model of unrelated links, the latency of user i on link l is its cost . expected latency cost of user i on link l translates to λli = Cil + C il Thus, the l on C and the strategy proﬁle k=1,k=i pi Ckl , and the social cost, now depending n lk P, is deﬁned by SC(C, P) = l1 ,l2 ,...,ln ∈[m]n k:lk =l Ckl . k=1 pk · maxl∈[m] Nash Equilibria. We are interested in a special class of mixed strategies called Nash equilibria [13] that we describe below. Formally, the probability matrix P is a Nash equilibrium [9, Section 2] if for all users i ∈ [n] and links ∈ [m], λi = λi if Ii = 1, and λi ≥ λi if Ii = 0. Thus, each user assigns its traﬃc with positive probability only on links for which its expected latency cost is minimized; this implies that there is no incentive for a user to unilaterally deviate from its mixed strategy in order to avoid links on which its expected latency cost is higher than necessary. The coordination ratio [9] is the maximum value, over all traﬃc vectors w and Nash equilibria P of the ratio SC (w, P) /OPT (w). In the model of unrelated links, the coordination ratio translates to the maximum value of SC (C, P) /OPT (C). Mavronicolas and Spirakis [10, Lemma 15] show that in the model of identical links, all links are equiprobable in a fully mixed Nash equilibrium. Lemma 1 (Mavronicolas and Spirakis [10]). Consider the fully mixed case under the model of identical capacities. Then, there exists a unique Nash equilibrium with associated Nash probabilities pi = 1/m, for any user i ∈ [n] and link ∈ [m]. Gairing et al. [4, Lemma 4.1] show that in the model of related links, the minimum expected latency cost of any user i ∈ [n] in a Nash equilibrium P is bounded by its minimum expected latency cost in the fully mixed Nash equilibrium F.

Which Is the Worst-Case Nash Equilibrium?

557

Lemma 2 (Gairing et al. [4]). Fix any traﬃc vector w, mixed Nash equilibrium P and user i. Then, λi (w, P) ≤ λi (w, F).

3

Disjointly Mixed versus Fully Mixed Nash Equilibria

In this section, we restrict ourselves to the case of disjointly mixed Nash equilibria, and we establish the FMNE Conjecture for this case. We prove: Theorem 1. Fix any traﬃc vector w such that F exists, and any disjointly mixed Nash equilibrium P. Then, SC (w, P) ≤ SC (w, F). Corollary 1. Consider the model of related links, and assume that n = 2 and m = 2. Then, the FMNE Conjecture is valid.

4

Identical Traﬃcs, Related Links and n = 2

In this section we restrict to 2 users with identical traﬃcs, that is, w1 = w2 . Without loss of generality we assume w1 = w2 = 1 and c1 ≥ · · · ≥ cm . In the following, we denote by support(1) and support(2) the supports of user 1 and 2, respectively, and by pji and fij the probabilities for user i to choose link j in P and F, respectively. Since we consider two users with identical traﬃcs, we have f1j = f2j for all j ∈ [m], and we write f j = fij . In order to prove the FMNE Conjecture for this type of Nash equilibria we will use the following formula for the social cost of any Nash equilibrium P in this setting. Theorem 2. In case of two users with identical traﬃcs on m related links, the social cost of any Nash equilibrium P is

1 1 i j SC(w, P) = λ2 (P) + p2 p1 − i . cj c 1≤i<j≤m

We now show that we only have to consider Nash equilibria P of certain structure. Lemma 3. For any Nash equilibrium P = F of two users with identical traﬃcs on m related links the following holds: 1. The supports of the two users are support(1) = [r] ∪ I1

and

support(2) = [r] ∪ I2 ,

where I1 , I2 are disjoint sets of links not containing a link i ∈ [r], such that [r] ∪ I1 ∪ I2 = [r + |I1 | + |I2 |]. 2. All links in I1 (I2 ) have the same capacity.

558

Thomas L¨ ucking et al.

In order to prove the FMNE Conjecture for two users with identical traﬃcs on m related links in Theorem 3, we show that the following lemma holds. Lemma 4. Let G be the fully mixed Nash equilibrium of two users with identical traﬃcs on m related links with capacities c1 ≥ . . . ≥ cm . Furthermore, let the last s ≥ 1 links have the same capacity, and let F be the fully mixed Nash equilibrium of the instance received by increasing the capacities of the last s links to cm−s . Then SC(w, F) ≤ SC(w, G). Theorem 3. Consider the model of identical traﬃcs and related links, and assume that n = 2. Then, the FMNE Conjecture is valid.

5

Identical Traﬃcs, Identical Links and m = 2

We show: Theorem 4. Consider the model of identical traﬃcs and identical links, and assume that m = 2 and n is even. Then, the FMNE Conjecture is valid. Proof. Since both the traﬃcs and the link capacities are identical, we can assume without loss of generality that wi = 1 for all i ∈ [n] and cj = 1 for all j ∈ [m]. Recall that in the case of identical capacities, the fully mixed Nash equilibrium F exists always (that is, for all traﬃc vectors w). Hence, we will show that for any other Nash equilibrium P, SC (w, P) ≤ SC (w, F). Fix any Nash equilibrium P. We can identify three sets of users in P: U1 = {i : support(i) = {1}}, U2 = {i : support(i) = {2}} and U12 = {i : support(i) = {1, 2}}. There are u = min(|U1 |, |U2 |) (pure) users, which choose link 1 and link 2, respectively, with probability 1. Therefore, SC(w, P) = SC(w, P ) + u, where P is the Nash equilibrium derived from P by omitting those 2u users. We will show, that SC (w, F ) ≥ SC (w, P ) for the fully mixed Nash equilibrium F of n − 2u users. As SC (w, F) > SC (w, F ) + 2u (Lemma 5), this will prove the theorem. Without loss of generality, we can assume that P is of the following form: r (pure) users go on link 1 with probability 1, and n − r users choose both links with positive probability. We write Pr for this kind of Nash equilibrium. Lemma 5. For the fully mixed Nash equilibrium F,

n n−1 n SC (w, F) = + n n . 2 2 2 −1 Lemma 6. For the Nash equilibrium Pr with two sets of users U1 = {i : support(i) = {1}} and U12 = {i : support(i) = {1, 2}} with |U1 | = r < n and |U12 | = n − r the Nash probabilities are p := p1i =

r 1 − , 2 2(n − r − 1)

and

q := p2i =

for all users i ∈ U12 . Furthermore, n > 2r + 1 holds.

r 1 + , 2 2(n − r − 1)

Which Is the Worst-Case Nash Equilibrium?

559

Lemma 7. The social cost of the Nash equilibrium Pr is given by n SC (w, Pr ) = 2 +

n−r n −r 2

n−r i= n +1 2

i·

p

n −r 2

q

n 2

+

n i= n +1 2

i·

n−r i−r

pi−r q n−i

n − r n−r−i i p q . i

The proof is completed by showing that ∆ := SC (w, F) − SC (w, Pr ) ≥ 0.

6

Unrelated Links

In this section, we consider the case of unrelated links. We prove Proposition 1. Consider the model of unrelated links. Fix any cost matrix C for which F exists, and a pure Nash equilibrium P. Assume that n ≤ m. Then, for any user i, λi (P) < λi (F). Theorem 5. Consider the model of unrelated links. Assume that n ≤ m. Consider any cost matrix C such that the fully mixed Nash equilibrium F exists, and any pure Nash equilibrium P. Then, SC (C, P) ≤ SC (C, F). Proof. Clearly, the social cost of any pure Nash equilibrium P is equal to the selﬁsh cost of some user, while the social cost of a fully mixed Nash equilibrium F is at least the selﬁsh cost of any user. Hence, Proposition 1 implies the claim. Proposition 2. Consider the model of unrelated links. Assume that n = 2. Fix any cost matrix C for which F exists, and any Nash equilibrium P. Then, for any user i ∈ [2], λi (P) ≤ λi (F). Theorem 6. Consider the model of unrelated links. Assume that n = 2 and m = 2. Then, the FMNE Conjecture is valid. We remark that Theorem 6 generalizes Corollary 1 to the case of unrelated links. We ﬁnally prove: Theorem 7 (Counterexample to the FMNE Conjecture). Consider the model of unrelated links. Then, the FMNE Conjecture is not valid even if n = 3 and m = 2.

7

Conclusion and Directions for Further Research

We have veriﬁed the FMNE Conjecture over several interesting restrictions of the selﬁsh routing game we considered for the case of related links. We have also investigated the FMNE Conjecture in the case of unrelated links, for which we have identiﬁed instances of the game that validate and falsify the FMNE Conjecture, respectively. The most obvious problem left open by our work is to

560

Thomas L¨ ucking et al.

establish the FMNE Conjecture in its full generality for the case of related links. We hope that several of the combinatorial techniques introduced in this work for settling special cases of the conjecture may be handy for the general case. The FMNE Conjecture attempts to study a possible order on the set of Nash equilibria (for the speciﬁc selﬁsh routing game we consider) that is deﬁned with respect to their social costs; in the terminology of partially ordered sets, the FMNE Conjecture asserts that the fully mixed Nash equilibrium is a maximal element of the deﬁned order. We feel that this order deserves further study. For example, what are the minimal elements of the order? More generally, is there a characterization of measures on Nash equilibria such that the fully mixed Nash equilibrium is a maximal element of the order deﬁned with respect to any speciﬁc measure? (Our study considers the social cost as one such measure of interest.)

Acknowledgments We thank Rainer Feldmann and Martin Gairing for several helpful discussions.

References 1. A. Czumaj and B. V¨ ocking, “Tight Bounds for Worst-Case Equilibria”, Proceedings of the 13th Annual ACM Symposium on Discrete Algorithms, pp. 413–420, 2002. 2. R. Feldmann, M. Gairing, T. L¨ ucking, B. Monien and M. Rode, “Nashification and the Coordination Ratio for a Selfish Routing Game”, 30th International Colloquium on Automata, Languages and Programming, 2003. 3. D. Fotakis, S. Kontogiannis, E. Koutsoupias, M. Mavronicolas and P. Spirakis, “The Structure and Complexity of Nash Equilibria for a Selfish Routing Game,’ Proceedings of the 29th International Colloquium on Automata, Languages and Programming, LNCS 2380, pp. 123–134, 2002. 4. M. Gairing, T. L¨ ucking, M. Mavronicolas, B. Monien and P. Spirakis, “Extreme Nash Equilibria”, submitted for publication, March 2003. Also available as Technical Report FLAGS-TR-02-5, Computer Technology Institute, Patras, Greece, November 2002. 5. T. Gonzalez, O.H. Ibarra and S. Sahni, “Bounds for LPT schedules on uniform processors”, SIAM Journal on Computing, Vol. 6, No. 1, pp. 155–166, 1977. 6. R. L. Graham, “Bounds on Multiprocessing Timing Anomalies”, SIAM Journal on Applied Mathematics, Vol. 17, pp. 416–426, 1969. 7. E. Horowitz and S. Sahni, “Exact and aproximate algorithms for scheduling nonidentical processors”, Journal of the Association of Computing Machinery, Vol. 23, No. 2, pp. 317–327, 1976. 8. E. Koutsoupias, M. Mavronicolas and P. Spirakis, “Approximate Equilibria and Ball Fusion”, Proceedings of the 9th International Colloquium on Structural Information and Communication Complexity, 2002, accepted to Theory of Computing Systems. 9. E. Koutsoupias and C. H. Papadimitriou, “Worst-case Equilibria”, Proceedings of the 16th Annual Symposium on Theoretical Aspects of Computer Science, LNCS 1563, pp. 404–413, 1999.

Which Is the Worst-Case Nash Equilibrium?

561

10. M. Mavronicolas and P. Spirakis, “The Price of Selfish Routing”, Proceedings of the 33rd Annual ACM Symposium on Theory of Computing, pp. 510–519, 2001. 11. H. Moulin and L. Vial, “Strategically Zero-Sum Games: The Class of Games whose Completely Mixed Equilibria Cannot be Improved Upon”, International Journal of Game Theory, Vol. 7, Nos. 3/4, pp. 201–221, 1978. 12. J. F. Nash, “Equilibrium Points in N -Person Games”, Proceedings of the National Academy of Sciences, Vol. 36, pp. 48–49, 1950. 13. J. F. Nash, “Non-cooperative Games”, Annals of Mathematics, Vol. 54, No. 2, pp. 286–295, 1951. 14. M. J. Osborne and A. Rubinstein, A Course in Game Theory, MIT Press, 1994. 15. C. H. Papadimitriou, “Algorithms, Games and the Internet”, Proceedings of the 33rd Annual ACM Symposium on Theory of Computing, pp. 749–753, 2001. 16. T. E. S. Raghavan, “Completely Mixed Strategies in Bimatrix Games”, Journal of London Mathematical Society, Vol. 2, No. 2, pp. 709–712, 1970.

A Unique Decomposition Theorem for Ordered Monoids with Applications in Process Theory (Extended Abstract) Bas Luttik Dept. of Theoretical Computer Science, Vrije Universiteit Amsterdam De Boelelaan 1081a, NL-1081 HV Amsterdam, The Netherlands [email protected], http://www.cs.vu.nl/˜luttik

Abstract. We prove a unique decomposition theorem for a class of ordered commutative monoids. Then, we use our theorem to establish that every weakly normed process deﬁnable in ACPε with bounded communication can be expressed as the parallel composition of a multiset of weakly normed parallel prime processes in exactly one way.

1

Introduction

The Fundamental Theorem of Arithmetic states that every element of the commutative monoid of positive natural numbers under multiplication has a unique decomposition (i.e., can be expressed as a product of prime numbers uniquely determined up to the order of the primes). It has been an invaluable tool in number theory ever since the days of Euclid. In the realm of process theory, unique decomposability with respect to parallel composition is crucial in the proofs that bisimulation is decidable for normed BPP [5] and normed PA [8]. It also plays an important rˆ ole in the analysis of axiom systems involving an operation for parallel composition [1,6,12]. Milner and Moller [10] were the ﬁrst to establish the unique decomposition property for a commutative monoid of ﬁnite processes with a simple operation for parallel composition. In [11], Moller presents an alternative proof of this result which he attributes to Milner; we shall henceforth refer to it as Milner’s technique. Moller explains that the reason for presenting Milner’s technique is that it serves “as a model for the proof of the same result in more complicated languages which evade the simpler proof method” of [10]. He reﬁnes Milner’s technique twice. First, he adds communication to the operational semantics of the parallel operator. Then, he turns from strong bisimulation semantics to weak bisimulation semantics. Christensen [4] shows how Milner’s technique can be further reﬁned so that also certain inﬁnite processes can be dealt with. He proves unique decomposition theorems for the commutative monoids of weakly normed BPP and of weakly normed BPPτ expressions modulo strong bisimulation. Milner’s technique hinges on some special properties of the operational semantics of parallel composition. The main contribution of this paper is to place these properties in a general algebraic context. Milner’s technique employs a well-founded subrelation of the transition relation induced on processes by the B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 562–571, 2003. c Springer-Verlag Berlin Heidelberg 2003

A Unique Decomposition Theorem for Ordered Monoids

563

operational semantics. We consider commutative monoids equipped with a wellfounded partial order (rather than an arbitrary well-founded relation) to tie in with the theory of ordered monoids as put forward, e.g., in [3,7]. In Section 2 we propose a few simple conditions on ordered commutative monoids, and we prove that they imply the unique decomposition property (Theorem 13). Then, to prove that a commutative monoid has the unique decomposition property, it suﬃces to deﬁne a partial order and establish that it satisﬁes our conditions. From Section 3 onwards, we illustrate this technique, discussing unique decomposability for the process theory ACPε [13]. ACPε is more expressive than any of the process theories for which unique decomposition was investigated previously. Firstly, it distinguishes two forms of termination (successful and unsuccessful). Secondly, it has a more general communication mechanism (an arbitrary number of parallel components may participate in a single communication, and communication not necessarily results in τ ). These two features make the extension of Milner’s technique to ACPε nontrivial; in fact, they both lead to counterexamples obstructing a general unique decomposition result (see Examples 16 and 19). In Section 4 we introduce for ACPε an appropriate notion of weak normedness that takes into account the distinction between successful and unsuccessful termination, and we propose a requirement on the communication mechanism. In Section 5 we prove that if the communication mechanism meets the requirement, then the commutative monoid of weakly normed ACPε expressions modulo bisimulation satisﬁes the abstract speciﬁcation of Section 2, and hence admits a unique decomposition theorem. Whether or not a commutative monoid satisﬁes the conditions put forward in Section 2 is independent of the nature of its elements (be it natural numbers, bisimulation equivalence classes of process expressions, or objects of any other kind). Thus, in particular, our unique decomposition theorem for ordered monoids is independent of a syntax for specifying processes. We think that it will turn out to be a convenient tool for establishing unique decomposability results in a wide range of process theories, and for a wide range of process semantics. For instance, we intend to investigate next whether our theorem can be applied to establish unique decomposition results for commutative monoids of processes deﬁnable in ACPε modulo weak- and branching bisimulation, and of processes deﬁnable in the π-calculus modulo observation equivalence.

2

Unique Decomposition in Commutative p.o. Monoids

A positively ordered monoid (a p.o. monoid ) is a nonempty set M endowed with: (i) an associative binary operation ⊗ on M with an identity element ι ∈ M ; the operation ⊗ stands for composition and ι represents the empty composition; (ii) a partial order on M that is compatible with ⊗, i.e., x y implies x ⊗ z y ⊗ z and z ⊗ x z ⊗ y for all x, y, z ∈ M , and for which the identity ι is the least element, i.e., ι x for all x ∈ M . A p.o. monoid is commutative if its composition is commutative.

564

Bas Luttik

An example of a commutative p.o. monoid is the set N of natural numbers with addition (+) as binary operation, 0 as identity element and the less-thanor-equal relation (≤) as (total) order; we call it the additive p.o. monoid of natural numbers. Another example is the set N∗ of positive natural numbers with multiplication (·) as binary operation, 1 as identity element and the divisibility relation (|) as (partial) order; we call it the multiplicative p.o. monoid of positive natural numbers. In the remainder of this section we shall use N and N∗ to illustrate the theory of decomposition in commutative p.o. monoids that we are about to develop. However, they are not meant to motivate it; the motivating examples stem from process theory. In particular, note that N and N∗ are so-called divisibility monoids [3] in which x y is equivalent to ∃z(x ⊗ z = y). The p.o. monoids arising from process theory generally do not have this property. Deﬁnition 1. An element p of a monoid M is called prime if p = ι and p = x⊗y implies x = ι or y = ι. Example 2. The natural number 1 is the only prime element of N. The prime elements of N∗ are the prime numbers. Let x1 , . . . , xn be a (possibly empty) sequence of elements of a monoid M ; we formally deﬁne its composition x1 ⊗ · · · ⊗ xn by the following recursion: (i) if n = 0, then x1 ⊗ · · · ⊗ xn = ι; and (ii) if n > 0, then x1 ⊗ · · · ⊗ xn = (x1 ⊗ · · · ⊗ xn−1 ) ⊗ xn . n Occasionally, we shall write i=1 xi instead of x1 ⊗ · · · ⊗ xn . Furthermore, we write xn for the n-fold composition of x. Deﬁnition 3. If x is an element of a monoid M and p1 , . . . , pn is a sequence of prime elements of M such that x = p1 ⊗ · · · ⊗ pn , then we call the expression p1 ⊗ · · · ⊗ pn a decomposition of x in M . Two decompositions p1 ⊗ · · · ⊗ pm and q1 ⊗ · · · ⊗ qn of x are equivalent if there is a bijection σ : {1, . . . , m} → {1, . . . , n} such that pi = qσ(i) for all 1 ≤ i ≤ m; otherwise, they are distinct. The identity element ι has the composition of the empty sequence of prime elements as a decomposition, and every prime element has itself as a decomposition. We now proceed to discuss the existence and uniqueness of decompositions in commutative p.o. monoids. We shall present two conditions that together guarantee that every element of a commutative p.o. monoid has a unique decomposition. Deﬁnition 4. Let M be a commutative p.o. monoid; by a stratiﬁcation of M we understand a mapping | | : M → N from M into the additive p.o. monoid N of natural numbers that is a strict homomorphism, i.e., (i) |x ⊗ y| = |x| + |y|, and (ii) x ≺ y implies |x| < |y| (where ≺ and < are the strict relations corresponding to and ≤, respectively).

A Unique Decomposition Theorem for Ordered Monoids

565

A commutative p.o. monoid M together with a stratiﬁcation | | : M → N we call a stratiﬁed p.o. monoid; the number |x| thus associated with every x ∈ M is called the norm of x. Observe that |x| = 0 iﬀ x = ι (since |ι| + |ι| ≤ |ι ⊗ ι| = |ι| by the ﬁrst condition in Deﬁnition 4, it follows that |ι| = 0, and if x = ι, then ι ≺ x, whence 0 = |ι| < |x| by the second condition in Deﬁnition 4). Example 5. The additive p.o. monoid N is stratiﬁed with the identity mapping idN on N as stratiﬁcation. The multiplicative p.o. monoid N∗ is stratiﬁed with | | : N∗ → N deﬁned by |k| = max{n ≥ 0 : ∃k0 < k1 < · · · < kn (1 = k0 | k1 | · · · | kn = k)}. Proposition 6. In a stratiﬁed commutative p.o. monoid every element has a decomposition. Proof. Straightforward by induction on the norm. The next two propositions are straightforward consequences of the deﬁnition of stratiﬁcation; we need them later on. Proposition 7. If M is a stratiﬁed commutative p.o. monoid, then M is strict: x ≺ y implies x ⊗ z ≺ y ⊗ z and z ⊗ x ≺ z ⊗ y for all x, y, z ∈ M . Proposition 8. The order of a stratiﬁed p.o. monoid M is well-founded : every nonempty subset of M has a -minimal element. Deﬁnition 9. We call a p.o. monoid M precompositional if for all x, y, z ∈ M : x y ⊗ z implies that there exist y y and z z such that x = y ⊗ z . Example 10. That N∗ is precompositional can be shown using the well-known property that if p is a prime number such that p | k · l, then p | k or p | l (see, e.g., [9, p. 11]). If x ≺ y, then x is called a predecessor of y, and y a successor of x. If there is no z ∈ M such that x ≺ z ≺ y, then x is an immediate predecessor of y, and y is an immediate successor of x. The following two lemmas establish a crucial relationship between the immediate predecessors of a composition and certain immediate predecessors of its components. Lemma 11. Let M be a precompositional stratiﬁed commutative p.o. monoid, and let x, y and z be elements of M . If x is a predecessor of y of maximal norm, then x ⊗ z is an immediate predecessor of y ⊗ z. Lemma 12. Suppose that x = x1 ⊗ . . . ⊗ xn and y are elements of a precompositional stratiﬁed commutative p.o. monoid M . If y is an immediate predecessor of x, then there exist i ∈ {1, . . . , n} and an immediate predecessor yi of xi such that y = x1 ⊗ · · · ⊗ xi−1 ⊗ yi ⊗ xi+1 ⊗ · · · ⊗ xn .

566

Bas Luttik

Theorem 13 (Unique Decomposition). In a stratiﬁed and precompositional commutative p.o. monoid every element has a unique decomposition. Proof. Let M be a stratiﬁed and precompositional commutative p.o. monoid. By Proposition 6, every element of M has a decomposition. To prove uniqueness, suppose, to the contrary, that the subset of elements of M with two or more distinct decompositions is nonempty. Since is well-founded by Proposition 8, this subset has a -minimal element a. That a has at least two distinct decompositions means that there must be a sequence p, p1 , . . . , pn of distinct primes, and sequences k, k1 , . . . , kn and l, l1 , . . . , ln of natural numbers such that (A) a = pk ⊗ pk11 ⊗ · · · ⊗ pknn and a = pl ⊗ pl11 ⊗ · · · ⊗ plnn ; (B) k < l; and (C) |p| < |pi | implies ki = li for all 1 ≤ i ≤ n. That a is -minimal means that the predecessors of a, i.e., the elements of the initial segment I(a) = {x ∈ M : x ≺ a} of M determined by a, all have a unique decomposition. Let x be an element of I(a). We deﬁne #p (x), the multiplicity of p in x, as the number of occurrences of the prime p in the unique decomposition of x. The index of p in x, denoted by [x : p], is the maximum of the multiplicities of p in the weak predecessors of x, i.e., [x : p] = max{#p (y) : y x}. We now use that a = pk ⊗ pk11 ⊗ · · · ⊗ pknn to give an upper bound for the multiplicity of p in an element x of I(a). Since M is precompositional there exist y1 , . . . , yk p and zi1 , . . . , ziki pi (1 ≤ i ≤ n) such that k n k i x= i=1 yi ⊗ i=1 j=1 zij . From yi p it follows that #p (yi ) ≤ [p : p] = 1, and from zij pi it follows that #p (zij ) ≤ [pi : p], so for all x ∈ I(a) #p (x) =

k

#p (yi ) +

i=1

ki n i=1 j=1

#p (zij ) ≤ k +

n

ki · [pi : p].

(1)

i=1

We shall now distinguish two cases, according to the contribution of the second term to the right-hand side of the above inequality, and show that either case leads inevitably to a contradiction with condition (B) above. n First, suppose that i=1 ki · [pi : p] > 0; then [pj : p] > 0 for some 1 ≤ j ≤ n. Let x1 , . . . , xn be such that xi pi and #p (xi ) = [pi : p] for all 1 ≤ i ≤ n, and x = pl ⊗ xl11 ⊗ · · · ⊗ xlnn . Since #p (pi ) = 0, if #p (xi ) > 0 then xi ≺ pi . In particular, since #p (xj ) = [pj : p] > 0, this means that x is an element of I(a) (use that a = pl ⊗pl11 ⊗· · ·⊗plnn and apply Proposition 7), and hence, that #p (x) is deﬁned, by n #p (x) = l + li · [pi : p]. i=1

We combine this deﬁnition with the inequality in (1), to conclude that

A Unique Decomposition Theorem for Ordered Monoids

l+

n

li · [pi : p] ≤ k +

i=1

n

567

ki · [pi : p].

i=1

To arrive at a contradiction with condition (B), it therefore suﬃces to prove that ki · [pi : p] = li · [pi : p] for all 1 ≤ i ≤ n. If [pi : p] = 0, then this is clear at once. If [pi : p] > 0, then, since #p (pi ) = 0, there exists x ≺ pi such that #p (x) = [pi : p] > 0. Every occurrence of p in the decomposition of x contributes |p| to the norm of x, so |p| ≤ |x| < |pi |, from which itfollows by condition (C) n that ki · [pi : p] = li · [pi : p]. This settles the case that i=1 ki · [pi : p] > 0. n We continue with the hypothesis that i=1 ki · [pi : p] = 0. First, assume li > 0 for some 1 ≤ i ≤ n; then, by Proposition 7, pl is a predecessor of a, but that implies l = #p (pl ) ≤ k, a contradiction with (B). In the case that remains, we may assume that li = 0 for all 1 ≤ i ≤ n, and consequently, since a = pl cannot be prime, that l > 1. Clearly, pl−1 is a predecessor of a, so 0 < l − 1 = #p (pl−1 ) ≤ k; it follows that k > 0. Now, let y be a predecessor of p of maximal norm; by Lemma 11, it gives rise to an immediate a-predecessor x = y ⊗ pk−1 ⊗ pk11 ⊗ · · · ⊗ pknn . Then, since a = pl , it follows by Lemma 12 that there exists an immediate predecessor z of p such that x = z ⊗pl−1 . We conclude that k−1 = #p (x) = l−1, again a contradiction with condition (B).

3

ACPε

We ﬁx two disjoint sets of constant symbols A and V; the elements of A we call actions; the elements of V we call process variables. With a ∈ A, X ∈ V and H ranging over ﬁnite subsets of A, the set P of process expressions is generated by P ::= ε | δ | a | X | P ·P | P +P | ∂H (P ) | P P | P |P | P P. If X is a process variable and P is a process expression, then the expression def X = P is called a process equation deﬁning X. A set of such expressions is called a process speciﬁcation if it contains precisely one deﬁning process equation for each X ∈ V. For the remainder of this paper we ﬁx a guarded process speciﬁcation S: every occurrence of a process variable in a right-hand side P of an equation in S occurs in a subexpression of P of the form a · Q with a ∈ A. We also presuppose a communication function, a commutative and associative partial mapping γ : A×A A. It speciﬁes which actions may communicate: if γ(a, b) is undeﬁned, then the actions a and b cannot communicate, whereas if γ(a, b) = c then they can and c stands for the event that they do. The transition system speciﬁcation in Table 1 deﬁnes on the set P a unary predicate ↓ and binary relations −−a→ (a ∈ A). A bisimulation is a symmetric binary relation R on P such that P R Q implies (i) if P ↓, then Q↓; and (ii) if P −−a→ P , then there exists Q such that Q −−a→ Q and P R Q .

568

Bas Luttik Table 1. The transition system speciﬁcation for ACPε .

ε↓

P ↓, Q↓ (P · Q)↓

P↓ (P + Q)↓, (Q + P )↓

P ↓, Q↓ (P Q)↓, (Q P )↓

a

a −−→ ε a

P −−→ P a P + Q −−→ P , Q + P −−→ P a

b

a

a

P Q −−→

P

P ↓, Q −−→ Q a P · Q −−→ Q a

def

P −−→ P , (X = P ) ∈ S a X −−→ P

c

a

P −−→ P , Q −−→ Q , a = γ(b, c) a P | Q −−→ P Q

P −−→ P a P Q −−→ P Q

a

a

P −−→ P a P · Q −−→ P · Q

a

P −−→ P a Q, Q P −−→ Q P

P↓ ∂H (P )↓

b

P −−→ P , a ∈ H a ∂H (P ) −−→ ∂H (P ) c

P −−→ P , Q −−→ Q , a = γ(b, c) a P Q −−→ P Q

Process expressions P and Q are said to be bisimilar (notation: P ↔ Q) if there exists a bisimulation R such that P R Q. The relation ↔ is an equivalence relation; we write [P ] for the equivalence class of process expressions bisimilar to P , and we denote by P/↔ the set of all such equivalence classes. Baeten and van Glabbeek [2] prove that ↔ has the substitution property with respect to , and that P (Q R) ↔ (P Q) R, P ε ↔ ε P ↔ P and P Q ↔ Q P . Hence, we have the following proposition. Proposition 14. The set P/↔ with ⊗ and ι deﬁned by [P ] ⊗ [Q] = [P Q] and ι = [ε] is a commutative monoid.

4

Weakly Normed ACPε with Bounded Communication

In this section we present three counterexamples obstructing a general unique decomposition theorem for the monoid P/↔ deﬁned in the previous section. They will guide us in deﬁning a submonoid of P/↔ which does admit a unique decomposition theorem, as we shall prove in the next section. The ﬁrst counterexample already appears in [10]; it shows that perpetual processes need not have a decomposition. def

Example 15. Let a be an action, let γ(a, a) be undeﬁned and let X = a·X. One can show that X ↔ P1 · · · Pn implies Pi ↔ X for some 1 ≤ i ≤ n. It follows that [X] has no decomposition in P/↔ . For suppose that [X] = [P1 ] ⊗ · · · ⊗ [Pn ]; then [Pi ] = [X], whereas [X] is not a prime element of P/↔ (e.g., X ↔ a X). The second counterexample employs the distinction between successful and unsuccessful termination characteristic of ACP-like process theories. Example 16. Let a be an action; then [a], [a + a · δ] and [a · δ + ε] are prime a elements of P/↔ . Moreover, a ↔ a + a · δ (the transition a + a · δ −−→ δ cannot be

A Unique Decomposition Theorem for Ordered Monoids

569

simulated by a). However, it is easily veriﬁed that a(a·δ+ε) ↔ (a+a·δ)(a·δ+ε), so a decomposition in P/↔ need not be unique. w

Let w ∈ A∗ , say w = a1 · · · an ; we write P −−→ P if there exist P0 , . . . , Pn an a1 such that P = P0 −−→ · · · −−→ Pn = P . To exclude the problems mentioned in Examples 15 and 16 above we use the following deﬁnition. Deﬁnition 17. A process expression P is weakly normed if there exist w ∈ A∗ w and a process expression P such that P −− → P ↔ ε. The set of weakly normed ε process expressions is denoted by P . It is straightforward to show that bisimulation respects the property of being weakly normed, and that a parallel composition is weakly normed iﬀ its parallel components are. Hence, we have the following proposition. Proposition 18. The set P ε /↔ is a submonoid of P/↔ . Moreover, if [P Q] ∈ P ε /↔ , then [P ] ∈ P ε /↔ and [Q] ∈ P ε /↔ . Christensen et al. [5] prove that every element of the commutative monoid of weakly normed BPP expressions modulo bisimulation has a unique decomposition. Presupposing a communication function γ that is everywhere undeﬁned, the operational semantics for BPP expressions is as given in Table 1. So, in BPP there is no communication between parallel components. Christensen [4] extends this result to a unique decomposition theorem for the commutative monoid of weakly normed BPPτ expressions modulo bisimulation. His BPPτ is obtained by replacing the parallel operator of BPP by a parallel operator that allows a restricted form of handshaking communication. Our next example shows that the more general communication mechanism of ACPε gives rise to weakly normed process expressions without a decomposition. def

Example 19. Let a be an action, suppose that a = γ(a, a) and X = a · X + a. Then one can show that X ↔ P1 · · · Pn implies that Pi ↔ X for some 1 ≤ i ≤ n, from which it follows by a similar argument as in Example 15 that [X] has no decomposition in P/↔ . The communication function in the above example allows an unbounded number of copies of the action a to participate in a single communication. To exclude this phenomenon, we use the following deﬁnition. Deﬁnition 20. A communication function γ is bounded if every action can be assigned a weight ≥ 1 in such a way that a = γ(b, c) implies that the weight of a is the sum of the weights of b and c.

5

Unique Decomposition in P ε /↔

We now prove that every element of the commutative monoid P ε /↔ of weakly normed process expressions modulo bisimulation has a unique decomposition, provided that the communication function is bounded. We proceed by deﬁning on P ε /↔ a partial order and a stratiﬁcation | | : P ε /↔ → N turning it into

570

Bas Luttik

a precompositional stratiﬁed commutative p.o. monoid. That every element of P ε /↔ has a unique decomposition then follows from the theorem of Section 2. Throughout this section we assume that the presupposed communication function γ is bounded so that every action has a unique weight assigned to it (cf. Deﬁnition 20). We use it to deﬁne the weighted length (w) of w ∈ A∗ inductively as follows: if w is the empty sequence, then (w) = 0; and if w = w a and a is an action of weight i, then (w) = (w ) + i. This deﬁnition takes into account that a communication stands for the simultaneous execution of multiple actions. It allows us to formulate the following crucial property of the operational semantics of ACPε . w Lemma 21. If P , Q and R are process expressions such that P Q −− → R, then u ε ∗ there exist P , Q ∈ P and u, v ∈ A such that R = P Q , P −−→ P , Q −−v→ Q and (u) + (v) = (w).

Deﬁnition 22. The norm |P | of a weakly normed process expression is the least natural number n such that there exists w ∈ A∗ of weighted length n and w a process expression P such that P −− → P ↔ ε. Lemma 23. If P ↔ Q, then |P | = |Q| for all P, Q ∈ P ε . Lemma 24. |P Q| = |P | + |Q| for all P, Q ∈ P ε . We deﬁne on P ε binary relations i (i ≥ 1) and by a

P i Q ⇐⇒ there exists a ∈ A of weight i s.t. P −−→ Q and |P | = |Q| + i. P Q ⇐⇒ P i Q for some i ≥ 1. The reﬂexive-transitive closure ∗ of is a partial order on P ε . Deﬁnition 25. We write [P ] [Q] iﬀ there exist P ∈ [P ] and Q ∈ [Q] such that Q ∗ P . It is straightforward to verify that is a partial order on P ε /↔ . Furthermore, that is compatible with ⊗ can be established by means of Lemma 24, and that ι is its least element essentially follows from weak normedness. Hence, we get the following proposition. Proposition 26. The set P ε /↔ is a commutative p.o. monoid. By Lemmas 23 and 24, the mapping | | : (P ε /↔ ) → N deﬁned by [P ] → |P | is a strict homomorphism. Proposition 27. The mapping | | : (P ε /↔ ) → N is a stratiﬁcation of P ε /↔ . Lemma 28. If P Q ∗ R, then there exist P and Q such that P ∗ P , Q ∗ Q and R = P Q . The following proposition is an easy consequence of the above lemma.

A Unique Decomposition Theorem for Ordered Monoids

571

Proposition 29. The p.o. monoid P ε /↔ is precompositional. According to Propositions 26, 27 and 29, P ε /↔ is a stratiﬁed and precompositional commutative p.o. monoid, so by Theorem 13 we get the following result. Theorem 30. In the p.o. monoid P ε /↔ of weakly normed processes expressions modulo bisimulation every element has a unique decomposition, provided that the communication function is bounded.

Acknowledgment The author thanks Clemens Grabmayer, Jeroen Ketema, Vincent van Oostrom, Simona Orzan and the referees for their comments.

References 1. L. Aceto and M. Hennessy. Towards action-reﬁnement in process algebras. Inform. and Comput., 103(2):204–269, 1993. 2. J. C. M. Baeten and R. J. van Glabbeek. Merge and termination in process algebra. In K. V. Nori, editor, Proc. of FST TCS 1987, LNCS 287, pages 153–172, 1987. 3. G. Birkhoﬀ. Lattice theory, volume XXV of American Mathematical Society Colloquium Publications. American Mathematical Society, third edition, 1967. 4. S. Christensen. Decidability and Decomposition in Process Algebras. PhD thesis, University of Edingburgh, 1993. 5. S. Christensen, Y. Hirshfeld, and F. Moller. Decomposability, decidability and axiomatisability for bisimulation equivalence on basic parallel processes. In Proc. of LICS 1993, pages 386–396. IEEE Computer Society Press, 1993. 6. W. J. Fokkink and S. P. Luttik. An ω-complete equational speciﬁcation of interleaving. In U. Montanari, J. D. P. Rolim, and E. Welzl, editors, Proc. of ICALP 2000, LNCS 1853, pages 729–743, 2000. 7. L. Fuchs. Partially Ordered Algebraic Systems, volume 28 of International Series of Monographs on Pure and Applied Mathematics. Pergamon Press, 1963. 8. Y. Hirshfeld and M. Jerrum. Bisimulation equivalence is decidable for normed process algebra. In J. Wiedermann, P. van Emde Boas, and M. Nielsen, editors, Proc. of ICALP 1999, LNCS 1644, pages 412–421, 1999. 9. T. W. Hungerford. Algebra, volume 73 of GTM. Springer, 1974. 10. R. Milner and F. Moller. Unique decomposition of processes. Theoret. Comput. Sci., 107:357–363, January 1993. 11. F. Moller. Axioms for Concurrency. PhD thesis, University of Edinburgh, 1989. 12. F. Moller. The importance of the left merge operator in process algebras. In M. S. Paterson, editor, Proc. of ICALP 1990, LNCS 443, pages 752–764, 1990. 13. J. L. M. Vrancken. The algebra of communicating processes with empty process. Theoret. Comput. Sci., 177:287–328, 1997.

Generic Algorithms for the Generation of Combinatorial Objects Conrado Mart´ınez and Xavier Molinero Departament de Llenguatges i Sistemes Inform` atics, Universitat Polit`ecnica de Catalunya, E-08034 Barcelona, Spain {conrado,molinero}@lsi.upc.es

Abstract. This paper brieﬂy describes our generic approach to the exhaustive generation of unlabelled and labelled combinatorial classes. Our algorithms receive a size n and a ﬁnite description of a combinatorial class A using combinatorial operators such as union, product, set or sequence, in order to list all objects of size n in A. The algorithms work in constant amortized time per generated object and thus they are suitable for rapid prototyping or for inclusion in general libraries.

1

Introduction

Exhaustively generating all the objects of a given size is an important problem with numerous applications that has attracted the interest of combinatorialists and computer scientists for many years. There is a vast literature on the topic and many ingenious techniques and eﬃcient algorithms have been devised for the generation of objects of relevant combinatorial classes (permutations, trees, sets, necklaces, words, etc.). Indeed, it is common to ﬁnd introductory material in many textbooks on algorithms (see for instance [8]). Furthermore, several distinct natural (and useful) orderings have been considered for the generation of combinatorial classes, for example, lexicographic and Gray ordering. Many stateof-the-art algorithms for exhaustive generation can be found (and executed) in the Combinatorial Object Server (www.theory.csc.uvic.ca/˜cos), where the interested reader can also ﬁnd further references. The ultimate goal is to achieve algorithms with constant amortized time per generated object, that is, the cost of generating all N objects of size n takes time proportional to N . Many such algorithms are known, but there is still on-going and active research on this topic. In this work, we combine some well-known principles and a few novel ideas in a generic framework to design algorithms that solve the problem of exhaustive generation, given the size and a ﬁnite speciﬁcation of the combinatorial class whose elements are to be listed. This kind of approach was pioneered by Flajolet et al. [2] for the random generation of combinatorial objects and later applied

This research was supported by the Future and Emergent Technologies programme of the EU under contract IST-1999-14186 (ALCOM-FT) and the Spanish “Ministerio de Ciencia y Tecnolog´ıa” programme TIC2002-00190 (AEDRI II).

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 572–581, 2003. c Springer-Verlag Berlin Heidelberg 2003

Generic Algorithms for the Generation of Combinatorial Objects

573

by the authors for the unranking problem [6] and for the generation of labelled objects [5]. Somewhat diﬀerent, but with a similar spirit, is the general approach of Kemp [4] for the generation of words in lexicographic order. We show that all our algorithms work in constant amortized time and provide a general framework for the analysis of the performance of these algorithms in the form of a calculus or set of rules. Most existing algorithms exploit particular characteristics of the combinatorial class to be generated, thus achieving improved performance over na¨ıve or brute force methods. The main contribution of this work is to provide a few generic algorithms which solve the problem of iteration over the subset of objects of a given size, given the size and a ﬁnite speciﬁcation of the combinatorial class. These ﬁnite speciﬁcations are built from basic - and atomic classes, and combinatorial operators like unions (‘+’), Cartesian products (‘×’), sequences (‘S’), multisets (‘M’), cycles (‘C’), etc. Our algorithms, deprived of speciﬁc knowledge of the problem at hand, are likely to be a bit worse than their speciﬁc counterparts, but still have competitive performance, making them good candidates for rapid prototyping and for inclusion into general combinatorial libraries such as the combstruct package [1] for Maple1 and MuPAD-combinat for MuPAD (mupad-combinat.sourceforge.net). Typically, complex objects in a given class are composed by smaller units, called atoms. Atoms are objects of size 1 and the size of an object is the number of atoms it contains. For instance, a string is composed by the concatenation of symbols, where each of these is an atom, and the size of the string is its length or the number of symbols it is composed of. Similarly, a tree is built out of nodes – its atoms – and the size of the tree is its number of nodes. Objects of size 0 and 1 will be generically denoted by and Z, respectively2 . Unlabelled objects are those whose atoms are indistinguishable. On the contrary, each of n atoms of a labelled object of size n bears a distinct label drawn from the numbers 1 to n. For the rest of this paper, we will use calligraphic uppercase letters to denote classes (A, B, C, . . . ). Given a class A, An will denote the subset of objects of size n in A and an the number of such objects. We use the corresponding uppercase roman letter (A, B, C, . . . ) to denote the counting generating functions (ordinary GFs for unlabelled classes, exponential GF for labelled classes). The n-th coeﬃcient of A(z) is denoted [z n ]A(z); hence, an = [z n ]A(z) if A(z) is ordinary and an = n! · [z n ]A(z) if A(z) is exponential. As it will become apparent, our approach to the exhaustive generation problem requires an eﬃcient algorithm for counting, that is, given a speciﬁcation of a class and a size, compute the number of objects with the given size. Hence, we will only deal with so-called admissible combinatorial classes [10,9]. Those are constructed from admissible operators, operations over classes that yield new 1

2

The current implementation of combstruct oﬀers a routine allstructs to generate all objects of a given size and a ﬁnite speciﬁcation of the class; but it does the job by repeatedly generating objects at random until all them have been generated. Also, we will use these symbols to denote not only objects but the classes that contain just one object of size 0 and of size 1, respectively.

574

Conrado Mart´ınez and Xavier Molinero

classes, and such that the number of objects of a given size in the new class can be computed from the number of objects of that size or smaller sizes in the constituent classes. Tables 1 and 2 give a few examples of both labelled and unlabelled admissible classes; as such, our algorithms are able to generate all their objects of a given size and the speciﬁcation of the class. Table 1. Examples of labelled classes and their speciﬁcations Labelled class Cayley trees Binary plane trees Hierarchies Surjections Functional graphs

Speciﬁcation A = Z M(A) B =Z +BB C = Z + M(C, card ≥ 2)) D = S(M(Z, card ≥ 1))) E = M(C(A))

Any combinatorial object belonging to an admissible class can be suitably represented as a string of symbols or as an expression tree whose leaves correspond to the object’s atoms and whose internal nodes are labelled by admissible operators. However, such a representation is not the most convenient for the exhaustive generation problem; our algorithms will act upon a new kind of objects, which we call iterators. An iterator contains a combinatorial object (represented as a tree-like structure), but also additional information which helps and speeds up the generation process. This additional information is also organized as a tree-like structure – which we call deep structure – and reﬂects that of the corresponding combinatorial object, but each node contains information about the class, the rank, the size, the labelling of the “subobject” rooted at the node in the case of labelled objects, etc. Furthermore, there are pointers between the object’s representation and the deep structure to allow fast access and update of the object’s representation. Table 2. Examples of unlabelled classes and their speciﬁcations Unlabelled class Binary sequences Necklaces Rooted unlabelled trees Non plane ternary trees Integer partititons without repetition

Speciﬁcation A = S(Z + Z) B = C(M(Z, card ≥ 1)) C = Z × M(C) D = Z + M(D, card = 3) E = P(S(Z, card ≥ 1))

From the user’s point of view, we shall oﬀer the following four routines: 1) a procedure to initialize the iterator (init iter), which given a ﬁnite description of a combinatorial class A and a size n, returns the iterator corresponding to the ﬁrst object in An ; 2) a function next, which given an iterator modiﬁes it so that it corresponds to the object following the previous one; 3) a function get obj to retrieve the combinatorial object from the iterator, in order to print or process it as needed; 4) a boolean function is last to check whether the

Generic Algorithms for the Generation of Combinatorial Objects

575

iterator corresponds to the past-the-end object: a ﬁctitious sentinel which follows the last object in An . These will be typically used as follows: it:= init_iter(A, n); while not is_last(it) do print(get_obj(it)); it:= next(it); end In next section we will describe our algorithms and their performance for the generation of admissible unlabelled combinatorial classes. Section 3 brieﬂy considers the generation of labelled objects, based on our previous work [5], thus integrating the generation of labelled and unlabelled classes into a single and elegant framework. Except for the case of unlabelled cycles (not describe here), most common combinatorial constructions can be dealt within this framework. In section 4 we comment our current work, extensions and future developments.

2

Unlabelled Classes

Here, by an admissible class we mean that the class can be ﬁnitely speciﬁed using the class (the class with a single object of size 0), atomic classes (classes that contain a single object of size 1), and disjoint unions (‘+’), products (‘×’), sequences (‘S’), multisets (‘M’) and powersets (‘P’) of admissible classes. We have also developed a generic algorithm for cycles (‘C’) of admissible unlabelled classes; however both the algorithm and its analysis use rather diﬀerent ideas and techniques from the other operators and will be not explained here because of the space limitations. There exist other interesting admissible operations, but we shall restrict our attention to those mentioned above. Even within this restricted framework, many important combinatorial classes can be speciﬁed. Furthermore, the techniques and results presented here can be easily extended to other admissible operators such as substitutions, sequences, multisets and powersets of restricted cardinality, etc. 2.1

The Algorithms

The problem of generating the class and atomic classes is trivial. We shall only consider the function next, as the others functions are more or less straightforward. We assume that we have a function count(A, n) which returns the number of objects of size n in the combinatorial class A [2]. The function next actually uses a recursive routine which receives a pointer p to some node of the deep structure; the initial recursive call is with p pointing to the root of the deep structure. However, we will use the same name for the recursive routine which actually performs the job. If the current object (of size n) belongs to a class A+B, we need only to check whether the current object is the last of A or not. If it is, then the next object will be the ﬁrst object in B; otherwise, we generate the next object in the appropriate class (A if the current rank is smaller than or equal to count(A, n) − 1, B if the

576

Conrado Mart´ınez and Xavier Molinero

current rank is greater than or equal to count(A, n) − 1). This actually means recursively applying the procedure to the unique subtree hanging from p. All the checks above can be easily done as the node pointed to by p in the deep structure contains the speciﬁcation, the rank of the current object, its size, etc. On the other hand, if the current subobject of size n corresponds to a product, say A×B, we check if the second component (given by p.second ) is the last object of size n−k of its class B, where k = p.ﬁrst.size. If it is not, we recursively obtain the next object of the second component. If the second component is the last object of size n − k in B, but the ﬁrst component (pointed to by p.ﬁrst) is not the last object of size k in A then the algorithm is recursively applied to the ﬁrst component; we also reset the information in the current node and set the second component of the new object to be the ﬁrst object in B of size n − k. If both ﬁrst and second components were the last objects of sizes k and n − k in A and B, respectively, then a loop looks for the smallest k > k such that Ak × Bn−k is not empty. After that, the ﬁrst objects in Ak and Bn−k are generated and the appropriate information is updated in the current node p. Multisets are dealt with in a similar spirit as products. The basis of the algorithm is the isomorphism ΘM(A) = ∆ΘA × M(A), where ∆A denotes the diagonal or stacking of the class A, that is, ∆A = {α | α ∈ A} + {(α, α) | α ∈ A} + {(α, α, α) | α ∈ A} + · · · , and ΘA is the pointing (marking) of the class A, that is, the class that we obtain by making copies of each object of size k in A, but marking a diﬀerent atom in each copy. If we mark an atom in a multiset we might think that we have marked the object that contains the atom, say the m-th copy of the object; on the right hand side we produce the marked object, make m copies of the marked object and attach a multiset. A multiset γ consists of two parts: a ﬁrst component α ∈ A of size k together with its number of occurrences, say , and a second component β which is a multiset of size n − k. This second component contains objects in A whose size is less than or equal to k; in the latter case, their rank is strictly smaller than the rank of α (implying that they have been used as the ﬁrst component previously). In order to get the next object, we check whether there exist multisets satisfying the conditions above, that is, whether there is some object following β. If not, we obtain the object following α in ∆Ak . When the ﬁrst component in the current object is the last object in ∆Ak , we loop until we ﬁnd a suitable size j > j = k, and obtain the respective ﬁrst objects. The generation of the next object of α is also easy: we obtain the object following α in Ak if it exists; if not, we look for the smaller divisor k of j = k which is larger than k and produce the ﬁrst object in Ak and attach the appropiate number of occurrences = j/k . Powersets are generated in the same vein. For a ﬁxed ﬁrst component α of size j, we produce all powersets of size n − j made up of objects of size smaller than or equal to j and whose rank is strictly smaller than the rank of α; if there are no more such powersets, we recursively apply the procedure to α. If α is the last object of size j then we look for the next available size j for the ﬁrst component. The isomorphim here is given by ΘP(A) = ∆[odd] A × P(A) − ∆[even] A × P(A), where ∆[odd] and ∆[even] are like the diagonal operator, but for odd and even

Generic Algorithms for the Generation of Combinatorial Objects

577

numbers of copies, respectively. The proof of this isomorphism is a bit more involved, and exploits the principle of inclusion-exclusion to guarantee that no element is repeated. On a purely formal basis, we can introduce ∆ˆ = ∆[odd] − ∆[even] , so that we ˆ could say ΘP(A) = ∆ΘA × P(A). The operator ∆ˆ allows for more convenient symbolic manipulations when computing the cost of the algorithms, but has no combinatorial meaning, though. 2.2

The Performance Let ΛAn = α∈An cn(α), where cn(α) denotes the cost of applying next to object α. Then µA,n = ΛAn /an is the amortized cost per generated object. We will not include in the cost the preprocessing time needed to compute the tables of counts nor the cost of constructing the ﬁrst object (and associated information) of a given size in a given class. These can be precomputed just once (or computed the ﬁrst time they are needed and stored into tables for later reuse) and its contribution to the overall performance will be neglected. Also, we will not include the time to parse the speciﬁcation and transform it to standard form either as this cost does not depend on n. Lemma 1. Given an admissible unlabelled class A, let ΛA(z) denote the ordinary generating function of the cumulated costs {ΛAn }n≥0 . Then 1. 2. 3. 4. 5. 6.

Λ∅ = Λ = ΛZ = 0, Λ(A + B) = ΛA + ΛB + [[A]] + [[B]] − [[A + B]], Λ(A × B) = ΛA · [[B]] + A · ΛB + [[A]] · [[B]] − [[A × B]], ΛΘA = ΘΛA + Θ[[A]] − [[ΘA]], Λ∆A = ∆ΛA + ∆[[A]] − [[∆A]], Λ∆[t] A = ∆[t] ΛA + ∆[t] [[A]] − [[∆[t] A]], t ∈ {odd, even}

where ΘA(z) ≡ z dA dz , ΘA denotes the pointing or marking of the class A, ∆A(z) ≡ k>0 A(z k ), ∆A denotes the diagonal of the class A, ∆[odd] A(z) = 2k−1 ), ∆[even] A(z) = k>0 A(z 2k ), ∆[odd] A and ∆[even] denote the odd k>0 A(z and even diagonals of the class A, respectively, and [[A]] = n≥0 [[an = 0]]z n , with [[P ]] = 1 if the predicate P is true and 0 otherwise. Proof. For classes that contain just one item, the cumulated cost is 0, as we do not count the cost of generating the ﬁrst object. The rule for unions is straightforward, but we must take care to charge the cost corresponding to computing the next of the last element in A; this is accounted for by the terms [[A]]+[[B]]−[[A + B]]. For products, we generate pairs in Ak ×Bn−k for k = 0, . . . , n. For a ﬁxed object α of size k we generate all objects in Bn−k to form all pairs whose ﬁrst component is α; since there are ak objects of size k and we have to consider all possible values of k, this contribution is given by A · ΛB. The other main contribution to the cost comes from the generation of all objects in A with sizes k from 0 to n (and such that there are objects of size Bn−k to form at least a pair). This is given by the term ΛA · [[B]]. The remaining terms account for the

578

Conrado Mart´ınez and Xavier Molinero

application of next to the last pair in Ak × Bn−k whenever there exist such a pair, but not for the ﬁrst pair in A × B. The algorithm for the marking is also straightforward: list all elements in A of size n with the ﬁrst atom marked, then list all them again but with the second atom marked and so on. The terms Θ[[A]] − [[ΘA]] account for the cost of passing from the ﬁrst listing to the second, from the second to the third, etc. To generate all objects of size n in the diagonal of the class A, recall that we loop through all divisors d of n such that there are objects of size d. For each such d, we list all objects of size d and for each one we attach the number of copies (n/d) that make up the corresponding object in ∆A. Thus Λ∆An = d divides n (ΛAd + [[ad = 0]]) − [[∆An = ∅]]. The rule for the odd and even diagonals of A are similarly obtained. From Lemma 1 we can easily obtain rules for sequences, sets and multisets. Corollary 1. Let A be an admissible class such that ∈ A and let A be its counting generating function. Then 1. Let S(A) = 1/(1 − A). Then ΛS(A) = S(A) · (1 + (ΛA + [[A]] − 1)[[S(A)]]) . 2. Let M(A) = exp( k>0 A(z k )/k). Then z dz ∆Θ(ΛA + [[A]]) · [[M(A)]] − Θ[[M(A)]] . ΛM(A) = M(A) · z · M(A) 0 3. Let P(A) = exp( k>0 (−1)k−1 A(z k )/k). Then ΛP(A) = P(A) ·

z

dz ˆ ∆Θ(ΛA + [[A]]) · [[P(A)]] − Θ[[P(A)]] z · P(A)

+[[ΘP(A)]] − [[∆[odd] ΘA × P(A)]] + [[∆[even] ΘA × P(A)]] . 0

Proof. It suﬃces to use the isomorphisms S(A) = + A × S(A), ΘM(A) = ∆ΘA × M(A) and ΘP(A) = ∆[odd] ΘA × P(A) − ∆[even] ΘA × P(A), and apply rules 1-6 in the statement of Lemma 1. In the case of multisets and powersets, the rules can be obtained applying Λ to both sides of the isomorphisms given above, inverting Θ and Λ with rule 4, and solving the resulting linear diﬀerential equations. In the case of powersets, the sought cost arises from the diﬀerence of costs; we have thus ΛΘP(A) = Λ(Θ∆[odd] A × P(A)) − Λ(Θ∆[even] A × P(A)). We have the following theorem that can be easily established from either the rules that we have just derived or directly reasoning about the algorithms. Theorem 1. For any unlabelled admissible class A which can be ﬁnitely speciﬁed using , Z, +, ×, S, M and P, we have µA,n = ΛAn /an = Θ(1).

Generic Algorithms for the Generation of Combinatorial Objects

579

Proof (Sketch of proof ). The proof is by structural induction on the speciﬁcation of the class and on the size of the objects. We consider thus what happens for unions, products, sequences, etc. and assume that the statement is true for smaller sizes. Since we charge just one “time” unit for the update of one node in the deep structure and assume that the initialization and calls to the count routine are free, we actually have µA,n → 1 as n → ∞. In practice, µA,n is diﬀerent for diﬀerent classes if we take into account that the (constant) overhead associated with each operator varies. We conclude with a few simple examples of application of these rules. 1. K-shuﬄes. A K-shuﬄe is a sequence of a’s and b’s that contains exactly K b’s. Let LK = S(a) × (b × LK−1 ) for K > 0 and L0 = S(a). It is not zK diﬃcult to show that ΛLK ∼ (1−z) K+1 near the singularity z = 1; hence the n K-shuﬄes of size n. amortized cost µLK ,n → 1 since there are exactly K We get the same asymptotic performance if we use alternative speciﬁcations, e.g., LK = LK−1 × (b × S(a)). 2. Motzkin trees. For √M √= + Z × M + Z × M × M, one readily gets 6z + z 2 + O((1 − 6z + z 2 )3/2 ) near the domΛM ∼ −1/2(3 − 2 2) 1 − √ inant singularity at z = 3 − 2 2; hence ΛM ∼ M and µM,n = 1 + o(1).

3

Labelled Classes

By admissible labelled classes we mean those that can be ﬁnitely speciﬁed using the -class, atomic labelled classes, unions (+), labelled products ( ), sequences (Seq), sets (Set) and cycles (Cycle) of admissible labelled classes. As in the previous section, there exist other admissible operators over labelled classes, but we shall restrict our attention to those mentioned above. Again, many important combinatorial classes can be speciﬁed within this framework and the ideas that we present here carry on to other admissible operators such as substitutions, sequences, sets and cycles of restricted cardinality, etc. For example, the class C of Cayley trees is admissible, as it can be speciﬁed by C = Z Set(C), where Z denotes an atomic class. The class F of functional graphs is also admissible, since a functional graph is a set of cycles of Cayley trees; therefore, F = Set(Cycle(C)). The exhaustive generation of labelled combinatorial objects uses similar ideas to those sketched in Section 2; in fact, we face easier problems since sets and cycles can be speciﬁed by means of the so-called boxed product, denoted by 2 [3]. We recall that in a boxed product, we obtain a collection of labelled objects from a pair of given objects, much in the same manner as for the usual labelled product, but the smallest label must always correspond to an atom belonging to the ﬁrst object in the pair. Boxed products are related to the pointing (see subsection 2.1) of a class A by ΘA = ΘB C ⇐⇒ A = B 2 C. The isomorphisms for sequences, sets and cycles in terms of the other constructors (union, product and boxed product) that we use in our algorithms are the following: 1) Seq(A) = +A Seq(A), 2) Set(A) = +A 2 Set(A), and 3) Cycle(A) = A 2 Seq(A). Thus every admissible speciﬁcation (a ﬁnite set of equations specifying admissible

580

Conrado Mart´ınez and Xavier Molinero

classes, like in the example of functional graphs) can be transformed into an equivalent speciﬁcation that involves only unions, products and boxed products. The algorithms for unions and products are very similar to those for unlabelled classes, and the algorithm for boxed products works much like the algorithm for products. In the case of labelled and boxed products, we change the partition or relabelling of the current object if possible; otherwise we recursively apply the next routine to the second component or the ﬁrst component of the object. In order to “traverse” the nk possible partitions of the labels3 of a pair α, β , where n is the size of the objects to be generated and k is the size of the ﬁrst component of the current object, we use Nijenhuis and Wilf’s routine for the next k-subset [7] (alternatively, we can use the algorithm by Kemp [4]). Also, like for unlabelled classes, we can set up a calculus for the complexity of our algorithms with rules such as ΘΛ(A 2 B) = ΘΛA · [[B]] + ΘA · ΛB + Θ[[A]] · [[B]] − Θ[[A 2 B]], n for boxed products. Here, [[A]] = n≥0 [[an = 0]] zn! and ΛA is the exponential generating function of the total costs to generate all elements of each size. The rules for the other combinatorial constructions are similar in spirit and can be easily derived from the rules for unions, products and boxed products. We make here the same assumptions as in the analysis of the performance of the algorithms for unlabelled classes; moreover, we take into account the fact that the algorithm for the generation of k-subsets works in constant amortized time [7]. Then it is not diﬃcult to show that this cost can be easily “absorbed” by terms like A · ΛB in the rule for products and there is no need to include a term of the type c · A · B. Using the same techniques as in the proof of Theorem 1 is not hard to establish an analogous result for labelled generation. The detailed account of the complexity calculus for the generation of labelled objects and of the proof of the following theorem will be given in the full version of this extended abstract. Theorem 2. For any admissible labelled class A which can be ﬁnitely speciﬁed using , Z, +, , Seq, Set and Cycle, we have µA,n = ΛAn /an = Θ(1).

4

Current and Future Work

As we have mentioned earlier, we already have a constant amortized time algorithm to generate unlabelled cycles of A’s. However, this algorithm for the generation of unlabelled cycles is based upon techniques quite diﬀerent from the one here and it doesn’t ﬁt nicely in the framework here sketched (in sharp contrast with labelled cycles). We have implemented all the algorithms described here for the Maple system, on top of the basic routines provided by the combstruct package. Also, 3

There are only

n−1 k−1

partitions of the labels in the case of boxed products.

Generic Algorithms for the Generation of Combinatorial Objects

581

there are plans for a port of these routines to the MuPAD-combinat package in the near future. Furthermore, we also have routines for the generation of labelled substitutions and for labelled sequences, sets and cycles when their cardinalities are restricted. We have conducted extensive experiments to asses the practical performance of our algorithms. These experiments show that the practical performance is in good agreement to the theoretical predictions (namely, the cost grows linearly with the total number N of generated objects, if N is suﬃciently large; the slope of the plot is independent of the size of the objects being generated). Our current work is now centered in the extension of the techniques presented here to other admissible operators. We also are trying to design an algorithm for unlabelled cycles that ﬁts within the framework here sketched. If we obtained such an algorithm, it would immediately suggest an eﬃcient answer for the unranking of unlabelled cycles, a question that still remains open, to the best of the authors’ knowledge. We are also working on alternative isomorphisms and orderings which could improve the eﬃciency of the generation algorithms (similar ideas yield signiﬁcant improvements for the random generation and unranking of objects, see [2,6]).

References 1. Ph. Flajolet and B. Salvy. Computer algebra libraries for combinatorial structures. J. Symbolic Computation, 20:653–671, 1995. 2. Ph. Flajolet, P. Zimmerman, and B. Van Cutsem. A calculus for the random generation of combinatorial structures. Theoret. Comput. Sci., 132(1-2):1–35, 1994. 3. D.H. Greene. Labelled Formal Languages and Their Uses. PhD thesis, Computer Science Dept., Stanford University, 1983. 4. R. Kemp. Generating words lexicographically: An average-case analysis. Acta Informatica, 35(1):17–89, 1998. 5. C. Mart´ınez and X. Molinero. Generic algorithms for the exhaustive generation of labelled objects. In Proc. Workshop on Random Generation of Combinatorial Structures and Bijective Combinatorics (GASCOM), pages 53–58, 2001. 6. C. Mart´ınez and X. Molinero. A generic approach for the unranking of labelled combinatorial classes. Random Structures & Algorithms, 19(3–4):472–497, 2001. 7. A. Nijenhuis and H. S. Wilf. Combinatorial Algorithms. Academic Press, 1978. 8. E.M. Reingold, J. Nievergelt, and N. Deo. Combinatorial Algorithms: Theory and Practice. Prentice-Hall, Englewood Cliﬀs, NJ, 1977. 9. R. Sedgewick and Ph. Flajolet. An Introduction to the Analysis of Algorithms. Addison-Wesley, Reading, MA, 1996. 10. J.S. Vitter and Ph. Flajolet. Average-case analysis of algorithms and data structures. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science, chapter 9. North-Holland, 1990.

On the Complexity of Some Problems in Interval Arithmetic K. Meer Department of Mathematics and Computer Science Syddansk Universitet, Campusvej 55, 5230 Odense M, Denmark [email protected] Fax: 0045 6593 2691

Abstract. We study some problems in interval arithmetic treated in Kreinovich et al. [6]. First, we consider the best linear approximation of a quadratic interval function. Known to be N P -hard in the Turing model, we analyze its complexity in the real number model and the analoguous class N PR . We give new upper complexity bounds by locating the decision version in DΣR2 (a real analogue of Σ 2 ) and solve a problem left open in [6].

1

Introduction

Problems in interval arithmetic model situations in which input data only is known within a certain accuracy. Starting from an exact description with input values ai , i ∈ I (say ai ∈ R or ∈ Q, I an index set), a corresponding formalization in terms of interval arithmetic would only supply the information that the ai ’s belong to some given intervals [ai , ai ] ⊂ R. This framework provides a way to formalize and study problems related to the presence of uncertainties. The latter both includes data errors occuring during data measurements and rounding errors during the performance of computer algorithms. Interval arithmetic thus can be seen as an approach to validate numerical calculations. The computational complexity of solving a problem in the interval setting might be signiﬁcantly larger than solving the corresponding problem with accurate input data. In fact, many such results are known in interval arithmetic. As an example, consider the solvability problem for a linear equation system A · x = b. If A and b are given precisely Gaussian elimination eﬃciently yields computation of a solution (or proves its non-existence). This holds both for the bit measure and the algebraic cost measure, see below. However, if we only know the entries in A and b to belong to given intervals, then the complexity changes dramatically; deciding the question whether concrete choices for A and b exist within the given (rational) interval bounds such that the resulting system for these choices is solvable is N P -complete, see [6].

Partially supported by the Future and Emerging Technologies programme of the EU under contract number IST-1999-14186 (ALCOM-FT) and by the Danish Natural Science Research Council SNF.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 582–591, 2003. c Springer-Verlag Berlin Heidelberg 2003

On the Complexity of Some Problems in Interval Arithmetic

583

From a logical point of view this increase in the complexity (provided P = N P ) is due to the presence of additional quantiﬁers in the interval problem description. For example, in the above linear system problem the interval approach has additional existential quantiﬁers ranging over the intervals of the given accuracy bounds and asking for the existence of “right” coeﬃcients followed by the linear equation problem. Since the new quantiﬁers are ranging over real intervals, they in principle introduce a quantiﬁed formula in the ﬁrst-order theory of the reals (considered as a real closed ﬁeld; this of course only holds for algebraic problems). Even though this theory is decidable the complexity of currently known algorithms is tremendous. It is well known that already the existential theory over R is an N PR -complete problem; here, N PR denotes the real analogue of N P in the framework of real Turing (or Blum-Shub-Smale, shortly: BSS) machines, see [1]. Therefore, it is natural to consider interval problems in that framework. In this paper we want to analyze whether for interval problems known to be hard in the Turing setting the shift from the Turing to the real number model implies N PR -completeness or N PR -hardness of the problem in the latter as well. We shall substantiate that in the real number model a ﬁner complexity analysis can be done. More precisely, for some problems the interval formulation will likely not lead to N PR -hardness, even though the restriction to rational problems and the Turing model implies N P -hardness (or completeness, respectively). This will be due to the fact that even though formally the new quantiﬁers range over the reals, in certain situations they can be replaced by Boolean quantiﬁers, i.e., quantiﬁers ranging over {0, 1}, only. We study the following problem treated in [6] for clarifying our approach: Best approximation of quadratic interval functions by linear ones. The problem is known to be N P -hard in the Turing model, [7]; note, however, that membership in the polynomial hierarchy is not established in [7]. Deﬁnition 1. (a) Let B := [b1 , b1 ] × . . . [bn , bn ] be a box in Rn , bi < bi for 1 ≤ i ≤ n. An interval function f on B is a mapping which assigns to each point y ∈ B an interval f (y) := [f (y), f (y)] ⊆ R. If both functions f and f are linear or quadratic functions, i.e., if they are polynomials of degree 1 or 2, resp., we call f a linear respectively a quadratic interval function. (b) Given a box B as above, a linear interval function X := [X, X] and a quadratic interval function f := [f , f ] on B, we say that X approximates f on B iﬀ [f (y), f (y)] ⊆ [X(y), X(y)] for all y ∈ B. Deﬁnition 2. (a) The problem BLAQIF (best linear approximation of a quadratic interval function) is deﬁned as follows: given n ∈ N, a box B ⊆ Rn , a quadratic interval function f (y) := [f (y), f (y)] on B and a bound M ∈ R, is there a linear approximation X = [X, X] of f on B such that max X(y)−X(y) ≤ M ? y∈B

(b) The computational version of BLAQIF asks to compute min max X(y)− X

y∈B

X(y) under the constraints that X = (X, X) is an approximation of f.

584

K. Meer

Our main results now can be summarized as follows: Main results: (i) In the real number model BLAQIF is not N PR -complete under weak polynomial time reductions and likely (to be precised!) neither N PR complete under (full) polynomial time reductions. (ii) In the Turing model BLAQIF can be located in Σ2 . For ﬁxed input dimension n both the decision and the computational version can be solved in polynomial time. Part (ii) complements the results in [7] by providing an upper complexity bound in the Turing setting. It also answers a question posed in [6].

2

Basic Notations; Structural Properties

We ﬁrst recall the deﬁnition of some complexity classes important for our results. Then, we analyze the consequence for problems belonging to one of these classes with respect to completeness or hardness properties in the real number model. 2.1

Complexity Classes

Though there are diﬀerent equivalent deﬁnitions for the classes we need, for our purposes those based on alternating quantiﬁers are most appropriate. Deﬁnition 3. (a) A decision problem A over the alphabet {0, 1} is in class Σ k , k ∈ Nx iﬀ there are a problem B ∈ P and polynomials p1 , . . . , pk such that x ∈ A ⇐⇒ Q1 y1 ∈ {0, 1}p1 (|x|) . . . Qk yk ∈ {0, 1}pk (|x|) (x, y1 , . . . , yk ) ∈ B , where the variable blocks yi range over {0, 1}pi (|x|) and the quantiﬁers Qi ∈ {∃, ∀} alternate, starting with Q1 = ∃ (and |x| describes the bit size of x). ∞ Ri is in class ΣRk , k ∈ N in the real (b) A decision problem A over R∞ := i=1

number model iﬀ there are a problem B ∈ PR and polynomials p1 , . . . , pk such that x ∈ A ⇐⇒ Q1 y1 ∈ Rp1 (|x|R ) . . . Qk yk ∈ Rpk (|x|R ) (x, y1 , . . . , yk ) ∈ B , where the variable blocks yi range over Rpi (|x|R ) and the quantiﬁers Qi ∈ {∃, ∀} alternate, starting with Q1 = ∃ (and |x|R describes the algebraic size of x). (c) If we in b) restrict the quantiﬁers to be Boolean ones, i.e., if the variable blocks range over {0, 1}∗ instead of R∞ we obtain the digital classes DΣRk . Clearly, Σ 1 = N P, ΣR1 = N PR and DΣR1 = DN PR , where the latter is the class digital N PR of problems in N PR that require a discrete search space for veriﬁcation, only.

On the Complexity of Some Problems in Interval Arithmetic

2.2

585

The Real Number Complexity of Problems in DΣR2

This section is on structural complexity of problems in DΣR2 . The main goal is to argue that problems in DΣR2 likely do not bear the full complexity of ΣR2 , and even not of N PR -hard problems. This shows that the complexity analysis of several NP-hard interval arithmetic problems can be considerably reﬁned. We turn this into more precise statements as follows. We give an absolute statement with respect to so called weak reductions (introduced by Koiran [5] for a weak version of the BSS model): No problem in DΣR2 is N PR -hard under weak reductions. Then, we give an analogous statement for (general) polynomial time reductions under a widely believed hypothesis concerning computations of resultant polynomials: No problem in DΣR2 is N PR -hard under polynomial time reductions unless there is a (non-uniform) polynomial time algorithm computing a multiple of the resultant polynomial on a Zariski-dense subset. Though some deﬁnitions are necessary to precisely state these results, the proofs are almost straightforward extensions of similar statements for DN PR given in [2] and [8]. Deﬁnition 4. (Weak running time, [5]) (a) Let M be a real machine with (real) machine constants c := (c1 , . . . , cs ) and having a running time bounded by a function t of the (algebraic) input size. Any intermediate result computed by M on x 1 ,...,cs ) is a rational function of the form p(x,c q(x,c1 ,...,cs ) , where p and q are polynomials with integer coeﬃcients over x and c. The weak costs of computing this intermediate result are given as the maximum among the degrees of p, q and the bit sizes of any of its coeﬃcient. Other operations of M (like branches and copying) have weak costs 1. The weak running time of M on input x ∈ R∞ is the sum of the weak costs of all intermediate results and branch-nodes along the computational path of M on x. (b) We call a many-one reduction a weak polynomial time reduction if it can be computed in weak polynomial time by a BSS machine. The notion of N PR -completeness under weak polynomial time reductions then is deﬁned in a straightforward manner. Note that using the weak cost measure we still allow real number algorithms, but operation sequences like repeated squaring now get more expensive than in the BSS model, see [5]. The next deﬁnition introduces (a particular subcase of) the well known resultant polynomials. Consider the problem of deciding whether a given system f := (f1 , . . . , fn ) ∈ R[x1 , . . . , xn ]n of n homogeneous polynomial systems of degree 2 in n variables has a zero x ∈ Cn \ {0}, i.e., fi (x) = 0 ∀i. We denote by H the set of all such systems and by H0 those being solvable in Cn \ {0}. The implicitly stated claims in the following deﬁnition are well known, see,e.g., [11]. Deﬁnition 5. Let n ∈ N, N := 12 · n2 · (n + 1). The resultant polynomial RESn : RN → R is a polynomial which as its indeterminates takes the coeﬃcient vectors of homogeneous systems in H. It is the unique (up to sign) irreducible polynomial with integer coeﬃcients that generates the variety of (coeﬃcient vectors of ) solvable instances H0 of problems in H, i.e., RESn (f ) = 0 iﬀ f ∈ H

586

K. Meer

has a zero x ∈ Cn \ {0}. In this notation, RESn (f ) is interpreted as evaluating RESn on the coeﬃcient vector of f in RN . It is generally believed that no eﬃcient algorithms for computing RESn exist. This is, for example, substantiated by the close relation of this problem to other potentially hard computational problems like the computation of mixed volumes. For more see [11] and the literature cited in there. Hardness results for certain resultant computations can be found in [9]; relations between computation of resultants and the real PR versus N PR question are studied in [10]. Theorem 6. (a) No problem in DN PR is N PR -complete under weak polynomial time reductions. No problem in DΣR2 is N PR -hard under weak polynomial time reductions. (b) Suppose there is no (non-uniform) polynomial time algorithm which for each n ∈ N computes a non-zero multiple of RESn on a Zariski-dense subset of H0 . Then no problem in DN PR is N PR -complete and no problem in DΣR2 is N PR -hard under polynomial time reductions in the BSS model. The proof is an extension of ideas developed in [2] and [8].

Remark 1. Note that in (b) we cannot expect an absolute statement of noncompleteness like in (a) unless PR = N PR is proven (for the weak model the relation weak-PR = weak-N PR is shown in [2]). In the next sections the above theorem is used to substantiate the conjecture that interval problems which either belong to DN PR or DΣR2 do not share the full diﬃculty of complete problems in N PR . Thus, in the real number model their complexities seem to be not the hardest possible among all (algebraic) interval problems belonging to the corresponding real complexity class.

3

Approximation of Interval Functions

The BLAQIF problem is closely related to a semi-inﬁnite optimization problem. Towards this end, suppose for a while that we have found an optimal linear approximation X(y) := x0 + x1 y1 + . . . + xn yn , X(y) − f (y) ≥ 0 ∀ y ∈ B and X(y) := x0 + x1 y1 + . . . + xn yn , f (y) − X(y) ≥ 0 ∀ y ∈ B . As shown in [6] it is easy to calculate max X(y)−X(y) once X, X are known. y∈B

The components yi∗ of the optimal y ∗ are determined by the signs of xi − xi according to yi∗ := bi if xi ≥ xi and yi∗ := bi if xi < xi . Knowing these signs we obtain a linear semi-inﬁnite programming problem. For example, if we suppose xi ≥ xi ∀ 1 ≤ i ≤ n the problem turns into

On the Complexity of Some Problems in Interval Arithmetic

587

 n n   min x + x · b − x − xi · bi 0 i i  0   i=1 i=1 T s.t. x0 + x · y − f (y) ≥ 0 ∀y∈B (LSI)  T  f (y) − x − x · y ≥ 0 ∀ y∈B  0   xi ≥ xi ∀1≤i≤n, where x := (x1 , . . . , xn ), and similarly for x. This problem is linear on the upper variable level (i.e., the 2n + 2 many xvariables) and quadratic for the lower variable level (i.e., y). It is semi-inﬁnite because there are inﬁnitely many side-constraints for X, X, parametrized through y. Note that in general we do not know in advance which sign-conditions for the components of an optimal solution X, X hold. Later on, we shall guess the right conditions as the ﬁrst part of our DΣR2 algorithm and start from the resulting (LSI) problem. General assumption: For sake of simplicity, in the following we assume without loss of generality xi ≥ xi ∀ 1 ≤ i ≤ n and deal with the above (LSI). It is easy to see that the decision version of BLAQIF belongs to ΣR2 . The result, however, is not strong enough for what we want. It neither proves a similar statement in the Turing model (since we do not know how to bound the bit-sizes of the guessed reals) nor does it give any information that BLAQIF in a real setting is likely to be an easier problem than complete ones for class ΣR2 (and even for class N PR ) are. In order to see how general quantiﬁer elimination procedures over R can be avoided when solving the real BLAQIF problem we have to study semi-inﬁnite optimization problems a bit more deeply. 3.1

Optimality Conditions for (LSI)

A fundamental idea for studying (LSI) is to reduce the inﬁnitely many constraints to ﬁnitely many in order to apply common optimization criteria. The following can be deduced from semi-inﬁnite programming theory, see, e.g., [4]. Theorem 7. A feasible point (X, X) is optimal for (LSI) iﬀ the following conditions are satisﬁed: there exist two sets {y (i) , i ∈ I} and {y (j) , j ∈ J}, each of at most n points in B, together with Lagrange parameters λi , i ∈ I, νj , j ∈ J and µk , 1 ≤ k ≤ n such that         1 0 1 0 (i)          y  · λ i +  0  · νj +  µ  =  b  i)  0   −1   0   −1  i∈I j∈J −µ 0 −y (j) −b where µ := (µ1 , . . . , µn ) and 0 ∈ Rn ; ii) λi ≥ 0 ∀ i ∈ I, νj ≥ 0 ∀ j ∈ J, µk ≥ 0 ∀1 ≤ k ≤ n; iii) either λi = 0 or the point y (i) is optimal for the problem min x0 + xT · y (i) − y∈B

f (y), and the optimal value is 0;

588

K. Meer

iv) either νj = 0 or the point y (j) is optimal for the problem min f (y) − x0 − y∈B

xT · y (j) , and the optimal value is 0; v) µk · (xk − xk ) = 0 ∀ 1 ≤ k ≤ n.

This theorem is most important for obtaining our results by the following (j) reasons. First, it states that at least one point y (i) and one point y satisfying conditions iii) and iv), respectively, exist; this follows from λi = 1 = νj . i∈I

j∈J

Therefore, we can search for it and are sure to ﬁnd it if we guarantee the search to be exhaustive. Secondly, as global optima for the corresponding subproblems y (i) and y (j) satisfy the following optimality conditions on the lower level of the semi-inﬁnite problem. Corollary 1. Using the setting of Theorem 7 let y (i) be a point satisfying con(i) (i) dition iii), where λi > 0. Let AC(y (i) ) := {k|yk = bk or yk = bk } be the set of active components of y (i) in B. Then thereexist Lagrange parameters ηj ≥ 0, j ∈ AC(y (i) ) such that x − Dy f (y (i) ) = ηj · (±ej ) , where ej is j∈AC(y (i) )

the j-th unit vector and the sign is +1 iﬀ 3.2

(i) yj

(i)

= bj and −1 iﬀ yj = bj .

Linear Approximation of Quadratic Functions Is in DΣR2

The previous results on the relations of BLAQIF and (LSI) are used in this subsection in order to prove membership in DΣR2 resp. in Σ 2 as follows. The overall goal is to ﬁnd a solution (X, X) and check that it realizes the demanded bound M. It has to be shown how that can be realized using binary (digital) quantiﬁers. Towards this end 1) we guess the right set of signs for xk − xk , 1 ≤ k ≤ n and produce the corresponding (LSI); without loss of generality we again assume all these signs to be 0 or 1. 2) Assuming (X, X) to be known we guess certain discrete information which then is used to compute at least one point y (i) , i ∈ I and one point y (j) , j ∈ J satisfying Theorem 7. This is done using Corollary 1 and the ideas developed in [8]. 3) From the corollary and the information obtained in 2) we deduce conditions that have to be fulﬁlled by an optimal solution (X, X). These conditions lead to a linear programming problem. By means of a DN PR algorithm we obtain a candidate (X, X) for the optimum. 4) Finally, the candidate obtained in 3) is checked for optimality. This mainly requires to check the constraints, which now are quadratic programs in y. Using the results of [8] this problem belongs to class co-DN PR = DΠR1 . Together, we obtain a DΣR2 algorithm. Theorem 8. Let y (i) ∈ S and y (j) ∈ S be two points in the statement of Theorem 7 such that the corresponding Lagrange parameters λi and νj are positive.

On the Complexity of Some Problems in Interval Arithmetic

589

Suppose that we do neither know an optimal (X, X) nor y (i) , y (j) . Then having the correct information about the signs for xi − xi of an optimal solution and about the active components of y (i) and y (j) in S (i.e., those components that correspond either to bk or to bk ) we can compute an optimal solution (X, X) of (LSI) as (any) solution of a speciﬁc linear programming problem. Moreover, the latter linear programming problem can be constructed deterministically in polynomial time if the active components are known. Corollary 2. There is a DΣR1 algorithm which computes a set X of vectors in which an optimal solution of (LSI) can be found, i.e., a non-deterministic algorithm that guesses a vector in {0, 1}∗ of polynomial length in n and produces a candidate (X, X) for each guess such that at least one of the candidates produced is an optimal solution. Proof. The active components of y (i) and y (j) can be coded by a bit-vector. Now use Theorem 8 together with the results in [8]. Proof of Theorem 8. As it can be seen from the proof below it will be suﬃcient to argue for one of the points y (i) , y (j) only, so let us consider y (i) . W.l.o.g. (i) suppose the ﬁrst s many components to be active and to satisfy yk = bk for 1 ≤ k ≤ s. This actually is the most diﬃcult case because the values of the active (i) components yk = bk do not correspond to the assumed inequalities xk − xk ≥ 0 which in the objective function result in the terms (xk − xk ) · bk (instead of (i) (xk − xk ) · bk which would correspond to yk = bk ). However, the diﬀerence only results in an additional LP-problem which is of no concern in our analysis. We plug the active components into the constraint x0 + xT · y (i) − f (y (i) ) ≥ 0 and obtain a quadratic minimization problem in the remaining components ys+1 , . . . , yn :  s n  min x + x · b + x · y − f (b , . . . , b , y , . . . , y ) 0 k k n k 1 s s+1 (∗) k=1 k=s+1  such that bk < yk < bk , s + 1 ≤ k ≤ n If the guess was correct we know that an interior solution for y˜ := ys+1 , . . . , yn T exists. Now deﬁne f (b1 , . . . , bs , ys+1 , . . . , yn ) := 12 y˜T · D · y˜ + h · y˜ + e , where D ∈ R(n−s)×(n−s) , h ∈ Rn−s , e ∈ R. Then Corollary 1 together with a straightforward calculation gives: i) an optimal (interior) solution y˜ lies in the kernel of D; ii) an optimal (interior) solution y˜ satisﬁes (xs+1 , . . . , xn )T = D · y˜ + h . Thus, i) implies (xs+1 , . . . , xn )T = h and we can compute these components of the (LSI) solution directly; s iii) the optimal value of (∗) is 0; using ii) this results in x0 + xk · bk = e . k=1

In a completely analogue fashion we obtain a similar condition for the part X of a solution when studying an optimal y (j) for min f (y) − xT · y − x0 . If without

590

K. Meer

loss of generality the last n − components of a solution y (j) are active we get s (x1 , . . . , x )T = h as well as x0 + xk · bk = e for appropriate values h, e that k=1

can easily be computed knowing the active components of y (j) . Putting all the information together we have the following situation: knowing the active components of y (i) , y (j) we can directly compute in polynomial time from the problem input those components of an optimal solution (X, X) (i) (j) that correspond to non-active yk , yk . The remaining ones can be obtained as optimal solution of the linear program min x0 + s.t. x0 +

s k=1 s k=1

xk · bk +

n k=s+1

hk · bk − x0 +

xk · bk = e and x0 +

n k=+1

k=1

hk · bk −

n k=+1

xk · bk

xk · bk = e

Following [8] such a solution can be computed non-deterministically in polynomial time using a binary vector as a guess. The theorem can be used to prove the main result of this section: Theorem 9. The BLAQIF decision problem belongs to DΣR2 in the real number model and to Σ 2 in the Turing model. Proof. It is clear that there exists a best linear approximation for each instance. The ﬁrst sequence of binary existential quantiﬁers is used to ﬁnd the correct (LSI) version, i.e., to guess the correct signs for xk − xk in an optimal solution. We use the guess to construct the right objective function for the problem (as described before). According to Theorem 7 there exist two points y (i) , y (j) as described in Theorem 8. Moreover, we can guess a binary vector of polynomial length in the algebraic input size, perform the algorithm described in the proof of Theorem 8 and compute a candidate (X, X) for an optimal solution of the (LSI) instance. The proof also guarantees that if we would handle in a deterministic (but ineﬃcient) algorithm all possible guesses at least one would give an optimal solution, see Corollary 2. In the remaining part of the DΣR2 algorithm we have to verify that the computed candidate (X, X) indeed is feasible and gives a bound ≤ M for the objective function. Whereas the latter is done by a simple evaluation the former requires the computation of an optimal point for two quadratic programming problems with linear constraints. These problems have the lower level variables y as unknowns and are obtained by plugging X and X into the lower level equations. We have seen this problem to be in co-DN PR . Note that if we want to get a globally minimal point we have to compare it with all other candidates. Thus, this part corresponds to checking validity of a formula containing a sequence of O(n) many universal binary quantiﬁers. This implies BLAQIF ∈ DΣR2 . In the Turing model the above structure of binary quantiﬁers still describes a Σ 2 procedure. The only point to check is that the intermediate computations can be performed in polynomial time with respect to the bit-measure. This is

On the Complexity of Some Problems in Interval Arithmetic

591

true for the arguments relying in [8] as well as for the proof of Theorem 8: No additional constants are introduced in these algorithms and the construction of intermediate matrices and LP-subproblems is done by only rearranging some of the input data. Corollary 3. BLAQIF is not N PR -hard under weak polynomial time reductions; it is not N PR -hard under polynomial time reductions unless a non-zero multiple of RESn can be computed non-uniformly in polynomial time on a Zariski-dense subset of H0 . Theorem 9 also answers a question posed in [6], chapter 19, concerning the complexity of the rational BLAQIF problem if the dimension n is ﬁxed. Our result extends the one in [7]. Theorem 10. Let n ∈ N be ﬁxed. The computational version of the BLAQIF problem for rational inputs and ﬁxed dimension n is solvable in polynomial time in the Turing model. We ﬁnally mention that results similar to Corollary 3 can be obtained for several versions of interval linear systems, see [6], which are known to be NPcomplete in the Turing model. We postpone our discussion to the full version.

References 1. L. Blum, F. Cucker, M. Shub, S. Smale: Complexity and Real Computation. Springer, 1998. 2. F. Cucker, M. Shub, S. Smale: Complexity separations in Koiran’s weak model. Theoretical Computer Science, 133, 3 – 14, 1994. 3. E. Gr¨ adel, K. Meer: Descriptive Complexity Theory over the Real Numbers. In: Lectures in Applied Mathematics, J. Renegar, M. Shub, S. Smale (eds.), 32, 381– 403, 1996. 4. S.˚ A. Gustafson, K.O. Kortanek: Semi-inﬁnte programming and applications. In: Mathematical Programming: The State of the Art, A. Bachem, M. Gr¨ otschel, B. Korte, eds., Springer, 132 – 157, 1983. 5. P. Koiran: A weak version of the Blum-Shub-Smale model. In 34th Annual IEEE Symposium on Foundations of Computer Science, 486 – 495, 1993. 6. V. Kreinovich, A.V. Lakeyev, J. Rohn, P. Kahl: Computational Complexity and Feasibility of Data Processing and Interval Computations. Kluwer, 1997. 7. M. Koshelev, L. Longpr´e, P. Taillibert: Optimal Enclusure of Quadratic Interval Functions. Reliable Computing 4, 351 – 360, 1998. 8. K. Meer: On the complexity of quadratic programming in real number models of computation. Theoretical Computer Science 133, 85 – 94, 1994. 9. D.A. Plaisted: New NP-hard and NP-complete polynomial and integer divisibility problems. Theoretical Computer Science 31, 125 – 138, 1984. 10. M. Shub: Some remarks on Bezout’s theorem and complexity theory. In: M. Hirsch, J. Marsden, M. Shub (eds.), Form topology to computation: Proc. of the Smalefest, Springer, 443 – 455, 1993. 11. B. Sturmfels: Introduction to resultants. In: Application of Computational Algebraic Geometry, D.A. Cox B. Sturmfels (eds.), Proc. of Symposia in Applied Mathematics, Vol. 53, AMS, 25 – 39, 1998.

An Abduction-Based Method for Index Relaxation in Taxonomy-Based Sources Carlo Meghini1 , Yannis Tzitzikas1, , and Nicolas Spyratos2 1

Consiglio Nazionale delle Ricerche Istituto della Scienza e delle Tecnologie della Informazione, Pisa, Italy 2 Laboratoire de Recherche en Informatique Universite de Paris-Sud, France

Abstract. The extraction of information from a source containing termclassiﬁed objects is plagued with uncertainty. In the present paper we deal with this uncertainty in a qualitative way. We view an information source as an agent, operating according to an open world philosophy. The agent knows some facts, but is aware that there could be other facts, compatible with the known ones, that might hold as well, although they are not captured for lack of knowledge. These facts are, indeed, possibilities. We view possibilities as explanations and resort to abduction in order to deﬁne precisely the possibilities that we want our system to be able to handle. We introduce an operation that extends a taxonomy-based source with possibilities, and then study the property of this operation from a mathematical point of view.

1

Introduction

Taxonomies are probably the oldest conceptual modeling tool. Nevertheless, they make a powerful tool still used for indexing by terms books in libraries, and very large collections of heterogeneous objects (e.g. see [8]) and the Web (e.g. Yahoo!, Open Directory). The extraction of information from an information source (hereafter, IS) containing term-classiﬁed objects is plagued with uncertainty. From the one hand, the indexing of objects, that is the assignment of a set of terms to each object, presents many diﬃculties, whether it is performed manually by some expert or automatically by a computer programme. In the former case, subjectivity may play a negative role (e.g. see [10]); in the latter case, automatic classiﬁcation methods may at best produce approximations. On the other hand, the query formulation process, being linguistic in nature, would require perfect attuning of the system and the user language, an assumption that simply does not hold in open settings such as the Web. A collection of textual documents accessed by users via natural language queries is clearly a kind of IS, where documents play the role of objects and words play the role of terms. In this context, the above mentioned uncertainty is

This work has been carried out while Dr. Tzitzikas was a visiting researcher at CNR-ISTI as an ERCIM fellow. Our thanks to ERCIM.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 592–601, 2003. c Springer-Verlag Berlin Heidelberg 2003

An Abduction-Based Method for Index Relaxation

593

typically dealt with in a quantitative way, i.e. by means of numerical methods: in a document index, each term is assigned a weight, expressing the extent to which the document is deemed to be about the term. The same treatment is applied to each user query, producing an index of the query which is a formal representation of the user information need of the same kind as that of each document. Document and query term indexes are then matched against each other in order to estimate the relevance of the document to a query (e.g. see [1]). In the present study, we take a diﬀerent approach, and deal with uncertainty in a qualitative way. We view an IS as an agent, operating according to an open world philosophy. The agent knows some facts, but it does not interpret these facts as the only ones that hold; the agent is somewhat aware that there could be other facts, compatible with the known ones, that might hold as well, although they are not captured for lack of knowledge. These facts are, indeed, possibilities. One way of deﬁning precisely in logical terms the notion of possibility, is to equate it with the notion of explanation. That is, the set of terms associated to an object is viewed as a manifestation of a phenomenon, the indexing process, for which we wish to ﬁnd an explanation, justifying why the index itself has come to be the way it is. In logic, the reasoning required to infer explanations from given theory and observations, is known as abduction. We will therefore resort to abduction in order to deﬁne precisely the possibilities that we want our system to be able to handle. In particular, we will deﬁne an operation that extends an IS by adding to it a set (term, object) pairs capturing the sought possibilities, and then study the property of this operation from a mathematical point of view. The introduced operation can be used also for ordering query answers using a possibility-based measure of relevance. The paper is structured as follows. Sections 2 and 3 provide the basis of our framework, introducing ISs and querying. Section 4 introduces extended ISs and Section 5 discusses query answering in such sources. Subsequently, Section 6 generalizes extended ISs and introduces iterative extensions of ISs. Finally, Section 7 concludes the paper. For reasons of space, proofs are just sketched.

2

Information Sources

An IS consists of two elements. The ﬁrst one is a taxonomy, introduced next. Deﬁnition 1: A taxonomy is a pair O = (T, K) where T is a ﬁnite set of symbols, called the terms of the taxonomy, and K is a ﬁnite set of conditionals on T, i.e. formulae of the form p → q where p and q are terms; K is called the knowledge base of the taxonomy. The knowledge graph of O is the directed graph GO = (T, L), such that (t, t ) ∈ L iﬀ t → t is in K. 2 The second element of an IS is a structure, in the logical sense of the term. Deﬁnition 2: Given a taxonomy O = (T, K), a structure on O is a pair U = (Obj, I) where: Obj is a countable set of objects, called the domain of the structure, and I is a ﬁnite relation from T to Obj, that is I ⊆ T × Obj, called the interpretation of the structure. 2

594

Carlo Meghini, Yannis Tzitzikas, and Nicolas Spyratos

As customary, we will treat the relation I as a function from terms to sets of objects and, where t is a term in T, write I(t) to denote the extension of t, i.e. I(t) = {o ∈ Obj | (t, o) ∈ I}. Deﬁnition 3: An information source (IS) S is a pair S = (O, U ) where O is a taxonomy and U is a structure on O. 2 It is not diﬃcult to see the strict correspondence between the notion of IS and that of a restricted monadic predicate calculus: the taxonomy plays the role of the theory, by providing the predicate symbols (the terms) and a set of axioms (the knowledge base); the structure plays the basic semantical role, by providing a domain of interpretation and an extension for each term. These kinds of systems have also been studied in the context of description logics [3], where terms are called concepts and axioms are called terminological axioms. For the present study, we will mostly focus on the information relative to single objects, which takes the form of a propositional theory, introduced by the next Deﬁnition. Deﬁnition 4: Given an IS S and an object o ∈ Obj, the index of o in S, indS (o), is the set of terms in whose extension o belongs according to the structure S, formally: indS (o) = {t ∈ T | (t, o) ∈ I}. The context of o in S, CS (o), is deﬁned as: CS (o) = indS (o) ∪ K. 2 For any object o, CS (o) consists of terms and simple conditionals that collectivelly form all the knowledge about o that S has. Viewing the terms as propositional variables makes object contexts propositional theories. This is the view that will be adopted in this study. Example 1: Throughout the paper, we will use as an example the IS graphically illustrated in Figure 1, given by (the abbreviations introduced in Figure 1 are used for reasons of space): T = {, C, SC, MPC, UD, R, M, UMC}, K = {C → , SC → C, MPC → C, UD → , R → SC, M → SC, UMC → MPC, UMC → UD}, and U is the structure given by: Obj = {1, 2} and I = {(SC, 1), (M, 2), (MPC, 2)}. The index of object 2 in S, indS (2) is {M, MPC}, while the context of 2 in S is CS (2) = indS (2) ∪ K. Notice that the taxonomy of the example has a maximal element, , whose existence is not required in every taxonomy. 2 Given a set of propositional variables P, a truth assignment for P is a function mapping P to the set of standard truth values, denoted by T and F, respectively [5]. A truth assignment V satisﬁes a sentence σ, V |= σ, if σ is true in V, according to the truth valuation rules of predicate calculus (PC). A set of sentences Σ logically implies the sentence α, Σ |= α, iﬀ every truth assignment which satisﬁes every sentence in Σ also satisﬁes α. In the following, we will be interested in deciding whether a certain conditional is logically implied by a knowledge base. Proposition 1: Given a taxonomy O = (T, K) and any two terms p, q in T, K |= p → q iﬀ there is a path from p to q in GO . 2 From a complexity point of view, the last Proposition reduces logical implication of a conditional to the well-known problem on graphs REACHABILITY, which has been shown to have time complexity equal to O(n2 ), where n is the

An Abduction-Based Method for Index Relaxation

595

Cameras(C)

StillCameras(SC)

Reflex (R)

MovingPictureCams(MPC)

Miniatures(M)

1

UnderwaterDevices(UD)

UnderwaterMovingCams(UMC)

2

Fig. 1. A source

number of nodes of the graph [7]. Consequently, for any two terms p, q in T, K |= p → q can be decided in time O(|T |2 ).

3

Querying Information Sources

We next introduce the query language for extracting information from an IS in the traditional question-answering way. Deﬁnition 5: Given a taxonomy O = (T, K), the query language for O, LO , is deﬁned by the following grammar, where t is a term in T : q ::= t | q ∧ q | q ∨ q | ¬q | (q) 2 The answer to queries is deﬁned in logical terms by taking a model-theoretic approach, compliant with the fact that the semantical notion of structure is used to model the extensional data of an IS. To this end, we next select, amongst the models of object contexts, the one realizing a closed-world reading of an IS, whose existence and uniqueness trivially follow from the next Deﬁnition. Deﬁnition 6: Given an IS S, for every object o ∈ Obj, the truth model of o in S, Vo,S , is the truth assignment for T deﬁned as follows, for each term t ∈ T : T if CS (o) |= t Vo,S (t) = F otherwise Given a query ϕ in LO , the answer of ϕ in S is the set of objects whose truth model satisﬁes the query: ans(ϕ, S) = {o ∈ Obj | Vo,S |= ϕ}. 2 In the Boolean model of information retrieval, a document is returned in response to a query if the index of the document satisﬁes the query. Thus, the above deﬁnition extends Boolean retrieval by considering also the knowledge base in the retrieval process. Example 2: The answer to the query C in the IS introduced in Example 1, ans(C, S), consists of both object 1 (since {SC, SC → C} ⊆ CS (1) hence V1,S (C) = T) and object 2 (since {MPC, MPC → C} ⊆ CS (2) hence V2,S (C) = T). 2 The next deﬁnition introduces the function αS , which, along with Proposition 1, provides a mechanism for the computation of answers.

596

Carlo Meghini, Yannis Tzitzikas, and Nicolas Spyratos

Deﬁnition 7: Given an IS S, the solver of S, αS , is the total function from queries to sets of objects, αS : LO → P(Obj), deﬁned as follows: αS (t) = {I(u) | K |= u → t} αS (q∧q ) = αS (q)∩αS (q ), αS (q∨q ) = αS (q)∪αS (q ), and αS (¬q) = Obj\αS (q). 2 As intuition suggests, solvers capture sound and complete query answerers. Proposition 2:

For all ISs S and queries ϕ ∈ LO , ans(ϕ, S) = αS (ϕ).

We shall also use I

−

to denote the restriction of αS on T , i.e. I

−

2

= αS|T .

Example 3: In the IS previously introduced, the term C can be reached in the knowledge graph by each of the following terms: C, SC, MPC, R, M, and UMC. Hence: ans(C, S) = αS (C) = I(C) ∪ I(SC) ∪ I(MPC) ∪ I(R) ∪ I(M) ∪ I(UMC) = {1, 2}. Likewise, it can be veriﬁed that ans(M, S) = {2} and ans(UMC, S) = ∅. 2 In the worst case, answering a query requires (a) to visit the whole knowledge graph for each term of the query and (b) to combine the so obtained sets of objects via the union, intersection and diﬀerence set operators. Since the time complexity of each such operation is polynomial in the size of the input, the time complexity of query answering is polynomial.

4

Extended Information Sources

Let us suppose that a user has issued a query against an IS and that the answer does not contain objects that are relevant to the user information need. The user may not be willing to replace the current query with another one, for instance because of lack of knowledge on the available language or taxonomy. In this type of situation, both database and information retrieval (IR) systems oﬀer practically no support. If the IS does indeed contain relevant objects, the reason of the user’s disappointment is indexing mismatch: the objects have been indexed in a way that is diﬀerent from the way the user would expect. One way of handling the problem just described, would be to consider the index of an IS not as the ultimate truth about how the world is and is not, but as a ﬂexible repository of information, which may be interpreted in a more liberal or more conservative way, depending on the context. For instance, the above examples suggest that a more liberal view of the IS, in which the camera in question is indexed under the term M, could help the user in getting out of the impasse. One way of deﬁning precisely in logical terms the discussed extension, is to equate it with the notion of explanation. That is, we view the index of an object as a manifestation, or observation, of a phenomenon, the indexing process, for which we wish to ﬁnd an explanation, justifying why the index itself has come to be as it is. In logic, the reasoning required to infer explanations from given theory and observations, is known as abduction. The model of abduction that we adopt is the one presented in [4]. Let LV be the language of propositional logic over an alphabet V of propositional variables,

An Abduction-Based Method for Index Relaxation

597

with syntactic operators ∧, ∨, ¬, →, (a constant for truth) and ⊥ (falsity). A propositional abduction problem is a tuple A = V, H, M, T h , where V is a ﬁnite set of propositional variables, H ⊆ V is the set of hypotheses, M ⊆ V is the set of manifestations, and T h ⊆ LV is a consistent theory. S ⊆ H is a solution (or explanation) for A iﬀ T h ∪ S is consistent and T h ∪ S |= M. Sol(A) denotes the set of the solutions to A. In the context of an IS S, the terms in S taxonomy play both the role of the propositional variables V and of the hypotheses H, as there is no reason to exclude apriori any term from an explanation; the knowledge base in S taxonomy plays the role of the theory T h; the role of manifestation, for a ﬁxed object, is played by the index of the object. Consequently, we have the following Deﬁnition 8: Given an IS S and object o ∈ Obj, the propositional abduction problem for o in S, AS (o), is the propositional abduction problem AS (o) = T, T, indS (o), K . The solutions to AS (o) are given by: Sol(AS (o)) = {A ⊆ T | K ∪ A |= indS (o)} where the consistency requirement on K ∪ A has been omitted since for no knowledge base K and set of terms A, K ∪ A can be inconsistent. 2 Usually, certain explanations are preferable to others, a fact that is formalized in [4] by deﬁning a preference relation over Sol(A). Letting a ≺ b stand for a b and b a, the set of preferred solutions is given by: Sol (A) = {S ∈ Sol(A) | ∃S ∈ Sol(A) : S ≺ S}. In the present context, we require the preference relation to satisfy the following criteria, reﬂecting the application priorities in order of decreasing priority: (1) explanations including only terms in the manifestation are less preferable than explanations including also terms not in the manifestation; (2) explanations altering the behaviour of the IS to a minimal extent, are to be preferred; (3) between two explanations that alter the behaviour of the IS equally, the simpler, that is the smaller, one is to be preferred. Without the ﬁrst criterion, all minimal solutions would be found amongst the subsets of M, a clearly undesirable eﬀect, at least as long as alternative explanations are possible. In order to formalize our intended preference relation, we start by deﬁning perturbation. Deﬁnition 9: Given an IS S, an object o ∈ Obj and a set of terms A ⊆ T, the perturbation of A on S with respect to o, p(S, o, A) is given by the number of additional terms in whose extension o belongs, once the index of o is extended with the terms in A. Formally: p(S, o, A) = |{t ∈ T | (CS (o) ∪ A) |= t and CS (o) |= t}|. 2 As a consequence of the monotonicity of the PC, for all ISs S, objects o ∈ Obj and sets of terms A ⊆ T, p(S, o, A) ≥ 0. In particular, p(S, o, A) = 0 iﬀ A ⊆ indS (o). We can now deﬁne the preference relation over solutions of the above stated abduction problem. Deﬁnition 10: Given an IS S, an object o ∈ Obj and two solutions A and A to the problem AS (o), A A if either of the following holds:

598

Carlo Meghini, Yannis Tzitzikas, and Nicolas Spyratos

1. p(S, o, A ) = 0 2. 0 < p(S, o, A) < p(S, o, A ) 3. 0 < p(S, o, A) = p(S, o, A ), and A ⊆ A .

2

In order to derive the set Sol (AS (o)), we introduce the following notions. Deﬁnition 11: Given an IS S and an object o ∈ Obj, the depth of Sol(AS (o)), do , is the maximum perturbation of the solutions to AS (o), that is: do = max{p(S, o, A) | A ∈ Sol(AS (o))} Moreover, two solutions A and A are equivalent, A ≡ A , iﬀ they have the same perturbation, that is p(S, o, A) = p(S, o, A ). 2 It can be readily veriﬁed that ≡ is an equivalence relation over Sol(AS (o)), determining the partition π≡ whose elements are the set of solutions having the same perturbation. Letting Pi stand for the solutions having perturbation i, Pi = {A ∈ Sol(AS (o)) | p(S, o, A) = i} it turns out that π≡ includes one element for each perburbation value in between 0 and do , as the following Proposition states. Proposition 3: For all ISs IS S and objects o ∈ Obj, π≡ = {Pi | 0 ≤ i ≤ do }. In order to prove the Proposition, it must be shown that {Pi | 0 ≤ i ≤ do } is indeed a partition, that is: (1) Pi = ∅ for each 0 ≤ i ≤ do ; (2) Pi ∩ Pj = ∅ for 0 ≤ i, j ≤ do , i = j; (3) {Pi | 0 ≤ i ≤ do } = Sol(AS (o)). Items 2 and 3 above are easily established. Item 1 is trivial for do = 0. For do > 0, item 1 can be established by backward induction on i : the basis step, Pdo = ∅, is true by deﬁnition. The inductive step, Pk = ∅ for k > 0 implies Pk−1 = ∅, can be proved by constructing a solution having perturbation k − 1 from a solution with perturbation k. Finally, it trivially follows that this partition is the one induced by the ≡ relation. 2 We are now in the position of deriving Sol (AS (o)). Proposition 4: For all ISs S and objects o ∈ Obj, if do = 0 P0 Sol (AS (o)) = {A ∈ P1 | for no A ∈ P1 , A ⊂ A} if do > 0 This proposition is just a corollary of the previous one. Indeed, if do is 0, by Proposition 3, Sol(AS (o)) = P0 and by Deﬁnition 10, all elements in Sol(AS (o)) are minimal. If, on the other hand, do is positive, then by criterion (1) of Deﬁnition 10, all solutions with non-zero perturbation are preferable to those in P0 , and not viceversa; and by criterion (2) of Deﬁnition 10, all solutions with perturbation equal to 1 are preferable to the remaining, and not viceversa. Hence, for a positive do , minimal solutions are to be found in P1 . Finally, by considering the containment criterion set by item (3) of Deﬁnition 10, the Proposition results. Example 4: Let us consider again the IS S introduced in Example 1, and the problem AS (1). The manifestation is given by {SC}. Letting B stand for the set {UMC, MPC, UD, , C}, it can be veriﬁed that: Sol(AS (1)) = P(T ) \ P(B) as B includes all the terms in T not implying SC. Since do = 5, minimal solutions are to be found in the set P1 . By considering all sets of terms in Sol(AS (1)), it

An Abduction-Based Method for Index Relaxation

599

can be veriﬁed that: P1 = {{M} ∪ A | A ∈ P({SC, C, })} ∪ {{R} ∪ A | A ∈ P({SC, C, })} ∪ {{SC, UD} ∪ A | A ∈ P({, C})} ∪ {{SC, MPC} ∪ A | A ∈ P({, C})}. By applying the set containment criterion, we have: Sol (AS (1)) = {{M}, {R}, {SC, UD}, {SC, MPC}}. Analogously, it can be veriﬁed that: 2 Sol (AS (2)) = {{M, MPC, UD}, {R, M, MPC}}. We now introduce the notion of extension of an IS. The idea is that an extended IS (EIS for short) adds to the original IS all and only the indexing information captured by the abduction process illustrated in the previous Section. In order to maxime the extension, all the minimal solutions are included in the EIS. Deﬁnition 12: Given an IS S and an object o ∈ Obj, the abduced index of o, abindS (o), is given by: abindS (o) = Sol (AS (o)). The abduced interpretation of S, I + , is given by I + = I ∪ {t, o ∈ (T × Obj) | t ∈ abindS (o)}. Finally, the extended IS, S e , is given by S e = (O, U e ) where U e = (Obj, I + ). 2 Example 5: From the last Example, it follows that the extended S is given by S e = (O, U e ), U e = (Obj, I + ) where: abindS (1) = {SC, M, R, UD, MPC}, abindS (2) = {M, MPC, UD, R} and I + = {(SC, 1), (M, 1), (R, 1), (UD, 1), (MPC, 1), (M, 2), (MPC, 2), (UD, 2), (R, 2)} 2

5

Querying Extended Information Sources

As anticipated in Section 4, EISs are meant to be used in order to obtain more results about an already stated query, without posing a new query to the underlying information system. The following Example illustrates the case in point. Example 6: The answer to the query M in the extended IS derived in the last Example, ans(M, S e ), consists of both object 1 (since M ∈ abindS (1) hence M ∈ CS e (1)) and object 2 (since (M, 2) ∈ I hence (M, 2) ∈ I + ). Notice that 1 is not returned when M is stated against S, i.e. ans(M, S) ⊂ ans(M, S e ). Instead, ans(UMC, S) = ans(UMC, S e ) = ∅. 2 It turns that queries stated against an EIS can be answered without actually computing the whole EIS. In order to derive an answering procedure for queries posed against an EIS, we introduce a recursive function on the IS query language LO , in the same style as the algorithm for querying IS presented in Section 3. Deﬁnition 13: Given an IS S, the extended solver of S, αSe , is the total function from queries to sets of objects, αSe : LO → P(Obj), deﬁned as follows: αSe (t) = {αS (u) | t → u ∈ K and K |= u → t} αSe (q ∧ q ) = αSe (q) ∩ αSe (q ) αSe (q ∨ q ) = αSe (q) ∪ αSe (q ) αSe (¬q) = Obj \ αSe (q)

where αS is the solver of S.

2

600

Carlo Meghini, Yannis Tzitzikas, and Nicolas Spyratos

Note that since is the maximal element the set {αS (u) | → u ∈ K and K |= u → } is empty. This means that αSe (), i.e. {αS (u) | → u ∈ K and K |= u → } is actually the intersection of an empty family of subsets of Obj. However, according to the Zermelo axioms of set theory (see [2] for an overview), the intersection of an empty family of subsets of a universe equals to the universe. In our case, the universe is the set of all objects known to the source, i.e. the set Obj, thus we conclude that αSe () = Obj. The same holds for each maximal element (if the taxonomy has more than one maximal elements). 2 Proposition 5: For all ISs S and queries ϕ ∈ LO , ans(ϕ, S e ) = αSe (ϕ). Example 7: By applying the last Proposition, we have: 2 ans(M, S e ) = αSe (M) = αS (SC) = I(SC) ∪ I(R) ∪ I(M) = {1, 2}.

6

Iterative Extension of Information Sources

Intuitively, we would expect that ·+ be a function which, applied to an IS interpretation, produces a new interpretation that is equal to or larger than the original extension, the former case corresponding to the situation in which the knowledge base of the IS does not enable to ﬁnd any explanations for each object index. Technically, this amounts to say that ·+ is a monotonic function, which is in fact the case. Then, by iterating the ·+ operator, we expect to move from an interpretation to a larger one, until an interpretation is reached which cannot be extended any more. Also this turns out to be true, and in order to show it, we will model the domain of the ·+ operator as a complete partial order, and use the notion of ﬁxed point in order to capture interpretations that are no longer extensible. Proposition 6: Given an IS S, the domain of S is the set D given by D = {I ∪ A | A ∈ P(T × Obj)}. Then, ·+ is a continuous function on the complete partial order (D, ⊆). The proof that (D, ⊆) is a complete partial order is trivial. The continuity of ·+ follows from its monotonicity (also, a simple fact to show) and the fact that in the considered complete partial order all chains are ﬁnite, hence the class of monotonic functions coincides with the class of continuous functions [6]. 2 As a corollary of the previous Proposition and of the Knaster-Tarski ﬁxed point theorem, we have: Proposition 7: The function ·+ has a least ﬁxed point that is the least upper bound of the chain {I, I + , (I + )+ , . . .}. 2 Example 8: Let R be the EIS derived in the last Example, i.e. R = S e , and let us consider the problem AR (1), for which the manifestation is given by the set abindS (1) above. It can be veriﬁed that Sol(AR (1)) = P0 ∪ P1 , where: P0 = {{R, M, MPC, UD} ∪ A | A ∈ P({SC, C, })} P1 = {{R, M, UMC} ∪ A | A ∈ P({SC, C, , MPC, UD})} Therefore: Sol (AR (1)) = {{R, M, UMC}} from which we obtain: abindR (1) = {R, M, UMC} which means that the index of object 1 in R has been extended with the term UMC. If we now set P = Re , and consider the problem AP (1), we ﬁnd

An Abduction-Based Method for Index Relaxation

601

Sol(AP (1)) = P0 = {{R, M, UMC} ∪ A | A ∈ P({SC, MPC, UD, C, })} Consequently, Sol (AP (1)) = {{R, M, UMC}} and abindP (1) ⊆ indP (1). Analogously, we have abindR (2) = indR (2) ∪ {UMC} and abindP (2) ⊆ indP (2). Thus, since ((I + )+ )+ = (I + )+ , (I + )+ is a ﬁxed point, which means that P is no longer extensible. Notice that ∅ = ans(UMC, S) = ans(UMC, R) ⊂ ans(UMC, P ) = {1, 2}. 2

7

Conclusion and Future Work

To alleviate the problem of indexing uncertainty we have proposed a mechanism which allows liberating the index of a source in a gradual manner. This mechanism is governed by the notion of explanation, logically captured by abduction. The proposed method can be implemented as an answer enlargement1 process where the user is not required to give additional input, but from expressing his/her desire for more objects. Another interesting remark is that the abduced extension operation can be applied not only to manually constructed taxonomies but also to taxonomies derived automatically on the basis of an inference service. For instance, it can be applied on sources indexed using taxonomies of compound terms which are deﬁned algebraically [9]. The introduced framework can be also applied for ranking the objects of an answer according to an explanation-based measure of relevance. In particular, we can deﬁne the rank of an object o as (k)e follows: rank(o) = min{ k | o ∈ αS (ϕ)}.

References 1. R. Baeza-Yates and B. Ribeiro-Neto. “Modern Information Retrieval”. ACM Press, Addison-Wesley, 1999. 2. George Boolos. “Logic, Logic and Logic”. Harvard University Press, 1998. 3. F.M. Donini, M. Lenzerini, D. Nardi, and A. Schaerf. Reasoning in description logics. In G. Brewka, editor, Principles of Knowledge Representation, Studies in Logic, Language and Information, pages 193–238. CSLI Publications, 1996. 4. T. Eiter and G. Gottlob. The complexity of logic-based abduction. Journal of the ACM, 42(1):3–42, January 1995. 5. H.B. Enderton. A mathematical introduction to logic. Academic Press, N. Y., 1972. 6. P.A. Fejer and D.A. Simovici. Mathematical Foundations of Computer Science. Volume 1: Sets, Relations, and Induction. Springer-Verlag, 1991. 7. C.H. Papadimitriou. Computational complexity. Addison-Wesley, 1994. 8. Giovanni M. Sacco. “Dynamic Taxonomies: A Model for Large Information Bases”. IEEE Transactions on Knowledge and Data Engineering, 12(3), May 2000. 9. Y. Tzitzikas, A. Analyti, N. Spyratos, and P. Constantopoulos. “An Algebra for Specifying Compound Terms for Faceted Taxonomies”. In 13th European-Japanese Conf. on Information Modelling and Knowledge Bases, Kitakyushu, J, June 2003. 10. P. Zunde and M.E. Dexter. “Indexing Consistency and Quality”. American Documentation, 20(3):259–267, July 1969. 1

If the query contains negation then the answer can be reduced.

On Selection Functions that Do Not Preserve Normality Wolfgang Merkle and Jan Reimann Ruprecht-Karls-Universit¨ at Heidelberg Mathematisches Institut Im Neuenheimer Feld 294 D-69120 Heidelberg, Germany {merkle,reimann}@math.uni-heidelberg.de

Abstract. The sequence selected from a sequence R(0)R(1) . . . by a language L is the subsequence of all bits R(n + 1) such that the preﬁx R(0) . . . R(n) is in L. By a result of Agafonoﬀ [1], a sequence is normal if and only if any subsequence selected by a regular language is again normal. Kamae and Weiss [11] and others have raised the question of how complex a language must be such that selecting according to the language does not preserve normality. We show that there are such languages that are only slightly more complicated than regular ones, namely, normality is neither preserved by linear languages nor by deterministic one-counter languages. In fact, for both types of languages it is possible to select a constant sequence from a normal one.

1

Introduction

It is one of the fundamental beliefs about chance experiments that any inﬁnite binary sequence obtained by independent tosses of a fair coin will, in the long run, produce any possible ﬁnite sequence with frequency 2−n , where n is the length of the ﬁnite sequence considered. Sequences of zeros and ones having this property are called normal. It is a basic result of probability theory that, with respect to the uniform Bernoulli measure, almost every sequence is normal. One may now pose the following problem: If we select from a normal sequence an inﬁnite subsequence, under what selection mechanisms is the thereby obtained sequence again normal, i.e. which restrictions must and can one impose on the class of admissible selection rules to guarantee that normality is preserved. This problem originated in the work of von Mises (see for example [22]). His aim was to base a mathematical theory of probability on the primitive notion of a Kollektiv, which are objects having two distinguished properties. On the one hand, individual symbols possess an asymptotic frequency (as normal sequences do) which allows in turn to assign probabilities. On the other hand, the limiting frequencies are preserved when a subsequence is selected from the original sequence. Of course, not arbitrary selection rules, or place selection rules, as von Mises calls them, will be allowed in this context, since one might simply select all zeroes from a given sequence. Von Mises did not give a formal deﬁnition of an B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 602–611, 2003. c Springer-Verlag Berlin Heidelberg 2003

On Selection Functions that Do Not Preserve Normality

603

admissible selection rule, however, he requires them to select a subsequence “independently of the result of the corresponding observation, i.e., before anything is known about this result.” There have been various attempts to clarify and rigorously deﬁne what an admissible selection rule and hence a Kollektiv is. One approach allowed only rules that were in some sense eﬀective, for instance, computable by a Turing machine. This eﬀort was initiated by Church [9] and lead to the study of effective stochastic sequences (see the survey by Uspensky, Semenov and Shen [20] for more on this). Knowing Champernowne’s construction (see Champernowne’s paper [8] and Section 2 below), normal numbers disqualify as stochastic sequences, as some of them are easy to describe by algorithms. On the other hand, from a purely measure theoretic point of view, normal sequences seem to be good candidates for a Kollektiv, as they have the right limiting frequency of individual symbols. Furthermore, their dynamic behavior is as complex as possible: They are generic points in {0, 1}∞ with respect to a measure with highest possible entropy – the uniform (1/2, 1/2)-Bernoulli measure. (In Section 3 we will explain this further, also refer to Weiss [23].) So one might ask a question contrary to the problem set up by Church and others: Which selection rules preserve normality, i.e. map normal sequences to normal ones? In particular, such rules will preserve the limiting frequency of zeroes and ones and hence satisfy von Mises’ requirements for a Kollektiv. There are two kinds of selection rules that are commonly considered: oblivious ones, for which the decision of selecting a bit for the subsequence does not depend on the input sequence up to that bit (i.e., the places to be selected are ﬁxed in advance), and those selection rules that depend on the input sequence. For oblivious selection rules, Kamae [10] found a necessary and suﬃcient condition for them to preserve normality. For input dependent rules, Agafonoﬀ [1] obtained the result that if a sequence N is normal, then any inﬁnite subsequence selected from N by a regular language L is again normal. Detailed proofs and further discussion can be found in Schnorr and Stimm [18] as well as in Kamae and Weiss [11]. (It is not hard to see that the reverse implication holds, too, as observed by Postnikova [17] and others [15,7], hence the latter property of a sequence N is equivalent to N being normal). It has been asked by Kamae and Weiss [11] whether Agafonoﬀ’s result can be extended to classes of languages that are more comprising than the class of regular languages, e.g., to the class of context-free languages (see also Li and Vit´ anyi [13], p. 59, problem 1.9.7). In the sequel, we give a negative answer to this question for two classes of languages that are the least proper superclasses of the class of regular languages that are usually considered in the theory of formal languages. More precisely, Agafonoﬀ’s result can neither be extended to the class of linear languages nor to the class of languages that are recognized by a deterministic pushdown automat with unary stack alphabet, known as deterministic one-counter languages. Recall that these two classes are incomparable and that the latter fact is witnessed, for example, by the languages used in the proofs of Propositions 10 and 11, i.e., the language of all words that contains as

604

Wolfgang Merkle and Jan Reimann

many 0’s as 1 and the language of even-length palindromes. (For background on formal language theory we refer to the survey by Autebert, Berstel, and Boasson [5].) However, determining exactly the class of languages preserving normality remains an open problem. The outline of the paper is as follows. In Section 2 we review the basic deﬁnitions related to normality and recap Champernowne’s constructions of normal sequences. Section 3 discusses the two kinds of selection rules, oblivious and input dependent ones. In Section 4, we show that normality is not preserved by selection rules deﬁned by deterministic one-counter languages, while Section 5 is devoted to proving that normality is not preserved by linear languages. Our notation is mostly standard, for unexplained terms and further details we refer to the textbooks and surveys cited in the bibliography [3,4,6,13,14,16]. Unless explicitly stated otherwise, sequences are always inﬁnite and binary. A word is a ﬁnite sequence. For i = 0, 1, . . ., we write A(i) for bit i of a sequence A, hence, A = A(0)A(1) . . . and we proceed similarly for words. A word w is a preﬁx of a sequence A if A(i) = w(i) for i = 0, . . . , |w| − 1, where |w| is the length of w. The preﬁx of a sequence A of length m is denoted by A|m. The concatenation of two words v and w is denoted by vw. A word u is a subword of a word w if w = v1 uv2 for appropriate words v1 and v2 .

2

Normal Sequences

For a start, we review the concept of a normal sequence and standard techniques for the construction of such sequences. Deﬁnition 1. (i) For given words u and w, let occu (w) be the number of times that u appears as a subword of w, and let frequ (w) = occu (w)/|w|. (ii) A sequence N is normal if and only if for any word u lim frequ (N |m) =

m→∞

1 . 2|u|

(1)

Remark 2. A sequence N is normal if for any word u and any ε > 0, we have for all suﬃciently large m, 1 + ε. (2) 2|u| For a proof, it suﬃces to observe that for any given ε > 0 and for all suﬃciently large m, inequality (2) holds with u replaced by any word v that has the same length as u, while the sum of the relative frequencies freqv (N |m) over these 2|u| words diﬀer from 1 by less than ε; hence by (2) all such m, 1 (1 − ε) ≤ freqv (N |m) ≤ frequ (N |m) + (2|u| − 1)( 2|u| + ε), frequ (N |m) <

{v:|v|=|u|}

and by rearranging terms we obtain 1 − 2|u| ε < frequ (N |m) . 2|u|

On Selection Functions that Do Not Preserve Normality

605

Together with (2) this implies that frequ (N |m) converges to 2−|u| , because ε > 0 has been chosen arbitrarily. Deﬁnition 3. A set W of words is normal in the limit if and only if for any nonempty word u and any ε > 0 for all but ﬁnitely many words w in W , 1 1 − ε < frequ (w) < |u| + ε. 2|u| 2

(3)

Deﬁnition 4. For any n, let vn = 0n 0n−1 1 0n−2 10 . . . 1n be the word that is obtained by concatenating all words of length n in lexicographic order. Proposition 5. The set {v1 , v2 , . . .} is normal in the limit. Proof. By an argument similar to the one given in Remark 2, it suﬃces to show that for any word u and any given ε > 0 we have for almost all words vi , frequ (vi ) <

1 + ε. 2|u|

(4)

So ﬁx u and ε > 0 and consider any index i such that |u|/i < ε. Recalling that vi is the concatenation of all words of length i, call a subword of vi undivided if it is actually a subword of one of these words of length i, and call all other subwords of vi divided. It is easy to see that u can occur at most 2i |u| many times as a divided subword of vi . Furthermore, a symmetry argument shows that among the at most |vi | many undivided subwords of vi of length |u|, each of the 2|u| words of length |u| occurs exactly the same number of times. In summary, we have |vi | 1 2i |u| 1 i occu (vi ) ≤ |u| + 2 |u| = |vi | < |vi | + +ε , |vi | 2 2|u| 2|u| where the last inequality follows by |vi | = 2i i and the choice of i. Equation (4) is then immediate by deﬁnition of frequ (vi ). Lemma 6. Let W be a set of words that is normal in the limit. Let w1 , w2 , . . . be a sequence of words in W such that |{i ≤ t : wi = w}| = 0, t→∞ t

(i) for all w ∈ W, lim

|wt+1 | = 0. t→∞ |w1 . . . wt |

(ii) lim

Then the sequence N = w1 w2 . . . is normal. Remark 7. The sequence v1 v2 v2 v3 v3 v3 v4 . . ., which consists of i copies of vi concatenated in length-increasing order, is normal. This assertion is immediate by deﬁnition of the sequence, Proposition 5, and Lemma 6.

606

Wolfgang Merkle and Jan Reimann

Due to lack of space, we omit the proof of Lemma 6. The arguments and techniques (also for the other results in this section) are essentially the same as the ones used by Champernowne [8], who considered normal sequences over the decimal alphabet {0, 1, . . . , 9} and proved that the decimal analogues of the sequences N1 = v1 v2 v2 v3 v3 v3 v4 . . .

and

N2 = v1 v2 v3 v4 . . .

are normal. In Remark 7, we have employed Lemma 6 and the fact that the set of all words vi is normal in the limit in order to show that N1 is normal. In order to demonstrate the normality of the decimal analogue of the sequence N2 , Champernowne [8, item (ii) on page 256] shows a fact about the decimal versions of the vi that is stronger than just being normal in the limit, namely, for any word u and any constant k, we have for all suﬃciently large i, and all m ≤ |vi |, frequ (vi (0) . . . vi (m − 1)) <

m |vi | . + |u| k 2

(5)

This result is then used in connection with a variant of Lemma 6 where in place of assumption (ii), which asserts that the ratio |wt+1 | and |w1 . . . wt | converge to 0, it is just required that this ratio is bounded.

3

Selecting Subsequences

The most simple selection rules are oblivious ones, they ﬁx the places to be included in the subsequence in advance, independent of the input sequence. Deﬁnition 8. An oblivious selection rule is a sequence S ∈ {0, 1}∞ . The subsequence B obtained by applying an oblivious rule S to a sequence A is just the subsequence formed by all the bits A(i) with S(i) = 1. Kamae [10] gave a complete characterization of the class of oblivious selection rules that preserve normality. Let T be the shift map, transforming a sequence A = A(0)A(1)A(2) . . . into another sequence by cutting oﬀ the ﬁrst bit, i.e. T (A) = A(1)A(2)A(3) . . .. Given a sequence A, let further denote δA the Dirac measure induced by A, that is, for any class B of sequences, δA (B) = 1 if A ∈ B and δA (B) = 0 otherwise. Note that if a sequence A is normal, then any cluster point (in the weak-∗ topology) of the measures n 1 δ i , n i=0 T (A) is the uniform (1/2, 1/2)-Bernoulli measure, which has entropy 1 (see Weiss [23] for details on this). Kamae showed that an oblivious selection rule S preserves normality if and only if S is completely deterministic, that is, any cluster point of the measures n

1 δ i n i=0 T (S)

On Selection Functions that Do Not Preserve Normality

607

has entropy 0. (See the monograph by Weiss [23] for more on this topic.) The results in this paper are concerned with selection rules depending on the input sequence. Here, up to now, no exact classiﬁcation of normality preserving selection rules in the spirit of Kamae’s results are known. First, we formally deﬁne how an input dependent selection rule works. Deﬁnition 9. Let A be a sequence and let L be a language. The sequence selected from A by L is the subsequence of A that contains exactly the bits A(i) of A such that the preﬁx A(0) . . . A(i − 1) is in L. The basic result here is that regular languages preserve normality [1,17,18,11,15,7]. In the next two sections we are going to show that two wellknown superclasses of the regular languages, minimal among the classes usually studied in formal language theory and known to be incomparable, do not preserve normality.

4

Normality Is Not Preserved by Deterministic One-Counter Languages

Proposition 10. There exist a normal sequence N and a deterministic onecounter language L such that the sequence selected from N by L is inﬁnite and constant. Proof. Recall that occr (w) is the number of occurrences of the symbol r in w; for any word w, let d(w) = occ0 (w) − occ1 (w) . Let L be the language of all words that have as many 0’s as 1’s, i.e., L = {w ∈ {0, 1}∗ : d(w) = 0} . The language L is obviously a deterministic one-counter language, as it can be recognized by a deterministic push-down automata with unary stack alphabet that for the already scanned preﬁx v of the input stores the sign and the absolute value of d(v) by its state and by the number of stack symbols, respectively. Recall from Section 2 that vi is obtained by concatenating all words of length i and that by Remark 7, the sequence N = v1 v2 v2 v3 v3 v3 v4 . . .

(6)

is normal. For the scope of this proof, call the subwords vi of N in (6) designated subwords. Furthermore, for all t, let zt be the preﬁx of N that consists of the ﬁrst t designated subwords. Every preﬁx of the form zt of N is immediately followed by the (t + 1)th designated subword, where each designated subword starts with 0. Hence the proposition follows, if we can show that among all preﬁxes w of N , exactly the zt are in L, or equivalently, exactly the preﬁxes w that are equal to some zt satisfy d(w) = 0.

608

Wolfgang Merkle and Jan Reimann

Fix any preﬁx w of N . Choose t maximum such that zt is a preﬁx of w and pick v such that w = zt v. By choice of t, the word v is a proper preﬁx of the (t + 1)th designated subword and v is equal to the empty string if and only if w is equal to some zi . By additivity of d, we have d(w) = d(zt ) + d(v) = d(vi1 ) + . . . + d(vit ) + d(v)

(7)

for appropriate values of the indices ij . Then in order to show that w is equal to some zi if and only if d(w) = 0, it suﬃces to show that for all s, (i) d(vs ) = 0,

(ii) d(v) > 0 for all nonempty proper preﬁxes of vs . (8)

We proceed by induction on i. For i = 0 there is nothing to prove, so assume i > 0. Let v0i and v1i be the ﬁrst and the second half of vi , respectively. For r = 0, 1, the string vri is obtained from vi−1 by inserting 2i−1 times r, where d(vi−1 ) = 0 by the induction hypothesis. Hence (i) follows because d(vi ) = d(v0i ) + d(v1i ) = d(vi−1 ) + 2i−1 + d(vi−1 ) − 2i−1 = 0 . In order to show (ii), ﬁx any nonempty proper preﬁx u of vi . First assume that u is a proper preﬁx of v0i . Then u can be obtained from a nonempty, proper preﬁx of vi−1 by inserting some 0’s, hence we are done by the induction hypothesis. Next assume u = v0i v for some proper preﬁx v of v1i . We have already argued that the induction hypothesis implies d(v0i ) = 2i−1 . Furthermore, v can be obtained from a proper preﬁx v of vi−1 by inserting at most 2i−1 many 1’s, where by the induction hypothesis we have d(v ) > 0. In summary, we have d(u) = d(v0i ) + d(v) ≥ 2i−1 + d(v ) − 2i−1 > 0 ,

which ﬁnishes the proof of the proposition.

5

Normality Is Not Preserved by Linear Languages

Proposition 11. There is a normal sequence N and a linear language L such that the sequence selected from N by L is inﬁnite and constant. Proof. For any word w = w(0) . . . w(n − 1) of length n, let wR = w(n − 1) . . . w(0) be the mirror word of w and let L = {wwR : w is a word} be the language of palindromes of even length. The language L is linear because it can be generated by a grammar with start symbol S and rules S → 0S0 | 1S1 | λ. The sequence N is deﬁned in stages s = 0, 1, . . . where during stage s we z0 and z0 both be equal to the specify preﬁxes zs and zs of N . At stage 0, let empty string. At any stage s > 0, obtain zs by appending 2s copies of vs to zs−1 zs its own mirror word zR and obtain zs by appending to s , i.e., zs = zs−1 vs . . . vs

(2s copies of vs ),

i.e., for example, we have z1 = v1 ,

and

zs = zs zR s;

(9)

On Selection Functions that Do Not Preserve Normality

z1 z2 z2 z3

609

= v1 vR 1, = v1 vR 1 v2 v2 , R R R = v1 vR 1 v2 v2 v2 v2 v1 v1 , R R R = v1 vR 1 v2 v2 v2 v2 v1 v1 v3 v3 v3 v3 ,

R R R R R R R R R R R z3 = v1 vR 1 v2 v2 v2 v2 v1 v1 v3 v3 v3 v3 v3 v3 v3 v3 v1 v1 v2 v2 v2 v2 v1 v1 .

We show next that the set of preﬁxes of N that are in L coincides with the set {zs : s ≥ 0}. From the latter, it is then immediate that L selects from N an inﬁnite subsequence that consists only of 0’s, since any preﬁx zs of N is followed by the word vs+1 , where all these words start with 0. By deﬁnition of the zs , all words zs are preﬁxes of N and are in L. In order to show that the zs are the only preﬁxes of N contained in L, let us = 01s 1s 0 . By induction on s, we show for all s > 2 that (i) in zs occurs exactly one subword us−1 and no subword us ; (ii) in zs occur exactly two subwords us−1 and one subword us ; Inspection shows that both assertions are true in case s = 3. In the induction step, consider some s > 3. Assertion (i) follows by zs = zs−1 vs . . . vs , the induction hypothesis on zs−1 , and because by deﬁnition of vs , the block of copies of vs cannot overlap with a subword us . Assertion (ii) is then immediate by R assertion (i), by zs = zs zR and 01s is a suﬃx s , and because us is equal to us of zs . Now ﬁx any preﬁx w of N and assume that w is in L, i.e., is a palindrome of even length. Let s be maximum such that zs is a preﬁx of w. We can assume s ≥ 3, because inspection reveals that w cannot be a preﬁx of z3 unless w is equal to some zi , where in the latter case we are done. By (ii), the words zs and zs+1 contain us as a subword exactly once and twice, respectively, hence w contains us as a subword at least once and at most twice. When mirroring the palindrome w onto itself, the ﬁrst occurrence of the palindrome us in w must either be mapped to itself or, if present at all, to the second occurrence of us in w, in which cases w must be equal to zs and zs+1 , respectively. Since w was chosen as an arbitrary preﬁx of N in L, this shows that the zs are the only preﬁxes of N in L. It remains to show that N is normal. Let W = {vi : i ∈ N} ∪ {vi R : i ∈ N} and write the sequence N in the form N = w1 w2 . . .

(10)

where the words wi correspond in the natural way to the words in the set W that occur in the inductive deﬁnition of N (e.g., w1 , w2 , and w3 are equal to v1 , v1 R , and v2 ). For the scope of this proof, we will call the subwords wi of N in (10) the designated subwords of N .

610

Wolfgang Merkle and Jan Reimann

We conclude the proof by showing that the assumptions of Lemma 6 are satisﬁed. By Proposition 5, the set of all words of the form vi is normal in the limit, and the same holds, by literally the same proof, for the set of all words vi R ; the union of these two sets, i.e., the set W , is then also normal in the limit because the class of sets that are normal in the limit is easily shown to be closed under union. Next observe that in every preﬁx zs of N each of the 2s words v1 , . . . , vs and v1 R , . . . , vs R occurs exactly 2s−1 many times; in particular, zs contains at least s2s designated subwords and has length of at least 2s−1 |vs |. Now ﬁx any t > 0 and let z = w1 . . . wt ; let s be maximum such that zs is a preﬁx of z. By the preceding discussion, we have for any w in W , |{i ≤ t : wi = w}| 1 2s < s = t s2 s and, furthermore, |wt+1 | |vs+1 | 2s+1 1 < ≤ s−1 = s−2 . |w1 . . . wt | |zs | 2 |vs | 2 Since t was chosen arbitrarily and s goes to inﬁnity when t does, this shows that assumptions (i) and (ii) in Lemma 6 are satisﬁed.

Acknowledgements We are grateful to Klaus Ambos-Spies, Frank Stephan, and Paul Vit´ anyi for helpful discussions.

References 1. V. N. Agafonoﬀ. Normal sequences and ﬁnite automata. Soviet Mathematics Doklady, 9:324–325, 1968. 2. K. Ambos-Spies. Algorithmic randomness revisited. In B. McGuinness (ed.), Language, Logic and Formalization of Knowledge. Bibliotheca, 1998. 3. K. Ambos-Spies and A. Kuˇcera. Randomness in computability theory. In P. Cholak et al. (eds.), Computability Theory: Current Trends and Open Problems, Contemporary Mathematics, 257:1–14. American Mathematical Society, 2000. 4. K. Ambos-Spies and E. Mayordomo. Resource-bounded balanced genericity, stochasticity and weak randomness. In Complexity, Logic, and Recursion Theory. Marcel Dekker, 1997. 5. J.-M. Autebert, J. Berstel, and L. Boasson, Context-Free Languages and Pushdown Automata. In G. Rozenberg and A. Salomaa (eds.), Handbook of formal languages. Springer, 1997. 6. J.L. Balc´ azar, J. D´ıaz and J. Gabarr´ o. Structural Complexity, Vol. I and II. Springer, 1995 and 1990. 7. A. Broglio and P. Liardet. Predictions with automata. Symbolic dynamics and its applications, Proc. AMS Conf. in honor of R. L. Adler, New Haven/CT (USA) 1991, Contemporary Mathematics, 135:111–124. American Mathematical Society, 1992.

On Selection Functions that Do Not Preserve Normality

611

8. D. G. Champernowne, The construction of decimals normal in the scale of ten. Journal of the London Mathematical Society, 8:254–260, 1933. 9. A. Church. On the concept of a random number. Bulletin of the AMS, 46:130–135, 1940. 10. T. Kamae. Subsequences of normal seuqences. Isreal Journal of Mathematics, 16:121–149, 1973. 11. T. Kamae and B. Weiss. Normal numbers and selection rules. Isreal Journal of Mathematics, 21(2-3):101–110, 1975. 12. M. van Lambalgen. Random Sequences, Doctoral dissertation, University of Amsterdam, Amsterdam, 1987. 13. M. Li and P. Vit´ anyi An Introduction to Kolmogorov Complexity and Its Applications, second edition, Springer, 1997. 14. J. H. Lutz. The quantitative structure of exponential time. In Hemaspaandra, L. A. and A. L. Selman, editors, Complexity Theory Retrospective II. Springer, 1997. 15. M. G. O’Connor. An unpredictability approach to ﬁnite-state randomness. Journal of Computer and System Sciences, 37(3):324–336, 1988. 16. P. Odifreddi. Classical Recursion Theory. Vol. I. North-Holland, 1989. 17. L. P. Postnikova, On the connection between the concepts of collectives of MisesChurch and normal Bernoulli sequences of symbols. Theory of Probability and its Applications, 6:211–213, 1961. 18. C. P. Schnorr and H. Stimm. Endliche Automaten und Zufallsfolgen. Acta Informatica, 1:345–359, 1972. 19. A. Kh. Shen’. On relations between diﬀerent algorithmic deﬁnitions of randomness. Soviet Mathematics Doklady, 38:316–319, 1988. 20. V. A. Uspensky, A. L. Semenov, and A. Kh. Shen’. Can an individual sequence of zeros and ones be random? Russian Math. Surveys, 45:121–189, 1990. ´ 21. J. Ville, Etude Critique de la Notion de Collectif. Gauthiers-Villars, 1939. 22. R. von Mises. Probability, Statistics and Truth. Macmillan, 1957. 23. B. Weiss. Single Orbit Dynamics. CBMS Regional Conference Series in Mathematics. American Mathematical Society, 2000.

On Converting CNF to DNF Peter Bro Miltersen1,∗ , Jaikumar Radhakrishnan2,∗∗ , and Ingo Wegener3,∗∗∗ 1

3

Department of Computer Science, University of Aarhus, Denmark [email protected] 2 School of Technology and Computer Science Tata Institute of Fundamental Research, Mumbai 400005, India [email protected] FB Informatik LS2, University of Dortmund, 44221 Dortmund, Germany [email protected]

Abstract. We study how big the blow-up in size can be when one switches between the CNF and DNF representations of boolean functions. For a function f : {0, 1}n → {0, 1}, cnfsize(f ) denotes the minimum number of clauses in a CNF for f ; similarly, dnfsize(f ) denotes the minimum number of terms in a DNF for f . For 0 ≤ m ≤ 2n−1 , let dnfsize(m, n) be the maximum dnfsize(f ) for a function f : {0, 1}n → {0, 1} with cnfsize(f ) ≤ m. We show that there are constants c1 , c2 ≥ 1 and > 0, such that for all large n and all m ∈ [ 1 n, 2n ], we have n n−c1 log(m/n)

2

n n−c2 log(m/n)

≤ dnfsize(m, n) ≤ 2

.

In particular, when m is the polynomial nc , we get dnfsize(nc , n) = n ) n−θ(c−1 log n

2

1

.

Introduction

Boolean functions are often represented as disjunctions of terms (i.e. in DNF) or as conjunctions of clauses (i.e. in CNF). Which of these representations is preferable depends on the application. Some functions are represented more succinctly in DNF whereas others are represented more succinctly in CNF, and switching between these representations can involve an exponential increase in size. In this paper, we study how big this blow-up in size can be. We recall some well-known concepts (for more details see Wegener [15]). The set of variables is denoted by Xn = {x1 , . . . , xn }. Literals are variables and negated variables. Terms are conjunctions of literals. Clauses are disjunctions of literals. Every Boolean function f can be represented as a conjunction of clauses, s , (1) i=1 ∈Ci

as well as a disjunction of terms, ∗ ∗∗ ∗∗∗

Supported by BRICS, Basic Research in Computer Science, a centre of the Danish National Research Foundation. Work done while the author was visiting Aarhus. Supported by DFG-grant We 1066/9.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 612–621, 2003. c Springer-Verlag Berlin Heidelberg 2003

On Converting CNF to DNF s

,

613

(2)

i=1 ∈Ti

where Ti and Ci are sets of literals. The form (1) is usually referred to as conjunctive normal form (CNF) and the form (2) is usually referred to as disjunctive normal form (DNF), although it would be historically more correct to call them conjunctive and disjunctive forms and use normal only when the sets Ci and Ti have n literals on distinct variables. In particular, this would ensure that normal forms are unique. However, in the computer science literature such a distinction is not made, and we will use CNF and DNF while referring to expressions such as (1) or (2) even when no restriction is imposed on the sets Ci and Ti , and there is no guarantee of uniqueness. The size of a CNF is the number of clauses (the parameter s in (1)), and cnfsize(f ) is the minimum number of clauses in a CNF for f . Similarly, dnfsize(f ) is the minimum number of terms in a DNF for f . We are interested in the maximal blow-up of size when switching from the CNF representation to the DNF representation (or vice versa). For 0 ≤ m ≤ 2n−1 , let dnfsize(m, n) be the maximum dnfsize(f ) for a function f : {0, 1}n → {0, 1} with cnfsize(f ) ≤ m. Since ∧ distributes over ∨, a CNF with m clauses each with k literals can be converted to a DNF with k m terms each with at most m literals. If the clauses do not share any variable, this blow-up cannot be avoided. If the clauses don’t share variables, we have km ≤ n, and the maximum n dnfsize(f ) that one can achieve by this method is 2 2 . Can the blow-up be worse? In particular, we want to know the answer to the following question: For a function f : {0, 1}n → {0, 1}, how large can dnfsize(f ) be if cnfsize(f ) is bounded by a ﬁxed polynomial in n? The problem is motivated by its fundamental nature: dnfsize(f ) and cnfsize(f ) are fundamental complexity measures. Practical circuit designs like programmable logic arrays (PLAs) are based on DNFs and CNFs. Lower bounds on unbounded fan-in circuits are based on the celebrated switching lemma of H˚ astad (1989) which is a statement about converting CNFs to DNFs where some variables randomly are replaced by constants. Hence, it seems that the exact relationship between CNFs and DNFs ought to be understood as completely as possible. Fortunately, CNFs and DNFs have simple combinatorial properties allowing the application of current combinatorial arguments to obtain such an understanding. In contrast, the results of Razborov and Rudich [12] show that this is not likely to be possible for complexity measures like circuit size and circuit depth. Another motivation for considering the question is the study of SAT algorithms and heuristics with “mild” exponential behaviour; a study which has gained a lot of momentum in recent years (e.g., Monien and Speckenmeyer[9], Paturi et al. [10], Dantsin et al. [4], Sch¨ oning [13], Hofmeister et al. [7], and Dantsin et al. [5]). Despite many successes, the following fundamental question is still open: Is there an algorithm that decides SAT of a CNF with n variables and m clauses (without any restrictions on the length of clauses) in time mO(1) 2cn for some constant c < 1? The obvious brute force algorithm solves the

614

Peter Bro Miltersen, Jaikumar Radhakrishnan, and Ingo Wegener

problem in time mO(1) 2n . One method for solving SAT is to convert the CNF to a DNF, perhaps using sophisticated heuristics to keep the ﬁnal DNF and any intermediate results small (though presumably not optimally small, due to the hardness of such a task). Once converted to a DNF, satisﬁability of the formula is trivial to decide. A CNF-DNF conversion method for solving SAT, phrased in a more general constraint satisfaction framework was recently studied experimentally by Katajainen and Madsen [8]. Answering the question above limits the worst case complexity of any algorithm obtained within this framework. The Monotone Case: Our ﬁnal motivation for considering the question comes from the monotone version of the problem. Let dnfsize+ (m, n) denote the maximum dnfsize(f ) for a monotone function f : {0, 1}n → {0, 1}. In this case (see, e.g., Wegener [15, Chapter 2, Theorem 4.2]), the number of prime clauses of f is equal to cnfsize(f ) and the number of prime implicants of f is equal to dnfsize(f ). Our problem can then be modelled on a hypergraph Hf whose edges are precisely the prime clauses of f . A vertex cover or hitting set for a hypergraph is a subset of vertices that intersects every edge of the hypergraph. The number of prime implicants of f is precisely the number of minimal vertex covers in Hf . The problem of determining dnfsize+ (m, n) then immediately translates to the following problem on hypergraphs: What is the maximum number of distinct minimal vertex covers in a hypergraph on n vertices with m distinct edges? In particular, how many minimal vertex covers can a hypergraph with nO(1) edges have? Previous Work: Somewhat surprisingly, the exact question we consider does not seem to have been considered before, although some related research has been reported. As mentioned, H˚ astad’s switching lemma can be considered as a result about approximating CNFs by DNFs. The problem of converting polynomialsize CNFs and DNFs into representations by restricted branching programs for the purpose of hardware veriﬁcation has been considered since a long time (see Wegener [16]). The best lower bounds for ordered binary decision diagrams (OBDDs) and read-once branching programs (BP1s) are due to Bollig and We1/2 gener [3] and are of size 2Ω(n ) even for monotone functions representable as disjunctions of terms of length 2. The Results in this Paper: In Section 2, we show functions where the the blow-up when going from CNF to DNF is large: n

for 2 ≤ m ≤ 2n−1 , dnfsize(m, n) ≥ 2n−2 log(m/n) ; log log(m/n) n for 2 ≤ m ≤ n2 , dnfsize+ (m, n) ≥ 2n−n log(m/n) −log(m/n) . In particular, for m = nO(1) , we have n

dnfsize(m, n) = 2n−O( log n ) and dnfsize+ (m, n) = 2n−O(

n log log n ) log n

.

In Section 3, we show that functions with small CNFs do not need very large DNFs There is a constant c > 0 such that for all large n and all m ∈ − [104 n, 210 4n ],

On Converting CNF to DNF

615

n

dnfsize(m, n) ≤ 2n−c log(m/n) . In particular, for m = nO(1) , we have dnfsize(m, n) = 2n−Ω(n/log n) . For the class of CNF-DNF conversion based SAT algorithms described above, our results imply that no algorithm within this framework has complexity mO(1) 2cn for some constant c < 1, though we cannot rule out an algorithm of this kind with complexity mO(1) 2n−Ω(n/ log n) which would still be a very interesting result.

2

Functions with a Large Blow-Up

In this section, we show functions with small cnfsize but large dnfsize. Our functions will be the conjunction of a small number of parity and majority functions. To estimate the cnfsize and the dnfsize of such functions, we will need use a lemma. Recall, that a prime implicant t of a boolean function f is called an essential prime implicant if there is an input x such that t(x) = 1 but t (x) = 0 for all other prime implicants t of f . We denote the number of essential prime implicants of f by ess(f ). Lemma 1. Let f (x) = i=1 gi (x), where the gi ’s depend on disjoint sets of variables and no gi is identically 0. Then, cnfsize(f ) =

cnfsize(gi ) and

dnfsize(f ) ≥ ess(f ) =

i=1

ess(gi ).

i=1

Proof. First, consider cnfsize(f ). This part is essentially Theorem 1 of Voigt and Wegener [14]. We recall their argument. Clearly, we can put together the CNFs of the gi ’s and produce a CNF for f with size at most i=1 cnfsize(gi ). To show that cnfsize(f ) ≥ i=1 cnfsize(gi ), let C be the set of clauses of the smallest CNF of f . We may assume that all clauses in C are prime clauses of f . Because the gi ’s depend on disjoint variables, every prime clause of f is a prime clause of exactly one gi . Thus we obtain a natural partition {C1 , C2 , . . . , C } of C where each clause in Ci is a prime clause of gi . Consider a setting to the variables of gj (j = i) that makes each such gj take the value 1 (this is possible because no gj is identically 0). Under this restriction, the function f reduces to gi and all clauses outside Ci are set to 1. Thus, gi ≡ c∈Ci c, and |Ci | ≥ cnfsize(gi ). The ﬁrst claim follows from this. It is well-known since Quine [11] (see also, e.g., Wegener [15, Chapter 2, Lemma 2.2]) that dnfsize(f ) ≥ ess(f ). Also, it is easy to see that any essential prime implicant of f is the conjunction of essential prime implicants of gi and every conjunction of essential prime implicants of gi is an essential prime implicant of f . Our second claim follows from this.

We will apply the above lemma with the parity and majority functions as gi ’s. It is well-known that the parity function on n variables, deﬁned by ∆

Parn (x) =

n

i=1

xi =

n i=1

xi

(mod 2),

616

Peter Bro Miltersen, Jaikumar Radhakrishnan, and Ingo Wegener

has cnfsize and dnfsize equal to 2n−1 . For monotone functions, it is known that for the majority function on n variables, deﬁned by Maj(x) = 1 ⇔ has cnfsize and dnfsize equal to

n n/2

n

xi ≥

i=1

n , 2

.

Deﬁnition 1. Let the set of n variables {x1 , x2 , . . . , xn } be partitioned into = n/k sets S1 , . . . S where |Si | = k for i < . The functions fk,n , hk,n : {0, 1}n → {0, 1} are deﬁned as follows: fk,n (x) =

i=1 j∈Si

xj and hk,n (x) =

Maj(xj : j ∈ Si ).

i=1

Theorem 1. Suppose 1 ≤ k ≤ n. Then n · 2k−1 and dnfsize(fk,n ) = 2n−n/k ; cnfsize(fk,n ) ≤ k

n/k n k k cnfsize(hk,n ) ≤ · and dnfsize(hk,n ) ≥ . k k/2 k/2 k Proof. As noted above cnfsize(Park ) = 2k−1 and cnfsize(Majn ) = k/2 . Also, k k−1 and ess(Majn ) = k/2 . Our theorem it is easy to verify that ess(Park ) = 2 follows easily from this using Lemma 1.

Remark: One can determine the dnfsize of fk,n and hk,n directly using a general result of Voigt and Wegener [14], which states that the dnfsize(g1 ∧ g2 ) = dnfsize(g1 ) · dnfsize(g2 ) whenever g1 and g2 are symmetric functions on disjoint sets of variables. This is not true for general functions g1 and g2 (see Voigt and Wegener [14]). Corollary 1. 1. Let 2n ≤ m ≤ 2n−1 . There is a function f with cnfsize(f ) ≤ m and dnfsize(f ) ≥ 2n−2n/ log(m/n) . n 2. Let 4n ≤ m ≤ n/2 . Then, there is a monotone function h with cnfsize(h) ≤ m and dnfsize(h) ≥ 2n−n

log log(m/n) −log(m/n) log(m/n)

.

Proof. The ﬁrst part follows from Theorem 1, by considering fk,n for k = log2 (m/n). The second part follows from the Theorem 1, by considering hk,n k ≤ 2k−1 (valid for with the same value of k. We use the inequality 2k /k ≤ k/2 k ≥ 2).

Let us understand what this result says for a range of parameters, assuming n is large.

On Converting CNF to DNF

617

Case m = cn: There is a function with linear cnfsize but exponential dnfsize. For > 0, by choosing c = θ(22/ ), the dnfsize can be made at least 2(1−)n . −1 n Case m = nc : We can make dnfsize(f ) = 2n−O(c log n ) . By choosing c large we obtain in the exponent an arbitrarily small constant for the (n/ log n)-term. Case m = 2o(n) : We can make dnfsize(f ) grow at least as fast as 2n−α(n) , for each α = ω(1). Monotone functions: We obtain a monotone function whose cnfsize is nat most log log n a polynomial m = nc , but whose dnfsize can be made as large as 2n−ε log n . Here, ε = O(c−1 ).

3

Upper Bounds on the Blow-Up

In this section, we show the upper bound on dnfsize(m, n) claimed in the introduction. We will use restrictions to analyse CNFs. So, we ﬁrst present the necessary background about restrictions, and then use it to derive our result. 3.1

Preliminaries

Deﬁnition 2 (Restriction). A restriction on a set of variables V is a function ρ : V → {0, 1, }. The set of variables in V assigned by ρ are said to have been left free by ρ and denoted by free(ρ); the remaining variables set(ρ) = V − free(ρ) are said to be set by ρ. Let S ⊆ V . We use RVS to denote the set of all restrictions ρ with set(ρ) = S. For a Boolean function f on variables V and a restriction ρ, we denote by fρ the function with variables free(ρ) obtained from f by ﬁxing all variables x ∈ set(V ) at the value ρ(x). The following easy observation lets us conclude that if the subfunctions obtained by applying restrictions have small dnfsize then the original function also has small dnfsize. Lemma 2. For all S ⊆ V and all boolean functions f with variables V , dnfsize(f ) ≤ dnfsize(fρ ). ρ∈RV S

Proof. Let Φfρ denote the smallest DNF for fρ . For a restriction ρ ∈ RVS , let t(ρ) be the term consisting of literals from variables in S that is made 1 by ρ and 0 by all other restrictions in RVS . (No variables outside S appears in t(ρ). Every variables in S appears in t(ρ): the variable x appears unnegated if and only if ρ(x) = 1.) Then, Φ = ρ∈RV t(ρ) ∧ Φfρ gives us a DNF for f of the required S size.

In light of this observation, to show that the dnfsize of some function f is small, it suﬃces to somehow obtain restrictions of f that have small dnfsize. Random restrictions are good for this. We will use random restrictions in two ways. If the clauses of a CNF have a small number of literals, then the switching

618

Peter Bro Miltersen, Jaikumar Radhakrishnan, and Ingo Wegener

lemma of H˚ astad[6] and Beame [1] when combined with Lemma 2 immediately gives us a small DNF (see Lemma 4 below). We are, however, given a general CNF not necessarily one with small clauses. Again, random restrictions come to our aid: with high probability large clauses are destroyed by random restrictions (see Lemma 5). Deﬁnition 3 (Random Restriction). When we say that ρ is a random restriction on the variables in V leaving variables free, we mean that ρ is generated as follows: ﬁrst, pick a set S of size |V | − at random with uniform distribution; next, pick ρ with uniform distribution from RVS . We will need the following version of the switching lemma due to Beame [1]. Lemma 3 (Switching Lemma). Let f be a function on n variables with a a CNF whose clauses have at most r literals. Let ρ be a random restriction leaving variables free. Then Pr[fρ does not have a decision tree of depth d] < (7r/n)d . We can combine Lemma 2 and the switching lemma to obtain small DNFs for functions with CNFs with small clauses. n Lemma 4. Let 1 ≤ r ≤ 100 . Let f have a CNF on n variables where each clause 1 n has at most r literals. Then, dnfsize(f ) ≤ 2n− 100 · r .

Proof. Let V be 1the set of variables of f . Let ρ be a random restriction on V that leaves = 15 · nr variables free. By the switching lemma, with probability more than 1 − 2−d , fρ has a decision tree of depth at most d. We can ﬁx S ⊆ V so that this event happens with this probability even when conditioned on set(ρ) = S, that is, when ρ is chosen at random with uniform distribution from RVS . If fρ has a decision tree of depth at most d, then it is easy to see that dnfsize(fρ ) ≤ 2d . In any case, dnfsize(fρ ) ≤ 2−1 . Thus, by Lemma 2, we have dnfsize(f ) ≤ 2n− · 2d + 2n− · 2−d · 2−1 . 1 n Set d = 2 . Then, dnfsize(f ) ≤ ρ∈RV dnfsize(fρ ) ≤ 2n− 2 +1 ≤ 2n− 100 · r . S

Lemma 5. Let V be a set of n variables, and K a set of literals distinct on variables. Let |K| = k. Let ρ be a random restriction that leaves n2 variables free. Then, k Pr[no literal in K is assigned 1] ≤ 2e− 8 . ρ

Proof. Let W be the set of variables that appear in K either in negated or nonnegated form. Using estimates for the tail of the hypergeometric distribution [2], we see ﬁrst have k k Pr[|W ∩ set(ρ)| ≤ ] ≤ exp(− ). 4 8 k k Furthermore, Pr[no literal in K is assigned 1 | |W ∩ set(ρ)| ≥ ] ≤ 2− 4 . Thus, 4 k

k

k

Pr[no literal in K is assigned 1] ≤ e− 8 + 2− 4 < 2e− 8 . ρ

On Converting CNF to DNF

3.2

619

Small DNFs from Small CNFs

We now show that the blow-up obtained in the previous section (see Corollary 1) is essentially optimal. Theorem 2. There is a constant c > 0, such that for all large n, and m ∈ −4 [104 n, 210 n ], n dnfsize(m, n) ≤ 2n−c log(m/n) . Proof. Let f be a Boolean function on a set V of n variables, and let Φ be a CNF for f with at most m clauses. We wish to show that f has a DNF of small size. By comparing the present bound with Lemma 4, we see that our job would be done if we could somehow ensure that the clauses in Φ have at most O(log(m/n)) literals. All we know, however, is that Φ has at most m clauses. In order to prepare Φ for an application of Lemma 4, we will attempt to destroy the large clauses of Φ by applying a random restriction. Let ρ be a random restriction on V that leaves n2 variables free. We cannot claim immediately that all large clause are likely to be destroyed by this restriction. Instead, we will use the structure of the surviving large clauses to get around them. The following predicate will play a crucial role in our proof. E(ρ): There is a set S0 ⊆ free(ρ) of size at most n/10 so that every clause ∆ of Φ that is not killed by ρ has at most r = 100 log(m/n) free variables outside S0 . n

Claim. Prρ [E(ρ)] ≥ 1 − 2− 100 . Before we justify this claim, let us see how we can exploit it to prove our n theorem. Fix a choice of S ⊆ V such that Pr[E(ρ) | set(ρ) = S] ≥ 1 − 2− 100 . Let F = V − S. We will concentrate only on ρ’s with set(ρ) = S, that is, ρ’s from the set RVS . We will build a small DNF for f by putting together the DNFs for the diﬀerent fρ ’s. The key point is that whenever E(ρ) is true, we will be able to show that fρ has a small DNF. E(ρ) is true: Consider the set S0 ⊆ free(ρ) whose existence is promised in the deﬁnition of E(ρ). The deﬁnition of S0 implies that for each σ ∈ RF S0 all clauses of Φσ◦ρ have at most r literals. By Lemma 4, dnfsize(fσ◦ρ ) ≤ 2|F |−|S0 |−

|F |−|S0 | 100r

, and by Lemma 2, we have |F |−|S0 | |F |−|S0 | dnfsize(fσ◦ρ ) ≤ 2|S0 | 2|F |−|S0 |− 100r ≤ 2|F |− 100r . dnfsize(fρ ) ≤ σ∈RF S

0

E(ρ) is false: We have dnfsize(fρ ) ≤ 2|F |−1 . Using these bounds for dnfsize(fρ ) for ρ ∈ RVS in Lemma 2 we obtain dnfsize(f ) ≤ 2|S| · 2|F |−

|F |−|S0 | 100r

n

+ 2|S| 2− 100 2|F |−1 = 2n (2−

|F |−|S0 | 100r

n

+ 2− 100 ).

The theorem follows from this because |F | − |S0 | = Ω(n) and r = O(log(m/n)). We still have to prove the claim.

620

Peter Bro Miltersen, Jaikumar Radhakrishnan, and Ingo Wegener

Proof of Claim. Suppose E(ρ) is false. We will ﬁrst show that there is a set of at most n/(10(r + 1)) surviving clauses in Φρ that together involve at least n/10 variables. The following sequential procedure will produce this set of clauses. Since E does not hold, there is some (surviving) clause c1 of Φρ with at least r +1 variables. Let T be the set of variables that appear in this clause. If |T | ≥ n/10, then we stop: {c1 } is the set we seek. If |T | < n/10, there must be another clause c2 of Φρ with r + 1 variables outside T , for otherwise, we could take S0 = T and E(ρ) would be true. Add to T all the variables in c2 . If |T | ≥ n/10, we stop with the set of clauses {c1 , c2 }; otherwise, arguing as before there must be another clause c3 of Φρ with r + 1 variables outside T . We continue in this manner, picking a new clause and adding at least r + 1 elements to T each time, as long n n as |T | < 10 . Within n/(10(r + 1)) steps we will have |T | ≥ 10 , at which point we stop. For a set C of clauses of Φ, let K(C) be a set of literals obtained by picking one literal for each variable that appears in some clause in C. By the discussion above, for E(ρ) to be false, there must be some set C of clauses of Φ such that ∆ n |C| ≤ n/(10(r + 1)) = a, K(C) ≥ 10 and no literal in K(C) is assigned 1 by ρ. Thus, using Lemma 5, we have Pr[¬E(ρ)] ≤ Pr[no literal in K(C) is assigned 1 by ρ] ρ

n C,|C|≤a,|K(C)|≥ 10

≤

a m j=1

j

n

ρ

n

· 2e− 80 ≤ 2− 100 .

To justify the last inequality, we used the assumption that n is large and m ∈ −4 [104 n, 210 n ]. We omit the detailed calculation. This completes the proof of the claim.

4

Conclusion and Open Problems n

We have shown lower and upper bounds for dnfsize(m, n) of the form 2n−c log(m/n) . The constant c in the lower and upper bounds are far, and it would be interesting to bring them closer, especially when m = An for some constant A. Our bounds are not tight for monotone functions. In particular, what is the largest possible blow-up in size when converting a polynomial-size monotone CNF to an equivalent optimal-size monotone DNF? Equivalently, what is the largest possible number of distinct minimal vertex covers for a hypergraph with n vertices and nO(1) edges? We have given an upper bound 2n−Ω(n/ log n) and a lower bound 2n−O(n log log n/ log n) . Getting tight bounds seems challenging.

Acknowledgements We thank the referees for their comments.

On Converting CNF to DNF

621

References 1. Beame, P.: A switching lemma primer. Technical Report UW-CSE-95-07-01, Department of Computer Science and Engineering, University of Washington (November 1994). Available online at www.cs.washington.edu/homes/beame/. 2. Chv´ atal, V.: The tail of the hypergeometric distribution. Discrete Mathematics 25 (1979) 285–287. 3. Bollig, B. and Wegener, I.: A very simple function that requires exponential size read-once branching programs. Information Processing Letters 66 (1998) 53–57. 4. Dantsin, E., Goerdt, A., Hirsch, E.A., and Sch¨ oning, U.: Deterministic algorithms for k-SAT based on covering codes and local search. Proceedings of the 27th International Colloquium on Automata, Languages and Programming. Springer. LNCS 1853 (2000) 236–247. 5. Dantsin, E., Goerdt, A., Hirsch, E.A., Kannan, R., Kleinberg, J., Papadimitriou, C., Raghavan, P., and Sch¨ oning, U.: A deterministic (2 − 2/(k + 1))n algorithm for k-SAT based on local search. Theoretical Computer Science, to appear. 6. H˚ astad, J.: Almost optimal lower bounds for small depth circuits. In: Micali, S. (Ed.): Randomness and Computation. Advances in Computing Research, 5 (1989) 143–170. JAI Press. 7. Hofmeister, T., Sch¨ oning, U., Schuler, R., and Watanabe, O.: A probabilistic 3-SAT algorithm further improved. Proceedings of STACS, LNCS 2285 (2002) 192–202. 8. Katajainen, J. and Madsen, J.N.: Performance tuning an algorithm for compressing relational tables. Proceedings of SWAT, LNCS 2368 (2002) 398–407. 9. Monien, B. and Speckenmeyer, E.: Solving satisﬁability in less than 2n steps. Discrete Applied Mathematics 10 (1985) 287–295. 10. Paturi, R., Pudl` ak, P., Saks, M.E., and Zane, F.: An improved exponential-time algorithm for k-SAT. Proceedings of the 39th IEEE Symposium on the Foundations of Computer Science (1998) 628–637. 11. W. V. O. Quine: On cores and prime implicants of truth functions. American Mathematics Monthly 66 (1959) 755–760. 12. Razborov, A. and Rudich, S.: Natural proofs. Journal of Computer and System Sciences 55 (1997) 24–35. 13. Sch¨ oning, U.: A probabilistic algorithm for k-SAT based on limited local search and restart. Algorithmica 32 (2002) 615–623. 14. Voigt, B., Wegener, I.: Minimal polynomials for the conjunctions of functions on disjoint variables an be very simple. Information and Computation 83 (1989) 65– 79. 15. Wegener, I.: The Complexity of Boolean Functions. Wiley 1987. Freely available via http://ls2-www.cs.uni-dortmund.de/∼wegener. 16. Wegener, I.: Branching Programs and Binary Decision Diagrams – Theory and Applications. SIAM Monographs on Discrete Mathematics and Applications 2000.

A Basis of Tiling Motifs for Generating Repeated Patterns and Its Complexity for Higher Quorum∗ N. Pisanti1 , M. Crochemore2,3,∗∗ , R. Grossi1 , and M.-F. Sagot4,3,∗∗∗ 1

2

Dipartimento di Informatica, Universit` a di Pisa, Italy {pisanti,grossi}@di.unipi.it Institut Gaspard-Monge, University of Marne-la-Vall´ee, France [email protected] 3 INRIA Rhˆ one Alpes, France [email protected] 4 King’s College London, UK

Abstract. We investigate the problem of determining the basis of motifs (a form of repeated patterns with don’t cares) in an input string. We give new upper and lower bounds on the problem, introducing a new notion of basis that is provably smaller than (and contained in) previously deﬁned ones. Our basis can be computed in less time and space, and is still able to generate the same set of motifs. We also prove that the number of motifs in all these bases grows exponentially with the quorum, the minimal number of times a motif must appear. We show that a polynomial-time algorithm exists only for ﬁxed quorum.

1

Introduction

Identifying repeated patterns in strings is a computationally-demanding task on the large data sets available in computational biology, data mining, textual document processing, system security, and other areas; for instance, see [6]. We consider patterns with don’t cares in a given string s of n symbols drawn over an alphabet Σ. The don’t care is a special symbol ‘◦’ matching any symbol of Σ; for example, pattern T◦E matches both TTE and TEE inside s = COMMITTEE (note that a pattern cannot have a don’t care at the beginning or at the end, as this is not considered informative). Contrarily to string matching with don’t cares, the pattern T◦E is not given in advance for searching s. Instead, the patterns with don’t cares appearing in s are unknown and, as such, have to be discovered and extracted by processing s eﬃciently. In our example, T◦E and M◦◦T◦E are among the patterns appearing repeated in COMMITTEE. In this paper we focus ∗ ∗∗ ∗∗∗

The full version of this paper is available in [11] as technical report TR-03-02. Supported by CNRS action AlBio, NATO Sc. Prog. PST.CLG.977017, and Wellcome Trust Foundation. Supported by CNRS-INRIA-INRA-INSERM action BioInformatique and Wellcome Trust Foundation.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 622–631, 2003. c Springer-Verlag Berlin Heidelberg 2003

A Basis of Tiling Motifs for Generating Repeated Patterns

623

on ﬁnding the patterns called motifs, which appear at least q times in s for an input parameter q ≥ 2 called the quorum. Diﬀerent formulations in the known literature address the problem of detecting motifs in several contexts, revealing its algorithmic relevance. Unfortunately, the complexity of the algorithms for motif discovery may easily become exponential due to the explosive growth of the motifs in strings, such as in the artiﬁcial string A · · · ATA · · · A (same number of As on both sides of T) generating many motifs with As intermixed with don’t cares, and in other “real” strings over a small alphabet occurring in practice, e.g., DNA sequences. Some heuristics try to alleviate this drawback by reducing the number of interesting motifs to make feasible any further processing of them, but they cannot guarantee sub-exponential bounds in the worst case [7]. In this paper, we explore the algorithmic ideas behind motif discovery while getting some insight into their combinatorial complexity and their connections with string algorithmics. Given a motif x for a string s of length n, we denote the set of positions on s at which the occurrences of x start by Lx ⊆ [0. .n−1], where |Lx | ≥ q holds for the given quorum q ≥ 2. We single out the maximal motifs x, informally characterized as satisfying |Lx | = |Ly | for any other motif y more speciﬁc than x, i.e., obtained from x by adding don’t cares and alphabet letters or by replacing one or more don’t cares with alphabet letters. In other words, x appears in y but x occurs in s more times than y does, which is considered informative for discovering the repetitions in s. For example, M◦◦T◦E is maximal in COMMITTEE for q = 2 while M◦◦◦◦E and T◦E are not maximal since M◦◦T◦E is more speciﬁc with the same number of occurrences. Maximality provides an intuitive notion of relevance as each maximal motif x indirectly represents all non-maximal motifs z that are less speciﬁc than it. Unfortunately, this property does not bound signiﬁcantly the number of maximal motifs. For example, A · · · ATA · · · A contains an exponential number of them for q = 2 (see Section 2). A further requirement on the maximal motifs is the notion of irredundant motifs ([7]). A maximal motif x is redundant if there exist maximal motifs y1 , . . . , yk = x such that the set of occurrences of x satisﬁes Lx = Ly1 ∪ . . . ∪ Lyk ; it is irredundant otherwise. The set of occurrences of a redundant motif can be covered by other sets of occurrences while that of an irredundant motif is not the union of the sets of occurrences of other maximal motifs. The basis of the irredundant motifs of string s with quorum q is the set of irredundant motifs in s. Informally speaking, a basis can generate all the motifs by simple rules and can be expressed mathematically in the algebraic sense of the term. According to Parida et al. [7], what makes interesting the irredundant motifs is that their number is always upper bounded by 3n independently of any chosen q ≥ 2; moreover, they can be found in O(n3 log n) time by this bound, notwithstanding the possibly exponential number of maximal motifs that are candidates for the basis. Our results: We study the complexity of ﬁnding the basis of motifs with novel algorithms to represent all motifs succinctly. We show that, in the worst case, there is an inﬁnite family of strings for which the basis contains Ω(n2 ) irredundant motifs for q = 2 (see Section 2). This contradicts the upper bound of 3n for any q ≥ 2 given in [7] as shown (in the Appendix of [11] we give a

624

N. Pisanti et al.

counterexample to its charging scheme, which crucially relies on a lemma that is not valid). As a result, the bound of O(n3 log n) time in [7] for any q does not hold since it relies on the upper bound of 3n, thus leaving open the problem of discovering a basis in polynomial time for any q. We also introduce a new deﬁnition called basis of the tiling motifs of string s with quorum q. The condition for tiling motifs is stronger than that of irredundancy. A maximal motif x is tiled if there exist maximal motifs y1 , . . . , yk = x such that the set of occurrences of x satisﬁes Lx = (Ly1 + d1 ) ∪ . . . ∪ (Lyk + dk ) for some integers d1 , . . . , dk ; it is tiling otherwise. Note that the motifs y1 , . . . , yk are not necessarily distinct and the union of their occurrences is taken after displacing them by d1 , . . . , dk , respectively. Since a redundant motif is also tiled with d1 = · · · = dk = 0, a tiling motif is surely irredundant. Hence the basis for the tiling motifs is included in the basis for irredundant motifs while both of them are able to generate the same set of motifs with mechanical rules. Although the deﬁnition of tiling motifs is derived from that of irredundant ones, the diﬀerence is much more substantial than it may appear. The basis of tiling motifs is symmetric, namely, the tiling motifs of s (the string s in reversed order) are the reversed tiling motifs of s whereas the irredundant motifs for strings s and s are apparently unrelated, unlike the entropy and other properties related to the repetitions in strings. Moreover, the number of tiling motifs can be provably upper bounded in the worst case by n − 1 for q = 2 and they occur in s for a total of 2n times at most, whereas we demonstrate that there can be Ω(n2 ) irredundant motifs. We give more details in Section 3, and we also discuss in the full paper [11] how to ﬁnd the longest motifs with a limited number of don’t cares. Finally, in Section 4, we reveal an exponential dependency on the quorum q for the number of motifs, both for the basis of irredundant motifs and for the basis of tiling motifs, which was unnoticed in previous work. We prove n−1 that there is an family of inﬁnite 1 n−1 2 −1 = Ω strings for which the basis contains at least q−1 tiling (hence, 2q q−1 irredundant) motifs. Hence, no worst-case polynomial-time algorithm can exist for ﬁnding the basis with arbitrary values of q ≥ 2. we can prove Nonetheless, that the tiling motifs in our basis are less than n−1 q−1 in number and occur in s a total of q n−1 q−1 times at2 most. For them there exists a pseudo-polynomial time, which shows that the tiling motifs can be algorithm taking O q 2 n−1 q−1 found in polynomial time if and only if the quorum q satisﬁes either q = O(1) or q = n − O(1) (the latter is hardly meaningful in practice). Experimenting with small strings exhibits a non-constant growth of the basis for increasing values of q up to O(log n) but larger values of q are possible in the worst case. More experimental analysis of the implementation can be found in [11]. Proofs of all results can also be found in [11]. Related work: As previously mentioned, the seminal idea of basis was introduced by Parida et al. [7]. The unpublished manuscript [1] adopted an identical deﬁnition of irredundant motifs in the ﬁrst part. Very recently, Apostolico [4] observed that the O(n3 )-time algorithm proposed in the second part of [1] contains an implicit deﬁnition diﬀerent from that of the ﬁrst part. Namely, in a redundant motif x, the list Lx can be “deduced” from the union of the oth-

A Basis of Tiling Motifs for Generating Repeated Patterns

625

ers (see also [3]). Note that no formal speciﬁcation of this alternative deﬁnition is however explicited. Applications of the basis of repeated patterns (with just q = 2) to data compression are described in [2]. Tiling motifs can be employed in this context because of their linear number of occurrences in total. The idea of the basis was also explored by Pelfrˆene et al. [8,9], who introduced the notion of primitive motifs. They gave two alternative deﬁnitions claimed to be equivalent, one deﬁnition reported in the two-page abstract accompanying the poster and the other in the poster itself. The basis deﬁned in the poster is not symmetric and is a superset of the one presented in this paper. On the other hand, the deﬁnition of primitive motifs given in the two-page abstract is somehow equivalent to that given in this paper and introduced independently in our technical report [10]. Because of the lower bounds proved in this paper, the algorithm in [9] is exponential with respect to q. The problem of ﬁnding a polynomial-size basis for higher values of q remains unsolved.

2

Irredundant Motifs: The Basis and Its Size for q = 2

We consider strings that are ﬁnite sequences of letters drawn from an alphabet Σ, whose elements are also called solid characters. We introduce an additional letter (denoted by ◦ and called don’t care) that does not belong to Σ and matches any letter. The length of a string t with don’t cares, denoted by |t|, is the number of letters in t, and t[i] indicates the letter at position i in t for 0 ≤ i ≤ |t| − 1 (hence, t = t[0]t[1] · · · t[|t| − 1] also noted t[0 . . |t| − 1]). A pattern is a string in Σ ∪ Σ(Σ ∪ {◦})∗ Σ, that is, it starts and ends with a solid character. The pattern occurrences are related to the speciﬁcity relation . For individual characters σ1 , σ2 ∈ Σ ∪ {◦}, we have σ1 σ2 if σ1 = ◦ or σ1 = σ2 . Relation extends to strings in (Σ ∪ {◦})∗ under the convention that each string t is implicitly surrounded by don’t cares, namely, letter t[j] is ◦ when j < 0 or j ≥ |t|. In this way, v is more speciﬁc than u (shortly, u v) if u[j] v[j] for any integer j. We also say that u occurs at position in v if u[j] v[ + j], for 0 ≤ j ≤ |u| − 1. Equivalently, we say that u matches v[] · · · v[ + |u| − 1]. For the input string s ∈ Σ ∗ with n = |s|, we consider the occurrences of arbitrary patterns x in s. The location list Lx ⊆ [0 . . n − 1] denotes the set of all the positions on s at which x occurs. For example, the location list of x = T◦E in s = COMMITTEE is Lx = {5, 6}. Deﬁnition 1 (Motif ). Given a parameter q ≥ 2 called quorum, we say that pattern x is a motif according to s and q if |Lx | ≥ q. Given any location list Lx and any integer d, we adopt the notation Lx + d = { + d | ∈ Lx } for indicating the occurrences in Lx “displaced” by the oﬀset d. Deﬁnition 2 (Maximality). A motif x is maximal if any other motif y such that x occurs in y satisﬁes Ly = Lx + d for some integer d. Making a maximal motif x more speciﬁc (thus obtaining y) reduces the number of its occurrences in s. Deﬁnition 2 is equivalent to that in [7] stating that x is

626

N. Pisanti et al.

maximal if there exist no other motif y and no integer d ≥ 0 verifying Lx = Ly +d, such that y[j + d] x[j] for 0 ≤ j ≤ |x| − 1. Deﬁnition 3 (Irredundant Motif ). A maximal motif x is irredundant if, for any maximal motifs y1 , y2 , . . . , yk such that Lx = ∪ki=1 Lyi , motif x must be one of the yi ’s. Vice versa, if all the yi ’s are diﬀerent from x, pattern x is said to be covered by motifs yi , y2 , . . . , yk . The basis of irredundant motifs for string s is the set of all irredundant motifs in s, useful as a generator for all maximal motifs in s (see [7]). The size of the basis is the number of irredundant motifs contained in it. We now show the existence of an inﬁnite family of strings sk (k ≥ 5) for which there are Ω(n2 ) irredundant motifs in the basis already for quorum q = 2, where n = |sk |. In this way, we disprove the upper bound of 3n which is based on an incorrect lemma (see also [11]). Each string sk is the suitable extension of tk = Ak TAk , where Ak denotes the letter A repeated k times (our argument works also for z k wz k , where |z| = |w| and z is a string not sharing any common character with w). String tk has an exponential number of maximal motifs, including those having the form A{A, ◦}k−2 A with exactly two don’t cares. To see why, each such motif x occurs four times in tk : speciﬁcally, two occurrences of x match the ﬁrst and the last k letters in tk while each distinct don’t care in x matching the letter T in tk contributes to one of the two remaining occurrences. Extending x or replacing a don’t care with a solid character reduces the number of these occurrences, so x is maximal. The idea of our proof is to obtain strings sk by preﬁxing tk with O(|tk |) symbols to transform the above maximal motifs x into irredundant motifs for sk . Since there are Θ(k 2 ) of them, and n = |sk | = O(|tk |) = O(k), this leads to the result. In order to deﬁne sk on the alphabet {A, T, u, v, w, x, y, z, a1 , a2 , . . . , ak−2 }, we introduce a few notations. Let u be the reversal of u, and let ev k , od k , uk , vk be if k is even : ev k od k uk vk

= a2 a4 · · · ak−2 , = a1 a3 · · · ak−3 , = ev k u e v k vw ev k , k z od k , = od k xy od

if k is odd : ev k od k uk vk

= a2 a4 · · · ak−3 , = a1 a3 · · · ak−2 , = ev k uv e v k wx ev k , k z od k . = od k y od

The strings sk are then deﬁned by sk = uk vk tk for k ≥ 5. Lemma 1. The length of uk vk is 3k, and that of sk is n = 5k + 1. Proposition 1. For 1 ≤ p ≤ k − 2, any motif of the form Ap ◦ Ak−p−1 with one don’t care cannot be maximal in sk . Also motif Ak cannot be maximal in sk . Proposition 2. Each motif of the form A{A, ◦}k−2 A with exactly two don’t cares is irredundant in sk . Theorem 1. The basis for string sk contains Ω(n2 ) irredundant motifs, where n = |sk | and k ≥ 5.

A Basis of Tiling Motifs for Generating Repeated Patterns

3

627

Tiling Motifs: The Basis and Its Properties

In this section we introduce a natural notion of basis for generating all maximal motifs occurring in a string s of length n. Analogously to what was done for maximal motifs in Deﬁnition 2, we introduce displacements while deﬁning tiling motifs for this purpose. Deﬁnition 4 (Tiling Motif ). A maximal motif x is tiling if, for any maximal motifs y1 , y2 , . . . , yk and for any integers d1 , d2 , . . . , dk such that Lx = ∪ki=1 (Lyi + di ), motif x must be one of the yi ’s. Vice versa, if all the yi ’s are diﬀerent from x, pattern x is said to be tiled by motifs y1 , y2 , . . . , yk . The notion of tiling is more selective than that of irredundancy in general. For example, in the string s = FABCXFADCYZEADCEADC, motif x1 = A◦C is irredundant but it is tiled by x2 = FA◦C and x3 = ADC according to Deﬁnition 4 since its location list, Lx1 = {1, 6, 12, 16}, can be obtained from the union of Lx2 = {0, 5} and Lx3 = {6, 12, 16} with respective displacements d2 = 1 and d3 = 0. A fairly direct consequence of Deﬁnition 4 is that if x is tiled by y1 , y2 , . . . , yk with associated displacements d1 , d2 , . . . , dk , then x occurs at position di in each yi for 1 ≤ i ≤ k (hence di ≥ 0). Note that the yi ’s in Deﬁnition 4 are not necessarily distinct and that k > 1 for tiled motifs (it follows from the fact that Lx = Ly1 +d1 with x = y1 would contradict the maximality of both x and y1 ). As a result, a maximal motif x occurring exactly q times in s is tiling as it cannot be tiled by any other motifs (we need at least two of them, which is impossible). The basis of tiling motifs is the complete set of all tiling motifs for s, and the size of the basis is the number of these motifs. For example, the basis B for FABCXFADCYZEADCEADC contains FA◦C, EADC, and ADC as tiling motifs. Although Deﬁnition 4 is derived from that of irredundant motifs given in Deﬁnition 3, the diﬀerence is much more substantial than it may appear. The basis of tiling motifs relies on the fact that tiling motifs are considered as invariant by displacement as for maximality. Consequently, our deﬁnition of basis is symmetric, that is, each tiling motif in the basis for the reverse string s is the reverse of a tiling motif in the basis of s. This follows from the symmetry in Deﬁnition 4 and from the fact that maximality is also symmetric in Deﬁnition 2. It is a sine qua non condition for having a notion of basis invariant by the left-to-right or right-to-left order of the symbols in s (like the entropy of s), while this property does not hold for the irredundant motifs. The basis of tiling motifs has further interesting properties. Later in this section, we show that our basis is linear for quorum q = 2 (i.e., its size is at most n − 1) and that the total size of the location lists for the tiling motifs is less than 2n, describing how to ﬁnd the basis in O(n2 log n log |Σ|) time. In the full paper [11], we discuss some applications such as generating all maximal motifs with the basis and ﬁnding motifs with a constraint on the number of don’t cares. Given a string s of length n, let B denote its basis of tiling motifs for quorum q = 2. Although the number of maximal motifs may be exponential and the basis of irredundant motifs may be at least quadratic (see Section 2), we show that the size of B is always less than n. For this, we introduce an operator ⊕ between the symbols of Σ to deﬁne merges, which are at the heart of

628

N. Pisanti et al.

the properties on B. Given two letters σ1 , σ2 ∈ Σ with σ1 = σ2 , the operator satisﬁes σ1 ⊕ σ2 = ◦ and σ1 ⊕ σ1 = σ1 . The operator applies to any pair of strings x, y ∈ Σ ∗ , so that u = x ⊕ y satisﬁes u[j] = x[j] ⊕ y[j] for all integers j. A merge is the motif resulting from applying the operator ⊕ to s and to its suﬃx at position k. Deﬁnition 5 (Merge). For 1 ≤ k ≤ n − 1, let sk be the string whose character at position i is sk [i] = s[i] ⊕ s[i + k]. If sk contains at least one solid character, Merge k denotes the motif obtained by removing all the leading and trailing don’t cares in sk (i.e., those appearing before the leftmost solid character and after the rightmost solid character). For example, the string FABCXFADCYZEADCEADC has Merge 4 = EADC, Merge 5 = FA◦C, Merge 6 = Merge 10 = ADC and Merge 11 = Merge 15 = A◦C. The latter is the only merge that is not a tiling motif. Lemma 2. If Merge k exists, it must be a maximal motif. Lemma 3. For each tiling motif x in the basis B, there is at least one k for which Merge k = x. Theorem 2. Given a string s of length n and the quorum q = 2, let M be the set of Merge k , for 1 ≤ k ≤ n − 1 such that Merge k exists. The basis B of tiling motifs for s satisﬁes B ⊆ M, and therefore the size of B is at most n − 1. A simple consequence of Theorem 2 implies a tight bound on the number of tiling motifs for periodic strings. If s = we for a string w repeated e > 1 times, then s has at most |w| tiling motifs. Corollary 1. The number of tiling motifs for s is ≤ p, the smallest period of s. The bound in Corollary 1 is not valid for irredundant motifs. For example, string s = ATATATATA has period p = 2 and only one tiling motif ATATATA, while its irredundant motifs are A, ATA, ATATA and ATATATA. We describe how to compute the basis B for string s when q = 2. A bruteforce algorithm generating ﬁrst all maximal motifs of s takes exponential time in the worst case. Theorem 2 plays a crucial role in that we ﬁrst compute the motifs in M and then discard those being tiled. Since B ⊆ M, what remains is exactly B. To appreciate this approach, it is worth noting that we are left with the problem of selecting B from n − 1 maximal motifs in M at most, rather than selecting B among all the maximal motifs in s, which may be exponential in number. Our simple algorithm takes O(n2 log n log |Σ|) time and is faster than previous (and more complicated) methods. Step 1. Compute the Multiset M of Merges. Letting sk [i] be the leftmost solid character of string sk in Deﬁnition 5, we deﬁne occ x = {i, i + k} to be the positions of the two occurrences of x whose superposition generates x = Merge k . For k = 1, 2, . . . , n−1, we compute string sk in O(n−k) time. If sk contains some

A Basis of Tiling Motifs for Generating Repeated Patterns

629

solid characters, we compute x = Merge k and occ x in the same time complexity. As a result, we compute the multiset M of merges in O(n2 ) time. Each merge x in M is identiﬁed by a triplet i, i + k, |x|, from which we can recover the jth symbol of x in constant time by simple arithmetic operations and comparisons. Step 2. Transform the Multiset M into the Set M of Merges. Since there can be two or more merges in M that are identical and correspond to the same merge in M, we put together all identical merges in M by performing radix sorting on the triplets representing them. The total cost of this step is dominated by radix 2 sorting, giving O(n log |Σ|) time. As byproduct, we produce the temporary location list Tx = x =x : x ∈M occ x for each distinct x ∈ M thus obtained. Lemma 4. Each motif x ∈ B satisﬁes Tx = Lx . Step 3. Select M∗ ⊆ M, where M∗ = {x ∈ M : Tx = Lx }. In order to build M∗ , we employ the Fischer-Paterson algorithm based on convolution [5] for string matching with don’t cares to compute the whole list of occurrences Lx for each merge x ∈ M. Its cost is O((|x| + n) log n log |Σ|) time for each merge x. Since |x| < n and there are at most n − 1 motifs x ∈ M, we obtain O(n2 log n log |Σ|) time to construct all lists Lx . We can compute M∗ by discarding the merges x ∈ M such that Tx = Lx in additional O(n2 ) time. Lemma 5. The set M∗ satisfy the conditions B ⊆ M∗ and x∈M∗ |Lx | < 2n. The property of M∗ in Lemma 5 is crucial in that x∈M |Lx | = Θ(n2 ) when many lists contain Θ(n) entries. For example, s = An has n − 1 distinct merges, each of the form x = Ai for 1 ≤ i ≤ n − 1, and so |Lx | = n − i + 1. This would be a sharp drawback in Step 4 when removing tiled motifs as it may turn into an Θ(n3 ) algorithm. Using M∗ instead, we are guaranteed that x∈M∗ |Lx | = O(n); we may still have some tiled motifs in M∗ , but their total number of occurrences is O(n). Step 4. Discard the Tiled Motifs in M∗ . We can now check for tiling motifs in O(n2 ) time. Given two distinct motifs x, y ∈ M∗ , we want to test whether Lx + d ⊆ Ly for some integer d and, in that case, we want to mark the entries in Ly that are also in Lx + d. At the end of this task, the lists having all entries marked are tiled (see Deﬁnition 4). By removing their corresponding motifs from M∗ , we eventually obtain the basis B by Lemma 5. Since the meaningful values of d are equal to the individual entries of Ly , we have only |Ly | possible values to check. For a given value of d, we avoid to merge Lx and Ly in O(|Lx |+|Ly |) time to perform the test, as it would contribute to a total of Θ(n3 ) time. Instead, we exploit the fact that each list has values ranging from 1 to n, and use a couple of bit-vectors of size n to perform |Ly |) time for all the above check in O(|Lx | × values of d. This gives O( y x |Lx | × |Ly |) = O( y |Ly | × x |Lx |) = O(n2 ) by Lemma 5. We therefore detail how to perform the above check with Lx and Ly in O(|Lx | × |Ly |) time. We use two bit-vectors V1 and V2 initially set to all zeros. Given y ∈ M∗ , we set V1 [i] = 1 if i ∈ Ly . For each x ∈ M∗ − {y} and

630

N. Pisanti et al.

for each d ∈ Ly , we then perform the following test. If all j ∈ Lx + d satisfy V1 [j] = 1, we set V2 [j] = 1 for all such j. Otherwise, we take the next value of d, or the next motif if there are no more values of d, and we repeat the test. After examining all x ∈ M∗ − {y}, we check whether V1 [i] = V2 [i] for all i ∈ Ly . If so, y is tiled as its list is covered by possibly shifted location lists of other motifs. We then reset the ones in both vectors in O(|Ly |) time. Summing up Steps 1–4, the dominant cost is that of Step 3, leading to the following result. Theorem 3. Given an input string s of length n over the alphabet Σ, the basis of tiling motifs with quorum q = 2 can be computed in O(n2 log n log |Σ|) time. The total number of motifs in the basis is less than n, and the total number of their occurrences in s is less than 2n.

4

q > 2: Pseudo-Polynomial Bases for Higher Quorum

We now discuss the general case of quorum q ≥ 2 for ﬁnding the basis of a string of length n. Diﬀerently from previous work claiming a polynomial-time algorithm for any arbitrary value of q, we show in Section 4 that no such polynomial-time algorithm can exist in the worst case, both for the basis of irredundant motifs and for the basis of tiling motifs. The size of these bases provably depends n−1 2 −1 . exponentially on suitable values of q ≥ 2, i.e., we give a lower bound of Ω q−1 In practice, this size has an exponential growth for increasing values of q up to O(log n), but larger values of q are theoretically possible in the worst case. Fixing q = (n − 1)/4 + 1 in our lower bound, we get a size of Ω(2(n−1)/4 ) motifs in the bases. On the average q = O(log|Σ| n) by extending the argument after Theorem 3. We show a further for the basis of tiling motifs in Section 4, property giving an upper bound of n−1 on its size with a simple proof. Since we can q−1 ﬁnd an algorithm taking time proportional to the square of that size, we can conclude that a polynomial-time algorithm for ﬁnding the basis of tiling motifs exists in the worst case if and only if the quorum q satisﬁes either q = O(1) or q = n − O(1) (the latter condition is hardly meaningful in practice). n−1We now show the existence of a family of strings for which there are at least 2 −1 tiling motifs for a quorum q. Since a tiling motif is also irredundant, q−1 this gives a lower bound for the irredundant motifs to be combined with that in Section 2 (the latter lower bound still gives Ω(n2 ) for q ≥ 2). The strings are this time tk = Ak TAk (k ≥ 5) themselves, without the left used in the bound extension motifs that are maximal of Section 2. The proof proceeds by exhibiting k−1 q−1 and have each exactly q occurrences, from whence it follows immediately that they are tiling (indeed the remark made after Deﬁnition 4 holds for any q ≥ 2). Proposition 3. For 2 ≤ q ≤ k and 1 ≤ p ≤ k − q + 1, any motif of the type Ap ◦ {A, ◦}k−p−1 ◦ Ap with exactly q don’t cares is tiling (and so irredundant) in tk . n−1 −1 2 Theorem 4. String tk has q−1 = Ω 21q n−1 tiling (and irredundant) moq−1 tifs, where n = |tk | and k ≥ 2. We now prove that n−1 q−1 is, instead, an upper bound for the size of a basis of tiling motifs for a string s and quorum q ≥ 2. Let us denote as before such

A Basis of Tiling Motifs for Generating Repeated Patterns

631

a basis by B. To prove the upper bound, we use again the notion of a merge except that it involves q strings. The operator ⊕ between the elements of Σ is the same as before. Let k be an array of q − 1 positive values k1 , . . . , kq−1 with 1 ≤ ki < kj ≤ n − 1 for all 1 ≤ i < j ≤ q − 1. A merge is the (non empty) pattern that results from applying the operator ⊕ to the string s and to s itself q − 1 times, at each time shifted by ki positions to the right for 1 ≤ i ≤ q − 1. Lemma 6. If Merge k exists for quorum q, it must be a maximal motif. Lemma 7. For each tiling motif x in the basis B with quorum q, there is at least one k for which Merge k = x. Theorem 5. Given a string s of length n and a quorum q, let M be the set of Merge k , for any of the n−1 exists. The q−1 possible choices of k for which Merge n−1 k basis B of tiling motifs satisﬁes B ⊆ M, and therefore |B| ≤ q−1 . The tiling motifs in our basis appear in s for a total of q n−1 q−1 times at most. A generalization of the algorithm 2 given in Section 3 gives a pseudo-polynomial time complexity of O q 2 n−1 . q−1

References 1. A. Apostolico and L. Parida. Incremental paradigms of motif discovery. unpublished, 2002. 2. A. Apostolico and L. Parida. Compression and the wheel of fortune. In IEEE Data Compression Conference (DCC’2003), pages 143–152, 2003. 3. A. Apostolico. Pattern discovery and the algorithmics of surprise. In NATO ASI on Artiﬁcial Intelligence and Heuristic Methods for Bioinformatics. IOS press, 2003. 4. A. Apostolico. Personal communication, May 2003. 5. M. Fischer and M. Paterson. String matching and other products. In R. Karp, editor, SIAM AMS Complexity of Computation, pages 113–125, 1974. 6. H. Mannila. Local and global methods in data mining: basic techniques and open problems. In P. et al., editor, International Colloquium on Automata, Languages, and Programming, volume 2380 of LNCS, pages 57–68. Springer-Verlag, 2002. 7. L. Parida, I. Rigoutsos, A. Floratos, D. Platt, and Y. Gao. Pattern Discovery on Character Sets and Real-valued Data: Linear Bound on Irredundant Motifs and Efﬁcient Polynomial Time Algorithm. In SIAM Symposium on Discrete Algorithms, 2000. 8. J. Pelfrˆene, S. Abdedda˝ım, and J. Alexandre. Un algorithme d’indexation de motifs approch´es. In Journ´ee Ouvertes Biologie Informatique Math´ematiques (JOBIM), pages 263–264, 2002. 9. J. Pelfrˆene, S. Abdedda˝ım, and J. Alexandre. Extracting approximare patterns. In Combinatorial Pattern Matching, 2003. to appear. 10. N. Pisanti, M. Crochemore, R. Grossi, and M.-F. Sagot. A basis for repeated motifs in pattern discovery and text mining. Technical Report IGM 2002-10, Institut Gaspard-Monge, University of Marne-la-Vall´ee, July 2002. 11. N. Pisanti, M. Crochemore, R. Grossi, and M.-F. Sagot. Bases of motifs for generating repeated patterns with don’t cares. Technical Report TR-03-02, Dipartimento di Informatica, University of Pisa, January 2003.

On the Complexity of Some Equivalence Problems for Propositional Calculi Steﬀen Reith 3Soft GmbH Frauenweiherstr. 14 D-91058 Erlangen, Germany [email protected]

Abstract. In the present paper1 we study the complexity of Boolean equivalence problems (i.e. have two given propositional formulas the same truthtable) and of Boolean isomorphism problems (i.e. does there exists a permutation of the variables of one propositional formula, such that the truthtable of this modiﬁed formula coincides with the truthtable of the second formula) of two given generalized propositional formulas and certain classes of Boolean circuits. Keywords: Computational complexity, Boolean functions, Boolean isomorphism, Boolean equivalence, closed classes, Dichotomy, Post, satisﬁability problems.

1

Introduction

In 1921 E. L. Post gave a full characterization of all classes of Boolean functions which are closed under superposition (i. e. substitution of Boolean functions, permutation and identiﬁcation of variables and introduction of ﬁctive variables). Based on his results (see [9]) we deﬁne, for a ﬁnite set B of Boolean functions, the so called B-formulas and B-circuits, which are closely related to Post’s closed classes of Boolean functions. To be more precise: Every B-formula and B-circuit represents a Boolean function in the closure of B, hence B-formulas form generalized propositional calculi, since the classical formulas and circuits are mostly restricted to B = {∧, ∨, ¬}. The satisﬁability-problem of B-formulas was at ﬁrst studied by H. Lewis. In his paper [8] he was able to show that the satisﬁability problem of B-formulas is either NP-complete iﬀ the Boolean function represented by x ∧ ¬y is in the closure of B or it is solvable in deterministic polynomial time. Theorems of this form are called dichotomy theorems, because they deal with problems which are either one of the hardest in a given complexity class or they are easy to solve. One of the best known and the ﬁrst theorem of this kind was proven by Schaefer (see [12]), giving exhaustive results about the satisﬁability of generalized propositional formulas in conjunctive normal form. The work [2] can be seen as 1

Work done in part while employed at Julius-Maximilians-Universit¨ at W¨ urzburg. For a full version of this paper see: http://www.streit.cc/dl/

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 632–641, 2003. c Springer-Verlag Berlin Heidelberg 2003

On the Complexity of Some Equivalence Problems for Propositional Calculi

633

counterpart of the present paper in Schaefer’s framework, because there the same equivalence problems are studied for formulas in generalized conjunctive normal form. Besides asking for a satisfying assignment, other interesting problems in the theory of Boolean functions were studied. Two of them are the property of being equal or isomorphic (i.e. is there a permutation of the variables of one function, such that the modiﬁed function is equal to the other function). A list of early references can be found in [3], stressing the importance of this kind of problems. In the case of classical propositional formulas it is known that the Boolean equivalence-problem for formulas and circuits is coNP-complete, whereas only very weak lower and upper bounds for the Boolean isomorphism-problem are known. By a reduction from the tautology problem, which is coNP-complete, a lower bound for the isomorphism-problem can be easily derived. An upper p bound for the isomorphism-problem is clearly Σp 2 . For this the Σ2 -machine can existentially guess a permutation and universally check the resulting formula for equality by using its oracle. In [1] Agrawal and Thierauf show that the complement of the Boolean isomorphism-problem for formulas and circuits has an one-round interactive proof, where the Veriﬁer has access to an NP oracle. Using this result they also show that if the Boolean isomorphism-problem is Σp 2 -complete, then the polynomial hierarchy collapse to Σp 3 . Agrawal and Thierauf give in their paper also a better lower bound for the isomorphism-problem of classical propositional formulas. More precisely they have proven that UOCLIQUE ≤pm ISOF holds, where ISOF is the isomorphism-problem of propositional formulas built out of ∧, ∨ and ¬, and UOCLIQUE is the problem of checking whether the biggest clique in a graph is unique. It is known that UOCLIQUE is ≤pm -hard for 1-NP, a superclass of coNP, where 1-NP denotes the class of all problems whose solution can be found on exactly one path in nondeterministic polynomial time. In the present paper we focus on the complexity of checking whether two given B-formulas (B-circuits, resp.) represents the same Boolean function (the equivalence-problem for B-formulas (B-circuits, resp.)) or if they represent isomorphic Boolean functions (isomorphism-problem for B-formulas (B-circuits, resp.)). We give, were possible, tight upper and lower bounds for the isomorphismproblem of B-formulas and B-circuits. In all other cases we show the coNPhardness for the isomorphism-problem, which is as good as the trivial lower bound in the classical case. Note that the known upper bounds for the usual isomorphism-problem hold for our B-formulas and B-circuits as well, since we work with special non-complete sets of Boolean functions as connectors. In the case of equivalence-problems we always give tight upper and lower bounds, showing that this problems are either in L, NL-complete, ⊕L-complete or coNPcomplete, where the complexity class L (NL) is deﬁned as the class of problems which can be solved by a deterministic logarithmically space bounded Turingmachine (nondeterministic logarithmically space bounded Turing-machine) and

634

Steﬀen Reith

by ⊕L we denote the class of decision problems solvable by an NL machine such that we accept (reject resp.) the input if the number of accepting paths is odd. After presenting some notions and preliminary results for closed classes of Boolean functions in Section 2, we turn to the equivalence and isomorphismproblem for B-circuits and B-formulas in Section 3. Finally Section 4 concludes.

2

Preliminaries

Any function of the kind f : {0, 1}k → {0, 1} will be called (k-ary) Boolean function. The set of all Boolean functions will be denoted by BF. Now let B be a ﬁnite set of Boolean functions. In the following we will give a description of B-circuits and B-formulas. A B-circuit is a directed acyclic graph where each node is labeled either with a variable xi or with a function out of B. The nodes of such a B-circuit are called gates, the edges are called wires. The number of wires pointing into a gate is called fan-in and the number of wires leaving a gate is named fan-out. Moreover we order the wires pointing to a gate. If a wire leaves a gate u and points to gate v we call u a predecessor-gate of v. Additionally the gates labeled by a variable xi must have fan-in 0 and we call them input-gate. The gates labeled by a function f k ∈ B must have fan-in k. Finally we mark one particular gate o and call this gate the output-gate. Since a Boolean formula can be interpreted as a tree-like circuit, it is reasonable to deﬁne B-formulas as the subset of B-circuits C, such that each gate of C has at most fan-out 1. Each B-circuit C(x1 , . . . , xn ) computes a Boolean function fCn (x1 , . . . , xn ). Given an n-bit input string a = a1 . . . an every gate in C computes a Boolean value as follows: The input-gate xi computes ai for 1 ≤ i ≤ n and each non input-gate v computes the value g(b1 , . . . , bm ), where g m ∈ B and b1 , . . . , bm are the values computed by the predecessor-gates of v, ordered according to the order of the wires pointing to v. The value fCn (a1 , . . . , an ) computed by C is deﬁned as the value computed by the output-gate o. Let V = {x1 , . . . , xn } be a ﬁnite set of Boolean variables. An assignment w.r.t. V is a function I : {x1 , . . . , xn } → {0, 1}. If V is clear from the context we simply say assignment. If there is an obvious ordering for the variables, we also use (a1 , . . . , an ) instead of {I(x1 ) := a1 , . . . , I(xn ) := an }. Let {xi1 , . . . , xim } ⊆ V . By I = I/{xi1 , . . . , xim } we denote the restricted assignment w.r.t {xi1 , . . . , xim }, which is deﬁned by I (x) = I(x) iﬀ x ∈ {xi1 , . . . , xim }. In order for an assignment I w.r.t V to be compatible with a B-circuit C (B-formula H, resp.) we must have V = Var(C) (V = Var(H), resp.). An assignment I satisﬁes a circuit C(x1 , . . . , xn ) (formula H(x1 , . . . , xn ), resp.) iﬀ fC (I(x1 ), . . . , I(xn )) = 1 (fH (I(x1 ), . . . , I(xn )) = 1, resp). For an assignment I which satisﬁes C (H, resp.) we write I |= C (I |= H, resp.). The number of satisfying assignments (nonsatisfying assignments, resp.) of C is denoted by #1 (C) =def |{I|I |= C}| (#0 (C) =def |{I|I |= C}|, resp.). A variable xi is called ﬁctive iﬀ f (a1 , . . . , ai−1 , 0, ai+1 , . . . , an ) = f (a1 , . . . , ai−1 , 1, ai+1 , . . . an ) for all a1 , . . . ai−1 , ai+1 , . . . , an and 1 ≤ i ≤ n.

On the Complexity of Some Equivalence Problems for Propositional Calculi

635

For simplicity we often use a formula instead of a Boolean function. For example, the functions id (x), and (x, y), or (x, y), not(x), xor (x, y) are represented by the formulas x, x ∧ y, x ∨ y, ¬x and x ⊕ y. Sometimes x is used instead of ¬x. We will use 0 and 1 for the constant 0-ary Boolean functions. Finally have in mind that the term gate-type will be replaced by function-symbol when we work with B-formulas. Now we identify the class of Boolean functions which can be computed by a B-circuit (B-formula, resp.). For this let B be a set of Boolean functions. By [B] we denote the smallest set of Boolean functions, which contains B ∪ {id } and is closed under superposition, i.e. under substitution (composition of functions), permutation and identiﬁcation of variables, and introduction of ﬁctive variables. We call a set F of Boolean functions base for B if [F ] = B, and F is called closed if [F ] = F . A base B is called complete if [B] = BF, where BF is the set of all Boolean functions. For an n-ary Boolean function f its dual function dual(f ) is deﬁned by dual(f )(x1 , . . . , xn ) =def ¬f (¬x1 , . . . , ¬xn ). Let B be a set of Boolean functions. We deﬁne dual(B) =def {dual(f )|f ∈ B}. Clearly dual(dual(B)) = B. Furthermore we deﬁne dual(H) (dual(C), resp.) to be the dual(B)-formula (dual(B)circuit, resp.) that emerges when we replace all function-symbols in H (gatetypes in C, resp.) by the symbol of their dual function (by the gate-type of their dual-function, resp.). Clearly fdual(H) = dual(fH ) (fdual(C) = dual(fC ), resp.). Emil Post gave in [9] a complete list of all classes of Boolean functions being closed under superposition. Moreover he showed that each closed class has a ﬁnite base. The following proposition gives us some bases for some closed classes which play a role for this paper. Proposition 1 ([9,7,10]). Every closed class of Boolean functions has a ﬁnite base. In particular: Class BF L S00 E2 N

Base {and , or , not} {xor , 1} {x ∨ (y ∧ z)} {and } {not, 1}

Class M L2 S10 V

Base {and , or , 0, 1} {x ⊕ y ⊕ z} {x ∧ (y ∨ z)} {or , 0, 1}

Class D D2 E V2

Base {(x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z)} {(x ∧ y) ∨ (x ∧ z) ∨ (y ∧ z)} {and , 0, 1} {or }

Now we need some deﬁnitions: Deﬁnition 2. Let C(x1 , . . . , xn ) be a B-circuit and π : {1, . . . , n} → {1, . . . , n} be a permutation. By π(C(x1 , . . . , xn )) we denote the B-circuit which emerges when we replace the variables xi for 1 ≤ i ≤ n by xπ(i) in C(x1 , . . . , xn ) simultaneously. Next we will deﬁne two equivalence relations for B-circuits:

636

Steﬀen Reith

Deﬁnition 3. Let C1 and C2 be B-circuits. The Boolean equivalence- and isomorphism relation for B-circuits is deﬁned as follows: C1 ≡ C2 ⇔def fC1 (a1 , . . . , an ) = fC2 (a1 , . . . , an ) for all a1 , . . . , an ∈ {0, 1}. C1 ∼ = C2 ⇔def There exists a permutation π : {1, . . . , n} → {1, . . . , n} such that π(C1 ) ≡ C2 . These deﬁnitions will be analogously used for B-formulas. Note that in both cases the sets of input-variables of C1 and C2 are equal. This can be easily achieved by adding ﬁctive variables when needed. Moreover note that C1 ≡ C2 iﬀ dual(C1 ) ≡ dual(C2 ). Now we are ready to deﬁne the Boolean equality-problem and the Boolean isomorphism-problem for B-circuits: Problem: EQC (B) Instance: B-circuits C1 , C2 Question: Is C1 ≡ C2 ?

Problem: ISOC (B) Instance: B-circuits C1 , C2 Question: Is C1 ∼ = C2 ?

Analogously we deﬁne the Boolean equality-problem EQF (B) and the Boolean isomorphism-problem ISOF (B) for B-formulas. The next proposition shows, that if we permute the variables of a given Bformula H then the number of satisfying assignments #1 (H) (non-satisfying assignments #0 (H), resp.) remain equal. This proposition is used to show the coNP-hard cases. Proposition 4. Let H1 (x1 , . . . , xn ) and H2 (x1 , . . . , xn ) be B-formulas such that H1 ∼ = H2 . Then #1 (H1 ) = #1 (H2 ) and #0 (H1 ) = #0 (H2 ) hold. It is obvious that Proposition 4 works for B-circuits too. Note that the opposite direction of Proposition 4 does not hold, as x ⊕ y ∼ ¬(x ⊕ y) and = #1 (x ⊕ y) = #1 (¬(x ⊕ y)) = 2 shows. Now let E be an arbitrary problem related to a pair of B-circuits: E(B) =def {(C1 , C2 ) | C1 and C2 are B-circuits such that fC1 and fC2 have property E } . This gives the following obvious proposition: Proposition 5. Let E be a property of two B-circuits, and let B and B be ﬁnite sets of Boolean functions. log 1. If B ⊆ [B ] then E(B) ≤log m E(B ). 2. If [B] = [B ] then E(B) ≡m E(B ).

This proposition clariﬁes that the complexity of EQF (B) and ISOC (B) can be determined by studying the classes of Post’s lattice. Now we can give an upper bound for the equivalence-problem of B-formulas and B-circuits. Later we will see that this upper bound is tight for B-circuits and B-formulas in some cases. Moreover we show that the equivalence-problem and the isomorphism-problem for B-circuits and dual(B)-circuits (B-formulas and dual(B)-formulas, resp.) is of equal complexity, hence we have a vertical symmetry-axis in Post’s lattice for the complexity of EQF (B), EQC (B), ISOF (B) and ISOC (B).

On the Complexity of Some Equivalence Problems for Propositional Calculi

637

Proposition 6. Let B be a ﬁnite set of Boolean functions. Then the following four statements hold: 1. 2. 3. 4.

C F log C EQF (B) ≤log m EQ (B) and ISO (B) ≤m ISO (B), EQF (B) ∈ coNP and EQC (B) ∈ coNP, F C log C EQF (B) ≡log m EQ (dual(B)) and EQ (B) ≡m EQ (dual(B)), F C log C ISOF (B) ≡log m ISO (dual(B)) and ISO (B) ≡m ISO (dual(B)).

The circuit-value problem for B-formulas is deﬁned as follows: Problem: VALC (B) Instance: A B-circuit C(x1 , . . . , xn ) and an assignment (a1 , . . . , an ) Question: Is fC (a1 , . . . , an ) = 1? Similarly we deﬁne the formula-value problem VALF (B) for B-formulas. The following proposition gives us some information about the circuit value-problem of B-circuits. A complete classiﬁcation of the circuit value problem for all closed classes can be found in [11,10], which generalizes a result of [4], where only two-input gate-types were studied. Proposition 7 ([11]). 1. Let B be a ﬁnite set of Boolean functions such that V2 ⊆ [B] ⊆ V or E2 ⊆ [B] ⊆ E, then VALC (B) is ≤log m -complete for NL. 2. Let B be a ﬁnite set of Boolean functions such that L2 ⊆ [B] ⊆ L, then VALC (B) is ≤log m -complete for ⊕L. Proposition 8 ([5,6]). 1. L⊕L = ⊕L

2. LNL = NL

To show, in certain cases, lower bounds for the equivalence- and isomorphismproblem of B-formulas and B-circuits we use the following well known graphtheoretic problems: Problem: Graph Accessibility Problem (GAP) Instance: A directed acyclic graph G whose vertices have outdegree 0 or 2, a start vertex s, and a target vertex t. Output: Is there a path in G which leads from s to t? Problem: Graph Odd Accessibility Problem (GOAP) Instance: A directed acyclic graph G whose vertices have outdegree 0 or 2, a start vertex s, and a target vertex t. Output: Is the number of paths in G, which lead from s to t, odd?

638

3

Steﬀen Reith

Main Results

In this section we use Post’s lattice (see [9,7]) to determine the complexity of EQC (B) and EQF (B) step by step. Similarly we are able to give lower bounds for the isomorphism problem of B-circuits and B-formulas. The basic idea in the case [B] ⊆ N, where N is the class of k-ary negations with Boolean constants is, that there exists a unique path from the output-gate to some input-gate or a gate which is labeled by a constant function. This is because every allowed Boolean function has one non-ﬁctive variable at most. Lemma 9. Let B be a ﬁnite set of Boolean functions and [B] ⊆ N. Then EQC (B) ∈ L, EQF (B) ∈ L, ISOC (B) ∈ L and ISOF (B) ∈ L. If we restrict ourselves to or -functions or and -functions the isomorphismand equivalence-problem for such B-circuits is complete for NL. For the proof we use the NL-complete graph accessibility problem (GAP). In contrast to this, if we only use exclusive-or functions for our B-circuits the equivalence- and isomorphism-problem is ⊕L-complete. Here the ⊕L-complete graph odd accessibility problem (GOAP) is used. The basic idea in all cases is to interpret a B-circuit as directed acyclic graph and the observation that the equivalenceand isomorphism-problem can be solved by using the suitable reachability problem. Theorem 10. Let B be a ﬁnite set of Boolean functions. If E2 ⊆ [B] ⊆ E or V2 ⊆ [B] ⊆ V, then EQC (B) and ISOC (B) are ≤log m -complete for NL. If L2 ⊆ [B] ⊆ L, then EQC (B) and ISOC (B) are ≤log m -complete for ⊕L. Proposition 7 shows that in the cases V2 ⊆ [B] ⊆ V and E2 ⊆ [B] ⊆ E the circuit value problem is complete for NL and in the case L2 ⊆ [B] ⊆ L that it is complete for ⊕L. The next lemma show that this does not hold for B-formulas. In contrast to the circuit value problem the formula value problem VALF (B) is in L in all these three cases, which gives us a hint that the corresponding equality- and isomorphism-problem for B-formulas is easier than for B-circuits, since it can be solved with the help of a VALF (B) oracle. Lemma 11. Let B be a ﬁnite set of Boolean functions such that V2 ⊆ [B] ⊆ V, E2 ⊆ [B] ⊆ E or L2 ⊆ [B] ⊆ L, then VALF (B) ∈ L. When EQC (B) and ISOC (B) are NL-complete or ⊕L-complete, the formula case is still easy to solve, as we will see in the Theorem below. Theorem 12. Let B be a ﬁnite set of Boolean functions. If E2 ⊆ [B] ⊆ E, V2 ⊆ [B] ⊆ V or L2 ⊆ [B] ⊆ L, then EQF (B) ∈ L and ISOF (B) ∈ L. The next lemma shows that it is possible to construct for every monotone 3 -DNF-formula an equivalent (B ∪ {0, 1})-formula in logarithmic space, if (B ∪ {0, 1}) is a base for M.

On the Complexity of Some Equivalence Problems for Propositional Calculi

639

Lemma 13. Let k > 0 be ﬁxed and B be a ﬁnite set of Boolean functions, such that E(x, y, v, u) and V (x, y, v, u) are B-formulas, fulﬁlling E(x, y, 0, 1) ≡ x ∧ y and V (x, y, 0, 1) ≡ x∨y. Then, for any monotone k-DNF (k-CNF, resp.) formula H(x1 , . . . , xn ), there exists a B-formula H (x1 , . . . , xn , u, v) such that H (x1 , . . . , xn , 0, 1) ≡ H(x1 , . . . , xn ). Moreover, H can be computed from H in logarithmic space. Now we use Lemma 13 to build two monotone formulas out of a 3 -DNF formula, such that these two monotone formulas are equivalent iﬀ the 3 -DNF formula is a tautology. For this let 3 -TAUT be the coNP-complete set of 3-DNF formulas which are tautologies. Lemma 14. Let B be a ﬁnite set of Boolean functions such that {or , and } ⊆ [B]. Then there exist logspace computable B-formulas H1 and H2 , which can be computed out of H, such that H ∈ 3 -TAUT iﬀ H1 ≡ H2 iﬀ H1 ∼ = H2 iﬀ #1 (H1 ) = #1 (H2 ). Moreover the formulas H1 and H2 do not represent constant Boolean functions (i.e., H1 ≡ 0, H1 ≡ 1, H2 ≡ 0 and H1 ≡ 1). The property that the formulas H1 and H2 can not represent a constant Boolean function plays an important role in the proof of Theorem 17. To show the coNP-completeness the basic idea of the next theorems works as follows: Since we have and ∈ [B] and or ∈ [B ∪ {1}] we can almost apply Lemma 14. The only problem is that we have to “simulate” the constant 1. The idea here is to introduce a new variable u which plays the role of the missing constant 1. By connecting the formulas from Lemma 14 and u with ∧ we cause that every satisfying assignment assigns 1 to u. Theorem 15. Let B be a ﬁnite set of Boolean functions, such that and ∈ [B] F and or ∈ [B ∪ {1}]. Then EQF (B) is ≤log m -complete for coNP and ISO (B) is log ≤m -hard for coNP. Corollary 16. Let B be a set of Boolean functions such that S10 ⊆ [B] or S00 ⊆ F [B]. Then EQF (B) and EQC (B) are ≤log m -complete for coNP and ISO (B) and C log ISO (B) are ≤m -hard for coNP. Now only two closed classes of Boolean functions are left: D and D2 . In this case the construction of Theorem 15 cannot work, because neither and ∈ D2 nor or ∈ D2 . Hence we have no possibility to use the and -function to force an additional variable to 1 for all satisfying assignments. This is not needed. Instead we use two new variables as a replacement for 0 and 1 and show, that in any case we get either two formulas representing the same constant function or formulas which match to Lemma 14. Theorem 17. Let B be a ﬁnite set of Boolean functions such that D2 ⊆ [B] ⊆ F D. Then EQF (B) and EQC (B) are ≤log m -complete for coNP and ISO (B) and C log ISO (B) are ≤m -hard for coNP. This leads us to the following classiﬁcation theorems for the complexity of the equivalence- and isomorphism-problem of B-circuits and B-formulas:

640

Steﬀen Reith

Theorem 18. Let B be a ﬁnite set of Boolean functions. The complexity of EQC (B) and ISOC (B) can be determined as follows: if B ⊆ N then EQC (B) ∈ L and ISOC (B) ∈ L. If B ⊆ E or B ⊆ V then EQC (B) and ISOC (B) are ≤log m complete for NL. In the case that B ⊆ L EQC (B) and ISOC (B) are ≤log m complete for ⊕L. In all other cases EQC (B) is ≤log m -complete for coNP and ISOC (B) is ≤log m -hard for coNP. Theorem 19. Let B be a ﬁnite set of Boolean functions. The complexity of EQF (B) and of ISOF (B) can be determined as follows: If B ⊆ V, B ⊆ E or B ⊆ L then EQF (B) ∈ L and ISOF (B) ∈ L. In all other cases EQF (B) is F log ≤log m -complete for coNP and ISO (B) is ≤m -hard for coNP. Another interesting problem arises in the context of the isomorphism-problem of Boolean formulas. It is well-known that the satisﬁability-problem for unrestricted Boolean formulas is complete for NP, but the satisﬁability-problem for monotone Boolean formulas is solvable in P. We showed that in both cases the isomorphism-problem is coNP-hard, but it might be possible that a better upper bound for the isomorphism-problem of monotone formulas (M2 ⊆ [B] ⊆ M) than for the isomorphism-problem of unrestricted formulas can be found.

4

Conclusion

In the present work we determined the complexity of the equivalence-problem of B-formulas and B-circuits and were able to give a complete characterization of the complexity w.r.t. all possible ﬁnite sets of Boolean functions. We showed that the equivalence-problem for B-circuits is, depending on the used set of Boolean functions, NP-complete, NL-complete, ⊕L-complete or in L. Interestingly, because of the succinctness of circuits, the equivalence-problem for B-formulas is sometimes easier to solve. To be more precise, if EQC (B) is NL-complete or ⊕L-complete then EQF (B) is still solvable in deterministic logarithmic space. In all other cases the representation as formula or circuit has no inﬂuence and the complexity of EQC (B) and EQF (B) coincide. In the case of isomorphism-problems we were not always able to prove completeness results. In these cases we showed the hardness for coNP as a lower bound. Note that in these cases the trivial upper bound Σp 2 remains valid, so our results are as strong as the well known trivial upper and lower bound for the isomorphism-problem of unrestricted Boolean formula and circuits. In the easier case [B] ⊆ N we proved that the isomorphism-problem for B-circuits is decidable in deterministic logarithmic space. For V2 ⊆ [B] ⊆ V and E2 ⊆ [B] ⊆ E (L2 ⊆ [B] ⊆ L, resp.) we showed the NL-completeness (⊕L-completeness, resp.) of the isomorphism-problem of B-circuits. Similar to the equivalence-problem the isomorphism-problem for B-formulas is still solvable in deterministic logarithmic space if ISOC (B) is NL-complete or ⊕L-complete. We use the same reduction for showing the coNP-hardness of EQF (B) and ISOF (B), therefore it can not be expected that this reduction is powerful enough

On the Complexity of Some Equivalence Problems for Propositional Calculi

641

to show a better lower bound for the isomorphism-problem. Note that this reduction does not use the ability of permuting variables. Hence it seems possible that any reduction showing a better lower bound for the isomorphism-problem has to take a non-trivial permutation into account.

Acknowledgments I am grateful to Heribert Vollmer for a lot of helpful discussions and in particular to Sven Kosub for a ﬁrst idea of Lemma 14.

References 1. Manindra Agrawal and Thomas Thierauf. The Boolean isomorphism problem. In 37th Symposium on Foundation of Computer Science, pages 422–430. IEEE Computer Society Press, 1996. 2. Elmar B¨ ohler, Edith Hemaspaandra, Steﬀen Reith, and Heribert Vollmer. Equivalence problems for boolean constraint satisfaction. In Proc. Computer Science Logic, Lecture Notes in Computer Science. Springer Verlag, 2002. 3. B. Borchert, D. Ranjan, and F. Stefan. On the Computational Complexity of Some Classical Equivalence Relations on Boolean Functions. Theory of Computing Systems, 31:679–693, 1998. 4. L.M. Goldschlager and I. Parberry. On The Construction Of Parallel Computers From Various Bases Of Boolean Functions. Theoretical Computer Science, 43:43– 58, 1986. 5. Ulrich Hertrampf, Steﬀen Reith, and Heribert Vollmer. A note on closure properties of logspace MOD classes. Information Processing Letters, 75:91–93, 2000. 6. Neil Immerman. Nondeterministic space is closed under complementation. SIAM Journal on Computing, 17(5):935–938, 1988. 7. S. W. Jablonski, G. P. Gawrilow, and W. B. Kudrajawzew. Boolesche Funktionen und Postsche Klassen. Akademie-Verlag, 1970. 8. Harry R. Lewis. Satisﬁability Problems for Propositional Calculi. Mathematical Systems Theory, 13:45–53, 1979. 9. E. L. Post. The two-valued iterative systems of mathematical logic. Annals of Mathematical Studies, 5:1–122, 1941. 10. Steﬀen Reith. Generalized Satisﬁability Problems. PhD thesis, University of W¨ urzburg, 2001. 11. Steﬀen Reith and Klaus W. Wagner. The Complexity of Problems Deﬁned by Boolean Circuits. Technical Report 255, Institut f¨ ur Informatik, Universit¨ at W¨ urzburg, 2000. To appear in Proceedings International Conference Mathematical Foundation of Informatics, Hanoi, October 25 - 28, 1999. 12. T. J. Schaefer. The complexity of satisﬁability problems. In Proccedings 10th Symposium on Theory of Computing, pages 216–226. ACM Press, 1978. 13. R. Szelepcs´enyi. The method of forcing for nondeterministic automata. Bulletin of the European Association for Theoretical Computer Science, 33:96–100, 1987.

Quantified Mu-Calculus for Control Synthesis St´ephane Riedweg and Sophie Pinchinat IRISA-INRIA, F-35042, Rennes, France {sriedweg,pinchina}@irisa.fr Fax: +33299847171 Abstract. We consider an extension of the mu-calculus as a general framework to describe and synthesize controllers. This extension is obtained by quantifying atomic propositions, we call the resulting logic quantified mu-calculus. We study its main theoretical properties and show its adequacy to control applications. The proposed framework is expressive : it oﬀers a uniform way to describe as varied parameters as the kind of systems (closed or open), the control objective, the type of interaction between the controller and the system, the optimality criteria (fairness, maximally permissive), etc. To our knowledge, none of the former approaches can capture such a wide range of concepts.

1

Introduction

To generalize the control synthesis theory of Ramadge and Wohnam [1], lot of works use temporal logics as speciﬁcation [2–4]. All those approaches suﬀer from substantial limitations: there is no way to impose properties on the interaction between the system and its controller, nor to require optimality of controllers. The motivation of our work is to ﬁll these gaps. We put forward an extension of the mu-calculus well suited to describe general control objectives and to synthesize ﬁnite state controllers. The proposed framework is expressive : it oﬀers a uniform way to describe as varied parameters as the kind of systems (closed or open), the control objective, the type of interaction between the controller and the system, the optimality criteria (fairness, maximally permissive), etc. To our knowledge, none of the former approaches can capture such a wide range of concepts. As in [5–7], we extend a temporal logic (the mu-calculus) by quantifying atomic propositions. We call the resulting logic quantiﬁed mu-calculus. We study its main theoretical properties and show its adequacy to control applications. We start from alternating tree automata for mu-calculus [8, 9] and we extend their theory using the Simulation Theorem [10, 11, 8] and a projection of automata. The Simulation Theorem states that alternating automata and nondeterministic automata are equivalent. The projection is an adaption of the construction of [12]. The meanings of existential quantiﬁer is deﬁned projecting automata on sets of propositions. Decision procedures for model-checking and satisfaction can therefore be obtained. Both problems are non-elementary when we consider the full logic. We can however display interesting fragments with lower complexity, covering still a wide class of control problems. B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 642–651, 2003. c Springer-Verlag Berlin Heidelberg 2003

Quantiﬁed Mu-Calculus for Control Synthesis

643

The following explains the applications to control. We view supervision of systems as pruning the systems’ computation trees. Consequently, a controller can be represented by a labeling c of the (uncontrolled) system’s computation tree into {0, 1}, such that the (downwards closed) 1-labeled subtree is the behavior of the controlled system. For any proposition c, we deﬁne a transformation α∗c of mu-calculus formulas α such that some controller induced restriction of S satisﬁes α if and only if α ∗ c holds of some c-labeling on the computation tree. Labeling allows to consider the forbidden part of the controlled system, and we derive controllers for large classes of speciﬁcations, using a constructive model-checking. Beyond the capability to specify controllers which only cut controllable transitions, we can more interestingly specify (and synthesize) a maximally permissive controller for α; i.e. a controller c such that the c-controlled system satisﬁes α and no c -controlled system such that c c satisﬁes α; where c c is the mucalculus formula expressing that the 1-labeled subtree deﬁned by c is a proper subtree of the 1-labeled subtree deﬁned by c . A maximally permissive controller enforcing α can therefore be speciﬁed by the quantiﬁed mu-calculus formula: ∃c. α∗c ∧ ∀c . c c ⇒ ¬(α∗c ) . Controllers and maximally permissive controllers for open systems [2] may also be speciﬁed and synthesized. Such controllers are required moreover to be robust against the environment’s policy. Also, the implementation considerations of [13] and decentralized controllers may be formulated in quantiﬁed mu-calculus. Not surprisingly, the expressive power of the mu-calculus enables us to deal with fairness. The rest of the paper is organized as follows : Section 2 presents the logic. Section 3 studies applications to control theory. Algorithms are developed in Section 4, based on the automata-theoretic semantics. Finally, control synthesis is illustrated in Section 5.

2

Quantified Mu-Calculus

We assume given a ﬁnite set of events A, a ﬁnite set of propositions AP , and an inﬁnite set of variables V ar = {X, Y, . . .}. Definition 1. (Syntax of qLµ ) The set of formulas of the quantiﬁed mucalculus on Γ ⊆ AP , written qLµ (Γ ), is deﬁned by the grammar: ∃Λ.α | ¬α1 |α1 ∨ α2 | β where Λ ⊆ AP , α ∈ qLµ (Γ ∪ Λ), α1 and α2 are formulas in qLµ (Γ ), and β is a formula of the pure mu-calculus on Γ . The set of formulas of the pure mu-calculus on Γ ⊆ AP , written Lµ (Γ ), is deﬁned by the grammar:

| p | X | ¬β | β | β ∨ β | µX.β(X)

644

St´ephane Riedweg and Sophie Pinchinat

where a ∈ A , p ∈ Γ , X ∈ V ar, and β and β are in Lµ (Γ ). To ensure meanings to ﬁx-points formulas, X must occur under an even number of negation symbols ¬ in α(X), in each formula µX.α(X). Extending the terminology of mu-calculus, we call sentences all quantiﬁed mucalculus formulas without free variables. We write ⊥, [a]α, α ∧ β, νX.α(X), and ∀Λ.α respectively for negating , ¬α, ¬α ∨ ¬β, µX.¬α(¬X) and ∃Λ.¬α. a a We write also →, →, α ⇒ β, and ∃x.α respectively for , [a] ⊥, ¬α ∨ β, and ∃{x}.α. Since in general, ﬁxed-point operators and quantiﬁers do not commute, we enforce no quantiﬁcation inside ﬁxed-point terms. The quantiﬁed mu-calculus qLµ , as a generalization of the mu-calculus, is also given an interpretation over deterministic transition structures called processes in [3] . Definition 2. A process on Γ ⊆ AP is a tuple S =< Γ, S, s0 , t, L >, where S is the set of states, s0 ∈ S is the initial state, t : S × A → S is a partial function called the transition function and L : S → P(Γ ) maps states to subset of propositions. We say that S is ﬁnite if S is ﬁnite and that it is complete if for all (a, s) ∈ A × S, t(s, a) is deﬁned. Compound processes can be built up by synchronous product. Definition 3. The (synchronous) product of two processes S1 =< Γ1 , S1 , s01 , t1 , L1 > and S2 =< Γ2 , S2 , s02 , t2 , L2 > on disjoint sets Γ1 and Γ2 is the process S1 × S2 =< Γ, S1 × S2 , (s01 , s02 ), t, L > on Γ = Γ1 ∪ Γ2 such that (1) (s1 , s2 ) ∈ t((s1 , s2 ), a) whenever s1 ∈ t1 (s1 , a) and s2 ∈ t2 (s2 , a)), and (2) L(s1 , s2 ) = L1 (s1 ) ∪ L2 (s2 ). In the sequel, we shall in particular make the product of a process on Γ with another (complete) process on a disjoint set of propositions Λ in order to obtain a similar process on Γ ∪ Λ. This is the way in which qLµ will be applied to solve control problem (see Theorem 1 Section 3). Definition 4. A labeling process on Λ ⊆ AP is simply a complete process E on Λ. Now, for any process S =< Γ, S, s0 , t, L > with Γ disjoint from Λ, S × E is called a labeling of S (by E) on Λ. We let LabΛ denote the set of labeling processes on Λ. Definition 5. (Semantics of qLµ ) The interpretation of the qLµ (Γ )-formulas is relative to a process S =< Γ, S, s0 , t, L > and a valuation val : V ar → P(S). [val] This interpretation α S (⊆ S) is deﬁned by: [val] [val] [val] S = S, p S = {s ∈ S |p ∈ L(s)}, X S = val(X), [val] [val] [val] [val] [val] ¬α S = S \ α S , α ∨ β S = α S ∪ β S [val] [val] α S = {s ∈ S |∃s : t(s, a) = s and s ∈ α S }, [val] [val(V /X)] ⊆ V }, µX.α(X) S = ∩{V ⊆ S | α S [val] [val×E] 0 ∃Λ.α S = {s ∈ S |∃E =< Λ, E, ε ,t ,L >∈ LabΛ , (s, ε0 ) ∈ α S×E } where (val × E)(X) = val(X) × E for any X ∈ V ar.

Quantiﬁed Mu-Calculus for Control Synthesis

645

Notice that the valuation val does not inﬂuence the semantics of a sentence α ∈ qLµ ; and we write S |= α whenever the initial state of S belongs to α S . Clearly, bisimilar processes satisfy the same qLµ formulas.

3

Control Specifications

This section presents various examples of speciﬁcations for control objectives in qLµ . First, a transformation of formulas is deﬁned, which used to link qLµ model-checking with control problems, as shown by Theorem 1. Variants of the Theorem are then exploited to capture requirements, such as maximal permissive controllers, controllers for open systems, etc. Definition 6. For any sentence α ∈ qLµ (Γ ) and for any x ∈ AP , the x-lift of α is the formula α∗x ∈ qLµ (Γ ∪ {x}), inductively deﬁned by (by convention, ∗ has highest priority) :

∗x = , p∗x = p, X ∗x = X, (¬α)∗x = ¬α∗x, (α ∨ β)∗x = α∗x ∨ β ∗x, ( α)∗x = (x ∧ α∗x), (µX.α)∗x = µX.α∗x, (∃Λ.α)∗x = ∃Λ.α∗x. Definition 7. Given a process S =< Γ, S, s0 , t, L > and some x ∈ Γ , the xpruning of S is the process S(x ) =< Γ, S, s0 , t , L > such that, for all s ∈ S and a ∈ A, t (s, a) = t(s, a) if x ∈ L(t(s, a)) or t (s, a) is undeﬁned otherwise. Lemma 1. For any process S on Γ , for any x ∈ Γ , and for any sentence α ∈ qLµ (Γ ), we have: α S(x ) = α∗x S . Proof. Straightforward by induction on α.

Control synthesis [1, 3, 14, 4] is a generic problem that consists to enforce a plant S to have some property α by composing it with another process R, the controller of S for α. The goal is to synthesize R given α. We focus here on the joint speciﬁcation α and constraints on R : they reﬂect the way it exerts the control. This capability relies on the following theorem. Theorem 1. Given a sentence α ∈ qLµ (Λ ∪ Γ ), where Λ and Γ are disjoint, and a process S on Γ , the following assertions are equivalent : – There exists a controller on Λ of S for α. – S |= ∃c∃Λ. α∗c where c is a fresh proposition. Proof. First, suppose that there exists a process R =< Λ, R, r0 , t, L > such that S × R |= α. Given c ∈ AP \ Λ ∪ Γ , we can easily deﬁne E ∈ LabΛ∪{c} s.t. R is (E))(c ) without the label c. Now, (S × E)(c ) or equivalently S ×(E)(c ) satisﬁes α, since R is (E)(c ) without c and c does not occur in α. Using Lemma 1, we conclude that (S × E) |= α∗c. Suppose now that S |= ∃c∃Λ.α∗c. By Deﬁnition 5, there is a process E ∈ Lab{c}∪Λ such that S × E |= α∗c. By Lemma 1, (S × E)(c ) satisﬁes α. Then take R as (E)(c ) .

646

St´ephane Riedweg and Sophie Pinchinat

We illustrate now the use of qLµ to specify various control requirements. The formula ∃c.α∗c of Theorem 1 is enriched to integrate control rules. In the sequel, we let = a∈A , [A] = a∈A [a], Reachc (γ) = (µY. Y ∨ γ)∗ c, and Invc (γ) = (νY.[A]Y ∧ γ) ∗ c. Also, the x-lift is canonically extended to conjunctions of propositions. Maximally Permissive Admissible Controller for α. When a system S has uncontrollable transitions, denoted by the set of labels Auc , an admissible controller for α should not disable any of them. Its existence may be expressed by the formula (1). An admissible controller c for α is maximally permissive if no other admissible controller c for α can cut strictly less transitions than c. Writing c c the mu-calculus formula Invc (c ) ∧ Reachc (¬c); this requirement is expressed by the formula (2). (1) S |= ∃c. Invc ([Auc ]c) ∧ α∗c S |= ∃c.Invc ([Auc ]c) ∧ α∗c ∧ ∀c . Invc ([Auc ]c ) ∧ (c c ) ⇒ ¬α∗c . (2) Maximally Permissive Open Controller for α. As studied in [2], an open system S takes the environment’s policy into account : the alphabet A of transitions is a disjoint union of the alphabet Aco of controllable transitions and the alphabet Auc of uncontrollable transitions, permitted or not by the environment. The open controller must ensure α for any possible choice of the environment. This requirement is expressed by the formula (3), where the proposition e represents the environment’s policy. The ad-hoc solution of [2] cannot be easily extended to maximally permissive open controller. This requirement is expressed by the formula (4). (3) S |= ∃c.Invc ([Auc ]c) ∧ ∀e. (Inve ([Aco ]e)) ⇒ α∗(e∧c) . S |= ∃c.Invc ([Auc ]c) ∧ ∀e. (Inve ([Aco ]e)) ⇒ α∗(e∧c) (4) ∧∀c . (Invc ([Auc ]c ) ∧ (cut(c) cut(c ))) ⇒ ∃e .Inve ([Aco ]e)∧¬α∗(e ∧c ) . Implementable Controller for “Non-blocking”. Such a controller [13], is an admissible controller which, moreover, selects exactly one controllable transition at a time, and such that, in the resulting supervised system, a ﬁnal state (given by the proposition Pf ) is always reachable. Let Nblock = νZ. (µX.Pf ∨ X)∧[A]Z a and let Impl = ( a∈Aco →) ⇒ a∈Aco (< a > c ∧ [Aco \ {a}]¬c) ; a nonblocking implementable controller of a system S may be expressed by the formula: S |= ∃c. c ∧ (Nblock)∗c ∧ Invc ([Auc ]c) ∧ Invc (Impl) Decentralized controllers for α. The existence of decentralized controllers R1 and R2 such that S × R1 × R2 |= α may be expressed : S |= ∃c1 ∃c2 .α∗(c1 ∧ c2 ).

4

Quantified Mu-Calculus and Automata

Automata-theoretic approaches provide the model theory of mu-calculus, and they oﬀer decision algorithms for the satisﬁability and the model-checking problems [15, 10, 16–18, 8]. Depending on the approach followed, diﬀerent automata

Quantiﬁed Mu-Calculus for Control Synthesis

647

have been considered, diﬀering mainly in two orthogonal parameters : the more or less restricted kind of transitions, ranging from alternating automata to the subclass of non-deterministic automata, and the acceptance conditions e.g. Rabin, Streett, Motkowski/parity. The class of tree automata for mu-calculus which we shall adapt to qLµ is the class of alternating parity automata, or shortly simple automata, considered in [3]. This adaptation is stated by Theorem 2 below which constitutes the main result of this section; the remaining brings back the material needed for its proof. Theorem 2. (Main result)For any sentence α ∈ qLµ (Γ ), there exists a simple automaton Aα on Γ such that, for any process S on Γ : S |= α iﬀ S is accepted by Aα Definition 8. (Simple Automata on Processes) A simple automaton on Γ is a tuple A =< Γ, Q, Q∃, Q∀, q 0, δ : Q × P(Γ ) → P(M oves(Q)), r > where Q is a ﬁnite set of states, partitioned into two subsets Q∃ and Q∀ of respectively existential and universal states, q 0 ∈ Q is the initial state, r : Q → IN is the parity condition, and the transition function δ assigns to each state q and to each subset of Γ a set of possible moves included in M oves(Q) = ((A ∪ { }) × Q) ∪ (A × {→, → }). Definition 9. (Nondeterministic Automata on Processes) A simple automaton is nondeterministic if for any set of labels Λ ⊆ Γ , δ(q, Λ) ⊆ { } × Q for any q ∈ Q∃ , and δ(q, Λ) ⊆ M oves(Q) { } × Q for any q ∈ Q∀ . Moreover, in case when q ∈ Q∀ , it is required that (a1 , q1 ), (a2 , q2 ) ∈ δ(q, Λ) ∩ (A × Q) and a1 = a2 entail q1 = q2 . Finally, the initial state should be an existential state. A nondeterministic automaton is bipartite if for any Λ ⊆ Γ , δ(q, Λ) ⊆ { } × Q∀ for any q ∈ Q∃ and δ(q, Λ) ∩ (A × Q) ⊆ A × Q∃ for any q ∈ Q∀ . Parity games provide automata semantics. A parity game is a graph with an initial vertex v 0 , with a partition (VI , VII ) of the vertices, and with a partial mapping r from the vertices to a given ﬁnite set of integers. A play from some vertex v proceeds as follows: if v ∈ VI , then player I chooses a successor vertex v , else player II chooses a successor vertex v , and so on ad inﬁnitum unless one player cannot make any move. The play is winning for player I if it is ﬁnite and ends in a vertex of VII , or if it is inﬁnite and the upper bound of the set of ranks r(v) of vertices v that are encountered inﬁnitely often is even. A strategy for player I is a function σ assigning a successor vertex to every sequence of vertices → → → v , ending in a vertex of VI . A strategy σ is memoryless if σ( v ) = σ(w) whenever → → the sequences v and w end in the same vertex. A strategy for player I is winning if all play following the strategy from the initial vertex are winning for player I. Winning strategies for player II are deﬁned similarly. The fundamental result of parity games is the memoryless determinacy Theorem, established in [10, 8]. Theorem 3. (Memoryless determinacy) For any parity game, one of the players has a (memoryless) winning strategy.

648

St´ephane Riedweg and Sophie Pinchinat

Definition 10. Given a simple automaton A =< Γ, Q, Q∃ , Q∀ , q 0 , δ, r > and a process S =< Γ, S, s0 , t, L >, we deﬁne the parity game G(A, S); where the vertices of player I are in Q∃ × S ∪ { } and the vertices of player II are in Q∀ × S ∪ {⊥}; the initial vertex v 0 is (q 0 , s0 ), the other vertices and transitions are deﬁned inductively as follows. Vertices and ⊥ have no successor. For any vertex (q, s) and for all a ∈ A: – there is an -edge to a successor vertex (q , s) if ( , q ) ∈ δ(q, L(s)), – there is an a-edge to a successor vertex (q , s ) if (a, q ) ∈ δ(q, L(s)) and t(s, a) = s , – there is an a-edge to a successor vertex if (a, →) ∈ δ(q, L(s)) and t(s, a) is deﬁned, or (a, →) ∈ δ(q, L(s)) and t(s, a) is not deﬁned, – there is an a-edge to a successor vertex ⊥ if (a, →) ∈ δ(q, L(s)) and t(s, a) is not deﬁned, or (a, →) ∈ δ(q, L(s)) and t(s, a)is deﬁned. The automaton A accepts S (noted S |= A) if there is a winning strategy for player I in G(A, S). Like automata on inﬁnite trees [10, 11], simple automata on processes are equivalent to bipartite non-deterministic automata. This fundamental result, due to [8, 3], is called the Simulation Theorem: Theorem 4. (Simulation Theorem for processes) Every simple automaton on processes is equivalent to a bipartite nondeterministic automaton. Since a constructive proof of Theorem 2 for α ∈ Lµ may be found in [8, 18, 9], in order to extend it to qLµ , we consider projections of automata: it is the semantic counterpart of the existential quantiﬁcation in qLµ . Projections presented here are similar to projections of nondeterministic tree automata presented in [12, 19] : projected automata are obtained by forgetting a subset of propositions in the condition of the transitions. Definition 11. (Projection) Let Γ ⊆ Γ and let A =< Γ , Q, Q∃ , Q∀ , q 0 , δ, r > be a bipartite nondeterministic automaton. The projection of A on Γ is the bipartite nondeterministic automaton A ↓Γ =< Γ, Q∃ ∪ Q∀ × P(Λ), Q∃ , Q∀ × P(Λ), q 0 , δ↓Γ , r↓Γ >, where for all l ⊆ Λ and for all l ⊆ Γ : 1. ∀q ∈ Q∃ : δ↓Γ (q, l ) = {( , (q , l)) |( , q ) ∈ δ(q, l ∪ l)}, 2. ∀q ∈ Q∀ : δ↓Γ ((q, l), l ) = δ(q, l ∪ l), 3. ∀q ∈ Q∃ : r↓Γ (q) = r(q) and ∀q ∈ Q∀ : r↓Γ ((q, l )) = r(q). Theorem 5. (Projection) Let A =< Γ , Q, Q∃ , Q∀ , q 0 , δ, r > be a bipartite nondeterministic automaton. For any process S =< Γ, S, s0 , t, L > on Γ ⊆ Γ , S |= A↓Γ if and only if there exists a labeling process E on Λ = Γ Γ such that S × E |= A. Proof. First, suppose S |= A↓Γ . Let σ be a winning memoryless strategy for player I in the game G = G(A↓Γ , S) (Theorem 3) and let VII ⊆ Q∀ × P(Λ) × S be the set of nodes from player II in G without ⊥. Let E ∈ LabΛ be an arbitrary

Quantiﬁed Mu-Calculus for Control Synthesis

649

completion of the process < Λ, VII , σ(q 0 , s0 ), t , L >, where for any (q, l, s) inVII , L (q, l, s) = l, and for all a ∈ A, t ((q, l, s), a) = σ(s , q ), for some (unique) (s , q ) such that there is an a-arc from ((q, l), s) to (q , s ) in G. Then, it can be show that we deﬁne a winning strategy σ for player I in the game G(A, S × E) by σ ((s, σ(s, q)), q) = σ(s, q). Reciprocally, suppose S × E |= A for some E ∈ LabΛ . It suﬃce to show that any memoryless winning strategy for player I in G(A, S×E) deﬁnes a memory winning strategy for player I in the game G(A↓Γ , S). We prove now Theorem 2, by induction on the structure of α ∈ qLµ (Γ ). We only prove the case where α is quantiﬁed, since α ∈ Lµ is dealt with following [8] and [9]. Without loss oﬀ generality, we can assume α of the form qΛα , where q ∈ {∃, ∀}. For q = ∃, let A be the bipartite nondeterministic automaton equivalent to Aα (Theorem 4). Now take Aα = A↓Γ and conclude by Theorem 5. The case where q = ∀ is obtained by complementation : since parity games are determined (Theorem 3), we complement A∃Λ¬α (as in [11]) to obtain Aα Theorem 2 gives an eﬀective construction of ﬁnite controllers on ﬁnite processes: given a ﬁnite process S and a sentence ∃c.α ∈ qLµ expressing a control problem, we construct the automaton A(∃c.α) . If we ﬁnd a memoryless winning strategy in the ﬁnite game G(Aα , S), Theorem 5 gives a ﬁnite controller. Otherwise, there is no solution. We can show that the complexity of such problem is (k + 1)EXP T IM E − complete where k is the number of alternations of existential and universal quantiﬁers in α. The result of [2] is retrieved: synthesizing controllers for open systems is 2EXP T IM E − complete for mu-calculus control objectives.

5

Controller Synthesis

This section illustrates the constructions on a simple example. The plant S (to be controlled) is drawn next page. Both states s0 and s1 are labeled with the empty set of propositions, thus S is a process on Γ = ∅. The control objective is the formula α = νY. Y ∧ (µX.[a]X). There is a controller of S for α iﬀ S |= ∃c.α∗c but also iﬀ S |= ∃c.c ∧ α∗c. Let φ be the formula c ∧ α∗c ≡ c ∧ νY. (c ∧ Y )∧ (c ∧ µX.[a](c ⇒ X)). The bipartite nondeterministic automaton Aφ is shown in Figure 1, where the following graphical conventions are used: circled states are existential states, while states enclosed in squares are universal states; the transitions between states are represented by edges in {a, b, } × P({c}); the other transitions are represented by labeled edges from states to the special boxe containing the symbol →. The rank function maps q 0 to 2 and q2 to 1. The projected automaton Aφ ↓∅ is shown in Figure 2, using similar conventions. Note that all transitions are labeled in {a, b, } × {∅}, since Aφ ↓∅ is an automaton on Γ = ∅, but all universal states are now labeled in Q × P({c}), as a result of the projection. Now, S |= ∃c.φ iﬀ Aφ↓∅ accepts S and this condition is equivalent to the existence of a winning strategy for player I in the ﬁnite parity game G(Aφ ↓∅ , S) of Figure 3. Clearly, player I has an unique memoryless wining strategy σ, that maps the vertex (q2 , s0 ) to (q2 , ∅, s0 ). The labeling process E on {c} derived from σ is shown in Figure 4. Four states and

650

St´ephane Riedweg and Sophie Pinchinat

transitions between them are ﬁrst computed, yielding an incomplete process on {c}. A last state c is then added so as to obtain a complete process. The dashed transitions (and all dead transitions) are ﬁnally suppressed to yield the synthesized controller.

6

Conclusion

The logical formalism we have developed allows to synthesize controllers for a large class of control objectives. All the constraints, as maximally permissive controllers or admissible ones for open systems, are formulated as objectives. As it is, the class of controllers is left free and we cannot, for example, deal with partial observation. The recent work of [3] oﬀers two constructions that we can use to interpret the quantiﬁed mu-calculus relatively to some ﬁxed classes of labeling processes. The ﬁrst construction, the quotient of automata, forces the labeling processes to be in some mu-calculus (deﬁnable) class. It can be seen as a generalization of the automata projection, and used instead. The quantiﬁed mu-calculus could hence be extended by constraining each quantiﬁer to range

Quantiﬁed Mu-Calculus for Control Synthesis

651

over some mu-calculus class. Nevertheless, the class of controllers under partial observation being undeﬁnable in the mu-calculus, we need to consider the second construction: the quotient of automata over a process exhibits (when it exists) a controller under partial observation inside some mu-calculus class. The outermost quantiﬁcation of a sentence is then made relative to some class of partial observation. Therefore, we can seek a controller under partial observation for open systems, but we cannot synthesize a maximally permissive controller among the controllers under partial observation.

References 1. Ramadge, P.J., Wonham, W.M.: The control of discrete event systems. Proceedings of the IEEE; Special issue on Dynamics of Discrete Event Systems 77 (1989) 2. Kupferman, O., Madhusudan, P., Thiagarajan, P., Vardi, M.: Open systems in reactive environments: Control and synthesis. CONCUR 2000, LNCS 1877. 3. Arnold, A., Vincent, A., Walukiewicz, I.: Games for synthesis of controllers with partial observation. To appear in TCS (2003) 4. Vincent, A.: Synth`ese de contrˆ oleurs et strat´egies gagnantes dans les jeux de parit´e. MSR 2001 5. Sistla, A., Vardi, M., Wolper, P.: The complementation problem for Buch¨ı automata with applications to temporal logic. TCS49 (1987) 6. Kupferman, O.: Augmenting branching temporal logics with existential quantiﬁcation over atomic propositions. Journal of Logic and Computation 9 (1999) 7. Patthak, A.C., Bhattacharya, I., Dasgupta, A., Dasgupta, P., Chakrabart, P.P.: Quantiﬁed computation tree logic. IPL 82 (2002) 8. Arnold, A., Niwinski, D.: Rudiments of mu-calculus. North-Holland (2001) 9. Walukiewicz, I.: Automata and logic. In: Notes from EEF Summer School’01. (2001) 10. Emerson, E.A., Jutla, C.S.: Tree automata, mu-calculus and determinacy. FOCS 1991. IEEE Computer Society Press (1991) 11. Muller, D.E., Schupp, P.E.: Simulating alternating tree automata by nondeterministic automata: New results and new proofs of the theorems of Rabin, McNaughton and Safra. TCS 141 (1995) 12. Rabin, M.O.: Decidability of second-order theories and automata on inﬁnite trees. Trans. Amer. Math. Soc. 141 (1969) 13. Dietrich, P., Malik, R., Wonham, W., Brandin, B.: Implementation considerations in supervisory control. In: Synthesis and Control of Discrete Event Systems. Kluwer Academic Publishers (2002) 14. Bergeron, A.: A uniﬁed approach to control problems in discrete event processes. Theoretical Informatics and Applications 27 (1993) 15. Emerson, E.A., Sistla, A.P.: Deciding full branching time logic. Information and Control 61 (1984) 16. Emerson, E.A., Jutla, C.S., Sistla, A.P.: On model-checking for fragments of mucalculus. CAV 1993, LNCS 697. 17. Streett, R.S., Emerson, E.A.: The propositional mu-calculus is elementary. ICALP 1984, LNCS 172. 18. Kupferman, O., Vardi, M.Y., Wolper, P.: An automata-theoretic approach to branching-time model checking. Journal of the ACM 47 (2000) 19. Thomas, W.: Automata on inﬁnite objects. In Leeuwen, J.v., ed.: Handbook of TCS, vol. B. Elsevier Science Publishers (1990)

On Probabilistic Quantiﬁed Satisﬁability Games Marcin Rychlik Institute of Informatics, Warsaw University Banacha 2, 02-097 Warsaw, Poland [email protected]

Abstract. We study the complexity of some new probabilistic variant of the problem Quantiﬁed Satisﬁability(QSAT). Let a sentence ∃v1 ∀v2 . . . ∃vn−1 ∀vn φ be given. In classical game associated with the QSAT problem, the players ∃ and ∀ alternately chose Boolean values of the variables v1 , . . . , vn . In our game one (or both) players can instead determine the probability that vi is true. We call such player a probabilistic player as opposite to classical player. The payoﬀ (of ∃) is the probability that the formula φ is true. We study the complexity of the problem if ∃ (probabilistic or classical) has a strategy to achieve the payoﬀ at least c playing against ∀ (probabilistic or classical). We completely answer the question for the case of threshold c = 1, exhibiting that the case when ∀ is probabilistic is easier to decide (Σ2P –complete) than the remaining cases (PSPACE-complete). For thresholds c < 1 we have a number of partial results. We establish PSPACE-hardness of the question whether ∃ can win in the case when only one of the players is probabilistic, and Σ2P -hardness when both players are probabilistic. We also show that the set of thresholds c for which a related problem is PSPACE is dense in [0, 1]. We study the set of reals c ∈ [0, 1] that can be game values of our games. The set turns out to include the set of binary rationals, but also some irrational numbers.

1

Introduction

In this paper we study a certain probabilistic variant of the problem Quantiﬁed Satisﬁability(QSAT). Games with coin tosses (see e.g. [9][3][2][5]) or the games where players use randomized strategies (see e.g. [6][11][4]), have been widely considered in several previous works in complexity theory. Many papers consider a possibility of players to choose probability distributions (mixed strategies [11][4][7][8] or behavior strategies[6][7]), but the choices are made by the players just once per game, either independently or with just one alternation. A crucial diﬀerence between these works and ours is that in our framework probabilities are chosen by players in turn, according to the values of probabilities chosen so far. To our knowledge, such situation has not been considered so far. Quantiﬁed Satisﬁability was studied in [1]. It can be considered as a game between two players, call them ∃ and ∀. Fix some Boolean formula φ (x1 , . . . , xn ).

Supported by Polish KBN grant No. 7 T11C 027 20

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 652–661, 2003. c Springer-Verlag Berlin Heidelberg 2003

On Probabilistic Quantiﬁed Satisﬁability Games

653

The two players move alternately, with ∃ moving ﬁrst. If i is odd then ∃ ﬁxes the value of xi , whereas if i is even ∀ ﬁxes the value of xi . ∃ tries to make the expression φ true, while ∀ tries to make it false. Then ∃ has a winning strategy iﬀ ∃x1 ∀x2 ∃x3 . . . φ (x1 , . . . , xn ) is true. If we assume that ∀ is uninterested in winning and plays at random then the game becomes a Game Against Nature studied in [9](see also [10]). The decisions of the Nature are probabilistic in manner, Nature chooses xi = 0 or xi = 1 with probability 12 . In this case a winning strategy for ∃ is a strategy that enforces the probability of success to be greater than 12 . Both in the case of the game Quantiﬁed Satisﬁability and in the case of the Game Against Nature the following problem is PSPACE-complete[9]: Given φ decide whether there exists a winning strategy for ∃. There will be a diﬀerence between the Games Against Nature and our probabilistic variant of the game Quantiﬁed Satisﬁability. In Games Against Nature the players use deterministic (pure) strategies. It means that at a particular node in a game, a player ∃(playing against Nature) is required to make a strategic move - say to choose the side of the coin. Or else, Nature is required to toss a coin, but the probabilities associated with the coin tosses are ﬁxed in advance and not chosen by Nature. Hence, coin tosses correspond to “chance moves” in standard game-theoretic terminology. In our game, the biases of the coins will be chosen strategically in turn by both players. Once the biases of all the coins are determined, the coins are tossed. Thus the values of x1 , . . . , xn are determined and ∃ wins iﬀ φ (x1 , . . . , xn ) is true. More speciﬁcally, we consider two types of players. A probabilistic player, instead of determining the value of xi , chooses the probability pi of xi being 1, where we assume that events {xi = 1} and {xj = 1} are independent when i = j. A player using a classical strategy, i.e. choosing values 0 or 1, can be viewed as a probabilistic player as well, but restricted to pi = 0 or pi = 1. The chosen probabilities p1 , p2 , . . . , pn determine the probability P (φ) of the event that φ is true. Now ∃ tries to make P (φ) as large as possible whereas ∀ tries to minimize P (φ). So P (φ) can be meant as the payoﬀ of ∃. Notice that in the classical Quantiﬁed Satisﬁability game the payoﬀ P (φ) can be only 0 or 1. The following computational problem arises: given formula φ decide if ∃ can make P (φ) greater than a ﬁxed threshold c ∈ [0, 1). We shall study the problem and related ones in this paper. We prove that the problem if ∃ can make P (φ) = 1 is Σ2P -complete(see e.g. [10]), when ∀ is probabilistic, and that this question is PSPACE-complete, when ∀ is classical. We show that it is PSPACE-hard to tell whether a probabilistic ∃ can enforce P (φ) ≥ c, when the opponent ∀ is classical. Similarly it is PSPACE-hard to tell whether a classical ∃ can make P (φ) > c when ∀ is probabilistic. In both cases we assume that thresholds are ﬁxed. We also present Poly (|φ| , |log2 ε|)space algorithm which, given φ and ε > 0, returns a value that is ε-close to the maximal value of P (φ) attainable by ∃ . We prove that for ∈ {>, ≥} and for all types of players ∃ and ∀ (classical or probabilistic) the following set is

654

Marcin Rychlik

dense in [0, 1]: the set of constants c ∈ [0, 1] such that the language of Boolean formulas φ such that ∃ can make P (φ) c is in PSPACE. For the proofs we refer the reader to [12].

2

Variants of The Problem of Quantiﬁed Satisﬁability

Let V be a countable set of variables. Recall a deﬁnition of the set of Boolean formulas Φ ::= 0 | 1 | V

|

∼Φ |

(Φ ∨ Φ)

|

(Φ ∧ Φ) .

Fix φ (v1 , . . . , vn ) ∈ Φ. Let xi ∈ {0, 1} , 1 ≤ i ≤ n. Then the meaning of φ (x1 , . . . , xn ) is the logical value of φ after replacing variables v1 , . . . , vn in φ by x1 , . . . , xn respectively. Now let X1 , . . . , Xn be pairwise independent random variables with range {0, 1} . Naturally φ (X1 , . . . , Xn ) can be understood as the random variable with range {0, 1} such that P (φ (X1 , . . . , Xn ) = 1), also written P (φ (X1 , . . . , Xn )) for short, equals the probability of the event that (X1 , . . . , Xn ) satisﬁes φ: P (φ (X1 , . . . , Xn ) = 1) =

n

P (Xi = xi ) .

(1)

(x1 ,...,xn )∈{0,1}n i=1 φ(x1 ,...,xn )=1

Note that P (φ (X1 , . . . , Xn ) = 1) is the expected value of φ (X1 , . . . , Xn ). In the sequel, Pp1 ,...,pn (φ) stands for P (φ (X1 , . . . , Xn ) = 1), where Xi s are arbitrary pairwise independent random variables satisfying P (Xi = 1) = pi , 1 ≤ i ≤ n. For all p1 , . . . , pn ∈ [0, 1] Pp1 ,...,pn (φ) =

n

pi xi

(2)

(x1 ,...,xn )∈{0,1}n i=1 φ(x1 ,...,xn )=1

pi if xi = 1 . 1 − pi if xi = 0 For the rest of this paper we shall assume that the range of random variables we consider is {0, 1} and that diﬀerently named random variables are pairwise independent. For instance X1 and X2 would denote two pairwise independent random variables with range {0, 1}. We shall write φ (X1 , . . . , Xn ) as the abbreviation for P (φ (X1 , . . . , Xn ) = 1) = 1. Consider the following statement: “There is a random variable X such that for every random variable Y we have P (X ↔ Y ) ≥ 12 ” (Here we wrote φ1 ↔ φ2 as the abbreviation for ((φ1 ∨ ∼ φ2 ) ∧ (∼ φ1 ∨ φ2 ))). It is a true statement - consider random variable X with P (X = 1) = 12 . This statement can be rewritten as 1 ∃X∀Y P (X ↔ Y ) ≥ . 2 where pi xi =

On Probabilistic Quantiﬁed Satisﬁability Games

655

We used uppercase letters X and Y to emphasize that they represent random variables. Sometimes we would like to state also that: “There is a random variable X such that for every y ∈ {0, 1} we have P (X ↔ y) ≥ 12 .” This can be viewed as the previous statement with Y restricted to two random variables: such that P (Y = 1) = 1 or P (Y = 0) = 1. We will denote it by ∃X∀y P (X ↔ y) ≥

1 . 2

Here and subsequently, ∃X means that there is a random variable X, ∃x means that there is a random variable x restricted to two random variables: such that P (x = 1) = 1 or P (x = 0) = 1. Similarly in the case of quantiﬁer ∀. We extend this notation to longer preﬁxes in obvious way. Note that ∃x1 ∀y1 ∃x2 . . . φ has its usual meaning. Consider formula of the form: Q1 y1 Q2 y2 Q3 y3 . . . Qn yn P (φ (y1 , y2 , y3 , . . . , yn )) c

(3)

where ∈ {≥, ≤, >, <}, c ∈ [0, 1), yi ∈ {xi , Xi }, Qi ∈ {∃, ∀}, 1 ≤ i ≤ n. We will interpret formula (3) as the game between ∃ and ∀. A player Qi chooses yi in turn, for i = 1, . . . , n. ∃ wins if P (φ (y1 , y2 , y3 , . . . , yn )) c, after all yi s are chosen. ∃ has a winning strategy, if he can make P (φ (y1 , y2 , y3 , . . . , yn )) c, and ∀ has a winning strategy, if he can make P (φ (y1 , y2 , y3 , . . . , yn )) c. Obviously ∃ has a winning strategy iﬀ formula (3) is true, and ∀ has a winning strategy iﬀ the following formula is true: Q1 y1 Q2 y2 Q3 y3 . . . Qn yn P (φ (y1 , y2 , y3 , . . . , yn )) c where Qi is ∃ if Qi = ∀, and Qi is ∀ if Qi = ∃, 1 ≤ i ≤ n. In the case of being ’≥’ or ’>’ ∃ tries to make P (φ (y1 , y2 , y3 , . . . , yn )) as big as possible, and then it is natural to call P (φ (y1 , y2 , y3 , . . . , yn )) the payoﬀ of ∃. If yi = Xi for every yi chosen by ∃, then we call ∃ a probabilistic player, and we say that he uses a probabilistic strategy. If yi = xi for every yi chosen by ∃, then we call ∃ a classical player, and we say he uses a classical strategy then. We use similar terminology for the case of the player ∀. For the rules of the game described in the introduction we can consider following problem. Problem 1. Fix c ∈ [0, 1) . Given Boolean formula φ decide whether ∃X1 ∀X2 ∃X3 . . . n Xn P (φ (X1 , X2 , X3 , . . . , Xn )) > c

(4)

where the nth quantiﬁer n is ∃ if n is odd, and ∀ if n is even. In the case of threshold c given by ﬁnitely representable rational number decidability of the problem 1 and of similar ones follows from Tarski’s Theorem on the decidability of the ﬁrst-order theory of the ﬁeld of real numbers. For example, we can rewrite formula ∃X∀Y P (X ↔ Y ) ≥ 12 as the following sentence of theory of reals ∃pX (0 ≤ pX ≤ 1) ∧ ∀pY [(0 ≤ pY ≤ 1) ⇒ pX pY + (1 − pX ) (1 − pY ) > c] .

656

Marcin Rychlik

In general, the size of an expression representing P (φ (X1 , X2 , X3 , . . . , Xn )) can be of exponential size with respect to the size of φ. The following problem is PSPACE-complete[1]. Problem 2 (Quantiﬁed Satisﬁability). Given formula φ decide whether ∃x1 ∀x2 ∃x3 . . . n xn φ . One may make conjecture that ∃X1 ∀X2 ∃X3 . . . n Xn φ (X1 , X2 , X3 , . . . , Xn ) is equivalent to ∃x1 ∀x2 . . . n xn φ (x1 , x2 , x3 , . . . , xn ). But it is not true as the following example shows. Example 1. Let φ = v1 ↔ v2 . Then ∀x1 ∃x2 φ (x1 , x2 ) is true but it is not true that ∀X1 ∃X2 φ (X1 , X2 ), because if P (X1 = 1) = 12 then P (φ (X1 , X2 )) = P (X1 = 0) P (X2 = 0) + P (X1 = 1) P (X2 = 1) 1 1 1 = P (X2 = 0) + P (X2 = 1) = < 1 2 2 2 whatever X2 is chosen.

The next example shows that for some Boolean formulas φ quantiﬁed formula ∃x1 ∀x2 . . . n xn φ is true whereas ∃X1 ∀X2 . . . n Xn P (φ (X1 , . . . , Xn )) ≥ c is true only when c is negligible. Example 2. Let φ =

n i=1

(v2i−1 ↔ v2i ). Then ∀x1 ∃x2 . . . ∃x2n φ (x1 , . . . , x2n ) is

true but ∀X1 ∃X2 . . . ∃X2n P (φ (X1 , . . . , X2n )) ≥ c is not true unless c ≤ 21n . If we set P (X2i−1 = 1) = 12 for all 1 ≤ i ≤ n, then P (X2i−1 ↔ X2i ) = 12 for all 1 ≤ i ≤ n (see the previous example) and in consequence P (φ (X1 , . . . , X2n )) = n P (X2i−1 ↔ X2i ) = 21n , no matter how ∀ chooses X2 , . . . , X2n . We used the i=1

fact that for arbitrary Boolean formulas φ1 (v1 , . . . , vn ) and φ2 (w1 , . . . , wm ) P (φ1 (X1 , . . . , Xn ) ∧ φ2 (Y1 , . . . , Ym )) = P (φ1 (X1 , . . . , Xn )) P (φ2 (Y1 , . . . , Ym )) when Xi s and Yi s are pairwise independent random variables.

The example above may seem to suggest that if a player has no winning strategy then the best he can do is to always choose probability 12 . But the following example illustrates that this need not be the case. Example 3. Consider formula φ (v1 , v2 , v3 , v4 ) such that φ (x1 , x2 , x3 , x4 ) is true if and only if (x1 , x2 , x3 , x4 ) ∈ {(1, 0, 0, 0) , (0, 1, 0, 0) , (0, 0, 1, 0) , (0, 1, 1, 0) , (1, 1, 1, 0) , (1, 0, 0, 1) , (0, 1, 0, 1) , (1, 1, 0, 1)} .

On Probabilistic Quantiﬁed Satisﬁability Games

657

One can check that ∃x1 ∀x2 ∃x3 ∀x4 φ (x1 , x2 , x3 , x4 ) is not true. The value F (p) deﬁned by

F (p) =

  

p

3 2   − 1 −3p+p −p √

2

√

−1+2p+p2 +p2 +1

−1+2p+p2 p

if 0 ≤ p ≤ if

1 2

1 2

is the maximal value of P (φ) available to ∃, when ∃ chooses P (X1 = 1) = p in the ﬁrst move. The computation is explained in [12](see also example 4). The

√ 13

√ 1 3 ∗ value of p maximizing F (p) is p = 6 53 − 6 78 + 6 53 + 6 78 − 16 ≈ 0.657298 = 12 . If ∃ would choose P (X1 = 1) = 12 instead, then he could attain

1 F 2 = 12 at most, which is less than F (p∗ ) ≈ 0.553906. It is worth noting that both p∗ and F (p∗ ) are irrational1 . It may come as a slight surprise, that the problem to decide whether ∃X1 ∀X2 ∃X3 . . . n Xn φ is not as hard as QSAT, unless PSPACE collapses to the second level of the polynomial hierarchy(see also Summary at page 659): Theorem 1. ∃x1 ∃x3 . . . ∃xι ∀x2 ∀x4 . . . ∀xκ φ ⇔ ∃x1 ∀X2 ∃x3 . . . n χn φ ⇔ ∃x1 ∀X2 ∃x3 . . . n χn P (φ) > 1 − ⇔ ∃X1 ∀X2 ∃X3 . . . n Xn φ

1 2n/2

⇔ ∃X1 ∀X2 ∃X3 . . . n Xn P (φ) > 1 −

1 2n/2

where χn is xn if n is odd, and Xn if n is even, and ι = n, κ = n − 1 if n is odd, and ι = n − 1, κ = n if n is even. The following theorem shows that if the player ∀ is classical then probabilistic strategy does not add the power to ∃, when threshold c is set to 1. Theorem 2. ∃x1 ∀x2 ∃x3 . . . n xn φ ⇔ ∃X1 ∀x2 ∃X3 . . . n κn φ ⇔ ∃X1 ∀x2 ∃X3 . . . n κn P (φ) > 1 −

1 2n/2

where κn is Xn if n is odd, and xn if n is even. 1

We used command FullSimplify[x ∈ Rationals] in Mathematica ver. 4.0.1.0 program created by Wolfram Research, Inc., to get this result.

658

3

Marcin Rychlik

Game Value

Deﬁnition. Let φ (v1 , . . . , vn ) ∈ Φ. cφ = max

min . . . n

Pp1 ,...,pn (φ)

(5)

cφ = max

min . . . n

Pp1 ,...,pn (φ)

(6)

cφ = max

min . . . n Pp1 ,...,pn (φ)

(7)

p1 ∈[0,1] p2 ∈[0,1]

pn ∈[0,1]

p1 ∈[0,1] p2 ∈{0,1}

pn ∈∆n

p1 ∈{0,1} p2 ∈[0,1]

pn ∈Λn

where n ,∆n ,Λn are max, [0, 1] , {0, 1} respectively if n is odd, and min, {0, 1}, [0, 1] if n is even. Let

Γ = {cφ : φ ∈ Φ} , Γ = cφ : φ ∈ Φ , Γ = cφ : φ ∈ Φ . The values at the right-hand sides of the formulas (5), (6) and (7), call them the game values, are well deﬁned because the sets [0, 1] and {0, 1} are compact and for 1 < i ≤ n the following maps are continuous with respect to p1 , . . . , pi p1 , . . . , pi →

i+1 . . . n Pp1 ,...,pn pn ∈[0,1] pi+1 ∈[0,1]

p1 , . . . , pi →

i+1 pi+1 ∈∆i+1

. . . n Pp1 ,...,pn (φ)

p1 , . . . , pi →

i+1 pi+1 ∈Λi+1

. . . n Pp1 ,...,pn (φ) .

(φ)

pn ∈∆n

pn ∈Λn

Pp1 ,...,pn (φ) is continuous (case i = n) because it is a multilinear map (recall (2)). The continuity of maps in case when i < n can be inductively proved by the use of the following lemma. Lemma 1. Assume f : S × T → R is a continuous map and S ,T are compact spaces. Then F deﬁned by F (s) = maxf (s, t) is also continuous. t∈T

The values cφ , cφ , cφ deﬁned by (5), (6) and (7) are the maximal attainable payoﬀs of ∃ in corresponding games. To see this observe that if f (p) is the payoﬀ of the player corresponding to a choice p ∈ P , where P is the compact set of all possible choices, then F = maxf (p) is the maximal attainable payoﬀ of the p∈P

player provided f is a continuous map. Example 4. Let φ be as in example 3. Then cφ = max

min

cφ = max

min

max

min

p1 ∈[0,1] p2 ∈[0,1] p3 ∈[0,1] p4 ∈[0,1]

max

min

Pp1 ,p2 ,p3 ,p4 (φ) = F (p∗ )

p1 ∈[0,1] p2 ∈{0,1} p3 ∈[0,1] p4 ∈{0,1}

Pp1 ,p2 ,p3 ,p4 (φ) =

where F and p∗ are deﬁned in example 3.

1 √ 5 − 1 ≈ 0.618034 . 2

On Probabilistic Quantiﬁed Satisﬁability Games

659

One can easily check that for every formula φ the following equations hold, relating the game values for φ and ∼ φ. 1 = cφ(v1 ,...,vn ) + c∼φ(v0 ,v1 ,...,vn )

= cφ(v1 ,...,vn ) + c∼φ(v0 ,v1 ,...,vn ) = c∼φ(v0 ,v1 ,...,vn ) + cφ(v1 ,...,vn )

(8)

where we used dummy variable v0 not used in formula φ to enforce x1 or X1 (according to the type of the game), be chosen by ∀. Observe that by (8) we have Γ = {1 − γ : γ ∈ Γ }. We also have following inequalities cφ(v1 ,...,vn ) ≤ cφ(v1 ,...,vn ) ≤ cφ(v1 ,...,vn ) . Theorem 3. For every c ∈ Γ \ {0} the following problem is PSPACE-hard: Given φ decide whether ∃X1 ∀x2 ∃X3 . . . n κn P (φ) ≥ c. Theorem 4. For every c ∈ Γ \ {1} the following problem is PSPACE-hard: Given φ decide whether ∃x1 ∀X2 ∃x3 . . . n χn P (φ) > c. Theorems 1, 2, 3, 4 are summarized below. We will rephrase them in gametheoretic terms. That is, the problem concerning ∃X1 ∀x2 ∃X3 . . . n κn P (φ) > c is considered as the problem of ∃ using probabilistic strategy, against ∀ using classical strategy. Similarly for other cases. Summary of the complexity results. Assume φ is given and c is arbitrary ﬁxed number c ∈ [0, 1), until otherwise stated. We put three questions: if ∃ can make: (i)P (φ) = 1, (ii)P (φ) > c, (iii)P (φ) ≥ c. Our complexity results(when one or both players are probabilistic) depend on the natures of strategies that both players use. (Of course if both players are classical the results are obvious consequences of the PSPACE-completeness of QSAT.) P (φ) = 1

∃\

∀

Probabilistic Σ2P -complete Σ2P -complete

∃\

∀

Probabilistic PSPACE-hard** Σ2P -hard*

∃\

∀

Probabilistic ? ?

Classical Classical PSPACE-complete Probabilistic PSPACE-complete

Classical P (φ) > c Classical PSPACE-complete Probabilistic PSPACE-hard* P (φ) ≥ c

Classical Classical PSPACE-complete Probabilistic PSPACE-hard***

* when c is the part of an input ** when c ∈ Γ \ {1} *** when c ∈ Γ \ {0}

660

Marcin Rychlik

The next theorem yields a partial information concerning the shape of the n sets Γ , Γ and Γ . A number b is binary rational if b = bi 21i for some n and i=1

for some b1 , . . . , bn ∈ {0, 1}. Let Υ be the set of all binary rationals in [0, 1]. Theorem 5. Υ Γ , Υ Γ and Υ Γ . Corollary. The sets Γ , Γ and Γ are dense subsets of the interval [0, 1]. We say that λ is ε-close to λ if |λ − λ | ≤ ε. Theorem 6. Let ∆i = [0, 1] or ∆i = {0, 1} for every 1 ≤ i ≤ n. Given φ (x1 , . . . , xn ) and ε > 0, we can compute in O (log2 |φ| + n log2 n + n |log2 ε|) space a number λ that is ε-close to λ = max min max . . . n Pp1 ,...,pn (φ). p1 ∈∆1 p2 ∈∆2 p3 ∈∆3

pn ∈∆n

In particular, we can compute the approximation of game values cφ , cφ , cφ within the bound just mentioned. One may ask if Theorem 6 could be used to solve Problem 1 in polynomial space, at least for some c. Lemma 2 enables us to give an aﬃrmative answer to this question. Lemma 2. Let D ⊆ Σ ∗ be a language over a ﬁnite alphabet Σ, |Σ| ≥ 2, and let P be a map P : D → [0, 1] . Assume for given d ∈ D we can compute in space O (Poly (|d| , |log ε|)) a value P (d, ε) that is ε-close to P (d). Let ∈ {≥, >}. Then the set {c ∈ [0, 1] : the language {d ∈ D|P (d) > c} is in PSPACE} is a dense subset of [0, 1]. As a corollary we get: Theorem 7. Let ∈ {≥, >}. The sets {c ∈ [0, 1] : the language {φ ∈ Φ|cφ c} is in PSPACE}

c ∈ [0, 1] : the language φ ∈ Φ|cφ c is in PSPACE

c ∈ [0, 1] : the language φ ∈ Φ|cφ c is in PSPACE are dense subsets of [0, 1].

4

Conclusion

We have answered completely the question of the complexity of the problem if ∃ has strategy to achieve payoﬀ 1 for all combinations of types of players. (For both players classical this is the classical QSAT problem.) We have shown PSPACE-hardness of the question whether classical ∃ can make payoﬀ greater than ﬁxed c when ∀ uses a probabilistic strategy. In the case of probabilistic ∃ and classical ∀ we need c to be part of the input to

On Probabilistic Quantiﬁed Satisﬁability Games

661

get PSPACE-hardness. We have PSPACE-hardness result in the case of ﬁxed c when we ask whether ∃ can make payoﬀ greater or equal to c. We have given Σ2P lower bound for the question “P (φ) > c ?” in the case of both players being probabilistic and c belonging to an input. We also indicate that for every mentioned problem it is possible to ﬁnd a dense subset of thresholds for which the problem is in PSPACE. Still many problems remain open. It would be nice to have a PSPACEcompleteness result of the question “P (φ) > c ?” or “P (φ) ≥ c ?” for some ﬁxed c (c = 12 for instance) and for all combinations of types of players. Also, the complexity of the problem of computing an approximation of game values (or exact values if possible) remains to be studied. This is the subject of an ongoing research.

Acknowledgement The author wishes to express his thanks to Prof. Damian Niwi´ nski for many stimulating conversations.

References 1. A.Chandra, D.Kozen, and L.Stockmeyer, Alternation, Journal of the ACM, 28 (1981), pp. 114-133 2. A.Condon, Computational Models od Games, ACM Distinguished Dissertation, MIT Press, Cambridge, 1989 3. A. Condon and R. Ladner, Probabilistic Game Automata, Proceedings of 1st Structure in Complexity Theory Conference, Lecture Notes in Computer Science, vol. 223, Springer, Berlin, 1986, pp. 144-162. 4. J.Feigenbaum, D.Koller, P.Shor, A Game-Theoretic Classiﬁcation of Interactive Complexity Classes, Proceedings of the 10th Annual IEEE Conference on Structure in Complexity Theory (STRUCTURES), Minneapolis, Minnesota, June 1995, pages 227-237. 5. S.Goldwasser, M.Sipser, Private coins versus public coins in interactive proof systems, Randomness and Computation, S.Micali, editor, vol. 5 of Advances in Computing Research, JAI Press, Greenwich, 1989, pp. 73-90 6. D.Koller, N.Megiddo, The complexity of Two-Person Zero-Sum Games in Extensive Form, Games and Economic Behavior, 4:528-552, 1992 7. D.Koller, N.Megiddo, B. von Stengel Fast Algorithms for Finding Randomized Stragegies in Game Trees, Proceedings of the 26th Symposium on Theory of Computing, ACM, New York, 1994, pp. 750-759 8. R.Lipton, N.Young, Simple strategies for large zero-sum games with applications to complexity theory, Contributions to the Theory of Games II, H.Kuhn, A.Tucker, editors, Princeton University Press, Princeton, 1953, pp. 193-216. 9. C.Papadimitriou, Games Against Nature, Journal of Computer and System Sciences, 31(1985), pp. 288-301 10. C.Papadimitriou, Computational Complexity, Addison-Wesley Pub. Co., 1994 11. C.Papadimitriou, M.Yannakakis, On Complexity as Bounded Rationality, Proceedings of the 26th Symposium on Theory of Computing, ACM, New York, 1994, pp. 726-733 12. M.Rychlik, On Probabilistic Quantiﬁed Satisﬁability Games, Available at http://www.mimuw.edu.pl/∼mrychlik/papers

A Completeness Property of Wilke’s Tree Algebras Saeed Salehi Turku Center for Computer Science Lemmink¨ aisenkatu 14 A FIN-20520 Turku [email protected]

Abstract. Wilke’s tree algebra formalism for characterizing families of tree languages is based on six operations involving letters, binary trees and binary contexts. In this paper a completeness property of these operations is studied. It is claimed that all functions involving letters, binary trees and binary contexts which preserve all syntactic tree algebra congruence relations of tree languages are generated by Wilke’s functions, if the alphabet contains at least seven letters. The long proof is omitted due to page limit. Instead, a corresponding theorem for term algebras, which yields a special case of the above mentioned theorem, is proved: in every term algebra whose signature contains at least seven constant symbols, all congruence preserving functions are term functions.

1

Introduction

A new formalism for characterizing families of tree languages was introduced by Wilke [13], which can be regarded as a combination of universal algebraic framework of Steinby [11] and Almeida [1], in the case of binary trees, based on syntactic algebras, and syntactic monoid/semigroup framework of Thomas [12] and of Nivat and Podelski [8],[9]. It is based on three-sorted algebras, whose signature Σ consists of six operation symbols involving the sorts Alphabet, Tree and Context. Binary trees over an alphabet are represented by terms over Σ, namely as Σ-terms of sort Tree. A tree algebra is a Σ-algebra satisfying certain identities which identify (some) pairs of Σ-terms representing the same tree. The syntactic tree algebra congruence relation of a tree language is deﬁned in a natural way (see Deﬁnition 1 below.) The Tree-sort component of the syntactic tree algebra of a tree language is the syntactic algebra of the language in the sense of [11], while its Context-component is the semigroup part of the syntactic monoid of the tree language, as in [12]. A tree language is regular iﬀ its syntactic tree algebra is ﬁnite ([13], Proposition 2.) A special sub-class of regular tree languages, that of k-frontier testable tree languages, is characterized in [13] by a set of identities satisﬁed by the corresponding syntactic tree algebra. For characterizing this sub-class, three-sorted tree algebra framework appears to be more suitable, since “frontier testable tree languages cannot be characterized by syntactic semigroups and there is no known B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 662–670, 2003. c Springer-Verlag Berlin Heidelberg 2003

A Completeness Property of Wilke’s Tree Algebras

663

ﬁnite characterization of frontier testability (for an arbitrary k) in the universal algebra framework” [7]. This paper concerns Wilke’s functions (Deﬁnition 2) by which tree algebra formalism is established for characterizing families of tree languages ([13]). We claim that Wilke’s functions generate all congruence preserving operations on the term algebra of trees, when the alphabet contains at least seven labels. For the sake of brevity, we do not treat tree languages and Wilke’s functions in manysorted algebra framework as done in [13], our approach is rather a continuation of the lines of the traditional framework, as of e.g. [11]. A more comprehensive general study of tree algebras and Wilke’s formalism (independent from this work) has been initiated by Steinby and Salehi [10].

2

Tree Algebraic Functions

For an alphabet A, let Σ A be the signature which contains a constant symbol ca and a binary function symbol fa for every a ∈ A, that is Σ A = {ca | a ∈ A} ∪ {fa | a ∈ A}. The set of binary trees over A, denoted by TA , is deﬁned inductively by: – ca ∈ TA for every a ∈ A; and – fa (t1 , t2 ) ∈ TA whenever t1 , t2 ∈ TA and a ∈ A. Fix a new symbol ξ which does not appear in A. Binary contexts over A are binary trees over A ∪ {ξ} in which ξ appears exactly once. The set of binary contexts over A, denoted by CA , can be deﬁned inductively by: – ξ, fa (t, ξ), fa (ξ, t) ∈ CA for every a ∈ A and every t ∈ TA ; and – fa (t, p), fa (p, t) ∈ CA for every a ∈ A, every t ∈ TA , and every p ∈ CA . For p, q ∈ CA and t ∈ TA , p(q) ∈ CA and p(t) ∈ TA are obtained from p by replacing the single occurrence of ξ by q or by t, respectively. Deﬁnition 1. For a tree language L ⊆ TA we deﬁne the syntactic tree algebra L L congruence relation of L, denoted by ≈L = (≈L A , ≈T , ≈C ), as follows: 1. For any a, b ∈ A, a ≈L A b ≡ ∀p ∈ CA {p(ca ) ∈ L ↔ p(cb ) ∈ L} & ∀p ∈ CA ∀t1 , t2 ∈ TA {p(fa (t1 , t2 )) ∈ L ↔ p(fb (t1 , t2 )) ∈ L}. 2. For any t, s ∈ TA , t ≈L T s ≡ ∀p ∈ CA {p(t) ∈ L ↔ p(s) ∈ L}. 3. For any p, q ∈ CA , p ≈L C q ≡ ∀r ∈ CA ∀t ∈ TA {r(p(t)) ∈ L ↔ r(q(t)) ∈ L}. Remark 1. Our deﬁnition of syntactic tree algebra congruence relation of a tree language is that of [13], but we have corrected a mistake in Wilke’s deﬁnition of ≈L A ; it is easy to see that the original deﬁnition (page 72 of [13]) does not yield a congruence relation. Another diﬀerence is that ξ is not a context in [13].

664

Saeed Salehi

Deﬁnition 2. ([13], page 88) For an alphabet A, Wilke’s functions over A are deﬁned by: ιA : A → TA 2 κA : A × T A → T A A λ : A × TA → CA ρA : A × TA → CA 2 σ A : CA → CA η A : CA × TA → TA

ιA (a) = ca κA (a, t1 , t2 ) = fa (t1 , t2 ) λA (a, t) = fa (ξ, t) ρA (a, t) = fa (t, ξ) σ A (p1 , p2 ) = p1 (p2 ) η A (p, t) = p(t)

Recall that projection functions πjn : B1 × · · · × Bn → Bj (for sets B1 , · · · , Bn ) are deﬁned by πjn (b1 , · · · , bn ) = bj . For a b ∈ Bj , the constant function from B1 × · · · × Bn to Bj , determined by b, is deﬁned by (b1 , · · · , bn ) → b. n

m

k

Deﬁnition 3. For an alphabet A, a function F : A × TA × CA → X where X ∈ {A, TA , CA } is called tree-algebraic over A, if it is a composition of Wilke’s functions over A, constant functions and projection function. Example 1. Let A = {a, b}. The function F : A × TA × CA → CA deﬁned by F (x, t, p) = fa fx fb (ca , ca ), ξ , p fb (t, cx ) , for x ∈ A, t ∈ TA and p ∈ CA , is tree-algebraic over A. Indeed F (x, t, p) = σ A λA a, η A p, κA (b, t, ιA (x)) , ρA x, fb (ca , ca ) . n

m

k

Deﬁnition 4. A function F : A × TA × CA → X where X ∈ {A, TA , CA } is called congruence preserving over A, if for every tree language L ⊆ TA and for all a1 , b1 , · · · , an , bn ∈ A, t1 , s1 , · · · , tm , sm ∈ TA , p1 , q1 , · · · , pk , qk ∈ CA , L L L if a1 ≈L A b1 , · · · , an ≈A bn , t1 ≈T s1 , · · · , tm ≈T sm , L L and p1 ≈C q1 , · · · , pk ≈C qk , then F (a1 , · · · , an , t1 , · · · , tm , p1 , · · · , pk ) ≈L x F (b1 , · · · , bn , s1 , · · · , sm , q1 , · · · , qk ),

where x is A, T, or C, if X = A, X = TA , or X = CA , respectively. Remark 2. In universal algebra, the functions which preserve congruence relations of an algebra, are called congruence preserving functions. On the other hand it is known that every congruence relation over an algebra is the intersection of some syntactic congruence relations (see Remark 2.12 of [1] or Lemma 6.2 of [11].) So, a function preserve all congruence relations of an algebra iﬀ it preserves the syntactic congruence relations of all subsets of the algebra. This justiﬁes the notion of congruence preserving function in our Deﬁnition 4, even though we require that the function preserves only all the syntactic tree algebra congruence relations of tree languages.

A Completeness Property of Wilke’s Tree Algebras

665

Example 2. For A = {a, b}, the root function root: TA → A, which maps a tree to its root label, is not congruence preserving: Let L = {fa (cb , cb )}, then fa (ca , ca ) ≈L T fb (ca , ca ), but since fa (cb , cb ) ∈ L, and fb (cb , cb ) ∈ L, then root fa (ca , ca ) = a ≈L A b = root fb (ca , ca ) . Lemma 1. All tree-algebraic functions are congruence preserving. The easy proof is omitted. We claim the converse for alphabets containing at least seven labels: Theorem 1. For an alphabet A which contains at least seven labels, every congruence preserving function over A is tree-algebraic over A. Remark 3. The condition |A| ≥ 7 in Theorem 1 may seem odd at the ﬁrst glance, but the theorem does not hold for |A| = 2: let A = {a, b} and deﬁne F : A → TA by F (a) = fa (cb , cb ), F (b) = fb (ca , ca ). It can be easily seen that F is congruence preserving but is not tree-algebraic over A. It is not clear at the moment whether Theorem 1 holds for 3 ≤ |A| ≤ 6. The long detailed proof of Theorem 1 will not be given in this paper because of space shortage. Instead, in the next section, a corresponding theorem for term algebras, which immediately yields Theorem 1 for congruence preserving m functions of the form F : TA → TA , is proved.

3

Congruence Preserving Functions in Term Algebras

Our notation follows mainly [2], [3], [5], [6], and [11]. A ranked alphabet is a ﬁnite nonempty set of symbols each of which has a unique non-negative arity (or rank). The set of m-ary symbols in a ranked alphabet Σ is denoted by Σm (for each m ≥ 0). TΣ (X) is the set of Σ-terms with variables in X. For empty X it is simply denoted by TΣ . Note that (TΣ (X), Σ) is a Σ-algebra, and (TΣ , Σ) is called the term algebra over Σ. For L ⊆ TΣ , let ≈L be the syntactic congruence relation of L ([11]), i.e., the greatest congruence on the term algebra TΣ saturating L. Let Σ denote a signature with the property that Σ = Σ0 . Throughout X is always a set of variables. Deﬁnition 5. A function F : (TΣ )n → TΣ is congruence preserving if for every congruence relation Θ over TΣ and all t1 , · · · , tn , s1 , · · · , sn ∈ TΣ , if t1 Θs1 , · · · , tn Θsn , then F (t1 , · · · , tn )ΘF (s1 , · · · , sn ). Remark 4. A congruence preserving function F : (TΣ )n → TΣ induces a welldeﬁned function FΘ : (TΣ /Θ)n → TΣ /Θ on any quotient algebra, for any congruence Θ on TΣ , deﬁned by FΘ ([t1 ]Θ , · · · , [tn ]Θ ) = [F (t1 , · · · , tn )]Θ .

666

Saeed Salehi

For terms u1 , · · · , un ∈ TΣ (X) and t ∈ TΣ (X ∪ {x1 , · · · , xn }) with x1 , · · · , xn ∈ X, the term t[x1 /u1 , · · · , xn /un ] 1 ∈ TΣ (X) is resulted from t by replacing all the occurrences of xi by ui for all i ≤ n. The function (TΣ )n → TΣ (X) deﬁned by (u1 , · · · , un ) → t[x1 /u1 , · · · , xn /un ] for all u1 , · · · , un ∈ TΣ , is called the term function 2 deﬁned by t. The rest of the paper is devoted to the proof of the following Theorem: Theorem 2. If |Σ0 | ≥ 7, then every congruence preserving F : (TΣ )n → TΣ , for every n ∈ IN, is a term function (i.e., there is a term t ∈ TΣ ({x1 , · · · , xn }), where x1 , · · · , xn are variables, such that F (u1 , · · · , un ) = t[x1 /u1 , · · · , xn /un ] for all u1 , · · · , un ∈ TΣ .) Remark 5. Theorem 2 dose not hold for |Σ0 | = 1: Let Σ = Σ0 ∪ Σ1 be a signature with Σ1 = {α} and Σ0 = {ζ0 }. The term algebra (TΣ , Σ) is isomorphic to (IN, 0, S), where 0 is the constant zero and S is the successor function. Let F : IN → IN be deﬁned by F (n) = 2n. It is easy to see that F is congruence preserving: for every congruence relation Θ, if nΘm then SnΘSm and by repeating the same argument for n times we get Sn nΘSn m or 2nΘn + m. Similarly Sm nΘSm m, so m + nΘ2m, hence 2mΘ2n that is F (n)ΘF (m). But F is not a term function, since all term functions are of the form n → Sk n = k + n for a ﬁxed k ∈ IN. It is not clear at the moment whether Theorem 2 holds for 2 ≤ |Σ0 | ≤ 6. Remark 6. Finite algebras having the property that all congruence preserving functions are term functions are called hemi-primal in universal algebra (see e.g. [3]). Our assumption Σ = Σ0 in Theorem 2 implies that TΣ is inﬁnite. Remark 7. Theorem 2 yields Theorem 1 for congruence preserving functions of n the form F : TA → TA , since (TA , Σ A ) is the term algebra over the signature Σ A , and its every term function can be represented by ιA and κA (recall that ca = ιA (a), and fa (t1 , t2 ) = κA (a, t1 , t2 ), for every a ∈ A, and t1 , t2 ∈ TA ). Proof of Theorem 2 Deﬁnition 6. – An interpretation of X in TΣ is a function ε : X → TΣ . Its unique extension to the Σ-homomorphism TΣ (X) → TΣ is denoted by ε∗ . – Any congruence relation Θ on TΣ is extended to a congruence relation Θ∗ on TΣ (X) deﬁned by the following relation for any p, q ∈ TΣ (X): p Θ∗ q, if for every interpretation ε : X → TΣ , ε∗ (p) Θ ε∗ (q) holds. – A function G : TΣ → TΣ (X) is congruence preserving if for every congruence relation Θ on TΣ and t, s ∈ TΣ , if tΘs, then G(t)Θ∗ G(s). The classical proof of the following lemma is not presented here. 1 2

Denoted by t[u1 , · · · , un ] in [4]. It is also called tree substitution operation, see e.g. [4].

A Completeness Property of Wilke’s Tree Algebras

667

Lemma 2. The term function TΣ → TΣ (X), u → t[x/u] deﬁned by any term t ∈ TΣ (X ∪ {x}) (x ∈ X), is congruence preserving. Deﬁnition 7. Let t be a term in TΣ (X), and C ⊆ TΣ (X), then t is called independent from C, if it is not a subterm of any member of C and no member of C is a subterm of t. the set of RFor a term rewriting system R, and a term u, let ∆∗R (u) be descendants of {u} (cf. [6]) and for a set of terms C, let ∆∗R (C) = u∈C ∆∗R (u). A useful property of the notion of independence is the following: Lemma 3. Let u ∈ TΣ (X) be independent from C ⊆ TΣ (X) and R be the single-ruled (ground-)term rewriting system {w → u} where w is any term in TΣ (X). Then L = ∆∗R (C) is closed under the rewriting rule u → w, and also u ≈L w. Moreover, every member of L results from a member of C by replacing some w subterms of it by u. Proof. Straightforward, once we note that any application of the rule w → u to a member of C, does not result in a new subterm of the form w, and all u’s appearing in the members of L (as subterms) are obtained by applying the (ground-term) rewriting rule w → u.

Proposition 1. For any C ⊂ TΣ (X) such that |C| < |Σ0 |, there is a term in TΣ which is independent from C. Proof. For each c ∈ Σ0 choose a tc ∈ TΣ that is higher (has longer height) than all terms in C and contains no other constant symbol than this c. Then, no tc is a subterm of any member of C. On the other hand, no term in C may appear as a subterm in more than one of the terms tc (for any c ∈ Σ0 ). Since the number of tc ’s for c ∈ Σ0 are more than the number of elements of C, then by the Pigeon Hole Principle, there must exist a tc that is independent from C.

Lemma 4. Let G : TΣ → TΣ (X) be congruence preserving, ε : X → TΣ be an interpretation, and u, v ∈ TΣ . If v is independent from {u, ε∗ (G(u))}, then ε∗ (G(v)) ∈ ∆∗{u→v} ε∗ (G(u)) . Moreover, ε∗ (G(v)) results from ε∗ (G(u)) by replacing some u subterms by v. Proof. Let L = ∆∗{u→v} ε∗ (G(u)) . By Lemma 3, u ≈L v. The function G is congruence preserving, so ε∗ (G(u)) ≈L ε∗ (G(v)), and since ε∗ (G(u)) ∈ L, then ε∗ (G(v)) ∈ L. The second claim follows from the independence of v from

{u, ε∗ (G(u))}. Recall that for a position p of the term t, t|p is the subterm of t at the position p (cf. [2]).

668

Saeed Salehi

Lemma 5. Suppose |Σ0 | ≥ 7, and let G : TΣ → TΣ (X) be congruence preserving. If v is independent from {u, G(u)}, for u, v ∈ TΣ , then G(v) results from G(u) by replacing some of its u subterms by v. Proof. By Proposition 1, there are w, w1 , w2 such that w is independent from {u, G(u), v, G(v)}, w1 is independent from {w, u, G(u), v, G(v)}, and w2 is independent from {w, w1 , u, G(u), v, G(v)}. Deﬁne the interpretation ε : X → TΣ by setting ε(x) = w for all x ∈ X. By the choice of w, v is independent from {u, ε∗ (G(u))}. So we can apply Lemma 4 to infer that ε∗ (G(v)) results from ε∗ (G(u)) by replacing some u subterms by v. Note that G(v) is obtained by substituting all w’s in ε∗ (G(v)) by members of X. The same is true about G(u) and ε∗ (G(u)). The positions of ε∗ (G(v)) in which w appear are exactly the same positions of ε∗ (G(u)) in which w appear (by the choice of w). So, positions of G(v) in which a member of X appear are exactly the same positions of G(u) in which a member of X appear. We claim that identical members of X appear in those identical positions of G(u) and G(v): if not, there are x1 , x2 ∈ X such that G(v)|p = x1 and G(u)|p = x2 for some position p of G(u) (and of G(v)). Deﬁne the interpretation δ : X → TΣ by δ(x1 ) = w1 , δ(x2 ) = w2 , and δ(x) = w for all x = x1 , x2 . Then δ ∗ (G(v))|p = w1 and δ ∗ (G(u))|p = w2 . On the other hand by Lemma 4, δ ∗ (G(v)) results from δ ∗ (G(u)) by replacing some u subterms by v. By the choice of w1 and w2 , such a replacement can not aﬀect the appearance of w1 or w2 , and hence the subterms of δ ∗ (G(v)) and δ ∗ (G(u)) in the position p must be identical, a contradiction. This proves the claim which implies that G(v) results from G(u) by replacing some u subterms by v.

Lemma 6. Suppose |Σ0 | ≥ 7, and let G : TΣ → TΣ (X) be congruence preserving. Then for any u, v ∈ TΣ , G(v) results from G(u) by replacing some u subterms by v. Proof. By Proposition 1, there is a w ∈ TΣ independent from {u, G(u), v, G(v)}. By Lemma 5, G(w) is obtained from G(u) by replacing some u subterms by w, and also results from G(v) by replacing some v subterms by w. By the choice of w, all w’s appearing in G(w) have been obtained either by replacing u by w in G(u) or by replacing v by w in G(v). Since the only diﬀerence between G(v) and G(w) is in the positions of G(w) where w appears, and the same is true for the diﬀerence between G(u) and G(w), then G(v) can be obtained from G(u) by replacing some u subterms of it, the same u subterms which have been replaced by w to get G(w), by v.

Lemma 7. If |Σ0 | ≥ 7, then every congruence preserving function G : TΣ → TΣ (X) is a term function (i.e., there is a term t ∈ TΣ (X ∪ {x}), where x ∈ X, such that G(u) = t[x/u] for all u ∈ TΣ .)

A Completeness Property of Wilke’s Tree Algebras

669

Proof. Fix a u0 ∈ TΣ , and choose a v ∈ TΣ such that v is independent from {u0 , G(u0 )}. (By Proposition 1 such a v exists.) Then by Lemma 6, G(v) results from G(u0 ) by replacing some u0 subterms by v. Let y be a new variable (y ∈ X) and let t ∈ TΣ (X ∪ {y}) result from G(u0 ) by putting y exactly in the same positions that u0 ’s are replaced by v’s to get G(v). So, G(u0 ) = t[y/u0 ] and G(v) = t[y/v], moreover all v’s in G(v) are obtained from t by substituting all y’s by v. We show that for any arbitrary u ∈ TΣ , G(u) = t[y/u] holds: Take a u ∈ TΣ . By Proposition 1, there is a w independent from the set {u0 , G(u0 ), v, G(v), u, G(u)}. By Lemma 6, G(w) results from G(v) by replacing some v subterms by w. We claim that all v’s are replaced by w’s in G(v) to get G(w). If not, then v must be a subterm of G(w). From the fact (Lemma 6) that G(u0 ) results from G(w) by replacing some w subterms by u0 (and the choice of w) we can infer that v is a subterm of G(u0 ) which is in contradiction with the choice of v. So the claim is proved and then we can write G(w) = t[y/w], moreover all w’s in G(w) are obtained from t by substituting y by w. Again by Lemma 6, G(u) results from G(w) by replacing some w subterms by u. We can claim that all w’s appearing in G(w) are replaced by u to get G(u). Since otherwise w would have been a subterm of G(u) which is in contradiction with the choice of w. This shows that G(u) = t[y/u].

Theorem 2. If |Σ0 | ≥ 7, then every congruence preserving F : (TΣ )n → TΣ , for every n ∈ IN, is a term function. Proof. We proceed by induction on n: For n = 1 it is Lemma 7 with X = ∅. For the induction step let F : (TΣ )n+1 → TΣ be a congruence preserving function. For any u ∈ TΣ deﬁne Fu : (TΣ )n → TΣ by Fu (u1 , · · · , un ) = F (u1 , · · · , un , u). By the induction hypothesis every Fu is a term function, i.e., there is a s ∈ TΣ ({x1 , · · · , xn }) such that Fu (u1 , · · · , un ) = s[x1 /u1 , · · · , xn /un ] for all u1 , · · · , un ∈ TΣ . Denote the corresponding term for u by tu (it is straightforward to see that such a term s is unique for every u). The mapping TΣ → TΣ ({x1 , · · · , xn }) deﬁned by u → tu is also congruence preserving. Hence by Lemma 7, it is a term function. So there is a t ∈ TΣ ({x1 , · · · , xn , xn+1 }) such that tu = t[xn+1 /u], hence F (u1 , · · · , un , un+1 ) = Fun+1 (u1 , · · · , un ) = tun+1 [x1 /u1 , · · · , xn /un ] = t[xn+1 /un+1 ][x1 /u1 , · · · , xn /un ]. So F (u1 , · · · , un , un+1 ) = t[x1 /u1 , · · · , xn /un , xn+1 /un+1 ] is a term function.

Acknowledgement I am much grateful to professor Magnus Steinby for reading drafts of this paper and for his fruitful ideas, comments and support.

References 1. Almeida J., “On pseudovarieties, varieties of languages, ﬁters of congruences, pseudoidentities and related topics”, Algebra Universalis, Vol. 27 (1990) pp. 333-350. 2. Bachmair L., “Canonical equational proofs”, Progress in Theoretical Computer Science, Birkh¨ auser, Boston Inc., Boston MA, 1991.

670

Saeed Salehi

3. Denecke K. & Wismath S. L., “Universal algebra and applications in theoretical computer science”, Chapman & Hall/CRC, Boca Raton FL, 2002. 4. F¨ ul¨ op Z. & V´ agv¨ olgyi S. “Minimal equational representations of recognizable tree languages” Acta Informatica Vol. 34, No. 1 (1997) pp. 59-84. 5. G´ecseg F. & Steinby M., “Tree languages”, in: Rozenberg G.& Salomaa A. (ed.) Handbook of formal languages, Vol. 3, Springer, Berlin (1997) pp. 1-68. 6. Jantzen M., “Conﬂuent string rewriting”, EATCS Monographs on Theoretical Computer Science 14, Springer-Verlag, Berlin 1988. 7. Salomaa K., Review of [13] in AMS-MathSciNet, MR-97f:68134. 8. Nivat M. & Podelski A., “Tree monoids and recognizability of sets of ﬁnite trees”, Resolution of Equations in Algebraic Structures, Vol. 1, Academic Press, Boston MA (1989) pp. 351-367. 9. Podelski A., “A monoid approach to tree languages”, in: Nivat M. & Podelski A. (ed.) Tree Automata and Languages, Elsevier-Amsterdam (1992) pp. 41-56. 10. Salehi S. & Steinby M., “Tree algebras and regular tree languages” in preparation. 11. Steinby M., “A theory of tree language varieties”, in: Nivat M. & Podelski A. (ed.) Tree Automata and Languages, Elsvier-Amsterdam (1992) pp. 57-81. 12. Thomas W., “Logical aspects in the study of tree languages”, Ninth Colloquium on Trees in Algebra and in Programming (Proc. CAAP’84), Cambridge University Press (1984) pp. 31-51. 13. Wilke T., “An algebraic characterization of frontier testable tree languages”, Theoretical Computer Science, Vol. 154, N. 1 (1996) pp. 85-106.

Symbolic Topological Sorting with OBDDs (Extended Abstract) Philipp Woelfel FB Informatik, LS2, Univ. Dortmund, 44221 Dortmund, Germany [email protected]

Abstract. We present a symbolic OBDD algorithm for topological sorting which requires O(log2 N ) OBDD operations. Then we analyze its true runtime for the directed grid graph and show an upper bound of O(log4 N ). This is the ﬁrst true runtime analysis of a symbolic OBDD algorithm for a fundamental graph problem, and it demonstrates that one can hope that the algorithm behaves well for suﬃciently structured inputs.

1

Introduction

Algorithms on graphs is one of the best studied areas in computer science. Usually, a graph G = (V, E) is given by an adjacency list or by an adjacency matrix. 2 Such an explicit representation of a graph requires space Θ(|V |+|E|) or Θ(|V | ), and for many graph problems eﬃcient algorithms are known. However, there are several application areas where typical problem instances have such a large size that a linear or even polynomial runtime is not feasible, or where even the explicit representation of the problem instance itself may not ﬁt into memory anymore. In order to deal with very large graphs, symbolic (or implicit) graph algorithms have been devised, where the vertex and edge sets representing the involved graphs are stored symbolically, i.e., in terms of their characteristic functions. The characteristic functions are usually represented by so-called Binary Decision Diagrams (BDDs) or more speciﬁcally by Ordered Binary Decision Diagrams (OBDDs) — see Section 2 for deﬁnitions. Such approaches have been successfully applied in the areas of model checking, circuit veriﬁcation and ﬁnite state machine veriﬁcation (see e.g. [2,3,4]). These applications can be viewed as particular cases of symbolic graph problems, which raises the question whether it is also possible to devise symbolic graph algorithms with a good behavior for fundamental graph theoretical problems. One approach in this direction was undertaken by Hachtel and Somenzi [5] who introduced a symbolic OBDD algorithm for the maximum ﬂow problem in 0-1 networks. The promising experimental studies demonstrated that the algorithm is able to handle graphs with over 1036 edges and that it is competitive with traditional algorithms on dense random graphs. The paper lacks however, a theoretical analysis of its performance with

Supported in part by DFG grant We 1066/10-1

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 671–680, 2003. c Springer-Verlag Berlin Heidelberg 2003

672

Philipp Woelfel

respect to runtime. Recently, Sawitzki [11,9] has analyzed the number of OBDD operations (i.e., the number of required synthesis operations of characteristic functions) required by the ﬂow algorithm of Hachtel and Somenzi and has proposed an improved algorithm. But note that there is only a very weak relation between the number of OBDD operations and the true runtime of a symbolic OBDD algorithm. The time required for one synthesis step is mainly inﬂuenced by the sizes of the involved OBDDs which may range from linear to exponential (in the number of variables of the represented characteristic functions). In fact, we are not aware of any analysis of a symbolic OBDD algorithm with respect to its true runtime. (However, there is a recent true runtime analysis for a related type of decision diagrams, called Binary Moment Diagrams, showing that certain multiplier circuits can be veriﬁed in polynomial time [7].) The results and techniques presented here aim to be a ﬁrst step in ﬁlling this gap. First, we present a new OBDD algorithm for topological sorting the N vertices of a directed acyclic graph, which requires only O(log2 N ) OBDD operations on OBDDs for functions with at most 4log N variables. Hence, if all OBDDs obtained during the execution of the algorithm have subexponential size, the total runtime is sublinear in the number of vertices of the graph. Then we analyze its true runtime for the directed grid graph and show an upper bound of O(log4 N ). This demonstrates that one can in fact hope that such a fundamental graph algorithm behaves well for suﬃciently structured inputs. For the analysis, we generalize the notion of threshold functions to multivariate threshold functions. We investigate the OBDD size of multivariate threshold (and modulo) functions and obtain strong results about the eﬀect of OBDD operations such as quantiﬁcation on these functions. Clearly, our analysis is a “good-case” analysis which is only valid for one particular input instance. We hope, though, that the techniques presented here are a good starting point for developing a framework which allows to analyze symbolic algorithms for fundamental graph problems for larger classes of input instances. In fact, Sawitzki [10] has already successfully applied our framework to analyze the true runtime of his 0-1 network ﬂow algorithm on the grid network.

2

OBDDs and Implicit Graph Representation n

In the following, let Bn denote the class of boolean functions {0, 1} → {0, 1} and let Xn = {x1 , . . . , xn } be a set of boolean variables. Let f ∈ Bn be a function deﬁned over the variables in Xn . The subfunction of f , where k variables xi1 , . . . , xik are ﬁxed to k constants c1 , . . . , ck ∈ {0, 1} is denoted by f|xi1 =c1 ,...,xik =ck . A variable ordering π on Xn is a permutation of the indices {1, . . . , n}, leading to the ordered list xπ(1) , . . . , xπ(n) of the variables. A π-OBDD on Xn for a variable ordering π is a directed acyclic graph with one root, two sinks labeled with 0 and 1, resp., and the following properties: Each inner node is labeled by a variable from Xn and has two outgoing edges, one of them labeled by 0, the other by 1. If an edge leads from a node labeled by xi to a node labeled by xj ,

Symbolic Topological Sorting with OBDDs

673

then π −1 (i) < π −1 (j). This means that any directed path passes the nodes in an order respecting the variable ordering π. A π-OBDD is said to represent a n boolean function f ∈ Bn , if for any a = (a1 , . . . , an ) ∈ {0, 1} , the path starting at the root and leading from any xi -node over the edge labeled by the value of ai , ends at a sink with label f (a). The size of a π-OBDD G is the number of its nodes and is denoted by |G|. The π-OBDD of minimal size for a given function f and a ﬁxed variable ordering π is unique up to isomorphism. A π-OBDD is called reduced, if it is the minimal π-OBDD. It is well-known that the size of any reduced π-OBDD for a function f ∈ Bn is bounded by O(2n /n). (see [1] for the upper bound with the best constants known). Let f and g be functions in Bn and let Gf and Gg be π-OBDDs representing f and g, resp., for an arbitrary variable ordering π. In the following, we summarize the operations on OBDDs to which we will refer in this text. For a more detailed discussion on OBDDs and their operations we refer to the monograph [12]. n

– Evaluation: Given x ∈ {0, 1} compute f (x). This can trivially be done in time O(n). – Minimization: Compute the reduced π-OBDD for f . This is possible in time O(|Gf |). – Binary synthesis: Given a boolean operation ⊗ ∈ B2 compute a reduced π-OBDD Gh representing the function h = f ⊗ g. This can be done in time O(|G∗h |), where G∗h is the graph which consists of all nodes in the product graph of Gf and Gg reachable from the root. The size of Gh is at most O(|G∗h |) = O(|Gf | · |Gg |). – Replacement by constants: Given a sequence of variables xi1 , . . . , xik ∈ Xn and a sequence of constants c1 , . . . , ck , compute a reduced π-OBDD Gh for the subfunction h := f|xi1 =c1 ,...,xik =ck ∈ Bn−k . This is possible in time O(|Gf |) and the reduced π-OBDD Gh is of smaller size than Gf . – Quantiﬁcation: Given a variable xi ∈ Xn and a quantiﬁer Q ∈ {∃, ∀}, compute a reduced π-OBDD for the function h ∈ Bn−1 with h := (Qxi )f , where (∃xi )f := f|xi =0 ∨ f|xi =1 and (∀xi )f := f|xi =0 ∧ f|xi =1 . The time for computing this π-OBDD is determined by the time for determining the πOBDDs for f|xi =0 and f|xi =1 and the time required for the binary synthesis 2 of the two. Hence, it is bounded by O |Gf | . DFS – SAT enumeration: Enumerate all inputs x ∈ f −1 (1). Using simple techniques, this can be done in optimal time O(|Gf | + nf −1 (1)). We can use OBDDs for an implicit graph representation by letting them represent the characteristic functions of the vertex and edge sets. For practical reasons, though, we assume throughout this text that the vertex set is V = n {0, 1} for some n ∈ N, so that a representation of V is not needed. It is easy to accommodate the algorithm for other vertex sets. In order to encode integers n using the binary notation we deﬁne |x| = 2n−1 xn−1 + · · · + 20 x0 for x ∈ {0, 1} .

674

3

Philipp Woelfel

The Topological Sorting Algorithm n

Let G = (V, E), V = {0, 1} , be a directed acyclic graph represented by a π-OBDD as described in the former section. The edge relation E deﬁnes in a natural way a partial order on V , where v w if and only if there exists a path from v to w. In the explicit case a topological sorting algorithm would enumerate all vertices in such a way that if u is enumerated before v, then v u. In the implicit case, we hope for runtimes in the order of o(|V |) in which the enumeration of all vertices is not possible. Hence, a goal might be to obtain a complete order ≺ which inherits the properties of (i.e., v u implies v ≺ u). Unless is a complete order, ≺ is not uniquely deﬁned by , and thus we assume that an arbitrary complete order on the vertex set V is given (this may be ﬁxed in advance for the algorithm or may be given as an additional parameter), which determines the order of the elements which are incomparable with respect to (i.e., those with u v and v u). An alternative is to compute an OBDD which allows to enumerate the elements in their topological order by simple SAT enumeration operations. For any two vertices u, v we denote by ∆(u, v) the length of the longest path leading from u to v. (The length of a path is the number of its edges.) If no such path exists, then ∆(u, v) := −∞. Note that ∆(v, v) = 0, since the graph is acyclic. Furthermore, let ∆(v) := max {∆(u, v) | u ∈ V }. We call ∆(v) the length of the longest path to the vertex v. Let now DIST ∈ B2n , be deﬁned to be 1 for an n n input (d, v) ∈ {0, 1} × {0, 1} , if and only if ∆(v) = |d|. Clearly, |du | < |dv | implies v u, where du , dv are the unique values with DIST(du , u) = 1 and DIST(dv , v) = 1. Hence, if we have a π-OBDD GDIST for the function DIST, we can use it to enumerate the vertices in an order respecting by computing the π-OBDDs for DIST|d=a for |a| = 0, 1, . . . and enumerating their satisfying inputs using the SAT enumeration procedure. We will see below how the OBDD GDIST can in addition be used to obtain a complete order respecting . In order to compute the function DIST, we use a method which is similar to that of computing the transitive closure by matrix squaring. For i ∈ {1, . . . , n} and u, v ∈ V let Ti (u, v) be the boolean function with function value 1 if and only if there exists a simple path from u to v which has length exactly 2i . We can compute OBDDs for all Ti as follows. T0 (u, v) = E(u, v)

and Ti+1 (u, v) = ∃w : Ti (u, w) ∧ Ti (w, v).

(S1)

Now we deﬁne the function DISTj ∈ B2n−j for 0 ≤ j ≤ n. It takes as input an (n − j)-bit value d∗ = dn−1 . . . dj and a vertex v (for j = n, d∗ is the empty string ). The function value DISTj (d∗ , v) is deﬁned as DISTj (d∗ , v) = 1

⇔

2j |d∗ | ≤ ∆(v) < 2j (|d∗ | + 1).

(∗)

I.e., DISTj (d∗ , v) is true if the bits dn−1 . . . dj are exactly the n − j most significant bits of the binary representation of the integer ∆(v). Clearly, DIST =

Symbolic Topological Sorting with OBDDs

675

DIST0 . The functions DISTj can be computed by DISTn (v) := 1 and for j = n − 1, . . . , 0 DISTj (dn−1 . . . dj , v) = DISTj+1 (dn−1 . . . dj+1 , v) ∧ dj ⇔ ∃u Tj (u, v) ∧ DISTj+1 (dn−1 . . . dj+1 , u) . (S2) It is easy to verify that the boolean functions DISTj do in fact fulﬁll property (∗) (the proof can be found in the full version of this paper). Once we have computed the function DIST, we can use it together with an arbitrary given complete order in order to compute a complete order ≺ by letting u ≺ v ⇔ ∃du , dv : DIST(du , u) ∧ DIST(dv , v) ∧

|du | < |dv | ∨ (|du | = |dv | ∧ u v) . (S3)

It can be easily checked that ≺ deﬁnes a complete order on V respecting . Thus, the following theorem follows from simply counting the number of OBDD operations. n

Theorem 1. Let V = {0, 1} and G = (V, E) be an acyclic directed graph represented by OBDDs. Applying the OBDD operations as described in (S1)– (S3) yields an OBDD for a relation ≺ which deﬁnes a complete order on V such that v ≺ w for all v, w ∈ V with (v, w) ∈ E. The number of OBDD operations is O(n2 ), where each OBDD represents a function on at most 4n variables. Note that the algorithm can be easily adapted to an arbitrary vertex set V ⊆ n {0, 1} given by an OBDD for the relation V . This is done by simply executing the algorithm from above for the edge relation E (u, v) = E(u, v) ∧ V (u) ∧ V (v). While the complete order ≺ returned by such a modiﬁed algorithm is deﬁned on n n {0, 1} × {0, 1} , its restriction to V × V is obviously a correct complete order. Since any (not necessarily reduced) OBDD in n variables has O(2n ) nodes, the theorem shows that the true worst-case runtime of our algorithm is 4 O(|V | log2 |V |). Clearly, this is much worse than the O(|V | + |E|) upper bound obtained by a well-known explicit algorithm. On the other hand, if all OBDDs obtained during the execution of the algorithm have a subexponential size (with respect to n), then its runtime is sublinear with respect to the number of vertices. In the following sections we show that it is justiﬁable to hope that this is the case for very structured input graphs.

4

Runtime Analysis for the Grid Graph

We analyze the behavior of the topological sorting algorithm for a 2n × 2n -grid, where all edges are directed from left to right and from bottom to up. The n n directed grid of the vertex set V = {0, 1} × {0, 1} and edge set graph consists E, where (x, y), (x , y ) ∈ E if and only if either |x| = |x | and |y | − |y| = 1 or |y| = |y | and |x | − |x| = 1.

676

Philipp Woelfel

In the analysis to follow, we assume an interleaved variable ordering, that is a variable ordering where e.g. for a function depending on two vertices u, v, the variable vi precedes the corresponding variable ui . Note that in practice, heuristics such as sifting algorithms [8] are used to optimize the variable orderings during the execution of an algorithm, and it can be expected that a good variable ordering is found this way. The idea for proving that the topological sorting algorithm is very eﬃcient for the grid graph is that all functions represented by OBDDs after each step of the algorithm belong to a class of functions which have a small OBDD representation. The functions we consider are compositions of certain threshold and modulo functions, which we deﬁne and investigate in the following. We denote by Xk,n the set of variables xij with 1 ≤ i ≤ k and 0 ≤ j < n. By i x we denote the vector of n variables (xin−1 , . . . , xi0 ). Deﬁnition 1. 1. A boolean function f ∈ Bkn deﬁned on the variable set Xk,n is called kvariate threshold function, if there exist a threshold T∈ Z and weights k w1 , . . . , wk ∈ Z such that f (x1 , . . . , xk ) = 1 if and only if i=1 wi · xi ≥ T . The maximum absolute weight of f is deﬁned as w(f ) := max |w1 |, . . . , |wk | . The set of k-variate threshold functions with maximum absolute weight w deﬁned on the set of variables Xk,n is denoted by Tw k,n 2. A boolean function g ∈ Bkn deﬁned on the variable set Xk,n is called kvariate modulo M function, if there exists a constant k C ∈ Z and w1 , . . . , wk ∈ Z such that g(x1 , . . . , xk ) = 1 if and only if i=1 wi · xi ≡ C (mod M ). The set of k-variate modulo M functions deﬁned on the set of variables Xk,n is denoted by MM k,n . Deﬁnition 2. Let f ∈ Bn and C be a class of functions deﬁned on the variable set Xn . We say that f can be decomposed into m functions in C, if there exist a formula F on m variables and f1 , . . . , fm ∈ C such that f = F (f1 , . . . , fm ). The set of functions decomposable into m functions in C is denoted by D[C, m]. For any k ∈ N we denote by Dk the

set of function sequences (fn )n∈N such that ∃m ∈ N ∀n ∈ N : fn ∈ D T1k,n , m .

The main idea in our proof is based on two observations. Firstly, any function decomposable into a constant number of threshold and modulo functions has a small OBDD size. Secondly, all intermediate OBDDs obtained during the execution of the topological sorting algorithm on the directed grid graph represent functions which are decomposable into threshold and modulo functions. Let πk,n be the variable ordering in which the variable in Xk,n appear in the order x10 , x20 , . . . , xk0 , x11 , . . . , xk1 , . . . , xkn−1 . I.e., a πk,n -OBDD tests all bits of the input integers in an interleaved order with increasing signiﬁcance of the bits. The following result is a generalization of Proposition 4 in [6]. The proof will be given in the full version of this extended abstract.

Symbolic Topological Sorting with OBDDs

677

M Lemma 1. Let f1 , . . . , fm ∈ Tw k,n ∪ Mk,n be given by reduced πk,n -OBDDs for fi , 1 ≤ i ≤ m. Further, let f = F (f1 , . . . , fm ) for a formula F of size s and let L = L(k, m, M ) = max {4kw + 5, M }. The minimal πk,n -OBDD for f has at most Ls+1 kn nodes and can be computed in time and space O (kn)2 sLs+1 + 1 .

Now we show for functions being decomposable into threshold functions (and no modulo functions), that the quantiﬁcation over one of its variable blocks xi0 , . . . , xin−1 , 1 ≤ i ≤ k, can be done eﬃciently. Theorem 2. Let (fn )n∈N such that there exist w, m ∈ N with fn ∈ D[Tw k,n , m] for all n ∈ N, and let Q ∈ {∃, ∀}. If fn is given as a πk,n -OBDD, then for any 1 ≤ ≤ k a minimal πk,n -OBDD for (Qx )fn can be computed in time n3 k O(1) . We need the following lemma, which states that the result of quantifying over one variable block of a function decomposable into threshold functions is a function which is decomposable into threshold and modulo functions. The proof has to be omitted due to space restrictions. Let lcm denote the least common multiple.

Lemma 2. Let f ∈ D Tw , m , and Q ∈ {∃, ∀} and ∈ {1, . . . , k}. Then k,n

∗ ∗ w ∗ (Qx )f ∈ D T2w·w ≤ lcm{1, 2, . . . , w} and m = k−1,n ∪ Mk−1,n , m , where w O(2m w∗ m2 ). In particular, for any ﬁxed k ∈ N and a sequence of functions (fn )n∈N ∈ Dk we have (Qxi )fn ∈ D T2k−1,n , m , where m = O(1). Proof (of Theorem 2). Fix w, m ∈ N such that fn ∈ D[Tw k,n , m] for all n ∈ N. W.l.o.g. we assume = 1 and for the sake of readability we write x instead of x1 and f instead of fn . We only prove the theorem for the case Q = ∀; the proof for Q = ∃ works analogously. We can write (∀x)f as (∀xn−1 ∀xn−2 . . . ∀x0 )f (x2 , . . . , xk ). If we apply the OBDD quantiﬁcation operations to the bits x0 , . . . , xn−1 in this order, then after the ith quantiﬁcation (0 ≤ i ≤ n) the resulting OBDD Gi represents the function gi = (∀xi−1 . . . ∀x0 )f be done in time in Bkn−i . Since each of the n quantiﬁcation operations n−1 can 2 2 O(|Gi | ), the total time required is bounded by i=0 |Gi | . Hence, it suﬃces to show that Gi has a size of at most O(nk O(1) ) for all 0 ≤ i ≤ n − 1. Note that gi does not depend on the variables x0 , . . . , xi−1 . What we do in the following is to introduce n dummyvariables z0 , . . . , zn−1 and to show that gi can be written as (∀z0 , . . . , zn−1 )gi∗ |x =0,...,x =0 , where gi∗ is a function in 0 i−1

w D Tk+1,n , m + 1 . Hence, gi is obtained from the function (∀z0 , . . . zn−1 )gi∗ by restricting some variables to constants. By Lemma 2, this function is decomposable into a constant number of threshold functions, and therefore its OBDD size is bounded suﬃciently. Note that the variables z0 , . . . , zn−1 are merely artiﬁcal helper variables, and that none of the functions we “really” deal with (i.e., which are represented by OBDDs) depend on these variables. Let f = F (f1 , . . . , fm ) for a formula F and f1 , . . . , fm ∈ Tw k,n . Since m = O(1), we may assume w.l.o.g. that the size s of F is a constant, too. We introduce

678

Philipp Woelfel

n new variables, which we denote by z0 , . . . , zn−1 . Then we replace the variables xj with the variables zj for 0 ≤ j ≤ i − 1. This way we obtain gi = (∀xi−1 . . . x0 )f (xn−1 . . . xi xi−1 . . . x0 , x2 , . . . , xk ) = (∀zi−1 . . . z0 )f (xn−1 . . . xi zi−1 . . . z0 , x2 , . . . , xk ) = (∀zn−1 . . . z0 ) |z| ≥ 2i ∨ f (xn−1 . . . xi zi−1 . . . z0 , x2 , . . . , xk )

(1)

Now consider an arbitrary threshold function fj for some 1 ≤ j ≤ m. I.e., fj (x, x2 , . . . , xk ) = 1 if and only if w1 |x| + w2 x2 + · · · + wk xk ≥ T . Let fj∗ ∈ B(k+1)n be the function with fj∗ (z, x, x2 , . . . , xk ) = 1 ⇔ w1 |z| + w1 x1 + w2 x2 + · · · + wk xk ≥ T ∗ i and f ∗ = F (f1∗ , . . . , fm ). Obviously, f ∗ ∈ D[Tw k+1,n , m]. If |z| < 2 , then |xn−1 . . . xi zi−1 . . . z0 | is the same as |xn−1 . . . xi 0 . . . 0| + |z|. Hence, it is easy to conclude from (1) that gi = (∀zn−1 . . . z0 ) |z| ≥ 2i ∨ f ∗ (z, xn . . . xi 0 . . . 0, x2 , . . . , xk ) ∗ = (∀zn−1 . . . z0 ) |z| ≥ 2i ∨ f|x (z, x1 , x2 , . . . , xk ) . i−1 =···=x0 =0

Now let

gi∗ (x1 , . . . , xk ) = |z| ≥ 2i ∨ f ∗ (z, x1 , x2 , . . . , xk ).

Then gi∗ ∈ D Tw and gi = (∀z)gi∗ |x =0,...,x =0 . Since k+1,n , m + 1 0 i−1

gi∗ ∈ D Tw , m + 1 and k, w, and m are constants, we can conclude from k+1,n

M Lemma 2 that (∀z)gi∗ ∈ D Tw for some constants w , M , and m . k,n ∪ Mk,n , m

Thus, by Lemma 1 the πk,n -OBDD size of (∀z)gi∗ is bounded by O(nk O(1) ). But as we have shown above, the πk,n -OBDD for gi can be obtained from the πk,n -OBDD for (∀z)gi∗ by simply replacing some variables with the constant 0. Hence, the resulting minimal πk,n -OBDD for gi can only be smaller than that for (∀z)gi∗ and thus its size is also bounded by O(nk O(1) ). Remark 1. All the upper bounds in Lemma 1 and Theorem 2 proven for functions decomposable into threshold- and modulo functions hold equivalently for their subfunctions f|α1 ...αi , where α1 . . . αi is a restriction to arbitrary variables except those being quantiﬁed in case of Theorem 2. The following corollary summarizes the abovely stated results in a more convenient way. It follows from the statements in Lemma 1, Theorem 2, Lemma 2, and Remark 1. Corollary 1. Fix a constant k ∈ N and let i, j ∈ {1, . . . , k} and Q, Q ∈ {∃, ∀}. Further, let (gn )n∈N ∈ Dk and fn = gn |α , where α is an assignment of constants to arbitrary variables except to those in {xi0 , . . . , xin−1 }. If gn is either given by a reduced πk,n -OBDD or by the reduced πk,n -OBDDs for the threshold functions into which it is decomposable, then the reduced πk,n -OBDDs for (Qxi )gn , (Qxi )fn , and (Qxi Q xj )gn can be computed in time O(n3 ).

Symbolic Topological Sorting with OBDDs

679

We can now apply these results in order to analyze the true run time of the topological sorting algorithm for the grid graph. Whenever we talk in the following about an OBDD for some function sequence in Dk , we assume that the variable ordering is πk,n . We have to specify the complete order for the 1 1 operations in (S3). A very natural order 1 is the2lexicographical order, i.e. (x , y ) 2 2 1 2 1 2 (x , y ) if and only if |x | < |x | ∨ |x | = |x | ∧ |y | ≤ |y | . Recall the steps (S1)-(S3) of the topological sorting algorithm from Section 3. We start the analysis of the grid with of the edge relation 2E. By 1the deﬁnition 1 1 2 2 2 1 graph, (x , y ), (x , y | − |x | = 0 ∧ |y ) ∈ E if and only if |x | − |y |=1 ∨ 2 |y | − |y 1 | = 0 ∧ |x2 | − |x1 | = 1 . Clearly, this function is in D4 . Now we look at the functions Ti obtained by (S1). Recall that Ti (u, v) is deﬁned to be 1 if and only if there exists a path from u to v which has length exactly 2i . Note also that in the directed grid graph all paths from vertex u to vertex v have the same length. Hence, for the directed grid graph Ti (x1 , y 1 ), (x2 , y 2 ) = 1 if and only if |y 2 | ≥ |y 1 |

∧

|x2 | ≥ |x1 |

∧

|x2 | − |x1 | + |y 2 | − |y 1 | = 2i .

Clearly, this function is in D4 and thus according to Corollary 1, Ti+1 can be computed from Ti in time O(n3 ). (Note also that the quantiﬁcation over one vertex in the grid graph is a quantiﬁcation over two integers.) Hence, computing T1 , . . . , Tn requires time O(n4 ) in total. Next, we analyze the construction of the OBDDs for the functions DISTj in (S2). Recall that for any vertex v and any d∗ = dn−1 . . . dj , the function DISTj (d∗ , v) is true if and only if d∗ describes the n − j most signiﬁcant bits of the bitwise representation of ∆(v). Let fj ∈ B3n , 0 ≤ j ≤ n, be deﬁned by fj (d, x, y) = 1 if and only if |d| ≤ |x| + |y| < |d| + 2j . Hence, fj is the conjunction of two functions in T13,n . Furthermore, it is easy to see that DISTj dn−1 . . . dj , (x, y) = f|dj−1 =···=d0 =0 (d, x, y). Therefore, DISTj is obtained from a function in D3 by replacing some variables with the constant 0. Note also that DIST = DIST0 is in fact in D3 . Moreover, due to the analysis of Tj above, it becomes obvious that Tj (u, v) ∧ DISTj+1 (dn−1 . . . dj+1 , u) is a function in D5 , where some variables are replaced with the constant 0. Hence, according to Corollary 1, the OBDD for gj := ∃u : Tj (u, v) ∧ DISTj+1 (dn−1 . . . dj+1 , u) 3 can be in time O(n ). The function gj is obtained from a function 8computed 2 in D T3,n ∪ M3,n , O(1) by replacing some variables with the constant 0 (see Lemma 2). Now it is easy to see that the ﬁnal two synthesis operations of (S2) required to compute DISTj run in time O(n3 ). (Apply Lemma 1 and Remark 1, and note that the function dj ∈ B1 can be viewed as a subfunction of f ∈ D1 with f (dn−1 . . . d0 ) = 1 if and only if |dn−1 . . . d0 | = 2j .) Altogether, the total time required for computing DISTn−1 , . . . , DIST0 = DIST is O(n4 ). Finally, we have to investigate the computation of the complete order ≺ using the operations in (S3). Recall that DIST ∈ D3 Hence, if one takes the deﬁnition of into account, the complete term in (S3) before the ﬁrst quantiﬁcation describes

680

Philipp Woelfel

a function h in D4 . According to Corollary 1 the function h = (∃dv ∃du )h can be computed in time and space O(n3 ). Summing up the time bounds for all OBDD operations, we have obtained the following result. Theorem 3. The OBDD algorithm for topological sorting takes time O(n4 ) on the directed 2n × 2n grid graph for an appropriate variable ordering πk,n and the complete order as deﬁned above.

5

Conclusion

Since the results about the threshold and modulo functions are quite general, we hope that they might as well be applicable to the analysis of other symbolic OBDD algorithms. It would be nice to extend the techniques in such a way that not only single input instances but small graph classes can be handled. An interesting example would be grids where some arbitrary or randomly chosen edges have been removed.

Acknowledgments I thank Daniel Sawitzki and Ingo Wegener for helpful comments and discussions.

References 1. Y. Breitbart, H. B. Hunt III, and D. J. Rosenkrantz. On the size of binary decision diagrams representing boolean functions. Theor. Comp. Sci., 145:45–69, 1995. 2. J. R. Burch, E. M. Clarke, K. L. McMillan, D. L. Dill, and L. J. Hwang. Symbolic model checking: 1020 states and beyond. Inform. and Comp., 98:142–170, 1992. 3. H. Cho, G. Hachtel, S.-W. Jeong, B. Plessier, E. Schwarz, and F. Somenzi. ATPG aspects of FSM veriﬁcation. In IEEE Int. Conf. on CAD, pp. 134–137. 1990. 4. H. Cho, S.-W. Jeong, F. Somenzi, and C. Pixley. Synchronizing sequences and symbolic traversal techniques in test generation. Journal of Electronic Testing: Theory and Applications, 4:19–31, 1993. 5. G. D. Hachtel and F. Somenzi. A symbolic algorithm for maximum ﬂow in 0-1 networks. Formal Methods in System Design, pp. 207–219, 1997. 6. S. Jukna. The graph of integer multiplication is hard for read-k-times networks. Technical Report 95–10, Universit¨ at Trier, 1995. 7. M. Keim, R. Drechsler, B. Becker, M. Martin, and P. Molitor. Polynomial formal veriﬁcation of multipliers. Formal Methods in System Design, 22:39–58, 2003. 8. R. Rudell. Dynamic variable ordering for ordered binary decision diagrams. In IEEE Int. Conf. on CAD, pp. 42–47. 1993. 9. D. Sawitzki. Implicit ﬂow maximization by iterative squaring. Manuscript. http://ls2-www.cs.uni-dortmund.de/˜sawitzki. 10. D. Sawitzki. Implicit ﬂow maximization on grid networks. Manuscript. http://ls2-www.cs.uni-dortmund.de/˜sawitzki. 11. D. Sawitzki. Implizite Algorithmen f¨ ur Graphprobleme. Diploma thesis, Univ. Dortmund, 2002. 12. I. Wegener. Branching Programs and Binary Decision Diagrams - Theory and Applications. SIAM, 2000.

Ershov’s Hierarchy of Real Numbers Xizhong Zheng1 , Robert Rettinger2 , and Romain Gengler1 1

2

BTU Cottbus, 03044 Cottbus, Germany [email protected] FernUniversit¨ at Hagen, 58084 Hagen, Germany

Abstract. Analogous to Ershov’s hierarchy for ∆02 -subsets of natural numbers we discuss the similar hierarchy for recursively approximable real numbers. Namely, with respect to diﬀerent representations of real numbers, we deﬁne k-computability and f -computability for natural numbers k and functions f . We will show that these notions are not equivalent for representations based on Cauchy sequences, Dedekind cuts and binary expansions.

1

Introduction

In classical mathematics, real numbers are represented typically by Dedekind cuts, Cauchy sequences of rational numbers and binary or decimal expansions. The eﬀectivization of these representations leads to equivalent deﬁnitions of computable real numbers. This notion was ﬁrst explored by Alan Turing in his famous paper [14] where also the Turing machine is introduced. According to Turing, the computable numbers may be described brieﬂy as the real numbers whose expressions as a decimal are calculable by ﬁnite means (page 230, [14]). In other words, a real number x ∈ [0; 1] 1 is called computable if there is a com putable function f : N → {0, 1, · · · , 9} such that x = i∈N f (i) · 10−i . Robinson [9] has observed that computable real numbers can be equivalently deﬁned via Dedekind cuts and Cauchy sequences. Theorem 1 (Robinson [9], Myhill [6] and Rice [8]). For any real number x ∈ [0; 1], the following are equivalent. 1. 2. 3. 4.

x is computable; The Dedekind cut Lx := {r ∈ Q : r < x} of x is a recursive set; There is a recursive set A ⊆ N such that x = xA := i∈A 2−i ; There is a computable sequence (xs ) of rational numbers which converges to x eﬀectively in the sense that (∀s, t ∈ N)(t ≥ s =⇒ |xs − xt | ≤ 2−s ).

1

(1)

In this paper we consider only the real numbers of the unit interval [0; 1]. For other real numbers y, there are an n ∈ N and an x ∈ [0; 1] such that y := x ± n. y and x are regarded as being of the same computability.

B. Rovan and P. Vojt´ aˇ s (Eds.): MFCS 2003, LNCS 2747, pp. 681–690, 2003. c Springer-Verlag Berlin Heidelberg 2003

682

Xizhong Zheng, Robert Rettinger, and Romain Gengler

Because of Specker’s example of an increasing computable sequence of rational numbers with a non-computable limit in [13], the extra condition (1) of the eﬀective convergence is essential for the computability of x. As observed by Specker [13], Theorem 1 does not hold if the eﬀectivization to the primitive recursive instead of computable level is considered. Let R1 be the class of all limits of primitive recursive sequences of rational numbers which converge primitive recursively, R2 the class of all real numbers of primitive recursive binary expansions and R3 include all real numbers of primitive recursive Dedekind cuts. It is shown in [13] that R3 R2 R1 . For polynomial time computability of real numbers, Ko [5] shows this dependence on representations of real numbers too. Let PC be the class of limits of all polynomial time computable sequences of dyadic rational numbers which converge eﬀectively, PD contain all real numbers of polynomial time computable Dedekind cuts and PB be the class of real numbers whose binary expansions are polynomial time computable (with the input n written in unary notation). Ko [5] shows that PD = PB PC and PC is a real closed ﬁeld while PD is not closed under addition and subtraction. In [5], the dyadic rational numbers D := ∪n∈N Dn for Dn := {m · 2−n : m ∈ N} instead of Q is used as base set. For the complexity discussion D seems more natural and easier to use. But for computability it makes no essential diﬀerence and we use both D and Q in this paper. In this paper, we investigate similar classes where we weaken the notion of computability in several quite natural ways instead of strengthening this notion. A typical approach to explore the non-computable objects is to classify them into equivalent classes or so-called degrees by various reductions (see e.g. [12]). This can be easily implemented for real numbers by mapping each set A ⊆ N to a real number xA := i∈A 2−i and then deﬁning Turing reduction xA ≤T xB by A ≤T B. This deﬁnition is robust as shown in [2]. The beneﬁt of this approach is that the techniques and results from well developed recursion theory can be applied straightforwardly. For example, Ho [4] shows that a real number x is Turing reducible to 0 , the degree of the halting problem K, iﬀ there is a computable sequence of rational numbers which converges to x. This is a reprint of Shoenﬁeld’s Limit Lemma ([10]) in recursion theory which says that A ≤T K iﬀ A is a limit of a computable sequence of subsets of natural numbers. However, the classiﬁcation of real numbers by Turing reductions seems not ﬁne enough and it does not relate very closely to the analytical properties of real numbers. In this paper we will give another classiﬁcation of real numbers which is analogous to Ershov’s hierarchy ([3]) for subsets of natural numbers. Notice that, if A ⊆ N is recursive, then there is an algorithm which tells us whether a natural number n belongs to A or not. In this case, corrections are not allowed. However, if we allow the algorithm to change its mind for the membership of n to A from negative to positive but at most once, then the corresponding set A is an r.e. set. In other words, the algorithm may claim n∈ / A at some stage and correct its claim to n ∈ A at a later stage. In general, given a function h : N → N, if the algorithm is allowed to change the answer to the question “n ∈ A? ” at most h(n) times for any n ∈ N, then the corresponding

Ershov’s Hierarchy of Real Numbers

683

set A is called h-r.e. according to Ershov [3]. Especially, for constant function h(n) ≡ k, the h-r.e. sets are called k-r.e. For recursive function h, the h-r.e. sets are called ω-r.e. This introduces a classiﬁcation of ∆02 subsets of N (so called Ershov’s Hierarchy). Obviously, we can transfer this hierarchy to real numbers via their binary expansions straightforwardly. More precisely, we call xA h-binary computable if A is h-r.e. Similarly, after extending Ershov’s Hierarchy to subsets of rational numbers, we can call x h-Dedekind computable if the Dedekind cut of x is a h-r.e. set. For the Cauchy representation of real numbers a classiﬁcation similar to Ershov’s can be introduced too. In this case, we count the number of the “big jumps” of the sequence instead of the number of the “mind-changes”. According to Theorem 1.4, x is computable if there is a computable sequence (xs ) of rational numbers which converges to x and the sequence (xs ) makes no big jumps in the sense of (1). However, if up to h(n) (non-overlapped) “big jumps” are allowed, then x is called h-Cauchy computable. Thus, three kinds of h-computability of real numbers can be naturally introduced. In this paper, we will investigate these notions and compare them with other known notions of weak computability of real numbers discussed in [15]. In this case we will ﬁnd that Cauchy computability is the most natural notion, although several interesting results about binary and Dedekind computability are obtained in this paper.

2

Basic Deﬁnitions

In this section, we recall ﬁrst some notions of weak computability of real numbers and Ershov’s hierarchy. Then we give the precise deﬁnition of binary, Dedekind and Cauchy computability. As mentioned in the previous section, a real number x is computable if there is a computable sequence (xs ) of rational numbers which converges to x eﬀectively in the sense of (1). The limit of an increasing or decreasing computable sequence of rational numbers is called left computable or right computable, respectively. Left and right computable real numbers are called semi-computable. If x is a difference of two left computable real numbers, then x is called weakly computable. According to Ambos-Spies, Weihrauch and Zheng [1], x is weakly computable iﬀ there is a computable sequence (xs ) of rational numbers which converges to x weakly eﬀectively, in the sense that s∈N |xs − xs+1 | ≤ c for a constant c. More generally, if x is simply the limit of a computable sequence of rational numbers, then x is called recursively approximable. The classes of computable, left computable, right computable, semi-computable, weakly computable and recursively approximable real numbers are denoted by EC, LC, RC, SC, WC and RA, respectively. For any ﬁnite set A := {x1 < x2 < · · · < xk } of natural numbers, the natural number i := 2x1 + 2x2 + · · · + 2xk is called the canonical index of A. The set with canonical index i is denoted by Di . A sequence (As ) of ﬁnite subsets of N is called computable if there is a computable function g : N → N such that As = Dg(s) for any s ∈ N. Similarly, we can introduce the canonical index for subsets of dyadic rational numbers. Let σ : N → D be a one-to-one coding of

684

Xizhong Zheng, Robert Rettinger, and Romain Gengler

the dyadic numbers. For any ﬁnite set A ⊆ D, its canonical index is deﬁned as the canonical index of the set Aσ := σ −1 (A) := {n ∈ N : σ(n) ∈ A}. In this paper, the subset A ⊆ D of canonical index n is denoted by Vn . A sequence (As ) of ﬁnite subsets of dyadic numbers is called computable if there is a recursive function h such that As = Vh(s) for all s ∈ N. Deﬁnition 1 (Ershov [3]). For any function h : N → N, a set A ⊆ N is called h-recursively enumerable (h-r.e. for short) if there is a computable sequence (As ) of ﬁnite subsets As ⊆ N such that ∞ ∞ 1. A0 = ∅ and A = i=0 j=i Aj . 2. (∀n ∈ N)(|{s : n ∈ As ∆As+1 }| ≤ h(n)), where A∆B := (A \ B) ∪ (B\A) is the symmetrical diﬀerence of A and B. In this case, the sequence (As ) is called an eﬀective h-enumeration of A. For k ∈ N, a set A is called k-r.e. if it is h-r.e. for a constant function h(n) ≡ k and A is ω-r.e. if it is h-r.e. for some recursive function h. For convenience, recursive sets are called 0-r.e. Theorem 2 (Hierarchy Theorem, Ershov [3]). Let f, g : N → N be recursive functions. If (∃∞ n ∈ N)(f (n) < g(n)), then there is a g-r.e. set which is not f -r.e. Thus, there is an ω-r.e. set which is not k-r.e. for any k ∈ N; there is a (k + 1)-r.e. set which is not k-r.e. (for every k ∈ N), and there is also a ∆02 -set which is not ω-r.e. The deﬁnition of h-r.e., k-r.e. and ω-r.e. subsets of natural numbers can be transferred straightforwardly to subsets of dyadic rational numbers. Of course, h should be a function of type h : D → N in this case. This should be clear from context and is usually not indicated explicitly later on. Thus, we can easily introduce corresponding hierarchies for real numbers by means of binary or Dedekind representations of real numbers. However, if the real numbers are represented by sequences of rational numbers, we should count the number of their jumps of certain size. More precisely, we have the following deﬁnition. Deﬁnition 2. Let n be a natural number and (xs ) be a sequence of real numbers which converges to x. 1. An n-jump of (xs ) is a pair (i, j) with n < i < j & 2−n ≤ |xi − xj | < 2−n+1 . 2. The n-divergence of (xs ) is the maximal number of non-nested n-jump pairs of (xs ), i.e., the maximal natural number m such that there is a chain n < i1 < j1 ≤ i2 < j2 ≤ · · · ≤ im < jm with 2−n ≤ |xit − xjt | < 2−n+1 for t = 1, 2, · · · , m. 3. For h : N → N, if the n-divergence of (xs ) is bounded by h(n) for any n ∈ N, then we say that (xs ) converges to x h-eﬀectively. Deﬁnition 3. Let x ∈ [0; 1] be a real number and h : N → N a function. 1. x is h-binary computable (h-bEC for short) if there is a h-r.e. set A ⊆ N such that x = xA ;

Ershov’s Hierarchy of Real Numbers

685

2. x is h-Cauchy computable (h-cEC for short) if there is a computable sequence (xs ) of rational numbers which converges to x h-eﬀectively; 3. x is h-Dedekind computable (h-dEC for short) if the left Dedekind cut Lx := {r ∈ Q : r < x} is a h-r.e. set. 4. For δ ∈ {b, c, d}, x is called k-δEC if x is h-δEC for the constant function h(n) ≡ k. x is called ω-δEC if it is h-δEC for a recursive function h. The classes of all k-δEC, ω-δEC and h-δEC real numbers are denoted by k-δ EC, ω-δ EC and h-δ EC, respectively, for δ ∈ {b, c, d}. Besides, let ∗-δ EC := n∈N n-δ EC. The following proposition follows directly from the deﬁnition. Proposition 1. For δ ∈ {b, c, d} and f, g : N → N, the following hold. 1. 0-δ EC = EC. 2. k-δ EC ⊆ (k + 1)-δ EC ⊆ ∗-δ EC ⊆ ω-δ EC, for any k ∈ N. 3. If f (n) ≤ g(n) holds for almost all n ∈ N, then f -δ EC ⊆ g-δ EC.

3

Binary Computability

In this section we discuss the binary computability. From Theorem 2, it follows immediately that g-bEC \ f -bEC = ∅ if (∃∞ n ∈ N)(f (n) < g(n)). Thus, we have the following hierarchy theorem for binary computability. Proposition 2. k-bEC (k + 1)-bEC ∗-bEC ω-bEC, for any k ∈ N. Now we compare the binary computability with semi-computability. It turns out that SC is incomparable with ∗-bEC but included properly in ω-bEC. Theorem 3. 1. SC ω-bEC 2. SC ⊆ ∗-bEC 3. 2-bEC ⊆ SC Proof. 1. As it is pointed out by Soare ([11], page 217), if the real number xA is left computable, then the set A is 2n+1 -r.e. Combining this with Theorem 2, SC ω-bEC follows immediately. 2. We construct a set A ⊆ N in stages such that xA is left computable and, for all i, j ∈ N, the following requirements are satisﬁed. Ri,j : (Dϕi (s) )s is an eﬀective j-enumeration =⇒ A = lim Dϕi (s) . s→∞

where (ϕi ) is an eﬀective enumeration of all computable partial functions ϕ :⊆ N → N. This implies that A is not ∗-r.e. To satisfy Re for e := i, j, we choose an ne > j. We put ne into A as long as ne is not in Dϕi (s) . If ne enters Dϕi (s) for some s, then we take ne out of A. ne may be put into A again if ne leaves Dϕi (t) for some t > s, and so on. Obviously, we need only to change the membership of ne to A at most j times and the strategy succeeds eventually. To make xA left computable, we reserve an interval [me ; ne ] of natural numbers with ne − me > j

686

Xizhong Zheng, Robert Rettinger, and Romain Gengler

exclusively for Re and put a new element from this interval into A whenever ne is taken out of A. 3. Ambos-Spies, Weihrauch and Zheng (Theorem 4.8 of [1]) show that, for Turing incomparable r.e. sets A, B ⊆ N, xA⊕B is not semi-computable, where B is the complement of B and A ⊕ B := {2n : n ∈ A} ∪ {2n + 1 : n ∈ B}. On the other hand, for any r.e. sets A, B, the join A ⊕ B := (2A ∪ (2N + 1)) \ (2B + 1) is a 2-r.e. set and hence xA⊕B is 2-bEC. Theorem 4. WC ⊆ ω-bEC and ω-bEC ⊆ WC Proof. In [16] Zheng shows that there are r.e. sets A, B ⊆ N such that the set C ⊆ N deﬁned by xC := xA − xB is not of ω-r.e. Turing degree. This means that xC is weakly computable but not ω-bEC. That is, WC ⊆ ω-bEC. The part ω-bEC ⊆ WC follows immediately from a result of [1], that if xA⊕∅ is weakly computable, then A is a 23n -r.e. set. By Ershov’s Hierarchy Theorem 2, we can choose an ω-r.e. A which is not 23n -r.e. Then B := A ⊕ ∅ is obviously also an ω-r.e. set and hence xB is ω-bEC. But xB is not weakly computable because A is not 23n -r.e.

4

Dedekind Computability

We investigate Dedekind computability in this section. Again the class of ω-dEC and WC are incomparable. But diﬀerent from the case of binary computability, the hierarchy theorem does not hold any more. Between ω-binary and ωDedekind computability we have the following result. Theorem 5. ω-bEC ⊆ ω-dEC Proof. Let xA ∈ ω-bEC and (As ) be an eﬀective h-enumeration of A for a recursive function h. We deﬁne a computable sequence (Es ) of ﬁnite subsets of dyadic numbers by Es := {r ∈ Ds : r ≤ xAs }, where Ds is the set of all dyadic rational numbers of precision s. It is easy to see that E := lims Es exists and it is in fact the left Dedekind cut of the real numberxA . On the other hand, (Es ) is an eﬀective g-enumeration of E, where g(n) := i≤n h(i). Thus, x is a g-dEC and hence an ω-dEC real number. The next result shows that the class ∗-dEC collapses to SC and hence the hierarchy theorem does not hold. Lemma 1. 1. 1-dEC = LC and SC ⊆ 2-dEC. 2. ∗-dEC = SC. Proof. 1. This follows directly from the deﬁnition. 2. By item 1, it suﬃces to prove that ∗-dEC ⊆ SC. For any x ∈ ∗-dEC, let k := min{n : x ∈ n-dEC}. Then the Dedekind cut Lx of x is a k-r.e. but not (k − 1)-r.e. set. Let (As ) be an eﬀective k-enumeration of Lx . Then there are inﬁnitely many r ∈ D such that |{s ∈ N : r ∈ As+1 ∆As }| = k, where

Ershov’s Hierarchy of Real Numbers

687

A∆B := (A \ B) ∪ (B \ A). Let Ok := {r ∈ D : |{s ∈ N : r ∈ As+1 ∆As }| = k}. Obviously, Ok is an r.e. set. If k > 0 and k is even, then x < r for any r ∈ Ok and we can choose a decreasing computable sequence (rs ) from Ok such that lim rs = x. Otherwise, there is a rational number y such that x < y < r for all r ∈ Ok . In this case, we can construct an eﬀective (k − 1)-enumeration of Lx by allowing any r > y to enter Lx at most k/2 − 1 times. This contradicts the hypothesis. Thus x is a right computable real number. Similarly, if k is odd, then x is left computable. Theorem 6. WC ⊆ ω-dEC. Proof. We construct recursive enumerations (As ) and (Bs ) of r.e. sets A and B and deﬁne Cs by xCs = xAs − xBs . Let C := lims→∞ Cs = respectively C . Then xC is a weakly computable real number. To guarantee that t s∈N t≥s xC is not ω-dEC, it suﬃces to satisfy the following requirements for all i, j ∈ N.  ϕi and ψj are total functions,  (Vϕi (s) )s∈N is an eﬀective ψj -enumeration, =⇒ sup(Ei ) = xC . Ri,j :  Ei := lims→∞ Vϕi (s) is a Dedekind cut where (ϕi ) and (ψj ) are recursive enumerations of partial computable functions ϕi :⊆ N → N and ψj :⊆ D → N, respectively. This can be achieved by a ﬁnite injury priority construction. Corollary 1. The class ω-dEC is incomparable with the class WC and hence the class ∗-dEC is a proper subset of ω-dEC. Corollary 2. The class ω-dEC is not closed under addition and subtraction. Proof. By Lemma 1.2, we have SC ⊆ ω-dEC. If ω-dEC is closed under addition and subtraction, then WC ⊆ ω-dEC holds because WC is the closure of SC under addition and substraction. This contradicts Theorem 6.

5

Cauchy Computability

We discuss the Cauchy computability in this section. We will show that, the classes k-cEC and ∗-cEC are incomparable with the classes LC and SC, and that the class ∗-cEC is not closed under addition. However the hierarchy theorem holds. From the deﬁnition of ω-Cauchy computability, it is easy to see that x is ω-Cauchy computable iﬀ there are a recursive function h and a computable sequence (xs ) of rational numbers converging to x such that, for any n ∈ N, there are at most h(n) non-nested pairs (i, j) of indices with |xi − xj | ≥ 2−n . Thus, class ω-cEC is in fact the class DBC (divergence bounded computable real numbers) discussed in [7] and hence it is in fact the image class of all left computable real numbers under total computable real functions. We summarize some known results about the class ω-cEC in the next theorem where CTF denotes the class of all computable real functions f : [0; 1] → [0; 1].

688

Xizhong Zheng, Robert Rettinger, and Romain Gengler

Theorem 7 (Rettinger, Zheng, Gengler and von Braunm¨ uhl [7]). 1. The class ω-cEC is a ﬁeld; 2. ω-cEC = CTF(LC) := {f (y) : f ∈ CTF & y ∈ LC} and 3. WC ω-cEC RA. Now let us look at the relationship among the classes 1-cEC, ∗-cEC and the classes SC and WC. Theorem 8. 1-cEC ⊆ SC ⊆ ∗-cEC WC. Proof. For the ﬁrst noninclusion, consider the number xA⊕B for two Turing incomparable r.e. sets A, B ⊆ N. By Theorem 4.8 of [1], it is not semi-computable but 1-cEC. For the second inclusion, we can construct a left computable real number which is not k-cEC for any k ∈ N by a priority construction. To prove ∗-cEC WC, let (xs ) be a computable sequence of rational numbers which converges k-eﬀectively to a ∗-cEC real number x for some k ∈ N. For any n ∈ N, let Sn := {s ∈ N : 2−n ≤ |xs − xs+1 | < 2−n+1 }. Then s∈N |xs − xs+1 | = n∈N s∈Sn &s≤n |xs − xs+1 | + s∈Sn &s>n |xs − xs+1 | ≤ n∈N (n ·

2−n+1 + k · 2−n+1 ) ≤ 8 + 2k. That is, x is a weakly computable real number. Therefore, ∗-EC ⊆ WC. By the assertion SC ∗-cEC of item (2), this inclusion is also proper. Theorem 9. For any recursive functions f, g with ∃∞ n(f (n) < g(n)), there is a g-cEC real number which is not f -cEC, i.e., g-cEC \ f -cEC = ∅. Proof. We construct a computable sequence (xs ) of rational numbers which satisﬁes, for any e ∈ N, the following requirements. N:

(xs ) converges g-eﬀectively to x, and

Re :

If (ϕe (s))s converges f -eﬀectively, then x = lims ϕe (s),

where (ϕe ) is an eﬀective enumeration of all computable partial functions ϕe :⊆ N → Q. To satisfy a single requirement Re , choose a rational interval Ie of length 2−ne for some ne ∈ N such that f (ne ) < g(ne ). Divide it equally into four subintervals Ii , for i < 4, of the length 2−(ne +2) . Deﬁne xs as the middle point of the interval I1 as long as the sequence (ϕe (s))s does not enter the interval I1 . Otherwise, if ϕe (s) enters into I1 for some s, then let xs be the middle point of I3 . Later, if ϕe (t) enters I3 for some t > s, then let xt be the middle point of I1 again, and so on. If (ϕe (s))s converges f -eﬀectively, then (xs ) needs at most f (ne ) + 1 ≤ g(ne )) jumps to guarantee that lim xs = lims ϕe (s). Thus, the requirement N is satisﬁed too. To satisfy all the requirements simultaneously, we will construct an increasing sequence (ne ) of natural numbers such that f (ne ) < g(ne ) and ne + 2 ≤ ne+1 for all e ∈ N, and two sequences (Ie ) and (Je ) of rational numbers such that Ie := [ae ; be ] and Je := [ce ; de ] which satisfy the following conditions ae < be < ce < de & be − ae = de − ce = 2−(ne +1) & ce − be = 2−ne ,

(2)

Ershov’s Hierarchy of Real Numbers

689

and Ie+1 ∪ Je+1 ⊂ Ie for all e ∈ N. The intervals Ie and Je are reserved for the requirement Re . That is, we construct a computable sequence (xs ) of rational numbers such that xs is properly chosen from Ie or Je in order to guarantee lims xs = lims ϕe (s). In general, the sequences (ne ), (Ie ) and (Je ) are not computable but they can be eﬀectively approximated. Namely, at stage s, we can construct the ﬁnite approximation sequence (ne,s )e≤k(s) , (Ie,s )e≤k(s) and (Je,s )e≤k(s) , where k(s) ∈ N satisﬁes lims k(s) = ∞. At any stage s, we choose a rational number xs such that xs ∈ Ie,s for all e ≤ k(s). If, for some t, ϕe,s (t) enters the interval Ie,s too, then we exchange Ie,s and Je,s . In this case, we denote this t by te,s . For any i > e, the intervals Ii and Ji will be cancelled and should be redeﬁned with a new ni,t > ni,s for some t > s. For the same ne , the intervals Ie and Je can be exchanged at most f (ne ) times, if (ϕe (s))s converges f -eﬀectively. Therefore, a ﬁnite injury priority construction can be applied. Corollary 3. For any k ∈ N, we have k-cEC (k + 1)-cEC. Theorem 10. There are x, y ∈ 1-cEC such that x − y ∈ / ∗-cEC. Therefore, k-cEC and ∗-cEC are not closed under addition and subtraction for any k > 0. Proof. We will construct two computable increasing sequences (xs ) and (ys ) of rational numbers which converge 1-eﬀectively to x and y, respectively, such that z := x − y satisﬁes all the following requirements: Ri,j :

(ϕi (s))s converges j-eﬀectively to ui =⇒ ui = z

where (ϕi ) is an eﬀective enumeration of all partial computable functions ϕi :⊆ N → Q. To satisfy Re (e := i, j), we choose two natural numbers ne and me such that me = 2j +ne +2 and an rational interval I := [a0e ; a8e ] of length 2−me +2 . ] for k < 8. The interval I is divided equally into eight subintervals Ik := [ake ; ak+1 e At the beginning, let x0 := a2e and y0 = 0 and hence z0 := x0 −y0 = a2e ∈ J := Ie2 , where J serves as a witness interval of Re such that any element z ∈ J satisﬁes Re . If, at some stage s0 > 0, ϕi (t0 ) enters the interval J for some t0 , then we deﬁne xs0 := x0 + 2−(ne +1) + 3 · 2−(me +1) , ys0 := y0 + 2−(ne +1) and J := Ie5 . Accordingly we have zs0 := xs0 − ys0 = z0 + 3 · 2−(me +1) and hence zs0 ∈ J. If, at a later stage s1 > s0 , ϕi (t1 ) enters the interval J := Ie5 for some t1 > t0 , then we deﬁne the xs1 := xs0 + 2−(ne +2) + 3 · 2−(me +1) , ys1 := ys0 + 2−(ne +2) and J := Ie2 . In this case, we have zs1 := xs1 − ys1 = z0 + 3 · 2−(me +1) and hence zs1 ∈ J. This can happen at most j times if (ϕi (s))s converges j-eﬀectively. Thus we have lims zs = lims ϕi (s) and Re is satisﬁed. To satisfy all the requirements, we apply a ﬁnite injury priority construction.

References 1. K. Ambos-Spies, K. Weihrauch, and X. Zheng. Weakly computable real numbers. Journal of Complexity, 16(4):676–690, 2000. 2. A. J. Dunlop and M. B. Pour-El. The degree of unsolvability of a real number. In J. Blanck, V. Brattka, and P. Hertling, editors, Computability and Complexity in Analysis, volume 2064 of LNCS, pages 16–29, Berlin, 2001. Springer. CCA 2000, Swansea, UK, September 2000.

690

Xizhong Zheng, Robert Rettinger, and Romain Gengler

3. Y. L. Ershov. A certain hierarchy of sets. i, ii, iii. (Russian). Algebra i Logika. 7(1):47–73, 1968; 7(4):15–47, 1968; 9:34–51, 1970. 4. C.-K. Ho. Relatively recursive reals and real functions. Theoretical Computer Science, 210:99–120, 1999. 5. K.-I. Ko. Complexity Theory of Real Functions. Progress in Theoretical Computer Science. Birkh¨ auser, Boston, 1991. 6. J. Myhill. Criteria of constructibility for real numbers. The Journal of Symbolic Logic, 18(1):7–10, 1953. 7. R. Rettinger, X. Zheng, R. Gengler, and B. von Braunm¨ uhl. Weakly computable real numbers and total computable real functions. In Proceedings of COCOON 2001, Guilin, China, August 20-23, 2001, volume 2108 of LNCS, pages 586–595. Springer, 2001. 8. H. G. Rice. Recursive real numbers. Proc. Amer. Math. Soc., 5:784–791, 1954. 9. R. M. Robinson. Review of “Peter, R., Rekursive Funktionen”. The Journal of Symbolic Logic, 16:280–282, 1951. 10. J. R. Shoenﬁeld. On degrees of unsolvability. Ann. of Math. (2), 69:644–653, 1959. 11. R. Soare. Cohesive sets and recursively enumerable Dedekind cuts. Paciﬁc J. Math., 31:215–231, 1969. 12. R. I. Soare. Recursively enumerable sets and degrees. A study of computable functions and computably generated sets. Perspectives in Mathematical Logic. SpringerVerlag, Berlin, 1987. 13. E. Specker. Nicht konstruktiv beweisbare S¨ atze der Analysis. The Journal of Symbolic Logic, 14(3):145–158, 1949. 14. A. M. Turing. On computable numbers, with an application to the “Entscheidungsproblem”. Proceedings of the London Mathematical Society, 42(2):230–265, 1936. 15. X. Zheng. Recursive approximability of real numbers. Mathematical Logic Quarterly, 48(Suppl. 1):131–156, 2002. 16. X. Zheng. On the Turing degrees of weakly computable real numbers. Journal of Logic and Computation, 13(2): 159-172, 2003.

Author Index ` Alvarez, C. 142 Amano, Kazuyuki 152 Ambos-Spies, Klaus 162 Anantharaman, Siva 169 Ausiello, G. 179 Baba, Kensuke 189 Banderier, Cyril 198 Bannai, Hideo 208 Bazgan, C. 179 Beier, Ren´e 198 Benkoczi, Robert 218 Bhattacharya, Binay 218 Blanchard, F. 228 Blesa, M. 142 Bodlaender, Hans L. 239 B¨ ohler, Elmar 249 Bonsma, Paul S. 259 Boreale, Michele 269, 279 Brosenne, Henrik 290 Brueggemann, Tobias 259 Bucciarelli, Antonio 300 Buhrman, Harry 1 Buscemi, Maria Grazia 269 Carton, Olivier 308 ˇ Cern´ a, Ivana 318 Cervelle, J. 228 Chen, Hubie 328, 338 Chen, Zhi-Zhong 348 Chrobak, Marek 218 Crochemore, M. 622 Dalmau, Victor 358 Dang, Zhe 480 Delhomm´e, Christian 378 Demange, M. 179 D´ıaz, J. 142 Duval, Jean-Pierre 388 Egecioglu, Omer 480 Epstein, Leah 398, 408 Feldmann, R. 21 Fellows, Michael R. 239 Fern´ andez, A. 142 Ford, Daniel K. 358

Formenti, E. 228 Friedl, Katalin 419 Gadducci, Fabio 279 Gairing, M. 21 Gastin, Paul 429, 439 Gengler, Romain 681 Geser, Alfons 449 Glaßer, Christian 249 Gorrieri, Roberto 46 Gramlich, Gregor 460 Grossi, R. 622 Hagiwara, Masayuki 490 Hannay, Jo 68 Hlinˇen´ y , Petr 470 Hofbauer, Dieter 449 Homeister, Matthias 290 Ibarra, Oscar H. 480 Inenaga, Shunsuke 208 Ishii, Toshimasa 490 Katsumata, Shin-ya 68 Knapik, Teodor 378 Kolpakov, Roman 388 Kouno, Mitsuharu 348 Krysta, Piotr 500 Kucherov, Gregory 388 Kumar, K. Narayan 429 Kutylowski, Miroslaw 511 Larmore, Lawrence L. 218 Lasota, Slawomir 521 Lecroq, Thierry 388 Lefebvre, Arnaud 388 Leporati, Alberto 92 Letkiewicz, Daniel 511 L¨ oding, Christof 531 Loyer, Yann 541 L¨ ucking, Thomas 21, 551 Luttik, Bas 562 Magniez, Fr´ed´eric 419 Marco, Gianluca De 368 Martinelli, Fabio 46 Mart´ınez, Conrado 572 Maruoka, Akira 152

692

Author Index

Mauri, Giancarlo 92 Mavronicolas, Marios 551 Meer, K. 582 Meghini, Carlo 592 Mehlhorn, Kurt 198 Meister, Daniel 249 Merkle, Wolfgang 602 Miltersen, Peter Bro 612 Molinero, Xavier 572 Monien, Burkhard 21, 551 Mukund, Madhavan 429 Narendran, Paliath Oddoux, Denis

169

439

Paschos, V. Th. 179 Pel´ anek, Radek 318 Pelc, Andrzej 368 Pinchinat, Sophie 642 Pisanti, N. 622 Radhakrishnan, Jaikumar 612 Reimann, Jan 602 Reith, Steﬀen 632 Rettinger, Robert 681 Riedweg, St´ephane 642 Rode, Manuel 21, 551 Rohde, Philipp 531 R¨ ohrig, Hein 1 Rusinowitch, Michael 169 Rychlik, Marcin 652 Rytter, Wojciech 218

Sagot, M.-F. 622 Salehi, Saeed 662 Salibra, Antonino 300 Sanders, Peter 500 Sannella, Donald 68 Santha, Miklos 419 Saxena, Gaurav 480 Sen, Pranab 419 Serna, M. 142 Shinohara, Ayumi 189, 208 Spirakis, Paul 551 Spyratos, Nicolas 592 Straccia, Umberto 541 Takeda, Masayuki 189, 208 Tassa, Tamir 408 Thilikos, Dimitrios M. 239 Thomas, D. Gnanaraj 378 Thomas, Wolfgang 113 Tsuruta, Satoshi 189 Tzitzikas, Yannis 592 V¨ ocking, Berthold Vrto, Imrich 551

500

Waack, Stephan 290 Waldmann, Johannes 449 Wegener, Ingo 125, 612 Woeginger, Gerhard J. 259 Woelfel, Philipp 671 Zheng, Xizhong

681

Mathematical Foundations of Computer Science 1994 19 conf., MFCS'94

Read more

Mathematical Foundations of Computer Science 1980, MFCS'80, 9 conf

Read more

Mathematical Foundations of Computer Science 2008, 33 conf., MFCS 2008

Read more

Mathematical Foundations of Computer Science 2007, 32 conf., MFCS 2007

Read more

Mathematical Foundations of Computer Science 1996 21 conf., MFCS'96

Read more

Mathematical Foundations of Computer Science 1992 17 conf., MFCS'92

Read more

Mathematical Foundations of Computer Science 2000, 25 conf., MFCS 2000

Read more

Mathematical Foundations of Computer Science 1993 18 conf., MFCS'93

Read more

Mathematical Foundations of Computer Science 1999, 24 conf., MFCS'99

Read more

Mathematical Foundations of Computer Science 2001, 26 conf., MFCS 2001

Read more

Mathematical Foundations of Computer Science 1998, 23 conf., MFCS'98

Read more

Mathematical Foundations of Computer Science 1975, MFCS'75, 4 conf

Read more

Mathematical Foundations of Computer Science 1995 20 conf., MFCS'95

Read more

Mathematical Foundations of Computer Science 1997 22 conf., MFCS'97

Read more

Mathematical Foundations of Computer Science 2006, 31 conf., MFCS 2006

Read more

Mathematical Foundations of Computer Science 1991 16 conf., MFCS'91

Read more

Mathematical Foundations of Computer Science 2005, 30 conf., MFCS 2005

Read more

Mathematical Foundations of Computer Science 3 conf

Read more

Theoretical Computer Science, 8 conf., ICTCS 2003

Read more

Computer Science Logic, 17 conf., CSL 2003

Read more

Mathematical Foundations of Computer Science 1990, MFCS'90

Read more

Mathematical Foundations of Computer Science 1989, MFCS'89

Read more

Mathematical Foundations of Computer Science 1988, MFCS'88

Read more

Computer Security - ESORICS 2003, 8 conf

Read more

STACS 2003, 20 conf

Read more

Mathematical Knowledge Management, 2 conf., MKM 2003

Read more

Computer Aided Verification, 15 conf., CAV 2003

Read more

Mathematical Foundations of Computer Science 1979 8 conf

Read more

Mathematical Foundations of Computer Science 1977 6 conf

Read more

Graph-Theoretic Concepts in Computer Science, 29 conf., WG 2003

Read more

Recommend Documents

Mathematical Foundations of Computer Science 1994 19 conf., MFCS'94

Mathematical Foundations of Computer Science 1980, MFCS'80, 9 conf

Mathematical Foundations of Computer Science 2008, 33 conf., MFCS 2008

Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris ...

Mathematical Foundations of Computer Science 2007, 32 conf., MFCS 2007

Lecture Notes in Computer Science Commenced Publication in 1973 Founding and Former Series Editors: Gerhard Goos, Juris ...

Mathematical Foundations of Computer Science 1996 21 conf., MFCS'96

Mathematical Foundations of Computer Science 1992 17 conf., MFCS'92

Mathematical Foundations of Computer Science 2000, 25 conf., MFCS 2000

Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis and J. van Leeuwen 1893 3 Berlin Heidelberg New Yo...

Mathematical Foundations of Computer Science 1993 18 conf., MFCS'93

Mathematical Foundations of Computer Science 1999, 24 conf., MFCS'99

Mathematical Foundations of Computer Science 2001, 26 conf., MFCS 2001

Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis and J. van Leeuwen 2136 3 Berlin Heidelberg New Y...