Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis and J. van Leeuwen
1380
Cl~iudio L. Lucchesi Arnaldo V. Moura (Eds.)
LATIN' 9 8: Theoretical Informatics Third Latin American Symposium Campinas, Brazil, April 20-24, 1998 Proceedings
.
m
B
Springer
Series Editors Gerhard Goos, Karlsruhe University, Germany Juris Hartmanis, Cornell University, NY, USA Jan van Leeuwen, Utrecht University, The Netherlands
Volume Editors Cl~udio L. Lucchesi Arnaldo V. Moura University of Campinas, Institute of Computing C.P. 6176, 13083-970 Campinas, SP, Brazil E-mail: {lucchesi/arnaldo} @dcc.unicamp.br Cataloging-in-Publication data applied for
Die Deutsche Bibliothek - CIP-F_,inheitsaufnahme Theoretical informatics : proceedings / LATIN '98, Third Latin American Symposium, Campinas, Brazil, April 20 - 24, 1998. Cl~iudio L. Lucchesi ; Arnaldo V. Mourn (ed.). - Berlin ; Heidelberg ; New York ; Barcelona ; Budapest ; Hong Kong ; London ; Milan ; Paris ; Santa Clara ; Singapore ; Tokyo : Springer, 1998 (Lecture notes in computer science ; Vol. 1380) ISBN 3-540-64275-7
CR Subject Classification (1991): El-3, E.3, G.1-2, 1.1.1-2, 1.3.5 ISSN 0302-9743 ISBN 3-540-64275-7 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. 9 Springer-Verlag Berlin Heidelberg 1998 Printed in Germany
Typesetting: Camera-ready by author SPIN 10631984
06/3142 - 5 4 3 2 1 0
Printed on acid-free paper
Preface
This volume is the proceedings of the International Conference LATIN'98, Latin American Theoretical INformatics, held in Campinas, Brazil, April 20-24, 1998. This event is the third of a series started with LATIN'92, organized in S?~o Paulo, Brazil, in April 1992, and continued with LATIN'95, organized in Valparaiso, Chile, in April 1995. The aim of the conference is to provide a high level forum for theoretical computer science research in Latin America, and to promote a strong and healthy interaction with the international scientific community. The LATIN conferences focus on theory of computing, but in Latin America it is quite common to group under this umbrella fields which are sometimes classified otherwise, such as graph theory and combinatorics on words. After a lengthy and passionate discussion among the Program Committee members, papers on these subjects were considered. We hope that this policy will become a tradition in future editions of the conference. We received 53 submissions, from 104 authors in some 15 different countries. The 33 papers in this volume include 5 articles by invited speakers and 28 papers selected by the Program Committee based on some 160 reports filed by Committee members and other referees. We would like to thank all individuals and organizations who cooperated with this event. In particular, the continued commitment of Springer-Verlag to publish the proceedings in its Lecture Notes in Computer Science series has strongly contributed to the success of this conference. We would also like to thank Imre Simon and Adriano Nagelschmidt Rodrigues who provided an intranet site for the Program Committee at the University of S~o Paulo, which proved to be essential for the multiple discussions among the PC members. Undoubtedly LATINis the main Latin American event in theoretical computer science. We feel confident that it is also gradually becoming a tradition in the computer science community.
Cls
L. Lucchesi Program Chair
April 1998 Arnaldo V. Moura Program Vice-Chair
T h e Conference
Local Arrangements The local arrangements for the conference were handled by the Institute of Computing of the University of Campinas ( I C - U N I C A M P ) , in Campinas, Brazil.
Organizing Committee Ricardo de Oliveira Anido Ariadne M. B. Rizzoni de Carvalho Ricardo Dahab Anamaria Gomide Tomasz Kowaltowski
Cls L. Lucchesi (Chair) Arnaldo V. Moura (Co-Chair) CSndido F. Xavier de Mendon~a Neto Jorge Stolfi
Financial Support (State of S~o Paulo Research Funding Agency), CNPq (Brazilian Council for Scientific and Technological Development), C A P E S Foundation (Brazilian Ministry of Education), FAEP Foundation (University of Campinas), and Institute of Computing (University of Campinas). FAPESP
Cooperation (European Association for Theoretical Computer Science), sBc (Brazilian Computing Society), sccc (Chilean Computer Science Society), S I G A C T - A C M (Association for Computing Machinery), SMCC (Mexican Computer Science Society), and UMALCA (Mathematical Union of Latin America and Caribe). EATCS
The Conference
Invited Speakers Noga Alon (Tel Aviv Univ., Israel) Richard Beigel (Lehigh Univ., USA) Gilles Brassard (Univ. de Montrdal, Canada) Herbert Edelsbrunner (Univ. of Illinois, Urbana-Champaign, USA) Juan A. Garay (IBM, Yorktown Heights, USA)
Program Committee Ricardo Baeza-Yates (Univ. de Chile, Chile) Valmir C. Barbosa (UFRJ, Brazil) Richard Beigel (Lehigh Univ., USA) Christian Choffrut (Univ. Paris VII, LITP, France) Va~ek Chvs (Rutgers Univ., USA) Volker Diekert (Univ. of Stuttgart, Germany) Peter Eades (Univ. of Newcastle, Australia) Herbert Edelsbrunner (Univ. of Illinois, Urbana-Champaign, USA) Juan A. Garay (IBM, Yorktown Heights, USA) Oscar Garrido (Univ. of Karlstad, Sweden) Eric Goles (Univ. de Chile, Chile) Jozef Gruska (Univ. of Brno, Czech Republik) Katia Silva Guimar~es (UFPE, Brazil) Yoshiharu Kohayakawa (USP, Brazil) Cl~udio L. Lucchesi (UNICAMP, Brazil) (Chair) Arnaldo V. Moura (UNICAMP, Brazil) (Co-Chair) Gene Myers (Univ. of Arizona, USA) Bruce Reed (CNRS, Paris VI, France) Alexander Schrijver (CWI, Netherlands) Peter Shot (ATT, USA) Imre Simon (USP, Brazil) Js Simon (Univ. of Chicago, USA) Jayme Luiz Szwarcfiter (UFRJ, Brazil) Eli Upfal (Weizmann Inst., Israel and IBM, Almaden, USA) Jorge Urrutia (Univ. of Ottawa, Canada) Nivio Ziviani (UFMG, Brazil)
VII
VIII
The Conference
Referees Amihood Amir Sandra A. de Amo David Applegate Arnaldo de Albuquerque Aratljo Ms Drummond Arafijo Amotz Bar-Noy Saulo R. M. Barros Mauro R. F. Benevides Paulo Borba Ljiljana Brankovi~ Lubo~ Brim Andrei Broder Edson N. Cs Rosa M. L. R. Carmo Ivana Cern~ Don Coppersmith Ricardo Dahab Javier Esparza Antonio Elias Fabris Martin Farach QinWen Feng Cristina Gomes Fernandes Carlos E. Ferreira Celina M. H. de Figueiredo Luiz Henrique de Figueiredo Marcelo Finger Alan Frieze Bill Gasarch Leucio Guerra Irene Guessarian Edward Hermann Haeusler Matthias Jantzen Esther Jennings David Johnson Ricardo Ueda Karpischek Alica Kelemenovg
Sulamita Klein Bernd Kreuter Alain Lascoux Pierre Lescanne Stefan Lewandowski S~rgio Lifschitz Antonio Alfredo Loureiro Nelson Maculan Arnaldo Mandel C~lia Picinin de Mello Lenka Moty~kov~ Edleno S. de Moura Ian Munro Anca Muscholl Gonzalo Navarro Valeria de Paiva Tarcisio Pequeno Roque Marinho Persiano Holger Petersen Rossella Petreschi Yuval Rabani Tal Rabin Ed Reingold Augusto Sampaio Jofio Carlos Setubal Said Sidki Flavio Soares Correa da Silva Slang Wun Song Dan Spielman John Stembridge Jean-Marc Steyaert Jorge Stolfi Routo Terada Jacques Wainer Yoshiko Wakabayashi Jerzy Wojciechowski
Table of Contents
Algorithms, Complexity Analysis of Rabin's Polynomial Irreducibility Test . . . . . . . . . . . . . . . . . . . . . .
1
Daniel Panario, Alfredo Viola A Chip Search Problem on Binary Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
Peter Damaschke Uniform Service Systems with k Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
23
Esteban Feuerstein Faster Non-linear Parametric Search with Applications to Optimization and Dynamic Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
David Ferndndez-Baca
Automata, Transition Systems, Combinatorics on Words Super-State A u t o m a t a and Rational Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . .
42
Frdddrique Bassino, Marie-Pierre Bdal, Dominique Perrin An Eilenberg Theorem for Words on Countable Ordinals . . . . . . . . . . . . . . . .
53
Nicolas Bedon, Olivier Carton Maximal Groups in Free Burnside Semigroups . . . . . . . . . . . . . . . . . . . . . . . . .
65
Alair Pereira do Lago Positive Varieties and Infinite Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
76
Jean-JEric Pin Unfolding Parametric A u t o m a t a . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
88
Marcos Veloso Peixoto, Laurent Fribourg Fundamental Structures in Well-Structured Infinite Transition Systems . . . 102
Alain Finkel, Philippe Schnoebelen
Computational Geometry, Graph Drawing Shape Reconstruction with Delaunay Complex
(Invited Paper)
...........
119
Herbert Edelsbrunner Bases for Non-homogeneous Polynomial Ck Splines on the Sphere . . . . . . . .
Anamaria Gomide, Jorge Stol]i
133
X
Table of Contents
The Splitting Number of the 4-Cube
................................
141
Luerbio Faria, Celina Miraglia Herrera de Figueiredo, Candido Ferreira Xavier de Mendon~a Nero Short and Smooth Polygonal Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
151
James Abello, Emden Gansner
Cryptography Quantum Cryptanalysis of Hash and Claw-Free Fhnctions (Invited Paper} . 163
Gilles Brassard, Peter Hcyer, Alain Tapp Batch Verification with Applications to Cryptography and Checking (Invited Paper) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
170
Mihir BeUare, Juan A. Garay, Tal Rabin Strength of Two Data Encryption Standard Implementations under Timing Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
192
Alejandro Hevia, Marcos Kiwi
Graph Theory, Algorithms on Graphs Spectral Techniques in Graph Algorithms (Invited Paper) . . . . . . . . . . . . . . . .
206
Noga Alon Colouring Graphs whose Chromatic Number Is Almost Their Maximum Degree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
216
Michael Molloy, Bruce Reed Circuit Covers in Series-Parallel Mixed Graphs . . . . . . . . . . . . . . . . . . . . . . . .
226
Orlando Lee, Yoshiko Wakabayashi A Linear Time Algorithm to Recognize Clustered Planar Graphs and Its Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
239
Elias Dahlhaus A New Characterization for Parity Graphs and a Coloring Problem with Costs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
249
Klaus Jansen On the Clique Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
261
Marisa Gutierrez, JoSo Meidanis
Packet Routing Dynamic Packet Routing on Arrays with Bounded Buffers . . . . . . . . . . . . . .
273
Andrei Z. Broder, Alan M. b'~rieze, Eli Upfal On-Line Matching Routing on Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Alan Roberts, Antonios Symvonis
282
Table of Contents Parallel
XI
Algorithms
Analyzing Glauber Dynamics by Comparison of Markov Chains . . . . . . . . . . Dana Randall, Prasad Tetali
292
The CREW PRAM Complexity of Modular Inversion . . . . . . . . . . . . . . . . . . . Joachim yon zur Gathen, Igor Shparlinski
305
Communication-Efficient Parallel Multiway and Approximate Minimum Cut Computation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Friedhelm Meyer auf der Heide, Gabriel Terdn Martinez Pattern
Matching,
316
Browsing
The Geometry of Browsing (Invited Paper) . . . . . . . . . . . . . . . . . . . . . . . . . . . . Richard Beigel, Egemen Tanin
331
Fast Two-Dimensional Approximate Pattern Matching . . . . . . . . . . . . . . . . . Ricardo Baeza-}rates, Gonzalo Navarro
341
Improved Approximate Pattern Matching on Hypertext . . . . . . . . . . . . . . . . . Gonzalo Navarro
352
Solving Equations in Strings: On Makanin's Algorithm . . . . . . . . . . . . . . . . . Claudio Gutidrrez
358
Spelling Approximate Repeated or Common Motifs Using a Suffix T r e e . . . 374 Marie-l~rance Sagot Author
Index
.................................................
391
Analysis of Rabin’s Polynomial Irreducibility Test Daniel Panario1 and Alfredo Viola 1 2
Department of Computer Science, Univ. of Toronto, Toronto, Canada M5S-3G4 daniel@c .toronto.edu Pedeciba Informatica, Casilla de Correo 161 0, Distrito 6, Montevideo, Uruguay
[email protected]
Ab tract. We give a precise average-case analysis of Rabin’s algorithm for testing the irreducibility of polynomials over nite elds. The main technical contribution of the paper is the study of the probability that a random polynomial of degree n contains an irreducible factor of degree dividing several maximal divisors of the degree n. We provide upper and lower bounds for this probability. Our method generalizes to other algorithms that deal with similar divisor conditions. In particular, we analyze the average-case behavior of Rabin’s variants presented by von zur Gathen & Shoup and by Gao & Panario.
1
Introduction
Let IFq be the nite eld with q elements, for q a prime power, and let f IFq [x] be an irreducible polynomial of degree n. In this case, the ring of polynomials modulo f , IFq [x] (f ), is a nite eld with q n elements. The theorem of existence and uniqueness of nite elds ensures that IFqn = IFq [x] (f ). This isomorphism allows the construction of arithmetic in extension elds via polynomial operations. We are only required to nd irreducible polynomials of any degree n over any nite eld IFq . This paper deals with a probabilistic algorithm for nding irreducible polynomials due to Rabin [15]. The central idea is to use trial and error, i.e., to take polynomials at random and test them for irreducibility. Remarkably enough, this idea was already noted by Galois ([6], p. 119). Let In be the number of irreducible polynomials of degree n over a nite eld IFq . Gauss proved for the case of nite prime elds that for the M¨ obius function, In = n1 kjn (k)q n k . This appeared in his posthumous book ([9], p. 611-612). It follows that In =
qn +O n
qn n
(1)
This identity shows that a fraction very close to 1 n of the polynomials of degree n over any nite eld IFq is irreducible. Thus, on average, we nd one irreducible polynomial of degree n after about n tries. This justi es the trial and error method. C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 1 10, 1998. c Springer-Verlag Berlin Heidelberg 1998
Daniel Panario and Alfredo Viola
We note that simple and explicit lower and upper bounds for In are also known (see [11], p. 142, Ex. 3.26 & 3.27) q(q n − 1) qn − n (q − 1)n
In
qn − q n
(2)
It remains to choose an irreducibility test. Let f IFq [x], deg f = n, be k a polynomial to be tested for irreducibility. Assume that n = i=1 pi with pk the distinct prime divisors of n, and denote ni = n pi , for 1 i p1 k. Rabin’s test is based on the following result: f is irreducible if and only if n n mod f . gcd(f xq − x) = 1 for all 1 i k, and xq − x Most of the analyses done in algorithms for polynomials over nite elds are based on the worst-case behavior. Very little work has been done in the averagecase analysis for these problems, and most of them are done with techniques based on generating functions and asymptotic analysis. This paper is another step towards this direction. In Sect. 2, we revisit Rabin’s irreducibility algorithm. In Sect. 3, we give the main contributions of this paper. We provide a precise average-case analysis of Rabin’s algorithm. This analysis involves the study of the number of polynomials of degree n that have irreducible factors of degree dividing a maximal divisor of n. The main technical contribution of the paper is the study of the probability that nj , a polynomial contains an irreducible factor of degree dividing some n1 for 1 j k. We provide tight upper and lower bounds for this probability. The average-case analysis is expressed as an asymptotic form in n, the degree of the polynomial to be tested for irreducibility. We x the nite eld IFq , and pk xed with 1 consider p1 k varying. Our method is also valid when analyzing other algorithms with same divisor conditions. In Sect. 4, we study the average-case analysis of some variants of Rabin’s algorithms found in [8] and [7]. Finally in Sect. 5, we remark about the algorithms discussed in previous sections. We assume that arithmetic in IFq is given. The cost measure of an algorithm will be the number of operations in IFq . The algorithms in this paper use basic polynomial operations like products and gcds. We distinguish two approaches for the polynomial arithmetic: the school method, and the fast method based on the Fast Fourier Transform (FFT). Let M (n) = n log n log log n when considering fast methods, and M (n) = n otherwise. The cost of multiplying two polynomials of degree at most n can be taken as 1 M (n), for a constant 1 (for fast arithmetic, see [17], [16], [3]). For a constant , a division with remainder can be computed with M (n) and M (n) log n operations in IFq using classical and fast methods, respectively. The cost of a gcd between two polynomials of degree at most n can be taken as 3 M (n) and 3 M (n) log n operations in IFq , with 3 a constant ([1], 8.9). Finally, we need the computation of q mod f for polynomials and f of degree at most n. This exponentiation can be done by means of the classical repeated squaring method (see [1 ], p. 441 442). In this case, the number of products needed is Cq = log q + (q) − 1, with (q)
Analysis of Rabin’s Polynomial Irreducibility Test
3
the number of ones in the binary representation of q. Therefore, the cost of computing q mod f by this method is 1 Cq M (n) operations in IFq .
Rabin’s Irreducibility Test The main goal of this paper is to provide a complete analysis of Rabin’s polynomial irreducibility test and several associated variants. In this section, we revisit Rabin’s test. Its correctness is based on the following theorem due to Rabin ([15], p. 275, Lemma 1), that leads to an immediate algorithm. pk be all t e prime divisors of n, and denote ni = n pi , Theorem 1. Let p1 for 1 i k. A polynomial f IFq [x] of degree n is irreducible in IFq [x] if and n n only if gcd (f xq − x mod f ) = 1 for 1 i k, and f divides xq − x.
Algorithm: Rabin irreducibility test Input: A monic polynomial f IFq [x] of degree n, pk all the distinct prime divisors of n. and p1 Output: Either f is irreducible or f is reducible .
[1]
for i := 1 to k do ni := n pi ; for i := 1 to k do n g := gcd(f , xq − x mod f ); if g = 1, t en ‘f is reducible’ and STOP; endfor; n g := xq − x mod f ; if g = , t en ‘‘f is irreducible’’ else ‘‘f is reducible’’.
The basic idea of this theorem is to exploit the known structure of the lattice of the Galois sub elds of IFqn . This determines the possible degrees of the irreducible factors of an nth degree polynomial. It remains to search for irreducible factors of degree dividing maximal divisors of n. A well-known result due to Gauss establishes that, for i 1, the polynomial xq −x IFq [x] is the product of all monic irreducible polynomials in IFq [x] whose degree divides i. Therefore, in order to check the lattice of possible degrees of the irreducible factors of a polynomial f of degree n, it is su cient to consider n gcd (f xq − x mod f ) for every ni maximal divisor of n. n The computation of xq mod f in Rabin’s algorithm is done by repeated nk . The corresponding gcd is also squaring independently for each value n1 taken separately. In Sect. 4, we consider some variants on the way of computing the powers presented in [8] and [7]. Our method also extends its application to them.
4
Daniel Panario and Alfredo Viola
For later comparison with the average-case result, we now give the worst-case cost of Rabin’s algorithm. The cost of Rabin’s algorithm is dominated by the cost of computing the exponentiations. It is easy to show, using the prime number theorem, that the number of operations in IFq for performing the exponentiations is given by 1 Cq nM (n) log log log n.
3
Analysis of Rabin’s Irreducibility Test
The exact analysis of Rabin’s irreducibility test is done in several stages. Let us denote by step i , 1 i k, the gcd computation in line [1] of the algorithm. First, we study the probability that a polynomial does not survive step i of the algorithm. We obtain an expression that is not easy to estimate. Thus, second, we provide tight upper and lower bounds for it. Finally, we use the previous results combined with the associated cost of performing step i to compute the average-case analysis of Rabin’s algorithm. We x the notation for the rest of the paper. The degree n of the polynomial k pk the distinct being tested for irreducibility veri es n = i=1 pi , with p1 prime divisors of n. Given the structure of the algorithm, there is a step for every prime divisor of n. As a consequence, we can analyze the algorithm in a natural pk , and way by considering the family of numbers n with xed primes p1 ni−1 , varying exponents 1 k . We denote by Pi the set of divisors of n1 where nj = n pj . In other words, Pi contains the set of all degrees already checked when we start the ith step. Thus, we have the initial condition P1 = For any j
1, we denote by Qj the degrees considered on step j, that is, Qj = pe11
pekk :
ej
j
− 1;
es
s
s=j
(3)
Obviously, we have Pi =
i−1 j=1 Qj
We ask the attention of the reader to (3). It indicates that, for i 2, Pi presents a range of values growing with i, and including small values. This means that polynomials that present irreducible factors of degree dividing ni but have no ni−1 have large degree values in most of the factors of degree dividing n1 cases. This will turn out to be an important remark for our analysis. In particular, it will imply that most polynomials that pass the rst gcd pass all the gcds. In other words, in most of the cases, the rst gcd will be the only one that e ectively discards polynomials. We quantify these comments in a precise sense in this section. Other conclusions are drawn in the last section of the paper. Theorem . Wit t e notation above, let Pi (q n) be t e probability t at a random monic polynomial of degree n over IFq contains an irreducible factor in Pi ,
Analysis of Rabin’s Polynomial Irreducibility Test
for i 1
2. T en, as n = ki=1 pi approac es to in nity wit p1 k varying, we ave l(i q n)
Pi (q n)
pk xed and
u(i q n)
wit u(i q n) := 1 − L(i q n) = 1 − 1 − 1 l(i q n) := 1 − U (i q n) = 1 − 1 − q
5
q3 2 1+ q
q
1 q
q
e
e1−
1− q−1 +
P
p P
P
p P
1 p
q 1 q−1 pqp 2
− p1
Proof. Let I be the collection of all monic irreducible polynomials in IFq , and denote by the degree of an element I. Formally, all monic polynomials with no irreducible factors whose degree belongs to Pi can be written as [4], [13] (1 +
+
+
(1 − )−1
)=
2I j j62P
2I j j62P
As usual, we consider a formal variable z, and the substitution z j j. This transformation produces generating functions for the number of polynomials with no irreducible factors belonging to Pi 1 − zj
j
−1
−Ip
(1 − z p )
=
=
p62P
2I j j62P
1 1 − qz
(1 − z p )Ip
(4)
p2P
In the following, we nd upper and lower bounds in the generating function using an exp-log argument. First notice that P I log(1−z p ) (1 − z p )Ip = e p P p (5) p2P
This last function is analytic in z 1 q, and converges at z = 1 q. Thus, we can use singularity analysis in (4) at z = 1 q (see [5], and [12]). Since the logarithm is negative, for L(i q n) we use the upper bound in (2) to nd: Ip log(1 − 1 q p )
q log(1 − 1 q) +
p2P
p2P
= q log(1 − 1 q) − p2P
q log(1 − 1 q) − p2P
qp − q log(1 − 1 q p ) p 1 + p 1 + p
= q log(1 − 1 q) + 1 − p2P
p2P
j1
1 p
1 p
j1
1 q − j j+1
1 q − j j+1
1 qj
1 q jp
6
Daniel Panario and Alfredo Viola
The bound follows by exponentiating the last equality. For U (i q n) we use the lower bound in (2) to nd: q qp − 1 qp − p q−1 p
Ip log(1 − 1 q p ) p2P
p2P
=− p2P
q 1 + p q−1
+ p2P j1
− p2P
j1
=− p2P
q 1 + p q−1
q 1+
p2P
q
p2P
1 q jp
1 pq p
q (q − 1) q (q − 1) 1 − − j+1 j (j + 1)q 1
3
+
1 pq p
q (q − 1) 1 q (q − 1) − − j+1 j (j + 1)q p
q 1 + p q−1
+
p2P
log(1 − 1 q p )
1 pq p
+1−
1 qj
q q−1
log(1 − 1 q)
The bound follows by exponentiating the last equality. The theorem is then proved by taking the complements to 1 of U (i q n) and L(i q n). P Both bounds depend on e− p P 1 p . If we let the exponent tend to in nity, k this sum tends to j=1 pj (pj − 1) capturing the prime decomposition of n. If n1 = n p1 , then every degree checked by the algorithm in every step i > 1 pi , and we let the exponents will be a multiple of p11 . Then, if we x p1 1 i tend to in nity we have that 1 p p2P nP1
and so the probability that the chosen polynomial has an irreducible factor of degree in Pi but not in P1 is upper bounded by 1 − e(1 − 1 q)q . This upper bound is 32 for q = 2 but only 76 for q = 7 and decreases to when q tends to in nity. This fact means that for most of the cases, the rst gcd will be the only one that e ectively discards polynomials. The method in the previous theorem provides the main term of the desired asymptotics. Due to the form of the generating function, we expect the appearance of fluctuations in the lower order terms. These are not captured by our method. The exact expected cost of Rabin’s algorithm can be derived from Theorem 2. Recall the cost of arithmetic operations from Sect. 1, and the de nitions of L(i q n) and U (i q n) from Theorem 2.
Analysis of Rabin’s Polynomial Irreducibility Test
7
Theorem 3. T e expected number C(q n) of operations used by Rabin’s algorit m to test a random polynomial f of degree n over IFq is lower bounded by 1 + L(k + 1 q n) + p1
k
L(i q n) pi
i=
(n 1 Cq M (n))
k
+
L(i q n)
1+
3
+ L(k + 1 q n)
log nM (n)
i=
and upper bounded by k
U (i q n) 1 + U (k + 1 q n) + p1 pi i=
(n 1 Cq M (n))
k
+
U (i q n)
1+
3
+ U (k + 1 q n)
log nM (n)
i=
k, the cost of computing gcd (f xq
Proof. For i = 1
(ni 1 Cq +
3
n
− x mod f ) is
log n) M (n)
The probability of a polynomial being discarded before step i is Pi (q n). Thus, step i is executed with probability 1 − Pi (q n). We may de ne P1 (q n) = since no polynomial is discarded before executing the algorithm. The last step of Rabin’s algorithm has cost (n 1 Cq +
log n) M (n)
and is executed with probability 1 − Pk+1 (q n). As a consequence the expected cost of Rabin’s algorithm is k
(1 − Pi (q n)) i=1
n pi
1 Cq
+
+ (1 − Pk+1 (q n)) (n 1 Cq +
3
log(n)
M (n)
log n) M (n)
(6)
The theorem is proved after applying the bounds presented in Theorem 2.
4
Variants
The analysis of Rabin’s algorithm can be adapted to other algorithms with similar divisor conditions. In this section, we briefly show that this is the case for two variants of Rabin’s method. As we will see, both variants give di erent ways of computing the exponentiations inside the algorithm. However, they use the same divisor construction as in Rabin’s irreducibility test.
8
Daniel Panario and Alfredo Viola
First we comment on an algorithm due to von zur Gathen & Shoup ([8], n p mod f independently for each value p1 pk com7). They calculate xq puting trace maps (Algorithm 5.2 in [8]). Using fast arithmetic, their algorithm uses 1 nM (n) operations for each modular exponentiation. There are at most O(log n log log n) prime divisors of n, thus the total worst-case cost of their algorithm is O(n log3 n + M (n) log q). The space requirement of the algorithm is O(n) elements in IFq , as in the other algorithms of this paper. It is interesting to note that there is no better algorithm between the algorithms by Rabin and by von zur Gathen & Shoup from a worst-case perspective. For instance, when n log q, von zur Gathen & Shoup’s algorithm has a better worst-case behavior, but when n q, Rabin’s algorithm behaves better. We note that they also give another algorithm with cost O(n1 7 + M (n) log q) in time. However, this time is achieved by using fast matrix multiplication and it has a space requirement of O(n log n) elements in IFq . Thus, it seems to be of mainly theoretical interest. We now focus on the average-case analysis of von zur Gathen & Shoup’s algorithm. The proof is similar to the one in Theorem 3. Corollary 1. T e expected number of operations used by von zur Gat en & S oup’s algorit m to test a random polynomial f of degree n over IFq is lower bounded by 1 + L(k + 1 q n) + p1
k i=
L(i q n) pi
(n 1 M (n))
k
+
L(i q n)
1+
3
+ L(k + 1 q n)
log nM (n)
i=
and upper bounded by k
1 + U (k + 1 q n) + p1
i=
U (i q n) pi
(n 1 M (n))
k
+
U (i q n)
1+
3
+ U (k + 1 q n)
log nM (n)
i= n
i k, is The second variant for the computation of xq mod f , for 1 due to Gao & Panario ([7] 2). The key idea in this algorithm is to sort the exponents ni in an increasing order, and then to compute, for suitable i, xq
n
qn
+1 −n
xq
n +1
mod f
This allows a telescopic computation of the cost with some savings in the worstcase analysis. More precisely, they prove that this variant correctly test for polynomial irreducibility, and uses O(nM (n) log q) operations in IFq in the worstcase. Thus, it behaves better than Rabin’s method in the worst-case. Their
Analysis of Rabin’s Polynomial Irreducibility Test
9
algorithm uses 1 (ni+1 − ni )M (n) log q operations to compute the it modular exponentiation. We now give an average-case estimate for the cost of this algorithm. Corollary . T e expected number of operations used by Gao & Panario’s algorit m to test a random polynomial f of degree n over IFq is lower bounded by 1 1 − + L(k + 1 q n) + p p1
k
L(i q n) i=
1 1 − pi+1 pi
(n
1
log qM (n))
k
+
L(i q n)
1+
3
+ L(k + 1 q n)
log nM (n)
i=
and upper bounded by k
1 1 − + U (k + 1 q n) + U (i q n) p p1 i=
1 pi+1
−
1 pi
(n
1
log qM (n))
k
+
U (i q n)
1+
3
+ U (k + 1 q n)
log nM (n)
i=
5
Conclusions
This paper deals with Rabin’s algorithm for testing the irreducibility of polynomials over nite elds. The correctness of the algorithm is based on Theorem 1. This theorem characterizes the possible degrees of a polynomial’s irreducible factors. However, a direct implementation as suggested by the theorem does not lead to an e cient algorithm. Indeed, rst we note the presence of redundancies in the computations. Since in each gcd we are checking the appearance of irreducible factors of degrees dividing a maximal divisor of n, it is possible that two gcds check some common degrees. For instance, all gcds check the linear factors. Of course, this fact is intrinsic to the theorem. On the other hand, in most of the cases only the rst gcd is computed. The polynomials that survive the rst gcd are likely to survive all gcds. Thus, the e ect of the redundancies is not crucial. A much more important weakness of these algorithms is the large computation involved on the exponentiation for the rst gcd. We indicate that the bottleneck of all algorithms discussed in this paper is the computation of exponentiations modulo a polynomial. This is, of course, a problem of independent interest. A di erent probabilistic algorithm for testing the irreducibility of polynomials over nite elds is due to Ben-Or [2]. A detailed analysis of Ben-Or’s algorithm in terms of the Buchstab function is given in [14]. It involves the study of the expected smallest degree on a random polynomial over nite elds. This paper presents an average-case analysis of Rabin’s irreducibility test and some variants, when considering random inputs. This is the case in the
10
Daniel Panario and Alfredo Viola
main application of the algorithms, i.e., when nding an irreducible polynomial by trial and error. We also prove that most of the polynomials that are not irreducible are discarded by the rst step of Rabin’s algorithm. Acknowledgement. Part of this work was done while the authors were visiting the University of Waterloo. For the invitations, support and hospitality, the rst author would like to thank Bruce Richmond and the Department of Combinatorics and Optimization, while the second author would like to thank Ian Munro and the Department of Computer Science. The work of Alfredo Viola was supported in part by Proyecto BID-CONICYT 14 /94.
References 1. A o, A., Hopcroft, J., and Ullman, J. The Design and Analysis of Computer Algorithms. Addison-Wesley, Reading MA, 1974. . Ben-Or, M. Probabilistic algorithms in nite elds. In Proc. nd IEEE Symp. Foundations Computer Science (1981), pp. 394 398. 3. Cantor, D., and Kaltofen, E. On fast multiplication of polynomials over arbitrary algebras. Acta. Inform. 8 (1991), 693 701. 4. Flajolet, P., Gourdon, X., and Panario, D. Random polynomials and polynomial factorization. In Proc. 3rd ICALP Symp. (1996), vol. 1099 of Lecture Notes in Computer Science, Springer-Verlag, pp. 3 43. 5. Flajolet, P., and Odlyzko, A. Singularity analysis of generating functions. SIAM Journal on Discrete Mathematics 3 (1990), 16 40. 6. Galois, E. Sur la theorie des nombres. In Ecrits et memoires d’Evariste Galois, R. Bourgne and J. Arza, Eds. Gauthier-Villars, 1830, pp. 11 1 8. 7. Gao, S., and Panario, D. Tests and constructions of irreducible polynomials over nite elds. In Foundations of Computational Mathematics, F. Cucker and M. Shub, Eds., Springer Verlag, 1997, pp. 346 361. 8. von zur Gat en, J., and S oup, V. Computing Frobenius maps and factoring polynomials. Comput complexity (199 ), 187 4. 9. Gauss, C. Untersuchungen uber H¨ ohere Mathematik. Chelsea, New York, 1889. 10. Knut , D. The art of computer programming, vol. : seminumerical algorithms, 3 ed. Addison-Wesley, Reading MA, 1997. 11. Lidl, R., and Niederreiter, H. Finite elds, vol. 0 of Encyclopedia of Mathematics and its Applications. Addison-Wesley, 1983. 1 . Odlyzko, A. Asymptotic enumeration methods. In Handbook of Combinatorics, R. Graham, M. Gr¨ otschel, and L. Lovasz, Eds. Elsevier, 1996, pp. 1063 1 9. 13. Panario, D. Combinatorial and algebraic aspects of polynomials over nite elds. Tech. Rep. 306/97, Department of Computer Science, University of Toronto, 1997. PhD Thesis. 14. Panario, D., and Ric mond, B. Analysis of Ben-Or’s polynomial irreducibility test. Preprint, 1997. 15. Rabin, M. Probabilistic algorithms in nite elds. SIAM J. Comp. 9 (1980), 73 80. ¨ n age, A. Schnelle Multiplikation von Polynomen u 16. Sc o ¨ber K¨ orpern der Charakteristik . Acta Inf. 7 (1977), 395 398. ¨ n age, A., and Strassen, V. Schnelle Multiplikation gro er Zahlen. Com17. Sc o puting 7 (1971), 81 9 .
A Chip Search Problem on Binary Numbers Peter Damaschke FernUniversit¨ at, Theoretische Informatik II 58084 Hagen, Germany Peter.Damasc ke@fernuni- agen.de Ab tract. Suppose that we have an unknown point d in the interval (0 1) and an unbounded reservoir of chips (pebbles) on both points 0 and 1. In every step, we can either move two pebbles from points x and y to (x + y) , or we can ask whether d < x or d > x, but only if there is currently a pebble at x. Our aim is to determine an interval of length −n including d, and we are interested in the exact number of necessary moves in the worst case, especially for the rst values of n. First we analyze a natural GREEDY strategy solving this problem in roughly n2 6 moves which improves our previous n2 4 result in [5]. On the other hand, n2 1 is a lower bound. Our analysis allows to compute the exact worst-case number of moves (n) that GREEDY takes for the rst values of n, although a nice general expression for this is missing. GREEDY sends pebbles only to points being binary approximations of the target d. Moreover, our analysis will show that GREEDY is optimal among all strategies sharing this property. Hence any such strategy needs (n2 ) moves in the worst case. It might seem that sending pebbles to other points brings no advantages. So it is surprising that, without the mentioned restriction, we can achieve an asymptotic result of O(n1+2 log n ) moves, by an acceleration technique. Hence GREEDY is not optimal in general, nevertheless it seems to be the best choice for small (i.e. practical?) n, since it is simple, and we have no strategy beating (n) if n < 0. An open problem is to determine the maximum n for which (n) moves are optimal. In a nal section we discuss a way to save tests if d is very small, and we briefly consider a variant of the problem where a test at x destroys a pebble lying there. To our best knowledge, the present problem has not been studied before [5]. As well as some other combinatorial search problems, it is motivated by possible use in analytical chemistry. Further, it is an example of a search problem with agents where the costs depend in a complicated way on the location of the target object. Search problems of similar type arise in robotics and in data retrieval from storage media.
1 1.1
Introduction Problem Statement
Consider the following situation. There is an unknown point d (0 1) given in the real interval (0 1) and an unbounded reservoir of pebbles on both points 0 and 1. We are allowed to perform the following actions: C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 11 c Springer-Verlag Berlin Heidelberg 1998
, 1998.
1
Peter Damaschke
MOVE(x y): Choose two pebbles located at points x and y and move them both to the midpoint (x + y) . TEST(x): If there is a pebble at point x then ask whether d < x or d > x. (We may w.l.o.g. assume that d has an in nite binary expansion, hence the case d = x never occurs.) Given a positive integer n, our goal is to nd that interval (i n (i + 1) n ) n − 1) including d. In other words, we want to nd the rst n bits of (i = 0 the binary expansion of d. Trivially, n tests are su cient and necessary in the worst case; the interesting matter is the number of moves required. This is a binary search problem where the costs of queries are not uniform (as in the trivial problem) but depend in a certain way on the locations of the queried items. Problems of similar type have been studied e.g. for searching on ordered lists [10] [11]. On the other hand, search procedures can often be described and studied in terms of pebble games; cf. [1] as an example. Search problems with agents in unordered structures, such as graphs or geometric environments, have also been studied in the literature, see e.g. [3] [1 ]. We introduced the present problem in [5], for it has a funny (but nevertheless seriously meant) motivation concerning chemical threshold tests, as described below. Another combinatorial problem motivated by chemical test series is the well-studied group testing problem, see e.g. [ ] [7] [8] [9] [13]. In fact, the problem considered here came up in the context of a generalized setting of group testing [5]. The use of combinatorial methods in chemistry, for detecting substances or determining concentrations, has a longer tradition (cf. e.g. [6]), and recently, computational chemical manipulations with merge and test steps, possibly performed by robots, are also discussed in the context of DNA computing. (However, the latter is not immediately related to our subject.) Suppose that we have two collections of tubes of equal size, each containing a unit of liquid. Each tube on one side contains a unit of a solvent B (e.g. water), each tube on the other side contains a substance A. The substance A dissolved in B has a certain e ect when appearing in at least some concentration d. We wish to determine this unknown threshold d, up to a precision of −n . Clearly, we can test whether d < x or d > x if a solution of concentration x is at hand. So we can approximate d simply by binary search, testing successively the binary approximations of d. (De nitions follow in 1. .) Perhaps the obvious idea would be to take one tube, merge equal parts of A and B rst, and to add, after each test, the suitable amount of A or B, in order to achieve the next test concentration. But any numerical example shows that these amounts will rapidly become complicated numbers, and more seriously, they decrease exponentially. Thus precise measurements of several tiny quantities of liquid are required. So we propose another natural approach: We only work with equal quantities. If our current knowledge about d is x < d < y and two tubes with concentrations x and y are still available then merging them yields two tubes of the average concentration (x + y) which is the proper point for the next test. So nothing must be measured, and the manipulations are easy to handle.
A Chip Search Problem on Binary Numbers
13
In view of the potential application, the exact number of moves for small n is more important than the asymptotic complexity. Perhaps n > 10 makes no sense in practice. As one of our results, less than 0 moves are su cient in the worst case, if n 10; the average case may be even better. So we have less than 0 very simple steps instead of 10 delicate steps. This should make our approach quite reasonable. In Sect. we propose a simple GREEDY strategy requiring roughly n 6 moves which improves our rst n 4 result from [5]. In Sect. 3 we achieve O(n1+ ) moves for any xed , but this acceleration has an e ect only for impractically large n. The main open problems are to prove nontrivial lower bounds and to nd out the number n up to which GREEDY remains optimal. In Sect. 4 we briefly discuss improvements for very small d. We suppose that the tests are non-destructive in the sense that they do not a ect the tested solution. If we have destructive tests then we must remove one pebble from point x after each TEST(x) step. In this paper we concentrate upon the version of the problem with non-destructive tests. Only in Sect. 5, we consider the destructive version. 1.
Preliminaries
We introduce some useful notation. Obviously, the points where pebbles can be placed at all are binary rational numbers, i.e. real numbers x with nite binary xn , xi 0 1 . For convenience we omit the pre x 0. and expansion 0 x1 xn . We will loosely identify the point x, the number x, simply write x = x1 and the binary word x that represents this number. So we may speak e.g. of numbers of length n . Particularly, the number 1 is represented by the word 1. In order to avoid confusion with the number 1, we use the special symbol I to represent the number 1. Let d be the unknown point we search for. W.l.o.g. d has an in nite binary . representation d1 d d3 The current state of the search is given by a multi-set of numbers (resp. words over alphabet 0 1 ), written in brackets, together with an interval, called the candidate interval. Namely, each element of the multi-set indicates a pebble on this number, and the candidate interval is the smallest interval known to include d. Numbers 0 and I are omitted, since we anyhow supposed an unbounded number of pebbles there. A point (number, word) is said to be present or absent if it occurs resp. occurs not in the current state. The initial state is [ ] with candidate interval (0 I), and the next state is necessarily [1 1]. Then we can test point 1 (i.e. 1 ). If we learn, for instance, d < 1 then the new candidate interval is (0 1), and we may next generate the state [01 01 1] for testing 01. If we now learn d > 01 then we get the candidate interval (01 1). It is natural to replace now 01 and 1 by two copies of 011, which results in the state [01 011 011], and to test 011. Assume that d > 011, i.e. the candidate interval becomes (011 1). Next we wish to test point 0111 to determine d4 , but a pebble is missing at point 1, hence we need a further move in between,
14
Peter Damaschke
so the next states are [01 011 011 1 1] and [01 011 0111 0111 1]. This is the smallest example where more moves than tests are required. dn−1 1. The binary approximation of d of length n is the number an := d1 For technical reasons we de ne a−1 := I and a0 := 0. These binary approximations have a nice property: For any k 1 there exist uniquely determined i < j < k such that ak = (ai + aj ) , and moreover, we always have j = k − 1. We will call ak−1 and this special ai the suppliers of ak (since they can together supply ak with a pair of pebbles).
The GREEDY Algorithm In this section we discuss a rst basic O(n ) algorithm for our search problem, called GREEDY. Although it will turn out to be not optimal, it deserves a closer analysis, for the following reasons: - The constant factors hidden in the complexities of the accelerated strategies developed later will depend on the exact complexity of GREEDY. - GREEDY is optimal within a restricted class of strategies where pebbles are sent only to the approximations ak of d. Interestingly enough and perhaps counter-intuitively, this proves that any better strategy cannot restrict itself on this sequence of points. - We have no strategy that beats GREEDY for small values of n, which is perhaps the interesting case in practice. .1
Description of GREEDY
Remember that the initial state is [ ] where numbers 0 and I are omitted. GREEDY is so simple that we may describe it informally. It nds the rst n binary positions of d, and thus a candidate interval of length −n . Algorithm GREEDY. Repeat the following steps until dn is known: Find the largest k such that ak is absent but both suppliers of ak are present. Remove a copy of both suppliers from the state and replace them by two copies of ak . If ak occurs rst time then test ak . If d < ak then dk := 0 else dk := 1. Since now ak+1 is known, compute the former supplier ai := ak+1 − ak of ak+1 . Note that our introductory example in 1. with 4 tests and 5 moves follows the GREEDY strategy. Note that a−1 and a0 are permanently present. From this we easily conclude that, in any state, there exists an index k demanded above, hence GREEDY can always execute the next move. Furthermore it is clear by the rule of GREEDY that every ak (k 1) occurs at most twice in any produced state. As an immediate consequence, GREEDY will reach an after O(n ) moves. Proposition 1. GREEDY terminates after n tests and O(n ) moves with a correct candidate interval of length −n that has an as one of its endpoints. The total number of used pebbles is bounded by n. In the followinog we shall give a tighter analysis which improves our earlier result in [5].
A Chip Search Problem on Binary Numbers
.
15
Levels of Points and a Backwards Analysis
Consider an instance of our approximation problem. By symmetry we assume that the rst test yields d < 1 , otherwise switch the roles of numbers 0 and I. Consider the set of the rst binary approximations of d, linearly ordered according to their length (rather than to their natural < ordering.) We partition an ) into consecutive levels, due to the following this ordered set (a−1 a0 a1 observation: Proposition . There exists a unique partition of the set of binary approximations of d into subsets of consecutive ak ’s, called levels, such that: (1) a−1 and a0 form separate levels. ( ) The suppliers of the rst point of each further level are the last points of the two previous levels. (3) The suppliers of any further point ak in a level are: the last point of the previous level, and ak−1 . We omit the simple proof. Note that the levels correspond to the blocks of consecutive 0’s or 1’s in the binary expansion of d, and thus to the monotonicity intervals of the real values of the ak ’s. Now consider an arbitrary sequence of moves (not necessarily that stipulated by GREEDY) which eventually delivers a pebble to point an and enters only binary approximations ak (k < n) of d in between. For i = 1 n let pi denote the number of pairs of pebbles arriving at point ai during this process; note that pebbles always arrive pairwise. Lemma 1. We have: (1) pn 1. ( ) If ai (i < n is not the last point of some level then pi pi+1 . (3) If ai (i < n is the last point of some level then pi Si+1 , where Si+1 is the sum of all pj of the next level, plus pj of the rst point of the next but one level. (4) The total number of moves is the sum of all pi . Proof. Item (1) is trivial. For ( ) observe that ai is a supplier of ai+1 , hence one pebble from ai must be forwarded for every pair of pebbles entering ai+1 . Item (3) is proved similarly: Note that ai is common supplier of all mentioned points. Item (4) is clear since every move transports one pair of pebbles to some destination point. Theorem 1. For GREEDY we always have equality in cases (1),( ),(3). Hence GREEDY performs the minimum number of moves among all algorithms that reach an and enter only binary approximations of d. Proof. The idea is that GREEDY never makes redundant moves, that is, any pair of pebbles delivered to some ai (i < n) will be used later.
16
Peter Damaschke
(1) GREEDY stops immediately after delivering the rst pair of pebbles to an , so we have pn = 1. ( ) Consider any ai being not the last point in its level. GREEDY will send the next pebbles to ai only in the following situation: No move into an absent point with index larger than i is possible, ai is absent, but both suppliers of ai are present. It follows that all points with larger index than i in the ai level are also absent, since they all with ai have a common supplier. After the current move we have two pebbles on ai . Now we immediately see that the next move whose destination is some point with index larger than i will get into ai+1 . We easily conclude that every pebble arriving at ai will later contribute to a pair of pebbles in ai+1 , with the possible exception of one pebble from the nal pair . which may not be requested by ai+1 . Thus we get pi = pi+1 (3) We similarly argue for the last point ai of any level. GREEDY will send the next pebbles to ai only in the following situation: No move into an absent point with index larger than i is possible, ai is absent, but both suppliers of ai are present. Let L be the next level and aj the rst point of the next but one level. After the current move we have two pebbles on ai . If some point of ai+1 . L is present then the next move will get into some point of L aj Otherwise (all points of L are absent) the next move whose destination is some point with index larger than i will get into ai+1 . Again we easily conclude that every pebble arriving at ai , possibly except the very last, will later contribute . to a pair of pebbles in L aj , hence pi = Si+1 So we have a recursive formula that allows us to analyze the number of moves executed by GREEDY. The theorem shows particularly that GREEDY has, for every n, the minimum worst-case bound among all algorithms using only binary approximations of d. However, if we drop this restriction then we can construct asymptotically faster strategies, as shown later. .3
Complexity Bounds for GREEDY
With help of the formula of Lemma 1 and Thm. 1 we can compute the worstcase number of moves g(n) used by GREEDY, for increasing n. Since each new pi only depends on the pj in the actual level and one more pj value, and the number of moves is simply the sum of all pi , we can conveniently use dynamic programming. From any item we have only to keep the total number of moves so far and the informations from the current level, and we may remove any item that is dominated by another item and thus cannot lead to a worst instance. It turns out that, in the worst cases, the levels are typically short, and so this process can even be executed by hand. However, we omit the details of this stupid work and only present the results for the smallest n: n 1 g(n) 1
3 4 5 6 7 8 9 10 11 1 13 14 15 3 5 7 9 11 13 16 19 6 30 35 40
Unfortunately, the recursion formula itself is too bizarre for obtaining a nice expression for g(n) in general, but at least we can use backwards analysis to
A Chip Search Problem on Binary Numbers
17
prove fairly close upper and lower bounds. Remember the main idea of the proof of Thm. 1: All pebbles arriving at some point ai (i < n) are forwarded to later points, possibly except one pebble. As a byproduct, this implies: Lemma . As soon as GREEDY has nished its work, there are exactly two pebbles on an , and not more than one pebble on each other point. Hence the total number of moved pebbles is at most n + 1. This strengthens the trivial observation in Prop. 1. Theorem . GREEDY performs at most g(n)
(n + 4n + 1) 6 moves.
Proof. We de ne the weight of a state to be the following sum: Every pebble located at point ak contributes a summand k. By Lemma , the total number of used pebbles is at most n + 1. Since pebbles leave a1 always along with a partner from a0 , at most half of them start from a−1 . Hence the initial weight is at least −(n + 1) . Also by Lemma , the weight of the nal state is at + 3n . So the weight has been increased by at most most n + ni=1 i = n + n + 1 . On the other hand, each move adds at least 3 to the weight n which yields the asserted bound. A square lower bound is given by: Proposition 3. For any n, there are instances where GREEDY needs at least n 1 + n moves. Proof. Consider the instance consisting of n levels of length 1, followed by a level of length n . Since the point in the last singleton level is the common supplier of all members of the long level, we must bring n pebbles to this point, in order to reach eventually an after n further moves. That is, we must achieve a weight of n 4 by moves through the levels of length 1. Each of these moves raises the weight by exactly 3, hence the assertion follows. Together with Thm. 1 this gives immediately: Corollary 1. Any strategy searching for an unknown point via its binary approximations only requires (n ) moves in the worst case.
3 3.1
Accelerated Strategies Reducing the Exponent Arbitrarily
In view of Cor. 1 we can reduce the exponent in the number of moves only by leaving the street of ak ’s. In fact, we can construct search algorithms with much better asymptotic behaviour. Figuratively speaking, the idea is to send out, from time to time, an expedition of pebbles to explore the next few digits of d, and then to deliver subsequently more pebbles for the further search process, to the more precise address found by the expedition.
18
Peter Damaschke
We start with a technical lemma. For an integer p, let I(p) denote the number of digits 1 in the binary expansion of p, and for positive p let Z(p) be the length of the su x consisting of a 1 followed by 0’s, that means, p Z(p)−1 is integer, but p Z(p) is not. We de ne Zn (p) = min Z(p) n . p−1
n−1 we have k=0 Zn (s − k) = n + Lemma 3. Let be Zn (s) = n. For p p − − I(p − 1). For general p s we have p−1 k=0 Zn (s − k) = n + p − − p n−1 − I((p mod n−1 ) − 1). n−1 . By induction we have Proof. The assertion is true for p = 1. Consider p p Z (s − k) = n + p − − I(p − 1) + Z (s − p) = n + p − − I(p) + = n n k=0 n+ (p+1)− −I(p). Particularly, for p = n−1 we get n+ p− −I(p−1) = n −1 and Zn (s − p) = n again. For arbitrary p s, the process is repeated p n−1 times, with a remainder of p mod n−1 steps. Thus we obtain: p−1 n−1 ( n −1)+n+ (p mod n−1 )− −I((p mod n−1 )−1) k=0 Zn (s−k) = p n−1 − I((p mod n−1 ) − 1). =n+ p− − p
Next we give the fast delivering routine. Note that merging 0 and some y gives two copies of 0y, and similarly, merging I and y gives two copies of 1y. Lemma 4. Let x be any number of length n. Then we can deliver q pebbles to point x by less than n + q moves. xn . Let sk = xk xn denote the su x of x starting Proof. Let be x = x1 with xk . In the following we only consider states of the following kind: They contain only numbers sk , and every such number, except s1 = x, occurs at most once. These states can be compactly represented by two integers, namely the number of pebbles on x and another integer S de ned as follows: The binary S , such that Sk = 1 holds if sk is present, and Sk = 0 expansion of S is Sn if sk is absent. We proceed as follows. If S = 0 then merge 0 and I which yields twice 1 = sn . In the next moves i = 1 n − 1, merge one copy of sn−i+1 with 0 if xn−i = 0, or with I if xn−i+1 = 1, respectively. This way we eventually deliver a pair of pebbles to x, and we obtain S = n−1 − 1 with binary expansion 1 1. If S = 0 then nd the smallest k such that Sk = 1 and execute the same procedure as above, but starting with sk . The obvious e ect is that we deliver the next pair of pebbles to x and replace S by S − 1 using k − 1 moves. (Observe that this routine has a similar structure as GREEDY on a single level.) Together with Lemma 3 this proves that we deliver p pairs of pebbles to point x after exactly n + p − − p n−1 − I((p mod n−1 ) − 1) moves. This term is at most n + p − . The q pebbles to be delivered are contained in p (q + 1) pairs, and the assertion follows. First we are interested in the exponent only. Since n moves is a trivial lower bound, the following result is tight in some sense. Before proving the theorem we remark that any search strategy working on ground interval [0 I] also works,
A Chip Search Problem on Binary Numbers
19
appropriately adapted, on any other interval [l r]. This is clear for simple geometric reasons, since the basic actions MOVE and TEST commute with translations, scaling, and reflection. In particular, if the endpoints l r, or symmetrically I − r I − l, have nite binary expansions and agree in all bits except the last one then nothing has to be changed, only this pre x of common bits must be appended to every word in the states of the search process. This invariance property is essential in the following. Theorem 3. For any positive integer k there is an algorithm that nds a candidate interval of length −n by O(n1+1 k ) moves. Proof. For k = 1 we can take any O(n ) algorithm, such as GREEDY. The theorem is proved by induction from k to k + 1. Partition the word an into roughly n1 (k+1) blocks of nk (k+1) consecutive digits each. Assume that the digits of the rst b blocks are already determined, and block b + 1 has to be explored now. (Initially we have b = 0.) By the invariance property and the induction hypothesis, the digits in the next block can be found by O(n) moves, provided that enough pebbles are available at the endpoints of the current candidate interval. (Note that the endpoints have always the above mentioned special form, since the pre x of bits from all preceding blocks is already xed.) Since the total number of pebbles moved by any search algorithm is at most twice the number of moves, O(n) pebbles are required for searching the block. So deliver O(n) pebbles to both endpoints of the candidate interval. By Lemma 4, this can be done by bnk (k+1) + O(n) moves. Altogether we need O(n1+1 (k+1) ) moves for the search expeditions within the blocks and O(n (k+1) )nk (k+1) + O(n1+1 (k+1) ) moves for delivering. This is a total of O(n1+1 (k+1) ). If we proceed as in the above proof then the constant factor is roughly doubled in each step of the induction. Hence we obtain a bound of O( k n1+1 k ). Choosing k = log n we get: Corollary . There is an algorithm that p −n by O(n1+ log n ) moves. 3.
nds a candidate interval of length
Faster than GREEDY for Small n?
Although the last corollary is interesting as an asymptotic result, we see that the exponent is considerably reduced only for n beyond any imaginable application. The more important question is the exact number of necessary moves for small n. Let us revisit the rst step of the above induction and estimate the exact number of moves. Now we work with n q blocks of length at most q + 1. (We add 1 because of ceiling.) By Lemma and Thm. , GREEDY needs at most q + pebbles and q 6 + q + 1 moves in each block. Hence all expeditions take nq 6 + n + n q moves. The delivering routines must cover a total way of n q + n q − n − n q and transport a total of n + n q − q − pebbles. Due 6n to Lemma 4, the sum of these terms is the number of moves. Choosing q gives:
0
Peter Damaschke
Corollary 3. There is an algorithm that −n by at most 3n3 + 7n 6 moves.
nds a candidate interval of length
This is better than GREEDY not before n = 0. In fact, we do not know any strategy beating GREEDY for small (i.e. practical) n. On the other hand, we do not have a good approach for proving nontrivial lower bounds. So it remains an interesting puzzle to nd the rst number n where less than g(n) moves are su cient.
4
Very Small d
For a given target d, de ne n0 as the length of the maximum pre x of d consisting of digits 0 only, and let n1 = n − n0 be the precision demanded by the user, i.e. the number of bits not including the leading 0’s. That means −n0 −1 < d < −n0 . If d is very small and thus n0 makes up a considerable fraction of n then we can achieve better results than the general worst case bounds for moves and tests. This is briefly and informally discussed below. Considering GREEDY, we can analyze the moves within the rst long level of length n0 and within the nal n1 points separately, since they are only loosely coupled: Perform backwards analysis on the last n1 points; as a byproduct this also yields p = pn0 . For each state, de ne an integer S as follows: The binary expansion of S is Sn0 −1 , such that Sk = 1 holds if ak is present and Sk = 0 if ak is absent. S1 It is an easy observation [5] that GREEDY sets S = n0 −1 − 1 in the rst n0 moves, and then always subtracts 1 from S while delivering the next pair of pebbles to an0 , and the number of moves in any such round is the distance of the rightmost digit 1 in S to position n0 . From this and Lemma 3 we see that the total number of moves in this level is exactly n0 + p − − I(p − 1). Using Lemma , this is bounded by n0 + n1 = n, since p n1 . Thus we can restrict the quadratic term to the n1 part: Theorem 4. If
n0 −1
> n1 , GREEDY performs at most n + g(n1 ) moves.
We may even save tests for determining n0 , although at a cost of additional moves: If we promptly perform a test at each new point then n0 + 1 tests are executed until n0 is found, whereas roughly log n0 +log log n0 +log log log n0 + tests would be su cient [4]. Hence, if n0 is expected to be large and tests are expensive then we may use another strategy with fewer tests for determining n0 . On the other hand, the maximum position queried by the Bentley-Yao strategy can be excessively larger than n0 . So the strategy for nding n0 should be a hybrid of successive search and the test-e cient Bentley-Yao strategy, depending on the ratio of costs. Let C be the assumed cost of each test whereas each move has unit cost. We wish to minimize the total costs for nding n0 . Let + 1 be the actual number of moves executed so far, i.e. we have the state [1 01 001 0 1 0 1] where 0 denotes the symbol 0 repeated times. As long as < C, one should use
A Chip Search Problem on Binary Numbers
1
the Bentley-Yao strategy, since the total costs of moves are still bounded by C, but each test has a cost of C alone. In contrast, if > C is reached then one ) until should switch to another strategy: Test any node = Ct (t = 1 3 > n0 and then nd n0 by ordinary binary search. By this quadratic growth we achieve O( n0 ) costs for tests until > n0 and O( n0 ) costs for the moves exceeding n0 (and O(log n0 ) for the nal tests which is a lower order term). Any slower or faster growth would perturb this balance, and a standard calculation shows that C is the best factor. Finally note that there might be only one pebble instead of two at point an0 now, but this aggravates only marginally the initial situation for determining the next n1 positions of d.
5
The Case of Destructive Tests
In this variant of the problem, one of the two pebbles arriving at any new point must be removed, since the test at this point destroys one pebble. This modi es the GREEDY algorithm and the recursion formula for the pi . Namely, we must add 1 to pi+1 resp. Si+1 , representing the pebble that is lost by the test at point ai . Theorem 5. In the case of destructive tests, GREEDY executes at most ( n + 5n + 1) 6 moves. Proof. We can adopt the weight argument from above. Now we also have to count the contributions of those pebbles which are removed from any new point, and the total number of pebbles is at most n+1 rather than n+1. Hence, in contrast n to Thm. 3, the weight bound becomes (3n + 1) + i=1 i = ( n + 5n + 1) . Again, the number of moves is at most a third of that. Proposition 4. For any n, there are instances where GREEDY needs n 4 + n moves. Proof. Here consider the instance where all levels have length 1. One easily sees that pn+1−i = i , and the assertion follows together with Lemma 1 (4).
References 1. J.A.Aslam, A.Dhagat: Searching in the presence of linearly bounded errors, 3rd ACM STOC 1991, 486-493 . A.Bar-Noy, F.K.Hwang, I.Kessler, S.Kutten: A new competitive algorithm for group testing, Discrete Applied Math. 5 (1994) 3. R.A.Baeza-Yates, J.C.Culberson, G.J.E.Rawlins: Searching in the plane, Info. and Computation 106 (1993), 34- 5 4. J.L.Bentley, A.C.Yao: An almost optimal algorithm for unbounded searching, Info. Proc. Letters 5 (1976), 8 -87
Peter Damaschke 5. P.Damaschke: The algorithmic complexity of chemical threshold testing, 3rd Italian Conference on Algorithms and Complexity CIAC’97, Rome 1997, Lecture Notes in Computer Science 1 03 (Springer 1997), 05- 16 6. J.A.Decker Jr.: Hadamard transform spectrometry: a new analytical technique, Analytical Chemistry 44 (197 ), 1 7-134 7. D.Z.Du, F.K.Hwang: Competitive group testing, Discrete Applied Math. 45 (1993) 8. D.Z.Du, H.Park: On competitive group testing, SIAM J. Comp. 3 (1994), 101910 5 9. D.Z.Du, G.L.Xue, S.Z.Sun, S.W.Cheng: Modi cations of competitive group testing, SIAM J. Comp. 3 (1994), 8 -96 10. S.W.Hornick, S.R.Maddila, E.P.M¨ ucke, H.Rosenberger, S.S. Skiena, I.G.Tollis: Searching on a tape, IEEE Trans. Comp. 39 (1990), 1 65-1 71 11. T.C.Hu, M.L.Wachs: Binary search on a tape, SIAM J. Computing 16 (1987), 573-590 1 . E.Koutsoupias, C.Papadimitriou, M.Yannakakis: Searching a xed graph, Proc. 3rd ICALP’96, Lecture Notes in Computer Science 1099 (Springer 1996), 80- 89 13. E.Triesch: A group testing problem for hypergraphs of bounded rank, Discrete Applied Math. 66 (1996), 185-188
Uniform Service Systems with k Servers Esteban Feuerstein Depto. de Computacion, FCEyN, Universidad de Buenos Aires Instituto de Ciencias, Universidad de General Sarmiento, Argentina efeuer
[email protected]
Ab tract. We consider the problem of k servers situated on a uniform metric space that must serve a sequence of requests, where each request consists of a set of locations of the metric space and can be served by moving a server to any of the nodes of the set. The goal is to minimize the total distance traveled by the servers. This problem generalizes a problem presented by Chrobak and Larmore in [7]. We give lower and upper bounds on the competitive ratio achievable by on-line algorithms for this problem, and consider also interesting particular cases.
1
Introduction
During the last decade considerable attention has been devoted to competitive analysis of on-line algorithms. On-line problems have a variety of relevant applications in computer science, logistics, economy and robotics. Probably the most famous on-line problem is the Paging Problem [15], that is the problem of managing a two-level memory, one level of limited capacity and fast access time (the cache) and the other one with slow access time but potentially unlimited capacity. An algorithm for this problem must determine which page of the cache to evict in front of a page-fault, with the goal of minimizing the total number of page faults incurred for serving a sequence of requests. Online paging algorithms must decide which page to replace without knowledge of future requests. On-line algorithms are in general evaluated using competitive analysis [11]: an on-line algorithm for a certain problem is said to be c-competitive if the cost incurred by it to serve any input is at most c times the cost charged to the optimal (o -line) algorithm for that input plus a constant. One of the most challenging on-line problem is the k-server problem [1 ], in which k servers, located on a metric space, must serve a sequence of requests at points of the metric space. An on-line algorithm for that problem tries to minimize the total distance traveled by the servers, deciding which server is moved ?
This work was partially supported by the KIT program of the European Community (Project DYNDATA), by University of Buenos Aires’ Programacion para Investigadores Jovenes, project EX070/J Algoritmos E cientes para Problemas On-line con Aplicaciones and by UBACYT project Modelos y Tecnicas de Optimizacion Combinatoria .
C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 3 3 , 1998. c Springer-Verlag Berlin Heidelberg 1998
4
Esteban Feuerstein
to each point in an on-line way. The great amount of extensions and generalizations of the Paging Problem include also the weighted version of Paging [14], the access graph model of [5], Metrical Task Systems [6] and Request-answer games [4]. In [7], Chrobak and Larmore proposed a family of on-line problems, namely Metrical Service Systems (MSS). In an instance of MSSw one server situated on a metric space must serve a sequence of requests, where each request consists of a set of nodes of the metric space (of size at most w) and can be served by moving the server to any of the nodes of the set. The goal is to minimize the total distance traveled by the server. An important particular case of MSS is when the metric space is uniform, i.e. when all the distances are equal (uniform-MSS). Both MSS and uniform-MSS are particular cases of Metrical Task Systems, but not of the k-server problem, as each request speci es di erent alternative nodes to cover. In this paper we present the generalization of uniform-MSSw to the case in which k 1 servers are used. We call this problem (k w)-Uniform Service Systems (abbreviated as USS(k w) ). It is a well known fact that the k-server problem on uniform metric spaces is isomorphic to the paging problem with a cache of size k. Analogously, USS(k w) can be seen as the following generalization of the Paging problem: given a set U of pages, an on-line algorithm with a cache of size k must deal with a nite sequence of requests, each of which consists in a subset r U of size at most w. Each request is served by having in the cache at least one element of r. In the reminder of this paper we shall use this Paging-oriented terminology rather than that of server problems. The problem, in both its server and paging versions, has several natural applications. As an example, consider a distributed network with virtual-circuit routing, in which each processor may have at most a constant number of simultaneously enabled connections. If the data needed to perform some task are replicated over the network, a processor has the alternative of communicating with di erent processors being forced, in general, to close its connection with some other one. Another example is given by a k mobile servers that can give their service in any branch of each of the clients’ companies. The decision about where to serve each request will influence the total time needed to process a sequence of requests, as well as the decision of which server to assign to each request. The problem treated in this paper is a sort of dual of the one considered in [8], in which every request consists of a set of pages, all of which have to be present in the cache to serve the request. Other related work has been done by Ausiello et al. [3, ], Alborzi et al. [1] and by Feuerstein et al. [9]. In [3, ] the problem of e ciently serving a sequence of requests in a metric space presented in an on-line fashion is considered. At every moment, a server may decide which of the requests to serve, with the goal of minimizing the total completion time. A similar approach is taken in [1] where it is assumed that a xed number of clients present sequences of requests in a
Uniform Service Systems with k Servers
5
metric space, that must be served by a single server. At any time, each client has at most one request to be served, after which a new one may be presented. They consider di erent cost models, namely the make-span, total completion time and maximum response time. The main di erence with the previously cited works is that, as the requests are threaded, the time in which requests are presented depends on the order in which the server processes previous requests. Finally, [9] introduces the generalization of Paging to the case where there are many threads of requests. That models situations in which the requests come from more than one independent source. Hence, apart from deciding how to serve a request, at each stage it is necessary to decide which request to serve among several possibilities. The di erence with the approach taken in this paper is that, in the setting of [9], all the requests that are not served at some stage are repeated in the next one, while here a brand-new set of pages may be requested. We show that no on-line algorithm for USS(k w) can achieve a competitive ratio better than k+w − 1, and we present an O(k min(k w wk ))-competitive w algorithm. For any xed value of k (and arbitrary w) this is at most a constant factor away from the lower bound . However, for k tending to in nity the upper bound is a constant times k away from optimality. We conjecture that the same algorithm achieves a competitive ratio of O(min(k w wk )), and therefore obtains an optimal (up to a constant factor) competitive ratio also when k tends to in nity, but we have not proved it in the general case. However, we have proved it for w = when the requests verify certain restrictions that will be explained later. Our algorithm solves at each step an instance of an NP-complete problem. In Section 4 we present a polynomial-time algorithm that achieves a competitive ratio of k with a cache of size k against an adversary with a cache of size k. The lower and upper bounds we obtain generalize the results of [7] regarding uniform metric spaces.
The General Case The following is a lower bound on the competitive ratio of any on-line algorithm for this problem. − 1 then no on-line algorithm for USS(k w) is Theorem 1. If c < k+w w c-competitive. Proof. We consider a universe U of cardinality k + w. Given an on-line algorithm A, we construct a sequence A in the following way: each request of A is to the w pages that are not present in A’s cache, and hence A faults at each request. We can assume that A evicts only one page at each fault, otherwise we could construct an algorithm A0 behaving in that way and such that for every sequence , CA0 ( ) CA ( ) (for any algorithm ALG, CALG ( ) denotes the cost incurred by ALG to serve a sequence of requests, the sequence is omitted when it is clear from the context). Hence, if we let A = n, we have CA = n. Consider now the family F of all the subsets of U of cardinality w. We have that F = k+w w .
6
Esteban Feuerstein
At any moment, the con guration of the cache of any algorithm (o -line or online) can be associated to the set f of the pages of U that are not present in the algorithm’s cache, with f F. Consider a family of adversaries ADV such con gurations, an exception that each adversary is in one of the possible k+w w k+w made of the con guration of A. There are w − 1 such adversaries, and all of them can serve each request r of A with no cost. Now assume that A evicts some page x r to make place for a page y r. Then, the only adversary whose cache would coincide with A’s cache after A’s move evicts y and brings x; but only after serving the request r. Then we can think as P all adversaries together CADVi = n. As there paying a cost of 1 for each request. Hence, we have that n . But hence there − 1 adversaries, the average cost is are in total k+w w (k+w w )−1 CA is at least one i such that CADVi is not more than average, and the ratio CADV i is at least k+w − 1. w In the following we propose an algorithm for USS(k w) . The algorithm, which Pk− w+1 k we call the Hitting Set algorithm (HS) is k min( k k−1−k =0 w + w )-competPk− k wk . itive. Notice that =0 w + w HS divides the sequence of requests in phases, the rst phase starting with the rst request of the sequence. Each time HS cannot answer the current request it behaves as follows. First, it computes a minimum cardinality set H of pages that intersects all the requests that produced a fault during the current phase. If H k then H is brought into the cache. If there is more than one set with minimum cardinality, then the one that can be brought into the cache with minimum cost is chosen. Otherwise, if H > k the current phase is nished and a new phase starts with the present request. The constraint of choosing the next con guration of the cache so as to minimize the (Hamming) distance to the current one is not really used in the proof of the following theorem, but it is a necessary condition to prove the better bound of O(k w ) that we conjecture is achieved by the algorithm. Notice that the lazy version of HS, that instead of bringing into the cache a complete hitting set H brings just one page of H useful to serve the current request, can be forced to pay as much as HS, simply by repeating, after each new request, all the previous requests of the phase. Theorem . HS is k min( k
w+1
−k k−1
Pk− =0
w + wk )-competitive for USS(k w) .
Proof. By de nition of HS, all the requests of a phase plus the rst one of the following phase need at least k + 1 di erent pages to be answered. Therefore, we know that the adversary pays at least a cost of 1 for each phase. We will show Pk− w+1 k that HS may fault on at most min( k k−1−k =0 w + w ) di erent requests during a phase. By de nition during a phase HS does not fault twice on the same request, and the cost for each single request is at most k, therefore the total cost of HS during each phase is not more than k times the number of faults, and the thesis follows. We will separately prove that:
Uniform Service Systems with k Servers
i) the maximum number of faults during a phase is at most than Pk− kw+1 −k k =0 w + w . k−1 , and ii) it is at most than
7
Pw =1
k =
i) Let us rst see that the maximum number of faults of a phase can be at most Pw =1 k . We will proceed by induction on w. Basis (w = . Let H be the nal hitting set of a phase, i.e. the last set of cardinality not greater than k such that all the requests of the phase could be served by having H in the cache. Let v H. After at most k+1 di erent requests including v, it is clear that any hitting set of cardinality less than k + 1 contains v. Hence HS may fault at most k + 1 times for each of the at most k pages in H. Summing over all pages of H we get that the number of di erent requests of a phase may not exceed k + k. Inductive step. Suppose that the thesis is true for w − 1, we will prove it for w = . Consider the family Fv of all the requests containing a particular − v Fv of the requests of Fv without page v and the family Fv0 = − 1, and every hitting set v. Obviously, for every Fv0 we have that of Fv not containing v must be a hitting set of Fv0 . By inductive hypothesis, P −1 k requests of this type every hitting set of Fv0 must be of after at most 1 + =1 cardinality at least k+1, and hence HS will necessarily keep v to cover Fv until the phase is over. Therefore, summing over all pages of the nal hitting P −1set we get that the maximum number of faults in a phase is at most k(1 + =1 k ) = Pw P k = k . k+ = =1 ii) We will now prove, by induction on k, that the maximum possible number Pk− k of faults in a phase is =0 w + w . Basis (k = 1 . The rst request of the phase gives at most w alternatives, and hence there are at most w di erent con gurations in which that request can be served. Inductive step. We assume the thesis is true for k − 1 and will prove it for k = . Consider the rst request r such that the size of the minimum hitting set of all the requests so far in the phase becomes . That request can be covered in at most w di erent ways. By inductive hypothesis, for each of these ways there P −3 −1 requests before the other − 1 elements of the may be at most =0 w + w hitting set become xed. Then the totalPnumber of faultsPduring a phase can be P −3 − − at most 1 + w( =0 w + w −1 ) = 1 + =1 w + w = =0 w + w . From the previous Theorem we get the following corollary regarding the performance of the algorithm HS for MSSw on uniform metric spaces (MSSw is the class of Metrical Service Systems where all requested sets have cardinality at most w). Our bound coincides with that of [7] for this particular case. Corollary 1. HS is w-competitive for MSSw on uniform metric spaces.
8
Esteban Feuerstein
To conclude the analysis of the performance of HS for USS(k w) , we will quantify the gap between the upper bound of Theorem and the lower bound of Theorem 1 in the two limit cases, i.e. when either w or k tend to in nity and the other value is xed. If we denote as C the competitive ratio achieved by algorithm HS (that is, Pk− w+1 k C = k min( k k−1−k =0 w + w )), we have that for any xed k, there is a P k value w(k) such that w w(k) C = k( k− =0 w + w ). Conversely, for any w+1 xed w there exists a value k(w) such that k k(w) C = k k k−1−k . Applying Stirling’s formula we get the following values for the ratio C k+w (recall that w Pk− k k w + w w ): =0 lim
w!1
lim
k!1
C k+w w
C k+w w
k k+1 k ek
= lim
k!1
www k ew
Note that in the rst case the ratio tends to a constant, while in the second case it grows proportionally to the value of k.
3
The Acyclic Case for w =
In the particular case of requests of cardinality , every request can be seen as an edge of a graph with nodes the universe U , and any set of requests can be seen as a graph on U . In this case algorithm HS reduces to computing a minimum vertex cover of the subgraph determined by the requests of the current phase. In the following we will show that if the graph of the requests is acyclic, then HS obtains a competitive ratio that is at most a factor of away from optimality. Actually, to prove this we need a smoother restriction on the input sequence, we only need that the (at most) k + k requests that form each phase do not form cycles. This particular case of USS(k ) is called acyclic-USS(k ) . We need some de nitions and preliminary results before stating and proving the theorem. De nition 1. Given graph G and a node n of G, we say that n is: xed if n H for every minimum cardinality vertex cover H of G; free if there exist two minimum cardinality vertex covers H and H 0 of G such that n H and n H 0 ; forbidden if there is no minimum cardinality vertex cover H of G such that n H. Making some abuse of notation, we will see the pages of U as the nodes of the graph induced by the requests of the phase. Besides, we will continue referring to Hitting Sets instead of Vertex Covers. Lemma 1. After a request involving a free and a forbidden node, the number of free nodes in HS’s cache is decremented exactly by the cost paid by HS.
Uniform Service Systems with k Servers
9
Proof. Suppose the request is to the pair x y , where x is free and y is forbidden. Then, after the request, x will be part of every minimum hitting set at least until the size of a minimum hitting set is incremented, and by de nition of HS this request is served by bringing x into the cache, possibly together with some other nodes. In general, HS will replace a set Z of free nodes by a set X, with x X and Z = X . We will proceed by induction on the cardinality of X and Z. Basis. If HS replaces some node z by x, the cost is 1, and the number of free nodes in the cache is decremented by 1, because z was free and x is now xed. Induction. We will consider the following two cases: (1) There is only one neighbor z of x with z Z. Then we have two possibilities: (a) All the neighbors of z (except x) where already in the cache. Then there is a minimum Hitting Set obtained replacing z by x, a contradiction because we are supposing that the minimum change is of cardinality bigger than one. vj of z must be brought to the cache so as to (b) Some neighbors v1 cover the edges (z v ) that remain uncovered because of the eviction of z. The changes done to bring each v into the cache are the same that would have been done if a request v w had been requested, for some forbidden node w. By induction, all the nodes brought in such cases remain xed, and hence the same happens in this case. Therefore, all the nodes brought in this case are xed. zj of x with z Z i=1 j. Since we ( ) There are j > 1 neighbors z1 are considering the acyclic case, all the requests of the phase form a forest. j be Let T be the subtree of that forest induced by Z X. Let T i = 1 zj respectively, and de ne X = X T the subtrees of T rooted at z1 and Z = Z T for i = 1 j. As Z = X , and because x X, we Sj have that X − x = Z − 1. Note that X − x = =1 X . Then there is at least one i such that X < Z . But in this case replacing only Z by x would give a hitting set of the same or smaller cardinality than the X previous hitting set, contradicting the hypothesis that replacing Z by X was a minimum cost change (or contradicting the minimality hypothesis of the previous hitting set). Theorem 3. HS is O(k )-competitive for acyclic-USS(k
).
Proof. We already know that the adversary incurs at least in a cost of 1 during each phase, and we will bound HS’s cost during the phase. For this we will use the following potential function: = (k + ) H − f − k t where H is a minimum hitting set, f is the set of free nodes in HS’s cache and t is the number of non-trivial connected components (trees), all referred to the graph induced by the requests of the current phase that produced a fault.
30
Esteban Feuerstein
The value of is 1 after the rst request of a phase, and is never greater than k + k. Therefore, if we show that is incremented at least by the cost paid by HS for each move we have that the total cost charged to HS during the phase is not greater than k + k. We will now see that this holds. The rst thing to note is that requests involving at least one xed node are served by HS without cost (as by de nition of HS xed nodes are present in the cache). Those requests are ignored by the algorithm, and produce no variation in the potential function. Hence we have to analyze the variation of in the remaining cases, that are: Forbidden/Forbidden: In this case, the cardinality of a minimum hitting set is incremented by one, and the cost paid by HS is exactly 1. We can distinguish two sub-cases: The requested nodes are both trivial trees, and hence the number of free nodes and the number of non-trivial trees increase by 1. Then we have that = (k+ )( H +1)−(k+ ) H −1−k(t+1)+kt = k+ −1−k = 1. At least one of the requested nodes was part of some non-trivial tree. Then the number of non trivial connected components does not increase, and the number of free nodes may increase at most by H + 1. Hence (k + )( H + 1) − (k + ) H − H − 1 = k + − H − 1 1. Forbidden/Free: In this case, the size of the minimum hitting set remains unchanged, and the number of non-trivial connected components does not increase. By Lemma 1, the decrement in the number of free nodes in cache is equal to the cost paid by HS, yielding an increment in equal to that cost. Free/Free: The size of the minimum hitting set and the number of free nodes in cache do not change. As for the number of non trivial connected components, it must necessarily decrease by 1, and therefore we have = −k(t − 1) + kt = k. But k is always greater than the cost paid by HS. Note that here we use the fact that the graph of the requests of a phase is acyclic.
4
A Polynomial-Time Algorithm
HS has an important drawback regarding its e ciency. In fact, at each fault HS has to solve an instance of the Hitting Set problem, which is NP-complete, even in the case of w = [10]. We propose a polynomial-time on-line algorithm that with a cache of size k is k-competitive against adversaries with cache size k for USS(k ) . The algorithm (and its analysis) is based on the ideas of a well-known -approximate polynomial algorithm for the Minimum Vertex Cover problem (see for example [13]). We call this algorithm AHS, and it works as follows: The sequence of requests is divided in phases, each phase ending after the k-th fault of the algorithm. A request r is answered in the following way: if at least one of the pages of r is present in the cache, then nothing is done. Otherwise both pages of r are brought into the cache and marked, evicting unmarked pages. If there are no unmarked pages to evict (i.e. there have already been k faults in the current phase), then all pages are unmarked and the phase is over.
Uniform Service Systems with k Servers
31
AHS follows in a certain way the same philosophy of HS, but keeping a approximate hitting set of the requests of the phase instead of a minimum hitting set. Because of this fact, it needs a memory twice as big as that of the adversary to prevent in nite sequences of requests that could be served by the adversary with constant cost and that could otherwise produce an unbounded cost each time the optimal hitting set does not coincide with the one computed by AHS. Theorem 4. Algorithm AHS with cache-size k is k-competitive against adversaries with cache-size k for USS(k ) . Proof. Trivially the total cost of AHS during a phase is k. On the other hand, consider the k requests that produced a fault during a phase plus the rst request of the following phase. By de nition of AHS, these requests are k + 1 pairwise disjoint sets, and hence they need at least k + 1 di erent pages to be answered. So we conclude that the adversary must necessarily have at least one fault to serve all of them. Algorithm AHS can be naturally extended to deal with requests of size at most w, and almost the same proof of Theorem 4 holds for the following Theorem. The idea is that a w-approximate solution to the Minimum Hitting Set problem with sets of cardinality at most w may be obtained by an algorithm that considers the sets one by one and, if the current set is uncovered then it adds all its elements to the hitting set. This corresponds to extending the approximate solution of Minimum Vertex Cover of graphs based on a maximal matching to an approximate solution of Minimum Vertex Cover of w-hypergraphs based on the computation of a maximal w-dimensional matching. Theorem 5. Algorithm AHS with cache-size wk is wk-competitive against adversaries with cache-size k for USS(k w) .
5
Open Problems and Future Research
The main open problem is to close the gap between the lower bound of Theorem 1 and the upper bound of Theorem . As we stated before, we believe that our algorithm HS achieves better performance than what has been proved. An interesting subject of future research is to extend USS(k w) to non-uniform metric spaces. This would extend both the work in this paper and the work by Chrobak and Larmore [7] on Metrical Service Systems, were only one server is considered.
Acknowledgments. I am very grateful to Amos Fiat and Alberto MarchettiSpaccamela for useful discussions during early stages of this work.
3
Esteban Feuerstein
References 1. H. Alborzi, E. Torng, P. Uthaisombut, and S. Wagner. The k-client problem. In Proc. of Eigth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 1995. . G. Ausiello, E. Feuerstein, S. Leonardi, L. Stougie, and M. Talamo. Competitive algorithms for the traveling salesman. In Proc. of Workshop on Algorithms and Data Structures (WADS’95), Springer-Verlag, 1995. 3. G. Ausiello, E. Feuerstein, S. Leonardi, L. Stougie, and M. Talamo. Serving requests with on-line routing. In Proc. of 4th Scandinavian Workshop on Algorithm Theory (SWAT’94), pages 37 48, Springer-Verlag, July 1995. 4. S. Ben-David, A. Borodin, R. Karp, G. Tardos, and A. Widgerson. On the power of randomization in on-line algorithms. Algorithmica, 11: 14, 1994. 5. A. Borodin, Sandy Irani, P. Raghavan, and B. Schieber. Competitive paging with locality of reference. In Proc. of 3rd ACM Symposium on Theory of Computing, pages 49 59, 1991. 6. A. Borodin, N. Linial, and M. Saks. An optimal online algorithm for metrical task system. Journal of the Association for Computing Machinery, 39(4):745 763, 199 . 7. M. Chrobak and L. Larmore. The server problem and on-line games. In On-line Algorithms, pages 11 64, AMS-ACM, 199 . 8. E. Feuerstein. Paging more than one page. In Proceedings of the Second Latin American Symposium on Theoretical Informatics (LATIN95), pages 7 87, Springer-Verlag, 1995. An improved version of this paper will appear in Theoretical Computer Science (1997). 9. E. Feuerstein and A. Strejilevich de Loma. On multi-threaded paging. In Proceedings of the 7th International Symposium on Algorithms and Computation (ISAAC’96), Springer-Verlag, 1996. 10. M. R. Garey and D. S. Johnson. Computers and Intractabiliy - A Guide to the Theory of NP-completeness. W.H. Freeman and Company, San Francisco, 1979. 11. A. Karlin, M. Manasse, L. Rudolph, and D. Sleator. Competitive snoopy caching. Algorithmica, 3():79 119, 1988. 1 . M.S. Manasse, L.A. McGeoch, and D.D. Sleator. Competitive algorithms for server problems. Journal of Algorithms, 11( ): 08 30, 1990. 13. R. Motwani. Lecture Notes on Approximation Algorithms. Technical Report, Stanford University. 14. P. Raghavan and M. Snir. Memory versus randomization in on-line algorithms. RC 156 , IBM, 1990. 15. D.D. Sleator and R.E. Tarjan. Amortized e ciency of list update and paging rules. Communications of ACM, 8( ): 0 08, 1985.
Faster Non-linear Parametric Search with Applications to Optimization and Dynamic Geometry David Fernandez-Baca Department of Computer Science, Iowa State University, Ames, IA 50011, USA fernande@c .ia tate.edu
Ab tract. A technique for accelerating certain applications of parametric search to non-linear problems is presented, together with its applications to optimization on weighted graphs and to two problems in dynamic geometry on points moving in straight-line trajectories: computing the minimum diameter over all time and nding the time at which the length of the maximum spanning tree is minimized.
1
Introduction
The main result of this paper is a technique for accelerating Megiddo’s method of parametric search [13,14] for various applications involving non-linear functions. Our result is related to Cole’s improvement on Megiddo’s method [7,8]. However, Cole’s method, as originally described, is restricted to problems that are, in a sense, linear, while Megiddo’s method does not have this limitation. Our approach yields logarithmic-factor speedups over previous algorithms for certain applications of parametric search to dynamic geometry (i.e., geometry of moving sets of points) and to problems involving graphs with non-linear edge and/or vertex weight functions. Prior to stating our results more precisely, we provide some background. Parametric search problems are characterized by having an underlying problem P(t), which depends on a real-valued parameter t, and which can be solved for any xed t by invoking an algorithm A. The goal of parametric search is to locate a value t such that the solution to P(t ) satis es some speci ed property. The search is guided by an oracle that can resolve any value t0 ; i.e., determine t . In many applications of parametric search (in fact, in whether or not t0 all those discussed here), the time needed to resolve a value will be of the same order of magnitude as the time to solve P(t0 ). To make this discussion concrete, let us consider a parametric search problem that shall be addressed in more detail in Sect. 4, and which was earlier considered by Gupta et al. [10]. There are n points that move along straight lines at constant, but possibly di erent, speeds. The diameter of the point set is the maximum distance between two points in the set. The dynamic diameter problem is to ?
Supported in part by the National Science Foundation under grant CCR-95 0946.
C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 33 41, 1998. c Springer-Verlag Berlin Heidelberg 1998
34
David Fernandez-Baca
nd the time t at which the diameter is minimized. This problem has a xedparameter version, namely, to nd the diameter for a static point set, which can be solved in O(n lg n) time [17]. Also, any value t0 can be resolved in O(n lg n) time by relying on the xed-parameter algorithm [10]. Megiddo’s method of parametric search [13,14] gives a general framework for solving search problems. Megiddo showed that if P(t) can be solved in polynomial time, then t can be found in polynomial time. He also showed that if, for xed t, P(t) can be solved by an algorithm that does D parallel steps and uses W processors, then t can be found in O(T D log W ) time, where T is the time needed by the oracle to resolve a value. This result has proved to be a powerful approach to solving a variety of optimization problems in optimization and geometry [1, ,6, 0]. Part of the appeal of the technique is that it is phrased as an easy-to-use black box: provided a few reasonable problem characteristics are identi ed, Megiddo’s result can be invoked. For example, the existence of a O(lg n)-step O(n)-processor parallel algorithm to compute the diameter for a static point set (see, e.g., [11]) immediately implies the existence of a O(n lg3 n) dynamic diameter algorithm [10]. Cole [7] showed that, under certain conditions, the running time of parametric search can be reduced to O(T (D + log W )), which typically brings an improvement of at least a logarithmic factor over Megiddo’s method. The key requirements for Cole’s technique to apply are that the parallel algorithm for P have bounded degree i.e., processors pass information only to a bounded number of other processors in each step and that the dependency of the input numbers on t be linear. Unfortunately, while the rst condition often applies, linearity does not hold for several cases of interest, including the dynamic diameter problem. Here we show that, under conditions similar to those required by Cole’s method, parametric search problems can be solved in O(T (D + log W )) time, even when the time dependencies are non-linear. This result will enable us to obtain faster algorithms for dynamic diameter, dynamic maximum Euclidean spanning tree, and graph problems with non-linear weights. Organization of the Paper. Section reviews Megiddo’s method and Cole’s improvement to introduce the terminology and basic techniques to be used here, and to establish the context for our work. Section 3 presents our improvement on Megiddo’s method for non-linear parametric search. Section 4 presents applications of our result. Section 5 summarizes and further discusses our work.
Parametric Search We shall rst review Megiddo’s method. Suppose that there is a parallel comparison-based algorithm A that solves the parametric problem P(t) for any xed value of t. Auses W processors and does at most D parallel steps. To locate t , Megiddo’s method simulates the execution of A at t . In the simulation, the input numbers for A are replaced
Faster Non-linear Parametric Search with Applications
35
by continuous functions of t and, thus, the arguments to A’s operations are also functions of t. Therefore, determining A’s computation path depends on being able to compare functions of t at t without actually knowing t . Of interest to us in this paper will be the case where the functions manipulated are boundeddegree polynomials. In this case, the outcome of a comparison between functions f and is resolved by computing the roots of f − and using an oracle to nd the pair of roots between which t lies. The roots associated with a comparison are called its critical values. Each parallel step of A carries out O(W ) comparisons, which generate a total of O(W ) critical values. These can all be resolved with O(log W ) oracle calls and O(W ) overhead by repeatedly carrying out the following alving step. First, in O(W ) time [5], locate the median critical value tm and resolve it using the oracle. Next, use the result to resolve at least half of the critical values. For example, if t tm , then t ti for every critical value ti tm ; the situation is analogous when t > tm . This leads to the following result. Theorem 1 (Megiddo). Let A be parallel algorit m t at carries out D parallel steps, eac of w ic uses at most W processors. T en A’s computation pat at t can be determined wit O(D log W ) oracle calls and O(W D) over ead. We should point out two issues. First, the above statement does not address the problem of actually locating t . What the procedure described gives us is an interval I such that A follows the same computation path for all t I. It turns out that, for most problems of interest, including all problems discussed here, it is possible to obtain t from I in the same order of time as one can solve P(t) for xed t; see, e.g., [14] or [ 0]. Second, Megiddo actually proved the above theorem for the case where the dependency on t is linear. However, the extension to a larger class of functions is straightforward. The dynamic diameter problem falls within the scope of Thm. 1. When simulated, standard comparison-based diameter algorithms require distance comparisons, and the resolution of the signs of determinants (to compare angles), both of which are comparisons between bounded-degree polynomials in t. Cole [7] devised a clever scheme that accelerates parametric search for a subset of the problems that can be solved by Megiddo’s method. Algorithm A is of bounded degree if it can be modeled by a bounded-degree undirected graph, called the communication grap , each of whose nodes represents a processor and where an edge between two nodes indicates that the processors may exchange data. Theorem (Cole). Let A be a bounded-degree parallel algorit m. T en A’s computation pat at t can be determined wit O(D + log W ) oracle calls and wit O(W D) over ead, if t e functions A manipulates are linear. It is instructive to review the main ideas behind Cole’s proof. Algorithm A is viewed as a combinational circuit consisting of interconnected gates. A gate is said to be active if all its inputs become available. Whereas in Megiddo’s method, all comparators that are active at the beginning of a parallel step are resolved
36
David Fernandez-Baca
before proceeding to the next step, Cole’s method has the curious property that it lets some comparisons lag behind, i.e., it slows the network in a sense. Despite this, the technique achieves an overall reduction in the number of oracle calls. The key is to base the decision on which comparators to execute on a weighing scheme, whereby each comparator gets a weight that is an exponentially decreasing function of its depth. Since the functions manipulated are linear, each comparator has at most one critical value, and resolving this value also resolves the comparison. A weighted version of halving is applied to resolve a set of critical values associated with active comparators, such that the total weight of the resolved comparators is at least half the total weight of all active comparators. Cole showed that only O(W + log D) rounds of weighted halving are necessary to resolve all comparators. Unfortunately, Cole’s technique does not directly apply when comparators have multiple critical values, since, in this case, resolving a weighted half of the critical values associated with active comparators does not guarantee that any signi cant fraction of the active comparators are resolved.
3
The Main Result
We now present an approach that yields a bound identical to Cole’s, but which handles the same kinds of functions as Megiddo’s method. We will need a de nition. Let J = (tL t ) be an interval on the real line and let d be an integer. Then, the mapping f : J R is a dt -degree piecewise < tk−1 < polynomial function (d-ppf) if there exist numbers tL = t0 < t1 < tk = t such that, for 0 i k and all t (ti−1 ti ), f (t) = fi (t) for some polynomial function fi of degree at most d. The ti s are called breakpoints. Observe that the minimum, maximum, and sum of two d-ppfs is itself a d-ppf. Note also that it is straightforward to represent a d-ppf by providing the coe cients of the polynomials describing the function within each subinterval. Suppose A is a bounded-degree algorithm. We assume that in every parallel step, each processor carries out a simple operation on its inputs such as an addition or a comparison, after which the result is forwarded to a subset of its neighbors. We assume that the data exchanged though interprocessor links is numeric, and that each of A’s operations has the property that if the numerical inputs are replaced by d-ppfs having a total of r breakpoints between them, the output will be a d-ppf with at most Cr breakpoints, where C is a constant that depends only on d and on the maximum degree of a node in the processor graph. The minimum and maximum operations, as well as sum, and certain other kinds of operations satisfy this property (see, e.g., [9]). Thus, our assumptions hold for networks of comparators such as those used for sorting, where each processor takes two arguments at each step and produces the maximum and minimum for transmission to its neighbors. In particular, if the inputs to such a network are quadratic functions, then, at every step of the computation, the sorter will manipulate piecewise quadratic functions.
Faster Non-linear Parametric Search with Applications
37
We have the following result. Theorem 3. Let A be a bounded-degree parallel algorit m. T en A’s computation pat at t can be determined wit O(D + log W ) oracle calls and wit O(W D) over ead, w en t e inputs are polynomials of degree at most d, for some xed d. Proof. We successively simulate A’s parallel steps, maintaining along the way an interval I = (tL t ) containing t . Initially, I = (− + ). The size of I will remain constant or decrease after each step. Let fc I denote the function describing an input c to one of A’s operations within interval I. We assume that prior to the beginning of parallel step i, we have complete descriptions within I of all fc I such that c is an input to step i. Let bc I be the number of breakpoints of fc I that lie in the interior of I. Our goal is to maintain the following invariant throughout the simulation. At P the beginning of the ith parallel step, bc I : c is an input to a processor at step i
W
(*)
Note that (*) is true at the beginning of the simulation, since the inputs are assumed to be d-th degree polynomial functions; i.e., they have zero breakpoints. To maintain (*), we use the following strategy at step i: (1) Carry out all the operations in the current step this will yield a set of inputs for step i + 1, each of which is a d-ppf. ( ) Form a list L by collecting all the the breakpoints of the functions fc I where c is an input to step i + 1. (3) Apply halving repeatedly, until the size of L is at most W . This results in an interval I 0 I. (4) For each input c to step i + 1, construct a description of fc I 0 . (5) Set I = I 0 . Because of steps (3) and (5), the above strategy maintains (*). Let e be any operation carried out in step (1); observe that there are O(W ) such e’s. ck . Then, by our earlier assumption, the Suppose the inputs to e are P c1 output of e has at most C i ck breakpoints, where C depends only on d and the maximum degree of a node in the communication graph. The total time to P execute this step can thus be shown to be O(C i ck ). Hence, if (*) holds before step (1) is executed, all the operations in the step can be executed in O(CW ) = O(W ) time. Step ( ) takes time proportional to the number of breakpoints in all the functions; i.e., the time is O(W ). Step (3) starts with O(CW ) critical values and must end with W values; the number of halvings required is only O(log C) = O(1), each of which requires one oracle call. The total overhead for median computations is O(W ). Step (5) takes O(1) time. To summarize, the whole simulation requires O(D) oracle calls, with an overhead of O(W D). After the last parallel step, we are left with O(W ) critical values. To resolve them, we execute O(log W ) oracle calls, with a total of O(W ) overhead. Thus, the total number of oracle calls is O(D + log W ).
38
David Fernandez-Baca
Remark. It is instructive to compare the above result to Cole’s technique. In the latter, some comparisons are delayed to achieve a smaller total number of oracle calls, but only one computation path is simulated at any given time. In our approach, the comparisons are executed level by level, as soon as inputs are available, however, we maintain multiple computation paths by storing d-ppfs. To keep things manageable, we use halving to ensure that, at any parallel step, the average number of di erent computation paths followed by a comparator is O(1). The following result can be shown using similar techniques; for brevity, we omit the proof. Theorem 4. Let A be an algorit m consisting of W independent parallel executions of a D-step sequential algorit m. T en, A’s computation pat at t can be found using O(D + log W ) oracle calls and O(DW ) over ead.
4 4.1
Applications Graph Problems with Polynomial Weights
We consider problems whose xed parameter version involves nding a minimumweight (maximum-weight) subgraph of an edge and/or vertex graph G. Two problems that can be expressed this way are computing minimum spanning trees and nding a minimum cut. Suppose that the weights are d-th degree polynomial functions of t that are concave (convex) within some interval I. Thus, the function Z describing the least-weight (maximum-weight) solution is a concave (convex) d-ppf. The problem is to locate the value t that maximizes (minimizes) Z within I. Toledo [19] showed how such problems can be solved using Megiddo’s method. Thm. 3 allows us to reduce the running time of these algorithms by a log factor in some cases. For the non-linear parametric minimum spanning tree problem, this allows us to reduce the running time from O(TMST (m n) log n) to O(TMST (m n) log n), where TMST (m n) is the time to compute a minimum spanning tree in an n-vertex m-edge graph. We note that the O(TMST (m n) log n) bound had already been achieved for the linear version of the problem [14] 4.
Dynamic Computational Geometry
We consider two problems involving sets of points moving at constant speed along straight lines in the plane: nding the time at which the diameter is minimized and nding the time at which the maximum spanning tree is minimized. Diameter. The diameter of a set P of n points is the maximum distance between any two points in the set. It is well known that, in the static case, the diameter can be computed in O(n log n) time [17]. The algorithm uses the fact that that the farthest pair of points must consist of two points lying on CH(P ),
Faster Non-linear Parametric Search with Applications
39
the convex hull of P . Indeed, once CH(P ) is known, the diameter can be found in O(n) time [17]. Suppose the points move at constant speeds along straight-line trajectories and let Z(t) be the diameter of a point set P at time t. The dynamic diameter problem is to nd the time t at which Z(t) is minimized. A previous algorithm for the dynamic diameter problem runs in O(n log3 n) time [10]; we will show how to improve this to O(n log n). As in [10], we nd it more convenient to work with squares of distances. Z (t) is the upper envelope of the set dist (p(t) q(t)) : p q P , where each element is a convex quadratic function of t; thus, Z (t) is a piecewise quadratic convex function of t. The oracle must take a value t0 and determine the position of t relative to t0 by determining whether Z is increasing, decreasing, or neither at t0 . This can be accomplished in O(n log n) time by invoking the static diameter algorithm [10]. To locate t , we simulate the execution of a parallel algorithm A for evaluating Z(t) for xed t. The procedure is a straightforward parallelization of the earlier-mentioned static diameter algorithm. A has three phases. The rst applies Miller and Stout’s bounded-degree parallel convex hull algorithm, which has W = O(n) and D = O(log n) [15]. By Thm. 3, simulating this phase takes O(n log n) time. After phase one is complete, we will have an interval I containing t such that the points on the convex hull, as well as their relative order on the hull’s boundary is the same for all t in I. In the second phase each hull vertex nds the farthest hull vertex from it; the pairs of vertices thus found are called antipodal. This is done by carrying out at most n independent binary searches in parallel, where each search locates the farthest point for a single hull vertex. By Thm. 4 the second phase can also be simulated in time O(n log n). In fact, since the convex hull is xed, we can use a faster, O(n), oracle, leading to a running time of O(n log n). Finally, among the O(n) antipodal pairs of points generated, we look for one that gives the maximum distance. This is done by the standard tournament algorithm, which has O(log n) parallel steps and uses O(n) processors [11]; thus, by Thm. 3, the running time using the faster oracle is O(n log n). Hence, the total time for the dynamic diameter problem is O(n log n). Maximum Spanning Tree. A maximum spanning tree (MaxST) for a set P of n points is the tree of maximum total length that spans P . Suppose the points move at constant speeds along straight-line trajectories and let W (t) be the length of the MaxST of a point set P at time t. The dynamic MaxST problem is to nd the time t at which W (t) is minimized. A previous algorithm for this problem runs in O(n log4 n) time [1 ]; we show how to improve this to O(n log3 n). For any xed t, the value of W (t) can be computed in O(n log n) time using an algorithm by Monma et al. [16]. As observed by Katoh et al. [1 ], this static algorithm can be parallelized directly. For this we need a procedure that preprocess a set of points S so that, given any point p, the farthest neighbor of p in S can be found in O(log S ) time. This is done by rst constructing the farthest-
40
David Fernandez-Baca
point Voronoi diagram [17] of S. Now, given a query point p, the farthest point in S can be found in O(log S ) time though point location in the diagram. The farthest-point Voronoi diagram of S can be constructed in O(log S ) time with O( S ) processors using the EREW convex hull algorithm of Amato et al. [4]. Using this procedure, we get a O(log n)-time, O(n)-processor parallel EREW algorithm for the static MaxST problem. To solve the dynamic problem, we rst observe that the function W (t) describing the cost of the maximum spanning tree is convex [1 ]. Thus, we can use the sequential static algorithm as an oracle to resolve any value t0 in O(n log n) time. The algorithm A to be simulated in the search for t is the parallel MaxST algorithm. This algorithm has bounded degree; we can therefore invoke Thm. 3 to obtain a O(n log3 n) algorithm for the dynamic MaxST problem.
5
Discussion
We have presented a technique that leads to a log-factor speedup for certain applications of parametric search to non-linear problems. Our method di ers from Megiddo’s or Cole’s in that it allows the simulation of the xed-parameter algorithm to follow several paths simultaneously. It should be pointed out that the O(n log n) dynamic diameter algorithm presented here simulates the optimal but highly impractical AKS sorting network [3]. On the positive side, our technique can also be used to improve slower but more practical algorithms, such as the the dynamic diameter algorithm of Schwerdt et al. [18]. We conjecture that none of the algorithms presented here is optimal and, in particular, that there exist optimal O(n log n) algorithms for both of the dynamic diameter and maximum spanning tree problems. However, we also believe that proving this will require ideas beyond those presented here. Acknowledgement. The author thanks Naoki Katoh for clari cations regarding the maximum spanning tree problem.
References 1. Pankaj Agarwal and Jiri Matousek. Ray shooting and parametric search. SIAM J. Computing, :794 806, 1993. . Pankaj K. Agarwal, Micha Sharir, and Sivan Toledo. Applications of parametric search to geometric optimization. Journal of Algorit ms, 17: 9 318, 1994. 3. M. Ajtai, J. Komlos, and E. Szemeredi. Sorting in c log n parallel steps. Combinatorica, pages 1 19, 1983. 4. Nancy M. Amato, Michael T. Goodrich, and Edgar A. Ramos. Parallel algorithms for higher-dimensional convex hulls. In Proceedings 35t IEEE Symp. on Foundations of Computer Science, pages 683 694, 1994. 5. Manuel Blum, Robert W. Floyd, Vaughan Pratt, Ronald L. Rivest, and Robert E. Tarjan. Time bounds for selection. Journal of Computer and System Sciences, 7(4):448 461, 1973.
Faster Non-linear Parametric Search with Applications
41
6. Bernard Chazelle, Herbert Edelsbrunner, Leonidas Guibas, and Micha Sharir. Diameter, width, closest line pair, and parametric searching. Discrete Comput. Geom., 10:183 196, 1993. 7. Richard Cole. Slowing down sorting networks to obtain faster sorting algorithms. J. Assoc. Comput. Mac ., 34(1): 00 08, 1987. 8. Richard Cole. Parallel merge sort. SIAM J. Computing, 17(4):770 785, 1988. 9. David Fernandez-Baca and Giora Slutzki. Optimal parametric search on graphs of bounded tree-width. Journal of Algorit ms, : 1 40, 1997. 10. P. Gupta, Ravi Janardan, and Michiel Smid. Fast algorithms for collision and proximity problems involving moving geometric objects. Computational Geometry: T eory and Applications, 6:371 391, 1996. 11. Joseph Ja’Ja’. An Introduction to Parallel Algorit ms. Addison-Wesley, Reading, MA, 199 . 1 . Naoki Katoh, Takeshi Tokuyama, and Kazuo Iwano. On minimum and maximum spanning trees of linearly moving sets of points. Discrete Comput. Geom., 13:161 176, 1995. 13. Nimrod Megiddo. Combinatorial optimization with rational objective functions. Mat . Oper. Res., 4:414 4 4, 1979. 14. Nimrod Megiddo. Applying parallel computation algorithms in the design of serial algorithms. J. Assoc. Comput. Mac ., 30(4):85 865, 1983. 15. Russ Miller and Quentin F. Stout. E cient parallel convex hull algorithms. IEEE Trans. Computers, C-37:1605 1618, 1988. 16. Clyde Monma, Michael Paterson, Subhash Suri, and Frances Yao. Computing Euclidean maximum spanning trees. Algorit mica, 5:407 419, 1990. 17. Franco P. Preparata and Michael Ian Shamos. Computational Geometry: An Introduction. Springer-Verlag, 1985. 18. J¨ org Schwerdt, Michiel Smid, and Stefan Schirra. Computing the minimum diameter for moving points: an exact implementation using parametric search. In 13t ACM Symposium on Computational Geometry, Nice, France, 1997. 19. Sivan Toledo. Maximizing non-linear concave functions in xed dimension, pages 4 9 447. World Scienti c, Singapore, 1993. 0. Sivan Toledo and Micha Sharir. Extremal polygon containment problems. Computational Geometry: T eory and Applications, 4:99 118, 1994.
Super-State Automata and Rational Trees Frederique Bassino1 , Marie-Pierre Beal , and Dominique Perrin1 1
Institut Gaspard Monge, Universite de Marne-la-Vallee, , rue de la butte verte 93166 Noisy le Grand Cedex, France 2 Institut Gaspard Monge, Universite Paris 7 ttp://www-igm.univ-mlv/ bassino,beal,perrin
Ab tract. We introduce the notion of super-state automata constructed from other automata. This construction is used to solve an open question about enumerative sequences in rational trees. We prove that any IN-rational sequence s = (sn )n 0 of nonnegative integers satisfying the Kraft inequality n 0 sn k−n 1 is the enumerative sequence of leaves by height of a k-ary rational tree. This result had been conjectured and was known only in the case of strict inequality. We also give a new proof of a result about enumerative sequences of nodes in k-ary rational trees.
P
1
Introduction
In this paper, we introduce the notion of super-state automata, which allows us to solve an open problem about enumerative sequences of leaves in rational trees stated in [10]. It is a re nement of the determinization algorithm including multiplicities. This notion can informally be stated as follows. Let A be a nite automaton or a multigraph (we disregard the labeling). A super-state automaton, constructed from the automaton A, has states composed of unordered lists of states of A such that the list of followers of all states of a super-state can be partitioned in super-states. Assume that A has an initial state i and consider the tree that is a development of A from the initial state: each node of this tree is associated with one state of A, and the sons of a node are associated with the followers of the state associated to their father, the root being associated with the initial state of A. This tree is rational as it has only a nite number of non-isomorphic subtrees. Moreover as its subtrees are identi ed with super-states, it can have a more compact representation. This transformation on tree is accompanied by a loss of information. Nevertheless, it keeps some interesting properties of the ordinary tree like the number of leaves or the number of nodes at each height. We use this notion of super-states to solve an open question about enumerative sequences of integers that can be realized as the enumerative sequences of leaves in a k-ary rational tree. We also give an alternative proof to a result proved in [3] about enumerative sequences of integers that can be realized as the enumerative sequences of nodes in a k-ary rational tree. These problems are linked with coding and symbolic dynamics. They can be considered as extensions of results of Hu man, Kraft, McMillan and Shannon on source coding. C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 4 c Springer-Verlag Berlin Heidelberg 1998
5 , 1998.
Super-State Automata and Rational Trees
43
If s = (sn )n1 is the enumerative sequence of leaves of a rational tree, s is IN-rational, that is sn is the number of paths of length n going from an initial state to a nal state in a nite multigraph or a nite automaton. If the tree is k-ary, s satis es the Kraft inequality for the integer k: n1 sn k −n 1. In the rst part of this paper, we study the converse of the above property. Consider for example the series s(z) = 3z (1 − z ). We have s(1 ) = 1 and we can obtain s as the enumerative sequence of the tree of Fig.1 associated with the pre x code X = (aa) (ab + ba + bb) on the binary alphabet a b .
Fig. 1. Tree associated to 3z (z ) Known constructions allow one to obtain a sequence s satisfying the Kraft inequality as the enumerative sequence of leaves of a k-ary tree, or as the enumerative sequence of leaves of a (perhaps not k-ary) rational tree. These two constructions lead in a natural way to the problem of building a tree both rational and k-ary. This question was already considered in [10], where it was conjectured that any IN-rational sequence satisfying the Kraft inequality is the enumerative sequence of leaves of a k-ary rational tree. The case of strict inequality was solved in [3]. In this paper, we completely settle the conjecture and the proof which we give works in both cases. A variant of the problem consists in replacing the enumerative sequence of leaves by the enumerative sequence of all nodes. Soittola ([11]) has characterized the series which are the enumerative sequences of nodes in a rational tree. The problem of a similar characterization for rational k-ary trees remains an open one in the general case. In [ ], this question was solved for IN-rational series t that satisfy certain necessary conditions (namely, two obvious ones: t0 = 1, ktn−1 , and a less obvious one, but proved to be necessary in [ ]: n 1 tn the convergence radius of t is strictly greater than 1 k) and one more condition: t has a primitive linear representation. In this case there is a k-ary rational tree whose enumerative sequence of nodes by height is t. In the second part of this paper, we give a new proof of this result. Proofs and algorithms used to establish the results are based on automata theory and on the theory of nonnegative matrices. Unlike in [ ], we do not use any symbolic dynamic construction like state-splitting. But we use basic results of the Perron-Frobenius theory, and a very simple lemma, that we call the weight lemma , due to B. Marcus in [8] (see also [7]), and already used by R. Adler,
44
Frederique Bassino, Marie-Pierre Beal, and Dominique Perrin
D. Coppersmith and M. Hassner in [1] to construct some nite-state codes with sliding block decoders for constrained channels. With this new method, the trees obtained in a lot of examples have smaller representations.
Preliminaries In this section, we introduce the notion of super-state automaton. We recall also de nitions about rational sequences and results from the Perron-Frobenius theory of nonnegative matrices. .1
Super-State Automata
Let A be a nite state automaton (Q E), where Q is the set of states and E the set of edges. In the following, the labeling alphabet will always be reduced to one letter, say z, but some de nitions can be extended to more general automata. So the labeling will not be represented on pictures. Automata can hence be seen as multigraphs, since several edges, (equally labeled), going from a state p to state q, may exist. Some initial or nal states may also be sometimes speci ed. In order to establish the rst result, we shall use a super-state automaton, constructed from an automaton A whose states have a positive integral valuation. We denote by v(q) the valuation of a state q. We also choose and x a positive integer m. A super-state automaton, according to the valuation v and the integer m, is an automaton B = (Q0 E 0 ) whose states, called super-states, are unordered (or qr ) of states of A, with 1 r m. We extend commutative) r-tuples (q1 q the de nition of the valuation to the super-states, and even to any r-tuple of states, as being the sum of the valuations of their components: r
v((q1 q
qr )) =
v(qj ) j=1
qr ) be a super-state. If q is a state of A, we denote by uq the unLet (q1 q ordered tuple obtained by concatenation of the ending states of edges of A going qr ) is a super-state, we denote by u(q1 q2 qr ) the out of state q. If (q1 q uqr . Now we partition u(q1 q2 qr ) unordered concatenation of all uq1 uq2 in several unordered r-tuples (1 r m), in such a way that all parts, but possibly one, have a valuation divisible by m. Such a partition can be obtained by applying the following lemma, which is a key point in the state-splitting process used to construct coding schemes for constrained channels (see [7] and [4]): Lemma 1. (weight lemma) Let v1 v is a subset S 1 m such that
vm be positive integers. Then there q2S vq is divisible by m.
Therefore the partition in super-states is obtained as follows: if u(q1 q2 qr ) has less than or exactly m (unordered) components, there is nothing to do. If not,
Super-State Automata and Rational Trees
45
consider the rst m ones (r1 r rm ). By the weight lemma, there is a subset S of 1 m such that i2S v(ri ) is divisible by m. The r-tuple composed of the ri , with i S, is a super-state that is the rst part of the partition. The process is iterated with the remaining components of u(q1 q2 qr ) . We either get a decomposition in super-states whose valuation are all equal to zero modulo m, or a decomposition in super-states whose all but one valuations have this property, the last one being equal to a nonzero value modulo m. After the choice qr ) as the of such a partition, we de ne the output edges in B of state (q1 q edges of a multigraph ending in the super-states of the partition. If a super-state qr ) to u appears t times in the decomposition, we have t edges from (q1 q u in the multigraph. Note here that the automaton B is a nite state automaton since there is only a nite number of super-states. The r-tuples are always unordered. This means that all components commute. A state of A can also appear several times in a same super-state as di erent components. Example 1. We associate to each state of the automaton A (Fig. ) a valuation mentioned in the square besides the state. We also x integer m, it is equal to the valuation of state 1, that is 3. We then obtain the super-state automaton B (Fig. ).
355 1
3
277
26
366
37
2
4 5
4
2
3 2
666
7 3 6
1
25
1 1
777
555
Fig. . Automaton A and the super-state automaton B
4
46
.
Frederique Bassino, Marie-Pierre Beal, and Dominique Perrin
Rational Sequences of Nonnegative Numbers
A sequence s = (sn )n0 of nonnegative integers is said to be IN-rational if sn is the number of paths of length n going from a state in I to a state in F in a nite directed graph G, where I and F are two special subsets of states, the initial and nal states respectively. We say that the triple (G I F ) is a representation of the sequence s. This de nition is usually given for the series n0 sn z n instead of the sequence (sn )n0 . Any IN-rational sequence s satis es a recurrence relation with integral coe cients. It is however not true that a sequence of nonnegative integers satisfying a linear recurrence relation is IN-rational (see [6] page 93). A well known result in the theory of nite automata allows us to use a particular representation of an IN-rational sequence s. One can choose a representation (G i F ) of s with a unique initial state i, such that no edge is entering state i and no edge is going out from any state of F . Such a representation is called a normalized representation. We recall that a tree T on a set of nodes N with a root r is a function T : N − r − N which associates to each node distinct from the root its father T (n), in such a way that, for each node n, there is a nonnegative integer such that T (n) = r. The integer is the height of the node n. A tree is k-ary if each node has at most k sons, a node without son is called a leaf. A tree is said to be rational if it admits only a nite number of non-isomorphic subtrees. If T is a tree, we denote by l(T ) its enumerative sequence of leaves by height, that is, the sequence of numbers sn , where sn is the number of leaves at height n. The sequence s = l(T ) of a k-ary tree is also the length distribution of a pre x code over a k-letter alphabet. Thus the corresponding series s(z) = n0 sn z n satis es the Kraft inequality: s(1 k) 1. We shall say that the strict Kraft inequality is satis ed when s(1 k) < 1. Note that the equality is reached when each node of the tree has exactly zero or k sons. Conversely, the McMillan construction establishes that for any series s satisfying the Kraft inequality, there is a k-ary tree such that s = l(T ). Moreover, if the series satis es the Kraft equality, then the internal nodes will have exactly k sons. But the tree obtained is not rational in general. If T is a rational tree, this sequence is IN-rational. It is easy to see that an IN-rational sequence s is the enumerative sequence of leaves of a rational tree. This one can be obtained by developing a chosen normalized representation of s : its root corresponds to the initial state of the graph, if a node of the tree at height n corresponds to a state i in the graph which has r outgoing edges jr , it admits r sons at height n + 1, each of them ending in states j1 j jr of the graph. The leaves corresponding respectively to the states j1 j of this tree correspond to the nal states of the normalized representation. The maximal number of sons of a node is then equal to the maximal number of edges going out from any state of the graph of this representation. Even if the sequence s satis es the Kraft inequality, the above construction does not lead in general to a k-ary rational tree. The aim of the rst result of this paper is to get a k-ary rational tree T such that s = l(T ).
Super-State Automata and Rational Trees
47
Let s be an IN-rational sequence and let (G i F ) be a normalized representation of s. If we identify the initial state i and all nal states of F in a single state still denoted i, we get a new graph denoted G, which is strongly connected. The sequence s is then the length distribution of the paths of rst returns to state i, that is of nite paths going from i to i without going through state i. Using the terminology of symbolic dynamics, the graph G can be seen as an irreducible shift of nite type (see, for example, [4], [5] or [7]). We denote by M the adjacency matrix associated to the graph G, that is the matrix M = (mij )1i jn , where n is the number of nodes of G and where mij is the number of edges going from state i to state j. By the Perron-Frobenius theorem (see [7]), the nonnegative matrix M associated to the strongly connected graph G has a positive eigenvalue of maximal modulus denoted by , also called the spectral radius of the matrix. Indeed, only depends on the series s, since 1 is the minimal modulus of the poles of 1 (1 − s). It is known that the series s satis es the strict Kraft inequality s(1 k) < 1 (resp. equality s(1 k) = 1) if and only if < k (resp. = k). The dimension of the eigenspace of is equal to one and there is a positive eigenvector (componentwise) associated to . When is an integer, the matrix admits a positive integral eigenvector. When < k, where k is an integer, the matrix admits a k-approximate eigenvector, that is, by de nition, a positive integral vector v with M v kv. The computation of an approximate eigenvector for the irreducible graphs G associated to normalized representations (G i T ) of sequences can be obtained by the use of Franaszek algorithm (see for example [3]). We associate to each node of G the value of the corresponding component of the approximate eigenvector of the graph G. The initial and the nal states will have same value since they correspond to the same state of G.
3
Enumerative Sequence of Leaves
In what follows we state and prove, by the use of super-state automata, the rst result about the enumerative sequences of leaves of rational trees: Theorem 1. Let s = (sn )n1 be an IN-rational sequence of nonnegative integers and let k be an integer such that n1 sn k −n 1. Then there is a k-ary rational tree having s as enumerative sequence by height of leaves. Proof. Let us consider an IN-rational sequence s and an integer k such that −n 1 and let A = (G i F ) be a normalized representation of s. We n1 sn k denote by M the adjacency matrix of G, and by its spectral radius. Hence vn )t of the k. We compute a k-approximate eigenvector v = (v1 v graph G. By de nition, we have M v kv. The vector v is used as a valuation, denoted by v, of the states of A. Next we de ne a super-state automaton B associated to the automaton A, the valuation v, and the integer m = vi , where i is the initial state of A. We shall now consider the part of B accessible from the initial super-state which has, as unique component, the initial state of A.
48
Frederique Bassino, Marie-Pierre Beal, and Dominique Perrin
Let u be a super-state of B. If u is composed of nj states j of A, its valuation is v(u) = 1jn nj vj . We associate to each super-state u another integer, denoted by w(u), and de ned by: w(u) = v(u) m . Note that w(i) = 1. Let us now suppose that u has t outgoing edges ending in the super-states ut . The sum of the valuations of u1 ut is equal to 1jn nj (M v)j . u1 As M v kv, we have, for all j, (M v)j kvj . Thus: t
v(uj )
kv(u)
j=1
or equivalently, t−1
v(uj ) m + v(ut ) m
kv(u) m
j=1
By construction of the super-state automaton, v(uj ) m is an integer for 1 (t − 1). Hence we get:
j
t−1
v(uj ) m + v(ut ) m
k v(u) m
j=1
Therefore we obtain: t
w(uj )
kw(u)
j=1
Finally we develop the multigraph B. And, in order to get a k-ary rational tree, admitting s as enumerative sequence of leaves, we associate to each superstate u, at any height, r = w(u) nodes. Since r nodes at height l have at most kr sons at height l + 1, corresponding to the nodes associated to the super-state followers of u, it is possible to associate to each one k sons at the next height. The initial super-state itself corresponds to one node, the root of the tree. The tree is then k-ary. The nal states of A are always themselves super-states. This is possible since their valuation is equal to v(i) = m. The leaves of the tree are then the nodes corresponding to a nal state of A. As there is only a nite number of super-states, the tree is also rational. Remark 1. The case of equality in the Kraft inequality is just a particular case of the above construction. The valuations of the super-states are then all divisible by m and the vector v is an eigenvector: M v = kv. If a super-state u has ut , by construction of the t outgoing edges ending in the super-states u1 v(ut−1 ) of the supersuper-state automaton, the valuations v(u1 ), v(u ) states are equal to zero modulo m. As M v = kv, v(ut ) also is divisible by m. Therefore the valuations of all super-states in the tree are divisible by m.
Super-State Automata and Rational Trees
49
Remark . In order to get a k-ary rational tree, instead of developing the superstate automaton B, we can apply state-splitting algorithm to the automaton B according to the approximate eigenvector w. Indeed, if we call M 0 the adjacency matrix of B, we have M 0w
kw
We then obtain an automaton with at most k outgoing edges for each state. Moreover, the initial state will not be split during the state-splitting process since, by construction, wi = w(i) = 1 (for further details see [3]). Example . Let s be the series de ned by: s(z) =
z z + (1 − z ) (1 − 5z 3 )
A normalized representation of s is given by the automaton A (Fig. ). In the previously mentioned gure, the valuation v(q) of a state q is given in the square besides the representation of the state. Note that the nal state 4 has same valuation (v(4) = 3) as the initial state 1. A k-ary rational tree T , having s as enumerative sequence of leaves, is given in Fig. 3. The components of the super-states are given inside the states, the number of small black balls above a super-state u is equal to the number w(u) = v(u) 3 of nodes represented by u and The nal state 4 corresponds to the leaves of the tree.
4
Enumerative Sequence of Nodes
A variant of the previous problem consists in replacing the enumerative sequence of leaves by the enumerative sequence of all nodes. Let T be a tree. We de ne the enumerative sequence t of nodes by height of the tree T by t = (tn )n0 , where tn is the number of nodes of T at height n. In this section, we give a new proof of a result about enumerative sequence of nodes by height of rational trees. This result has been obtained in [ ] by making use of dynamic operations such as an extended notion of state-splitting. The alternative proof we give here is based on the construction of a super-state automaton and is simpler than the previous one. The trees obtained with this new method have very compact representations. Let t be an IN-rational series. A linear representation of t is a triple (l M c), where l is a nonnegative integral row vector, c is a nonnegative integral column vector, and M is a nonnegative integral matrix, with: n
0
tn = lM n c
The linear representation is said to be primitive if M is a primitive matrix, that is there is an integer m such that M m > 0 or, equivalently, the graph associated with M is strongly connected and the g.c.d of lengths of its cycles is 1. In the following result, all but the last conditions are necessary.
50
Frederique Bassino, Marie-Pierre Beal, and Dominique Perrin
1
2 5
666
4
366
4
777
277
555
4
4
4
666 666 666
355
4
666
666
666
666
666
4
26
37
25
Fig. 3. Tree T
4
4
Super-State Automata and Rational Trees
Theorem . Let t(z) = that:
n0 tn z
n
be an IN-rational series and k
51
IN such
t0 = 1, n 1 tn ktn−1 , the convergence radius of t is strictly greater than 1 k, and t has a primitive linear representation. Then (tn )n0 is the enumerative sequence of nodes by height in a k-ary rational tree. Proof. We denote by 1 the convergence radius of t. Let (i M c) be a primitive linear representation of t. As M is primitive, there is an integer n0 such that: M M n0 c
kM n0 c
We set d = M n0 c, and denote by t0 the sequence obtained from t after n0 shifts: n
0
t0n = tn+n0 = iM n M n0 c = iM n d
The sequence t0 admits (i M d) as primitive linear representation. We call A the multigraph whose adjacency matrix is M . We have t00 = tn0 = i d. We de ne a super-state automaton B whose states contain only one occurrence of one state of A. Therefore the number of super-states is equal to the number of states of A. The followers of a super-state is then the list of its followers in A. We associate to each super-state u an integer w(u) de ned as the weighted number of nal states contained in the super-state u : w(u) = u d As M d
kd, we get for each super-state u ku d
uM d
Hence we get that for any super-state u whose followers in B are the super-states ut : u1 t
uj d
uM d =
ku d
j=1
or equivalently: t
w(uj )
kw(u)
j=1
We de ne a tree rooted by the initial super-state i, by developing the superstate automaton B. We associate w(u) nodes to each super-state u. The initial
5
Frederique Bassino, Marie-Pierre Beal, and Dominique Perrin
super-state itself (i) corresponds to tn0 nodes, since i d = tn0 . As r nodes at a height l have at most kr sons at height l + 1, corresponding to the nodes associated to the super-states followers of u, we associate to each one at most k sons in such a way that any node at the next height has one father. We nally complete the rst n0 levels to get a k-ary rational tree T having t as enumerative sequence of nodes by height. In this paper, we characterized sequences of nonnegative numbers that can be realized as the enumerative sequences of leaves in k-ary rational trees. But only a partial similar result concerning the enumerative sequences of nodes in such trees is known and the problem remains open in the general case.
References 1. R. L. Adler, D. Coppersmith, and M. Hassner. Algorithms for sliding block codes. I.E.E.E. Trans. Inform. T eory, IT- 9:5 , 1983. . F. Bassino, M.-P. Beal, and D. Perrin. Enumerative sequences of leaves and nodes in rational trees. T eoret. Comput. Sci., 1997. (to appear). 3. F. Bassino, M.-P. Beal, and D. Perrin. Enumerative sequences of leaves in rational trees. In ICALP’97, LNCS 1 56, pages 76 86. Springer, 1997. 4. M.-P. Beal. Codage Symbolique. Masson, 1993. 5. M.-P. Beal and D. Perrin. Symbolic dynamics and nite automata. In G. Rozenberg and A. Salomaa, editors, Handbook of Formal Languages, volume , chapter 10. Springer-Verlag, 1997. 6. J. Berstel and C. Reutenauer. Rational Series and t eir Languages. SpringerVerlag, 1988. 7. D. Lind and B. Marcus. An Introduction to Symbolic Dynamics and Coding. Cambridge, 1995. 8. B. Marcus. Factors and extensions of full shifts. Monats.Mat , 88: 39 47, 1979. 9. D. Perrin. Arbres et series rationnelles. C.R.A.S. Paris, Serie I, 309:713 716, 1989. 10. D. Perrin. A conjecture on rational sequences. In R. Capocelli, editor, Sequences, pages 67 74. Springer-Verlag, 1990. 11. A. Salomaa and M. Soittola. Automata-t eoretic Aspect of Formal Power Series. Springer-Verlag, Berlin, 1978.
An Eilenberg Theorem for Words on Countable Ordinals Nicolas Bedon and Olivier Carton Institut Gaspard Monge Universite de Marne-la-Vallee , rue de la Butte Verte F-93166 Noisy-le-Grand Cedex Nicolas.Bedon,Olivier.Carton @univ-mlv.fr ttp://www-igm.univ-mlv.fr/ bedon,carton
Ab tract. We present in this paper an algebraic approach to the theory of languages of words on countable ordinals. The algebraic structure used, called an ω1 -semigroup, is an adaptation of the one used in the theory of regular languages of ω-words. We show that nite ω1 -semigroups are equivalent to automata. In particular, the proof gives a new algorithm for determinizing automata on countable ordinals. As in the cases of nite and ω-words, a syntactic ω1 -semigroup can e ectively be associated with any regular language of words on countable ordinals. This result is used to prove an Eilenberg type theorem. There is a one-to-one correspondence between varieties of ω1 -languages and pseudo-varieties of ω1 -semigroups.
1
Introduction
Finite semigroups are the algebraic counterpart of automata. The rst deep result using semigroup recognition is due to Sch¨ utzenberger [16]. He proved that the syntactic semigroup of a regular language L is nite and aperiodic (i.e. groupfree) i L belongs to the smallest class of languages containing the letters and closed under product and nite boolean operations. The idea of using algebraic properties of syntactic semigroups to classify regular languages was developed by Eilenberg [8], who showed that there exists a one-to-one correspondence between pseudo-varieties of semigroups (class of semigroups closed under taking subsemigroups, quotients and nite direct products) and certain classes of languages, the varieties of languages. This theorem is known as the variety theorem. Since that time the theory of varieties of regular languages has been widely developed (see [13] for example). It has been shown that these algebraic methods are a very powerful tool. For instance, they can be used to obtain decidability results as in [17]. B¨ uchi [5] introduced automata on -words in order to prove the decidability of the monadic second-order logic of the integers. The algebraic study of such regular languages was rst developed by Pecuchet [9,10], but a more satisfying approach, the theory of -semigroups, is due to Wilke [18] and Perrin and Pin C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 53 64, 1998. c Springer-Verlag Berlin Heidelberg 1998
54
Nicolas Bedon and Olivier Carton
[1 ]. There are two important advantages in using the algebraic approach instead of automata theory when dealing with in nite words. First, there exists a canonical algebraic structure associated with any regular language, but we do not know yet how to associate a canonical automaton with a regular language of in nite words. Second, the complementation algorithm is di cult for automata but immediate for algebraic structures. Sch¨ utzenberger’s theorem was extended to the case of -words by Perrin [11], and Eilenberg’s theorem was adapted by Wilke [18]. B¨ uchi [6] introduced automata on ordinals to analyze the decidability of corresponding logics. Wojciechowski [19] de ned regular expressions. Choueka [7] also studied similar automata for ordinal less than n . The connections between temporal logic on ordinals and automata have been recently investigated in [14]. The rst author gave an algebraic approach to the recognition of words on ordinals less than n in [3]. The generalization of -semigroups which he introduced is equivalent to a particular class of B¨ uchi automata. Sch¨ utzenberger’s theorem was extended to words of length less than n in [4]. In this paper, we develop an algebraic theory for words over all countable ordinals. We de ne a generalization of semigroups called 1 -semigroups. We show uchi automata over words that, when nite, 1 -semigroups are equivalent to B¨ of countable ordinals for de ning sets of words. The proof of the equivalence between nite 1 -semigroups and automata gives in particular a new algorithm for the determinization of automata. We also show that, in analogy with the case of nite words, one can associate with any rational language L a canonical 1 semigroup, called the syntactic 1 -semigroup of L, which has the property of being the smallest one among all 1 -semigroups recognizing L. Finally, Eilenberg’s one-to-one correspondence between varieties of languages and pseudo-varieties of nite 1 -semigroups is extended. Thus, 1 -semigroups provide a tool to classify regular languages, generalizing the case of nite words. We point out that all constructions given in this paper are e ective. The paper is organized as follows. Section is devoted to basic notation and de nitions. In Sect. 3 are de ned 1 -semigroups. Equivalence between automata and nite 1 -semigroups is proved in Sect. 4. The syntactic 1 -semigroup is introduced in Sect. 5 and Eilenberg’s theorem is extended in Sect. 6.
Notation and Basic De nitions This section is devoted to basic notation and de nitions on ordinals, words, and automata. .1
Ordinals
We refer the reader to [15] for a complete introduction to the theory of ordinals. In this paper, ordinals are usually denoted by lower Greek letters like , , γ and for a limit ordinal. We identify the linear order on ordinals with the membership. An ordinal is then identi ed with the set of ordinals smaller
An Eilenberg Theorem for Words on Countable Ordinals
55
than . We mainly use ordinals to index sequences. For an ordinal , a sequence x and is a function of length or an -sequence over a set E is denoted by (xγ )γ from into E. When = + 1 is a successor ordinal, an -sequence x is also denoted by (xγ )γ . A sequence ( γ )γ of ordinals less than is co nal with if, for any < , there exists γ < such that < γ < . In this paper, we use only countable ordinals (except for 1 ). We recall that a limit ordinal is countable i there exists an -sequence of countable ordinals which is co nal with it. .
Words
Let A be a nite set called the alphabet whose elements are called letters. For an ordinal , an -sequence of letters is also called a word of length or an -word over A. The set of all words of countable length over A is denoted by A . A subset of A is called a language or an 1 -language. Let < be three ordinals and x be an -word. The factor y = x[ [ is the word of length − de ned by yγ = x +γ for γ < − . A factorization of a word is a decomposition of the word into a sequence of factors. The factorization is determined by the positions where the word is cut. More formally, we have the following de nition. De nition 1. Let be an ordinal and x be a word of length . A factorization = . If of x is a strictly increasing sequence ( γ )γ such that 0 = 0 and the sequence ( γ )γ is co nal with , the factorization is said to be co nal. For γ < , the factorization de nes the factor xγ = x[ γ γ+1 [. Let (xγ )γ be a sequence of words, each word xγ being of length γ . The P concatenation of this sequence of words is the word y of length P γ γ such = γ that for any ordinal < , we have y[ +1 [ = x where γ. If ( γ )γ is a factorization of a word x, the concatenation of the sequence of factors xγ = x[ γ γ+1 [ de ned by the factorization is equal to the word x. If y is word and is an ordinal, we denote by y the concatenation of the where xγ = y for any γ < . sequence (xγ )γ Example 1. Let A = a b be the alphabet. The word x = (ab) is the -word such that xi = a if i is even and xi = b otherwise. The words x[0 4[= abab and x[1 [= (ba) are two factors of x. .3
Automata
B¨ uchi automata [6] on trans nite words are a generalization of usual (Kleene) automata on nite words, with a second transition function for limit ordinals. States reached at limit points depend only on states reached before. De nition . An automaton A is a 5-tuple (Q A E I F ) where Q is the nite set of states, A a nite alphabet, E (Q A Q) (P(Q) Q) the set of transitions, I Q the set of initial states and F Q the set of nal states.
56
Nicolas Bedon and Olivier Carton
Transitions are either of the form (q a q 0 ) or of the form (P q) where P is a subset of Q. We now explain how these automata are used to de ne languages. Before describing a path in an automaton, we de ne the co nal set of a sequence at some point. De nition 3. Let c be -sequence of states and a limit ordinal. We denote by cof (c), the set of states q such that there exists a sequence ( γ )γ co nal with satisfying c γ = q for any γ < . A state q belongs to cof (c) if for any ordinal γ < , there exists an ordinal γ < < such that c = q. We come now to the de nition of a path in an automaton. De nition 4. Let A = (Q A E I F ) be an automaton. A path c of length from p to q in A is an ( + 1)-sequence of states such that q0 = p and q = q; A such that (q a q +1 ) is a transition for any < there exists a of A; for any limit ordinal , (cof (c) q ) is a transition of A. is called a label of the path c. The content C(c) of c is The word u = (aγ )γ q γ q = qγ . We denote the existence of such a path c by c : p@ > u > C(c) > q The path is successful i p I and q F . We denote by L(A) the class of labels of successful paths. A word is accepted (or recognized ) by A if it belongs to L(A). An automaton is deterministic i it satis es the following conditions. It has only one initial state, for any q Q a A there is at most one state q 0 such P(Q) there is at most one that (q a q 0 ) is a transition of A and for any P state q 0 such that (P q 0 ) is a transition of A. A language X is rational if there exists an automaton A such that X = L(A). Example . The deterministic automaton A = ( 1 with
3
a b E
1
1
)
E = (1 a ) (1 b 3) ( a ) ( b ) (3 a 3) (3 b 3) ( 1
1) (
1) ( 3 3)
is pictured in Fig. 1. This automaton recognizes words over a b beginning with an a and having letter a at all limit positions. Since in this paper we are only interested in words of countable length, we write L(A) instead of L(A) A for any automaton A.
An Eilenberg Theorem for Words on Countable Ordinals
a
a,b
1
b
1,
3
57
a,b
3
Fig. 1. Automaton A.
.4
Algebraic De nitions
We will use, in this paper, classical notions from universal algebra and especially semigroups. We refer the reader to [1,13] for de nitions. A pair ( e) of elements of a semigroup is linked if e is idempotent and e = . Two linked pairs ( e) and ( 0 e0 ) of elements of a semigroup S are conjugate if there exist x y S 1 such that e = xy, e0 = yx and 0 = x.
3
ω1 -Semigroups
We now turn to the de nition of the main algebraic structure of the paper. This is a generalization of semigroups, where associative products of any countable ordinal length are allowed. We will need the following theorem which is a particular case of a more general one due to Ramsey. Theorem 1 (Ramsey). Let S be a nite semigroup and x be an -sequence over S. There is then a co nal factorization (ki )i of x and a linked pair ( e) xk1 −1 = and xk xk +1 −1 = e for j 1. of S such that xk0 We now de ne the notion of an 1 -semigroup. Roughly speaking, an 1 semigroup is a set S equipped with a product which maps any sequence of countable length over S to an element of S. This notion generalizes the usual notion of a semigroup where the product is de ned on nite sequences of elements. De nition 5. An 1 -semigroup is a set S equipped with a function which satis es the following properties
:S
S
(1) For any element S, ( ) = . ( ) For any word x over S, and any factorization ( γ )γ of x, if y is the word of length over S de ned by yγ = (x[ γ γ+1 [), then (x) = (y)
58
Nicolas Bedon and Olivier Carton
The former property states that the image by of a sequence of length 1 is the unique element of the sequence. The latter one states that the function satis es a generalization of the usual associativity. In the case of semigroups, it su ces to assume the associativity over sequences of length 3 (i.e. 1 ( 3 ) = ( 1 ) 3 ) to ensure that the product of a nite sequence does not depend on the order of evaluation. In our case, we have to guarantee the associativity for any factorization of any sequence of countable length. An 1 -semigroup is not an usual algebra since the product does not have a nite arity. Even if the 1 -semigroup is nite, the description of the product is not nite since the product of any sequence of countable length must be given. We will actually see later that the product of a nite 1 -semigroup can be described in a nite manner. Even if the notion of an 1 -semigroup does not really t into the general framework of universal algebra, the following notions are self understanding: morphism of 1 -semigroups, quotient of 1 -semigroups, sub- 1 semigroup, congruence of 1 -semigroups. For an 1 -semigroup S, we denote by S 1 the 1 -semigroup obtained by adding a neutral element to S. The following example corresponds to the free semigroup A+ over a nite alphabet A. Example 3. Let A be an alphabet and let A be the set of words over A of countable length. The concatenation maps any sequence of words of A to a word of A . It can easily be veri ed that A equipped with the concatenation as the product is an 1 -semigroup. This 1 -semigroup is actually the free 1 semigroup on A. Furthermore, if S is an 1 -semigroup with as the product, the function is a morphism from the 1 -semigroup S into S. In semigroup theory, the product of a sequence 1 n of elements is exactly denoted as the word 1 n by mere concatenation. This does not lead to any confusion. We will follow this notation and we will usually not distinguish between a sequence of elements of an 1 -semigroup and the product of these elements. The following de nition is just a generalization of recognizability with usual semigroups. : S T De nition 6. Let S and T be two 1 -semigroups. A morphism recognizes a subset X of S if −1 (X) = X. We say that T recognizes X if there exists a morphism from S into T recognizing X. For a nite alphabet A, A is said to be recognizable if there exists a nite 1 an 1 -language X semigroup which recognizes X. We have already pointed out that an 1 -semigroup is not really an usual algebra since the product does not have a nite arity. Nevertheless, when the 1 -semigroup is nite, it is possible to describe the product in a nite way. It has been proved by Wilke [18] that it is possible to de ne, on a nite semigroup, a product for sequences on length . We extend this result by de ning a product for any sequence of countable length. Furthermore, this product is entirely determined by products of the form ( ) and the product of nite sequences. We rst introduce a de nition.
An Eilenberg Theorem for Words on Countable Ordinals
59
De nition 7. Let S be a semigroup. A function : S S which maps to (the function is denoted in post xed notation) is said to be compatible with S i it satis es for any t S, (
(t ) = ( t) ; n ) = for any n > 0.
A compatible function will be always denoted in post xed notation. Thus, both denotes the constant sequence of length and the image by of . This ambiguity will be justi ed by the following property. A nite semigroup equipped with a compatible function can be turned into a nite 1 -semigroup and the image by of is then the product of the sequence . Since we do not distinguish between a sequence of elements of an 1 -semigroup and the product of these elements, this ambiguity is natural. The following theorem states that for a nite set S, it is equivalent to endow S with a structure of 1 -semigroup or to endow it with a structure of semigroup with an additional compatible function . Theorem . Let S be an 1 -semigroup whose product is denoted by . The binary product t = ( t) naturally endows S with a structure of semigroup = ( ) is compatible. and the function given by Conversely, let S be a nite semigroup and be a compatible function. The semigroup S can be uniquely endowed with a structure of 1 -semigroup such that ( )= . Proof. Let S be an 1 -semigroup whose product is denoted by . It may be easily checked that the binary product de ned by t = ( t) is associative = ( ) is compatible. and that the function de ned by Assume now that S is a semigroup and that is a compatible function. We de ne a product from S to S. The function is de ned by induction on the length of the word x. If x = 0 is of length 1, we set (x) = 0 . If the length of x is a successor ordinal + 1, we set (x) = ( (x[0 [) x ). If the length of x, is co nal a limit ordinal , we rst choose a strictly increasing -sequence ( γ )γ with and we set γ = (x[ γ γ+1 [). By Theorem 1, we have a linked pair ( e). We then set (x) = e . The value of (x) does not depend on the choice and of the linked pair ( e). If ( γ0 )γ and ( 0 e0 ) are of the sequence ( γ )γ 0 0 other choices, the linked pair ( e ) is conjugate with ( e) and e = 0 e0 . It can be easily veri ed by induction that the function endows S with a structure of 1 -semigroup. From now on, we will not distinguish between a nite 1 -semigroup and a nite semigroup with an additional compatible function . We point out that both notions do not coincide anymore when the 1 -semigroups considered are not nite.
4
Automata and ω1 -Semigroups
The following theorem states that the notion of recognizability by 1 -semigroups is equivalent to the one by automata for words of countable length.
60
Nicolas Bedon and Olivier Carton
Theorem 3. An
1 -language
is rational i it is recognizable.
The proof of the previous theorem exhibits e ective methods to obtain a nite from an automaton, and conversely a deterministic automaton from a nite 1 -semigroup, such that both the 1 -semigroup and the automaton recognize the same language. it follows that non deterministic automata are not more powerful than deterministic ones for words of countable length. 1 -semigroup
Theorem 4 (B¨ uchi). Let A be an automaton. There exists a deterministic automaton B such that L(B) = L(A). We now briefly describe how to construct a nite 1 -semigroup equivalent to a given automaton A = (Q A E I F ). We rst consider the set T = P(Q) of all subsets of Q. The union naturally endows T with a structure of semigroup. The set K = P(T ) of all subsets of T is equipped with a structure of semi-ring. We then consider the semigroup S = K QQ of Q Q-matrices with coe cients in K. The idea is to encode the automaton by matrices of S, and to associate to any word x a matrix (x) such that (x)p q = l
Q
c : p@ > x > l > q
It is then pure routine to de ne a compatible function on S which turns S into an 1 -semigroup in such a way that becomes a morphism from A into S. The 1 -semigroup S then recognizes the language L(A). The converse construction is rather technical and involves methods from semigroup theory like semigroups expansions.
5
The Syntactic ω1 -Semigroup
In this section, we prove that with any rational subset X of an 1 -semigroup S, one can e ectively associate a canonical nite 1 -semigroup synt(X) which divides any 1 -semigroup recognizing X. This syntactic 1 -semigroup is the quotient of S by a syntactic congruence X we now de ne. This syntactic congruence is actually the counterpart of Arnold’s congruence [ ] for rational languages of -words. De nition 8. Let S be an 1 -semigroup and X a subset of S. For any x y S, S1, we say that x X y i for any positive integer m and elements 0 m 0(
(((x 1 )
)
3)
For m = 1, the expression as 0 x 1 .
) 0(
m
X (((x 1 )
0(
)
(((y 1 ) 3)
)
) m
3)
)
m
X
should be understood
Theorem 5. Let X be a rational subset of an 1 -semigroup S. The relation X is a congruence of 1 -semigroup of nite index and the quotient S X of S by X divides any 1 -semigroup recognizing X. The 1 -semigroup S X is called the syntactic 1 -semigroup of X and is denoted by synt(X).
An Eilenberg Theorem for Words on Countable Ordinals
61
In particular, synt(X) is nite and is smaller than any 1 -semigroup recognizing X. The following proposition states that the computation of the syntactic 1 semigroup of a rational subset X of an 1 -semigroup can be done in any 1 semigroup T recognizing X and that this computation is e ective when T is nite. Proposition 1. Let X be a rational subset of an 1 -semigroup S and let : S T be an onto morphism recognizing X. Let P = (X) the image of X in T . Let P be the equivalence relation de ned by x P y i for any positive tm T 1 , integer m T and any elements t0 t0 (
(((xt1 ) t ) t3 )
Then x
X
yi
(x)
) tm P
P
t0 (
(((yt1 ) t ) t3 )
(y) and synt(X) = T
) tm
P
P.
Example 4. Let X A be the set of words whose length is a limit countable ordinal. The syntactic 1 -semigroup synt(X) of X is e f with e = f e = e, S maps any ef = f = f and e = f = f . The canonical morphism : A letter a A to e and X = −1 ( f ).
6
Varieties
In this section we extend the Eilenberg one-to-one correspondence between pseudo-varieties of semigroups and varieties of rational sets of nite words (see [8,13]) to languages of words recognized by 1 -semigroups. We rst de ne both notions of a pseudo-variety of 1 -semigroups and of a variety of 1 -languages. All 1 -semigroups considered in this section are nite, except free 1 -semigroups. 6.1
The Correspondence Theorem
A pseudo-variety of 1 -semigroups is a class of 1 -semigroups closed under division and nite product. We will denote pseudo-varieties of 1 -semigroups by bold letters. Example 5. The class of commutative pseudo-variety of 1 -semigroups.
1 -semigroups
Before de ning the notion of a variety of of a residual of a language.
(satisfying xy = yx) is a
1 -languages,
we need the notion
S. We call residuals De nition 9. Let S be an 1 -semigroup, X S and t X , X −1 = t S t X and of X by the subsets −1 X = t S X . X − = t S (t ) It may be checked that if X is recognized by an 1 -semigroup T , the residuals −1 X, X −1 and X − are also recognized by T . In particular, a rational language X has a nite number of residuals.
6
Nicolas Bedon and Olivier Carton
De nition 10. A variety of 1 -languages V is a function which associates to any alphabet A a class A V of rational 1 -languages of A such that: for any alphabet A, A V is a boolean algebra; for any alphabet A, if X A V and x A , x−1 X A V, Xx−1 A V and A V; Xx− B is a morphism of free 1 -semigroups and X B V then if : A −1 (X) A V. We now use pseudo-varieties of 1 -semigroups to give a classi cation of regular 1 -languages by means of the properties of their syntactic 1 -semigroup. If V is a pseudo-variety of 1 -semigroups and A an alphabet, we denote by A V the set of languages of A recognized by an 1 -semigroup of V. Since a variety of 1 -semigroups is closed under division, a language X belongs to A V i its syntactic 1 -semigroup synt(X) belongs to V. It is straightforward to verify that if V is a variety of 1 -semigroup, then V is a variety of 1 -languages. We are now able to state the main theorem which extends Eilenberg’s theorem to words of countable length (see [1, p. 65] or [13, Cor. 4.8]). Theorem 6. The map V V is a bijection between varieties of and varieties of 1 -languages.
1 -semigroups
Proof. The proof essentially mimics the proof for the case of nite words. It is proved that two varieties V and W satisfy V W i for any alphabet A, A V A W. 6.
Examples
We nally give some examples of correspondences between pseudo-varieties of 1 -semigroups and varieties of 1 -languages. The following theorem extends the well-known result for nite words. Theorem 7. For any
1 -language
X, the following conditions are equivalent.
(1) X is star-free, i.e., described by a regular expression using products and boolean operations. ( ) X is de ned by a rst order formula. (3) The syntactic 1 -semigroup of X is aperiodic, i.e., contains no group. We now come to two examples really relevant to words on ordinals. Both theorems extend the result corresponding to the pseudo-variety J1 of semigroups. We rst introduce some notation. For an ordinal = 1 n1 + + k nk written in Cantor normal form, we denote deg = 1 (see [15, p. 6 ]) with the convention deg 0 = − . For an -word x and a letter a, we denote x a the ordinal isomorphic (for the order) to the well ordered set γ < xγ = a . If the word x is nite, x a is the number of occurrences of a in x. For an integer m, we de ne the equivalence relation m on A by x
m
y
a
A min(m deg x a ) = min(m deg y a )
An Eilenberg Theorem for Words on Countable Ordinals
63
The relations m extend the notion of content to words of non nite length [1, p. 130]. In particular, two words x and y satisfy x 0 y i the same letters appear in x and y. Theorem 8. An 1 -language is a union of 0 -classes i its syntactic group satis es the equations x = x, xy = yx and x = x.
1 -semi-
Theorem 9. An 1 -language is a union of m -classes for some integer m i its syntactic 1 -semigroup satis es the equations x = x, xy = yx and (xy) = x y .
7
Concluding Remarks
In the case of nite words, there are +-varieties and -varieties depending on whether the empty word is considered or not. In this paper, we have excluded the empty word but it could easily be taken into account by slightly modifying the de nitions. The 1 -semigroups should then be replaced by 1 -monoids with a neutral element and almost all results remain true. Usual Eilenberg’s theorem for nite words can be extended to varieties without complementation [13]. This approach uses ordered semigroups. Ordered 1 semigroups may be easily de ned to obtain a similar result for words of countable length. Acknowledgement. We would like to thank Dominique Perrin for suggesting to consider automata on ordinals and to investigate algebraic properties of such languages.
References 1. J. Almeida. Finite semigroups and universal algebra, volume 3 of Series in algebra. World Scienti c, 1994. . A. Arnold. A syntactic congruence for rational ω-languages. T eoretical Computer Science, 39:333 335, 1985. 3. N. Bedon. Automata, semigroups and recognizability of words on ordinals. IGM report 96-5, to appear in International Journal of Algebra and Computation. 4. N. Bedon. Star-free sets of words on ordinals. IGM report 97-8, submitted to Information and Computation. 5. J. R. B¨ uchi. On a decision method in the restricted second-order arithmetic. In Proc. Int. Congress Logic, Met odology and P ilosop y of science, Berkeley 1960, pages 1 11. Stanford University Press, 196 . 6. J. R. B¨ uchi. Trans nite automata recursions and weak second order theory of ordinals. In Proc. Int. Congress Logic, Met odology, and P ilosop y of Science, Jerusalem 1964, pages 3. North-Holland, 1965. 7. Y. Choueka. Finite automata, de nable sets, and regular expressions over ω n -tapes. J. Comp. Syst. Sci., 17:81 97, 1978.
64
Nicolas Bedon and Olivier Carton
8. S. Eilenberg. Automata, languages and mac ines, volume B. Academic Press, 1976. 9. J.-P. Pecuchet. Etude syntaxique des parties reconnaissables de mots in nis. Lecture Notes in Computer Science, 6: 94 303, 1986. 10. J.-P. Pecuchet. Varietes de semigroupes et mots in nis. Lecture Notes in Computer Science, 10:180 191, 1986. 11. D. Perrin. Recent results on automata and in nite words. In M. P. Chytil and V. Koubek, editors, Mat ematical foundations of computer science, volume 176 of Lecture Notes in Computer Science, pages 134 148, Berlin, 1984. Springer. 1 . D. Perrin and J.-E. Pin. Semigroups and automata on in nite words. In J. Fountain and V. A. R. Gould, editors, NATO Advanced Study Institute Semigroups, Formal Languages and Groups, pages 49 7 . Kluwer academic publishers, 1995. 13. J.-E. Pin. Handbook of formal languages, volume 1, chapter Syntactic semigroups, pages 679 746. Springer, 1997. 14. S. Rohde. Alternating automata and t e temporal logic of ordinals. PhD thesis, University of Illinois, Urbana-Champaign, 1997. 15. J. G. Rosenstein. Linear ordering. Academic Press, New York, 198 . 16. M. P. Sch¨ utzenberger. On nite monoids having only trivial subgroups. Information and Control, 8:190 194, 1965. 17. D. Therien and T. Wilke. Temporal logic and semidirect products: An e ective characterization of the until hierarchy. In Proceedings of t e 37t Annual Symposium on Foundations of Computer Science, 1996. To appear. 18. T. Wilke. An Eilenberg theorem for -languages. In Automata, Languages and Programming: Proc. of 18t ICALP Conference, pages 588 599. Springer, 1991. 19. J. Wojciechowski. Finite automata on trans nite sequences and regular expressions. Fundamenta informatic , 8(3-4):379 396, 1985.
Maximal Groups in Free Burnside Semigroups Alair Pereira do Lago Instituto de Matematica e Estat stica Rua do Matao, 1 1 Universidade de Sao Paulo 55 8-9 Sao Paulo SP Brazil
[email protected] p.br
Ab tract. We prove that any maximal group in the free Burnside semigroup de ned by the equation xn = xn+m for any n 1 and any m 1 is a free Burnside group satisfying xm = 1. We show that such group is free over a well described set of generators whose cardinality is the cyclomatic number of a graph associated to the J -class containing the group. For n = 2 and for every m 2 we present examples with 2m − 1 generators. Hence, in these cases, we have in nite maximal groups for large enough m. This allows us to prove important properties of Burnside semigroups for the case n = 2, which was almost completely unknown until now. Surprisingly, the case n = 2 presents simultaneously the complexities of the cases n = 1 and n 3: the maximal groups are cyclic of order m for n 3 but they can have more generators and be in nite for n 2; there are exactly 2 A J -classes and they are easily characterized for n = 1 but there are in nitely many J -classes and they are di cult to characterize for n 2.
1
Introduction
Take a nite set of generators A with A letters and take n and m integers satisfying the restrictions n 1 and m 1. The set of all words with letters in A (including the empty word 1) is denoted by A and A+ is the set A 1 . Let be the smallest congruence be the relation (xn+m xn ) x A+ and let that contains . The congruence class of a word w A will be denoted by w and M will denote the set w w A . The canonical projection from A to M induces the multiplication u v = uv. Thus (M ) is a monoid and we have that (M 1 ) is a semigroup since 1 = 1 . These are the free Burnside monoid and the free Burnside semigroup with A generators de ned by the equation xn = xn+m . If we allow n = 0 the monoid is also a group and is called a free Burnside group. Since 1 = 1 for n > 0, we will study the free Burnside semigroups by studying the free Burnside monoids. For the study of the structure of a semigroup, we need the Green relations J , D, R, L and H de ned as usual. They are equivalences and, in particular, the J -class of an element x is the class of all elements y that generate the same principal ideal M xM . All the de nitions and properties can be seen, for instance, in the book of Lallement [17]. A semigroup is said to be nite J -above C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 65 75, 1998. c Springer-Verlag Berlin Heidelberg 1998
66
Alair Pereira do Lago
if its J -classes are nite and, given a particular J -class, there are nitely many J -classes above it, where the order is induced by the inclusion of the principal ideals. We will exhibit now some structural properties about the free Burnside semigroups and groups. When A = 1, the free Burnside semigroups and the free Burnside groups are cyclic and nite. Their structure is very simple hence we will suppose that A > 1 from now on. The problem of determining whether the free Burnside groups are nite or not is extremely complex and it has been studied since 190 . In particular, they are nite for m 4 and for m = 6 due to Burnside, Sanov, and M. Hall; and they are in nite for m divisible by 13 or by any odd number not smaller than 665 due to Novikov and Adyan [3,1], Lysenok [18], and Ivanov [14,15]. In order to classify the structure of the free Burnside semigroups we have to split them in three cases: n = 1, n = and n 3. Our main interest here is the case n = whose structure was almost completely unknown until now. We will comment the case n = 1 rst. The idempotent semigroup the free Burnside semigroup for which n = 1 and m = 1 is nite and completely known [11] since 1950. In fact, from this classical paper of Green and Rees, we know that the free Burnside semigroups in which n = 1 and m 1 are nite for every alphabet A if and only if the free Burnside groups in which m 1 are nite 13 for every alphabet A. In particular, the semigroups are in nite for m 665 and nite for m 4. However, in every case, some niteness properties remain. For instance, there are exactly jAj J -classes. It was known since 1950 that any maximal group in these semigroups should be a homomorphic image of a free Burnside group with the same periodicity m and su cient number of generators. However, only in 1990 the work of Kadourek and Polak [16] proved that these maximal groups are isomorphic to the free Burnside groups. All these proofs strongly depend on the fact that n = 1 and cannot be easily extended for other values of n. Let us consider now the free Burnside semigroups in the cases n . Using Thue-Morse words [ 1] and work of Brzozowski et al. [ ], we know that these free Burnside semigroups are in nite (we are considering A ) and they have in nitely many J -classes. These are practically all the properties known until now for the case n = . While these semigroups are in nite, Brzozowski conjectured that some niteness properties should remain true for the free Burnside semigroups. In particular, he conjectured that for m = 1 any congruence class w should be recognizable. Motivated by Brzozowski’s conjecture, de Luca and Varricchio [4,6], McCammond [19], do Lago [8,9], and Guba [13,1 ] produced a sequence of papers that led to the discovery of many structural properties of the free Burnside semigroups in which n 3 and m 1. The reader can be referred to our work [10] to see all the history and proofs of these properties. Indeed, we e ectively de ne a rewriting system from and show that the congruences generated by and by are the same when n . All these properties depend uniquely on a property, called stability, of and we know that is stable when n 3 and m 1. Some of the properties that follow from the stability of are:
Maximal Groups in Free Burnside Semigroups
(1) ( ) (3) (4) (5) (6) (7) (8) (9) (10)
67
is a rewriting system with the Church-Rosser property; the word problem is solvable; there is a unique word with minimum length in each congruence class; the frame of the R-classes is a tree; there is a characterization of R and L-classes; there is a characterization of regular and irregular D-classes; the irregular H-classes are all trivial; the maximal groups are cyclic of order m; the semigroup is nite J -above; the Brzozowski conjecture holds.
The problem of whether or not the stability of holds for n = and m = 1 remains open. In case it holds, it would imply our set of structural properties also for this case. However, it does not hold for n = and m (as one can verify in our work [10]) and new techniques have to be found in these cases. The main result of this paper presents a property that holds for any n 1 and any m 1. We prove that any maximal group in the free Burnside semigroup is a free Burnside group satisfying xm = 1. This contains the results of Kadourek and Polak who studied the case n = 1. Furthermore, we show that such group is free over a set of generators whose cardinality is the cyclomatic number of a graph associated to the J -class containing the group and we present a transparent description of the set of generators. If is stable (in particular, for n 3), this graph is a cycle and the maximal groups are cyclic groups of order m. Our proofs hold for any n 1 and any m 1. This is particularly interesting since all the previous techniques could only be applied in the particular cases where they were used. For every m we present examples with m − 1 generators both for n = 1 and for n = . Hence, in these cases, we have in nite maximal groups for large enough m. This immediately implies that the J -classes of these groups are in nite, these semigroups are not nite J -above and also the congruence class associated to an element of such in nite group is not recognizable (see Thm. .1 in [5] for this implication). We also present an example of a congruence class in the case n = and m = that has two di erent words of minimum length. Most properties that hold if is stable fail if n = and m . Finding these examples in the case n = required the use of powerful computing resources. However, all statements relative to this example except Conjecture 4 are proved here. Summarizing, this paper presents new and powerful techniques which allow us to prove important properties of the Burnside semigroups for n = , a case almost completely unknown until now. Somewhat surprisingly the case n = presents simultaneously the complexities of the cases n = 1 and n 3. While the maximal groups are cyclic of order m for n 3, they can have more generators and be in nite for n . While there are exactly jAj J -classes and they are easily characterized in the case n = 1, there are in nitely many J -classes and they are di cult to characterize for n .
68
Alair Pereira do Lago
To conclude the introduction, we wish to point out some connections and motivations of this area, related to Computer Science. The study of periodicities in words appears in many important areas of Computer Science like pattern matching and searching algorithms. The free Burnside structures are de ned by equations which impose the equivalence of certain powers of words in any context as we just saw. This paper together with previous work shows that these Burnside structures behave very di erently depending on the amount of periodicities involved (measured by the values of m and of n). In view of this variation, understanding the underlying combinatorics is an intriguing adventure and might also shed important light to understand other questions in Computer Science which depend on periodicities. Recall that the word problem for a xed relation of words is undecidable in general, even for a nite relation. In our case, the relation is , which depends on n 1 and m 1, and the word problem can be formulated as follows: given words x and y, we want to know whether or not we can obtain y from x by successive substitutions of n-powers by the respective n + m-powers (or vice-versa). The answer we have for this word problem strongly depends on the values of n and m, as described previously. One important consequence of the Brzozowski conjecture is an application which allows an easy solution of the word problem. Indeed, if we prove constructively the Brzozowski conjecture, we have a nite automaton, depending on x, which solves the above instance of the word problem. This includes all cases where n 3. Unfortunately, in most of the cases where n = 1 or n = , the Brzozowski conjecture does not hold. We can still possibly solve the word problem. Assume that the fundamental graph can be e ectively obtained and the word problem is decidable for n = 0 and m 1. Then Thm. implies that the word problem is decidable for n 1 and the same m. In particular, this holds when n = 1 and m is large enough, even though the Brzozowski conjecture does not hold in this case. For cases like n = , the word problem remains open.
Free Burnside Groupoids Categories and groupoids generalize the concepts of monoids and groups. For their de nitions and their importance in the study of monoids, as well as many properties and other de nitions we are going to use, we refer the reader to the work of Tilson [ ]. If C is any category and v is an object of C, the subcategory Cv formed by all elements that start and end in v is a monoid and is called local monoid of v. In our work all graphs are directed and are seen as sets of arcs. The set of vertices and the incidence functions will be implicitly assumed. Take G a strongly connected graph and let V be its set of vertices. Let T be a spanning tree of G. We recall that the cyclomatic number of G is the cardinality of G T. We will denote by K (and also by G ) the free category generated by G, that is, the set of all paths in G including the empty path 1v for each v V . Take = the smallest congruence on K that contains the set (xm 1v ) v V x Kv . The quotient category B = K = is the free Burnside category generated by G
Maximal Groups in Free Burnside Semigroups
69
satisfying xm = 1. Since G is strongly connected, B is a groupoid and is called t e free Burnside groupoid generated by G. Since B is a groupoid, all monoids Bv are isomorphic groups. Our rst result is a relation between the free Burnside groupoid B and the free Burnside groups. Theorem 1. Given t e de nitions above, t e local groups Bv for v V are all isomorp ic to t e free Burnside group de ned by t e equation xm = 1 over a set of generators w ose cardinality is t e cyclomatic number of G. This proof is long and complex to t in this paper. Complete proofs of all the results stated in this paper, together with other properties of the Burnside semigroups, will appear in [7]. In particular, the proof of Thm. 1 explicits a generating set of the Burnside group under consideration as a function of G T.
3
D-Entrances and Doors
Given words u v x w A we say that (u x v) is a factorization of w if w = uxv. We frequently denote the factorization by w = uxv or say that it is an occurrence of x as a factor of w. We de ne the partial order on the set of all u0 and v v0 . factorizations of w as follows: (u x v) (u0 x0 v 0 ) i u (uk xk vk ), a sequence of factorizations of w which is Given (u1 x1 v1 ) an anti-chain, we say that it is ordered by occurrence if u < u +1 for i = 1 k − 1. We de ne the relation J 0 over A as follows: u J 0 v i u J v. In a similar way we de ne the relations R0 , L0 , H0 and D0 in terms of R, L, H and D which are de ned in M. Consider a word w. We de ne Rent (w), the R-entrance of w, as being the shortest pre x u of w such that u R0 w and we de ne Lent (w) the L-entrance of w, as being the shortest su x u of w such that u L0 w. We say that w is an R-entrance if Rent (w) = w and we say that w is an L-entrance if Lent (w) = w. We say that a word u A is a D-entrance of w if there is a minimal factorization (x u y) of w such that u D0 w. This factorization (x u y) will be called a window of w. We say that w is a D-entrance if w is a D-entrance of w, which is equivalent to say that w is both an R-entrance and an L-entrance. A word w can have several factors that are D-entrances of w. A door is any word awb with a b A such that aw D awb D wb D w. Any two factorizations (u x v) and (u0 x0 v 0 ) of w such that x and x0 are both doors or both D-entrances of w are incomparable. Proposition 1. A word w is a door i w as exactly two di erent windows and −1 −1 Lent (w) 1). if t ey are (1 Rent (w) Rent (w) w) and (wLent (w) (uk xk vk ) be t e sequence Proposition . Let w be a word, (u0 x0 v0 ) (u0k0 x0k0 vk0 0 ) ordered by occurrence of all windows of w, and let (u01 x01 v10 ) be t e sequence ordered by occurrence of all factorizations of w suc t at x0 is a door and x0 D w. T en we ave t at k 0 = k, t at u0 = u −1 and t at v 0 = v for i=1 k.
7
Alair Pereira do Lago
The following property is an immediate extension of an old result of I. Simon [ 0] for m = 1 and implies that the frame of the R-classes is a tree: Lemma 1. Let w w0 A+ be suc t at w R0 w0 . T en t ere exist u u0 A and a A suc t at Rent (w) = ua, t at Rent (w0 ) = u0 a and t at u u0 . In 0 particular, we ave t at Rent (w) Rent (w ).
4
Fundamental Graphs
Consider an idempotent e M 1 and let De be its D-class. Let D = d De d is a D-entrance . From Prop. 1 we have that a door w starts with the D-entrance Rent (w) and ends with the D-entrance Lent (w). Using Lemma 1 and its dual, two congruent doors w w0 start with congruent D-entrances and end with congruent D-entrances. Now we can de ne the fundamental grap of De as being G = w w is a door and w De , where the set of vertices is D and the incidence functions are well de ned by t e start
(w) = R^ ent (w) and
t e end (w) = L^ ent (w). One can prove that the graph G is strongly connected. We de ne the free category K = G generated by G, the congruence = and the free Burnside groupoid B = K = as in Sect. . Given p K, a path in G, we denote its canonical projection on B (the congruence class by =) by p. We de ne the partial function c : A − K as follows. Let w be a word such that w De and let (u1 x1 v1 ) (uk xk vk ) be the sequence ordered by occurrence of all k. If k > 0 factorizations of w such that x is a door and x D0 w for i = 1 xk ) which is in turn a then we de ne the contents of w as c(w) = (x1 x path in G. If k = 0, we can take the unique window (u d v) of w and de ne c(w) = 1de. The next theorem characterizes when two words w w0 are congruent in terms of the fundamental graph. Its proof is combinatorial and complex. Theorem . Given two words w and w0 suc t at w w0 w w0 i t e following conditions old:
De , we ave t at
(1) Rent (w) Rent (w0 ); ( ) Lent (w) Lent (w0 ); (3) c(w) = c(w0 ). Using Thm.
and Thm. 1 we can obtain our main result.
Theorem 3. Any maximal group in a free Burnside semigroup t at satis es xn+m = xn is a free Burnside group satisfying xm = 1. Moreover, suc group is free over a set of generators w ose cardinality is t e cyclomatic number of t e fundamental grap of t e D-class t at contains t e group. Hd and let w be any Proof. Let d D and Hd be its H-class. First let ^ c(w) is a path (w) = L (w) = d, its content word such that w = . Then R^ ent ent Bd . From Thm. , the value c(w) does not depend from d to d and c(w)
Maximal Groups in Free Burnside Semigroups
71
on the particular choice of w and we have de ned a function from Hd to Bd . Besides, this function is one-to-one due to the same theorem and applies d to 1d . Since for every path p K there exists a word w A such that c(w) = p, that Rent (w) = (p) and that Lent (w) = (p), we have that this function is surjective, and hence it is a bijection. On the other hand, if we consider x y M such that x e y = d, this multiplication by x to the left and by y to the right de nes a bijection between the H-class He of e and Hd . The composition of these two bijections establishes a bijection from He to Bd which maps e onto 1d . We can verify that this is indeed an isomorphism of groups.
5
Some Examples for the Case n =
We will suppose n = and m in this section. Furthermore, the symbols 3 5 7 will denote the pre xes of (ba) of length 3 5 7, respectively. (Ex: 5 = babab.) Our example mentioned in Sect. 1 is the H-class of 55 and we will present some of its properties in this section. I. Simon studied the D-class 55 in [ 0] for the case n = and m = 1. The counter-example that proves that is not stable for n = and m , as we have mentioned in Sect. 1, is also related to this D-class [10]. Given two words u and w we will denote by #u (w) the remainder of the division by m of the number of occurrences of u as a factor of w. Given a function f with domain A , we say that the congruence preserves f if f (w) = f (w0 ) for any words w and w0 such that w w0 . It is not hard to verify that preserves the functions #a , #b , #bb and #bbabb . Given a set U of words, we de ne the function U -c aracteristic of w by U (w)
=
1, if w A U A ; 0, otherwise.
Given any two words w and w0 , if preserves the function U and if U (w) = 0 0 0 m U (w ) then one can prove that w D w . We de ne the sets U1 = abab(b ) ba = , U = bab(bm ) bab and U3 = ab(bm ) baba. For i = ababbmk ba k = 0 1 1 and 3, we denote the function Ui simply by . One can prove that the congruence preserves these functions. Given a fundamental graph G and a path p in G, we de ne (p) in the following way: if p = 1de for a D-entrance d, then we choose (p) to be a shortest word in d; if p = a for an arrow a G, then we choose (p) to be a shortest word in a such that (1 (ea) ) Pref( (a)) and that (1 (ea) ) Su ( (a)); −1 (q). if p = aq for q K and a G, then we de ne (p) = (a) (1 (q) ) One can prove that (p) is a shortest word whose contents is p, i.e., c( (p)) = p. Example 1. The graph G presented in Fig. 1 is a subgraph of the fundamental graph of the D-class of 55.
^
72
Alair Pereira do Lago Dm−1 .. .
f1 D
e A
de0
^
e C
f1 E
A:
a35ba
d0 : a33a
B:
353
d1 : 35ba
C:
ab53a
d2 : ab53
D1 : .. .
a373a
Dm−1 : a35(ab)m−1 3a
Em−1 .. . de1
The D-entrances:
The doors:
E1 : .. . de2
ab535ba
Em−1 : ab53m−1 5ba
e B Fig. 1. Graph G de ned for n =
and m
Proof. Consider the words d0 , d1 , d , A, B, C as de ned in Fig. 1. For i in 1 m − 1 , we de ne D = a35(ab) 3a and E = ab53 5ba. Note that if we wish to prove that two words w and w0 satisfy w D0 w0 then it is su cient to prove that each one is a factor of a word in the congruence class of the other. The functions #u and U are used to prove that two given words are not congruent or are not in the same D0 -class. Two cases require a special proof, m − 1. however. We must prove that 35(ab) 3 D0 55 D0 b53 5b for i = 1 These proofs are quite technical and will not be completely done here. In the ^ 3 has exactly one maximal case of 35(ab) 3, we prove that any word in 35(ab) factor of period ab with at least four letters. In the case of b53 5b, we prove that ^ 5b and that they are not contiguous. there are exactly two such factors in b53 ^ 3 or These facts imply that 55 can not be a factor of any word in either 35(ab)
^ 5b and that 35(ab) 3 D0 55 D0 b53 5b. in b53 We will prove that the words d0 , d1 and d are non-congruent D-entrances in the D0 -class of 55. First we will prove that the words d0 , d1 and d are in the D0 -class of 55. They are all factors of 555 and thus factors of 555m 55. Since 55 is a factor of (ab) +m (ba) +m (ab) (ba) = a33a = d0 , since 55 is a factor (3ba) = 35ba = d1 , since 55 is a factor of ab55m3 = of 355mba = (3ba) +m +m (ab3) = ab53 = d , all the words d0 , d1 and d in the D0 -class of 55. (ab3) Now, we will prove that they are D-entrances. Since 3 (d0 ) = 1 = 0 = 3 (d0 a−1 ), we have that d0 D0 d0 a−1 . Since 1 (d0 ) = 1 = 0 = 1 (a−1 d0 ), we have that d0 D0 a−1 d0 and it follows that d0 is a D-entrance. In a similar way, 1 (d1 ) = 1 =
Maximal Groups in Free Burnside Semigroups
73
0 = 1 (d1 a−1 ) and (d1 ) = 1 = 0 = (b−1 d1 ) imply that d1 is a D-entrance. Finally, (d ) = 1 = 0 = (d b−1 ) and 3 (d ) = 1 = 0 = 3 (a−1 d ) imply that d is a D-entrance. To conclude, #bb (d0 ) = 1 = 0 = #bb (d1 ) = #bb (d ) d0 d . Furthermore, since the congruence preserves the implies that d1 d1 . rst letter in a congruence class, we have that d Fix an i in 1 m − 1 . Since d0 , d1 and d are D-entrances, one can verify that d0 = Rent (A) = Rent (D ) = Lent (C) = Lent (D ), that d1 = Rent (B) = Lent (A) = Lent (E ) and that d = Rent (C) = Rent (E ) = Lent (B). Using Lemma 1 and its dual, we have that no such words are either in the same 1 m−1 i . H0 -class, or can be congruent to each other. Take j Dj . Since Since #a (D ) = 6 + i = 6 + j = #a (Dj ), we have that D Ej . #a (E ) = 6 + i = 6 + j = #a (Ej ), we have that E We will prove that the words A, B and C are doors in the D0 -class of 55. The words A, B and C are all factors of 555 which in turn is a factor of 555m 55. Since either d0 or d1 are factors of A, B and C, it follows that these words are in the D0 -class of 55. Since 1 (a−1 Aa−1 ) = 1 (35b) = 0 = 1 = 1 (A), it follows immediately that A is a door. Since (b−1 Bb−1 ) = (ab5ba) = 0 = 1 = (B), it follows immediately that B is a door. Since 3 (a−1 Ca−1 ) = 3 (b53) = 0 = 1 = 3 (C), it follows immediately that C is a door. m − 1, are doors in the We will prove that the words D , for i = 1 m − 1 . First we will prove that D D0 D0 -class of 55. Fix an i in 1 55. Since d0 is a factor of D , it remains to prove that D is a factor of a word in 55. Let x = 5(ab) = b(ab) + = (ba) + b = (ba) 5 and let y = 5x5m−1 . Note that y = (5x5m−1 ) = 5x5m x5m−1 = 5(ba) 55m 5 (ab) 5m−1 5 (ba) 5 5(ab) 5m−1 = 5xx5m−1 . Then, by a simple induction on k 0, one can 5xk 5m−1 . Since 5 = (ba) b (ba) +m b = (ba)m− x and 5 = prove that y k +m m− m b(ab) = x(ab) , we have that y 5xm 5m−1 = 5 xm 5 5m− b(ab) m− m m− m− m− m− m− xx x (ab) 5 (ba) x x(ab) 5 555m− = 5m . Finally, (ba) we can verify that D is a factor of y, which in turn is a factor of y m 55 55. Now, we will prove that D is a door. Since (1 d0 b(ab) 3a) and 5m 55 (a3b(ab) d0 1) are two di erent windows of D , since 3x3 = a−1 D a−1 D0 55 as discussed before, D is a door due to Prop. 1. m − 1, are doors in the D0 We will prove that the words E , for i = 1 class of 55. Fix an i in 1 m − 1 . First we will prove that E D0 55. Since d1 is a factor of E , it remains to prove that E is a factor of a word in 55.1 First we will prove that 3(53 )m 53 353. Since 3 is both a pre x and a su x of 5 we have that 3(53 )m 53 3m 3(53 )m 533m = 3m− (33 ba)(33 ba)m (33 ba)b3m− 3m− (33 ba)(33 ba)b3m− = 3m 3533m 353. This implies that 555 5(53 )m 55. 3 −1 (3ba)(3ba)m (3ba)b3 −1 = Note also that 3 53 = 3 −1 (3ba)(3ba) b3 −1 m m−1 m−1 m 5 555 5 5(53 ) 55 = 5m (53 )m− 5 3 53 55 3 5 53 . Hence, 55 m m− m m m−1 m−1 53 5 53 55 = (5 (53 ) 5 ) 55 3 55 = z 55 (3 55) where z = 5 (53 ) 1
The proof of this fact is relatively short, even though it is more complex than in the case of D . It is quite simple to be veri ed, even though it was di cult to obtain this proof. The very existence of the words E in the D -class of 55, fact revealed by our computations, was surprising for us.
74
Alair Pereira do Lago
5m (53 )m−1 5m−1 . Applying again, we obtain 55 z (55) (3 55) z 55 (3 55) z 55(3 55) (3 55)m 55(3 55)m . Finally, since E is a factor of 55(3 55)m 55, we complete the proof of E D0 55. Now, we will prove that E is a door. As seen before, b53 5b = a−1 E a−1 D0 55. Since (1 d 3 −1 5ba) and (ab53 −1 d1 1) are two di erent windows of E , Prop. 1 implies that E is a door. This completes the proof that the graph G is well de ned and that it is a subgraph of the fundamental graph of the D-class of 55. Example 1 implies that the cyclomatic number of the fundamental graph of the D-class of 55 is at least m−1. Using Thm. 3, we have the following corollary. Corollary 1. Every group H-class in t e D-class of 55 is a free Burnside group of index m wit at least m − 1 generators. Considering the graph G de ned in Fig. 1, we have that (AB C D) and (DAB C) have the same length. If n = and m = , de ning B the local group (DAB C) around d0 , then AB C D = DAB C and we obtain that (AB C D) due to Thm. . Hence we get the following corollary: Corollary . Consider n = , m = and t e grap G de ned in Fig. 1. T e words (AB C D) and (DAB C) are two di erent s ortest words in t e same congruence class. In fact, we can prove that d0 , d1 , d , A, B, C, D and E are the unique shortest word in their respective congruence classes. This implies that (AB C D) = a−1 5575a−1 and (D AB C) = a−1 5755a−1. Besides, using complex computational results, we veri ed that the graph presented in Fig. 1 is the whole fundamental graph of the D-class in question for the cases m = or m = 3. A general proof of this fact depends on combinatorial results that have to be characterized. For the case m = , we can obtain a result similar to Cor. 1 independently of Thm. 3. For this case, m − 1 = 3 and we can exhibit three words in the H0 -class of 55 whose congruence classes generate a group isomorphic to the 3-generated Z Z and the words free Burnside group for m = . This free group is Z in question are 555, 575 and 55355. The subgroup generated by their congru^ 575355 ^ 5755355 ^ and one can 5355 5] 755 555355 ence classes is 55 555 575 5^ prove that they are di erent from each other analyzing the values of #a , #b and #bbabb . Conjecture 4 T e fundamental grap of t e D-class of 55 is t e grap G presented in Fig. 1.
Acknowledgements. I would like to thank Imre Simon and Cristina Gomes Fernandes for their suggestions.
Maximal Groups in Free Burnside Semigroups
75
References 1. S. I. Adian. The Burnside problem and identities in groups, volume 95 of Ergebnisse der Mathematik und ihrer Grenzgebiete [Results in Mathematics and Related Areas]. Springer-Verlag, Berlin-New York, 1979. Translated from the Russian by John Lennox and James Wiegold. 2. J. Brzozowski, K. Culik, and A. Gabrielian. Classi cation of non-counting events. J. Comp. Syst. Sci., 5:41 53, 1971. 3. Adyan, S. I. Problema Bernsa da i tozhdestva v gruppakh. Izdat. Nauka , Moscow, 1975. 4. A. de Luca and S. Varricchio. On non-counting regular classes. In M.S.Paterson, editor, Automata, Languages and Programming, pages 74 87, Berlin, 199 . SpringerVerlag. Lecture Notes in Computer Science, 443. 5. A. de Luca and S. Varricchio. On nitely recognizable semigroups. Acta Inform., 29(5):483 498, 1992. 6. A. de Luca and S. Varricchio. On non-counting regular classes. Theoretical Computer Science, 1 :67 1 4, 1992. 7. A. P. do Lago. Doctoral thesis in preparation. 8. A. P. do Lago. Sobre os semigrupos de Burnside xn = xn+m . Master’s thesis, Instituto de Matematica e Estat stica da Universidade de Sao Paulo, November 1991. 9. A. P. do Lago. On the Burnside semigroups xn = xn+m . In I. Simon, editor, LATIN’9 , volume 583 of Lecture Notes in Computer Science, pages 329 43, Berlin, 1992. Springer-Verlag. 1 . A. P. do Lago. On the Burnside semigroups xn = xn+m . Int. J. of Algebra and Computation, 6(2):179 227, 1996. 11. J. A. Green and D. Rees. On semigroups in which xr = x. Proc. Cambridge. Philos. Soc., 48:35 4 , 1952. 12. V. S. Guba. The word problem for the relatively free semigroup satisfying m = m+n with m 3. Int. J. of Algebra and Computation, 2(3):335 348, 1993. 13. V. S. Guba. The word problem for the relatively free semigroup satisfying m = m+n with m 4 or m = 3 n = 1. Int. J. of Algebra and Computation, 2(2):125 14 , 1993. 14. S. V. Ivanov. On the Burnside problem on periodic groups. Bull. Amer. Math. Soc. (N.S.), 27(2):257 26 , 1992. 15. S. V. Ivanov. The free Burnside groups of su ciently large exponents. Internat. J. Algebra Comput., 4(1-2):ii+3 8, 1994. 16. L. Kadourek, Jir and Polak. On free semigroups satisfying xr x. Simon Stevin, 64(1):3 19, 199 . 17. G. Lallement. Semigroups and Combinatorial Applications. John Wiley & Sons, New York, NY, 1979. 18. I. G. Lys¨enok. In nity of Burnside groups of period 2k for k 13. Uspekhi Mat. Nauk, 47(2(284)):2 1 2 2, 1992. 19. J. McCammond. The solution to the word problem for the relatively free semigroups satisfying a = a+b with a 6. Int. J. of Algebra and Computation, 1:1 32, 1991. 2 . I. Simon. Notes on non-counting languages of order 2. manuscript, 197 . ¨ 21. A. Thue. Uber die gegenseitige Lage gleicher Teile gewisser Zeichenreihen. Norske Vid. Selsk. Skr. I Mat. Nat. Kl., 1:1 67, 1912. 22. B. Tilson. Categories as algebra: an essential ingredient in the theory of monoids. J. Pure Appl. Algebra, 48(1-2):83 198, 1987.
Positive Varieties and In nite Words Jean-Eric Pin LIAFA, CNRS and Universite Paris VII, Place Jussieu 75 51 Paris Cedex O5, France
[email protected]
Abstract. Carrying on the work of Arnold, Pecuchet and Perrin, Wilke has obtained a counterpart of Eilenberg’s variety theorem for nite and in nite words. In this paper, we extend this theory for classes of languages that are closed under union and intersection, but not necessarily under complement. As an example, we give a purely algebraic characterization of various classes of recognizable sets de ned by topological properties (open, closed, and G ) or by combinatorial properties
1
Introduction
Carrying on the work of Arnold [1], Pecuchet [7,8] and Perrin [9,10,11], Wilke [ 5, 6] has obtained a counterpart of Eilenberg’s variety theorem for nite and in nite words. The word and is emphasized in the last sentence, because it is really important to work simultaneously with nite and in nite words. The tness of this approach was corroborated by recent contributions [13,14,18, 5, 7]. A variant of the notion of syntactic semigroup was recently proposed by the author [17]. The key idea is to de ne a partial order on syntactic semigroups, leading to the notion of ordered syntactic semigroups. The resulting extension of Eilenberg’s variety theory permits to treat classes of languages that are closed under union and intersection, but not necessarily under complement, a major di erence with the original theory. The aim of this paper is to extend this theory to in nite words, thus completing the table below. In the setting proposed by Wilke, semigroups are not suitable any more. They can be replaced by -semigroups, which are, roughly speaking, semigroups equipped with an in nite product [13, 5, 7].
Finite words
Finite or in nite words
Eilenberg [4] Varieties of semigroups +-varieties
Pin [17] Var. of ordered semigroups Positive +-varieties
Wilke [ 5, 6] Varieties of -semigroups -varieties
T is paper Var. of ordered -semigroups Positive -varieties
C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 76 87, 1998. c Springer-Verlag Berlin Heidelberg 1998
Positive Varieties and In nite Words
77
The ordered approach has many interesting consequences but leads to a complete rewriting of the theory. We have selected three examples, of rather di erent nature, to convince the reader of the power of this new approach. In Sect. 6, we give a purely algebraic characterization of four classes of recognizable sets de ned by topological properties. This includes in particular the class of deterministic -languages (i.e. recognized by a deterministic B¨ uchi automaton). Similar results were known only for topological classes closed under complement. Note that all these characterizations are e ective, since a simple algorithm to compute the syntactic -semigroup of a recognizable -language was given in [13]. In Sect. 7 we address a question originally considered by Pecuchet [7,8]. Since every recognizable -language can be written as a nite union of languages of the form XY , where X and Y are recognizable languages, the question arose to know whether this result could be relativized to varieties. Theorem 3, which extends the results of Pecuchet, gives such a result for varieties of ordered semigroups. It requires the concept of weak recognizability, introduced by Perrin [11], and the notion of ordered B¨ uchi automaton, which might be interesting in itself. Our last example, developed in Sect. 8, is a good illustration of the problems that arise when trying to generalize to in nite words a given class of recognizable languages. Indeed, as was observed by Pecuchet, there are at least three natural ways to associate a class of -languages to a variety of nite semigroups V One can consider the sets recognized by a nite -semigroup S whose semigroup part belongs to V , or those weakly recognized by a semigroup of V , or nally, inspired by McNaughton’s theorem, the boolean combinations of sets of the form − L where L is recognized by a semigroup of V . We will see how these three classes relate to each other in our case study, the shu e ideals. Due to the lack of space, no proofs are given, but they can be found in [14]. For more details, the reader is referred to [15,16, 0] for the variety theory for nite words and to [13,14,19, 4, 5, 6] for the theory of -languages.
Notations and Basic De nitions Let A be a nite alphabet. The free monoid on A is denoted A and the free semigroup, A+ . The set of in nite words on A is denoted A . Finally, A1 denotes the set of nite or in nite words. In this paper, a subset X of A1 will be systematically identi ed with the pair (X+ X ), where X+ = X A+ and X =X A . We now briefly review the standard de nition of a B¨ uchi automaton and introduce the notion of an ordered B¨ uchi automaton.
78
.1
Jean-Eric Pin
B¨ uchi Automata
A B¨ uc i automaton is a 5-tuple A = (Q A E I
) where
(1) (Q A E) is a nite (non deterministic) automaton, ( ) I and are subsets of Q, called the set of initial and nal states. A nite path in A is successful if its origin is in I and its end is in . An in nite path p is successful if its origin is in I and if some state of occurs in nitely often in p. The set of nite (respectively in nite) words recognized by A is the set of the labels of all successful nite (respectively in nite) paths of A. A set of nite (respectively in nite) words is recognizable (or regular ) if it is recognized by a nite B¨ uchi automaton. A B¨ uchi automaton A = (Q A E I ) is said to be deterministic if I is a singleton and if E contains no pair of transitions of the form (q a q1 ) (q a q ) with q1 = q . In particular, each word u is the label of at most one path starting from the initial state. It follows that an in nite word is accepted by A if and only if it has in nitely many pre xes accepted by A. Therefore, if L denotes the set of nite words recognized by A, then − L = u
A
u has in nitely many pre xes in L
is the set of in nite words recognized by A. .
Ordered B¨ uchi Automata
An ordered automaton is a B¨ uchi automaton A = (Q A E I ) in which the set of states Q is equipped with a partial order satisfying the two following conditions, for all p q q 0 Q and a A: (1) if p
q and (q a q 0 )
( ) The set
E, there exists a state p0
of nal states is an order ideal: if q
q 0 such that (p a p0 ) and p
q, then p
E, .
Condition (1) can be extended by transitivity as follows: (3) if p q and if there is a path from q to q 0 labeled by u, there exists a state p0 q 0 such that u is the label of a path from p to p0 . If A is a deterministic automaton, condition (1) can be simpli ed as follows (4) if p
q and if q a is de ned, then p a is de ned and satis es p a
q a.
Positive Varieties and In nite Words
3
Ordered Algebraic Structures
We introduce in this section our main algebraic tools : ordered semigroups, semigroups and syntactic -semigroups. 3.1
79
-
Ordered Semigroups
A quasi-order is a reflexive and transitive relation. Given a quasi-order , the relation de ned by x y if and only if x y and y x is an equivalence relation, called the equivalence relation associated wit . If this equivalence relation is the equality relation, the relation is an order . A relation on a semigroup S is stable if, for every x y z S, x y implies xz yz and zx zy. An ordered semigroup is a semigroup S equipped with a stable order relation on S. Note that any semigroup, equipped with the equality relation, becomes an ordered semigroup. Let S be an ordered semigroup. An ordered subsemigroup of S is a subset T of S such that t t0 T implies tt0 T . An order ideal of S is a subset I of S such that, if x y and y I, then x I. A congruence on an ordered semigroup S is a stable quasi-order which is coarser than or equal to . In particular, the order relation is itself a congruence. If is a congruence on S, then the equivalence relation associated with is a congruence on S. Furthermore, there is a well-de ned stable order on the quotient set S , given by [s] [t] if and only if s t. Thus (S ) is an ordered semigroup, also denoted S . Given two ordered semigroups S and T , their product S T is the ordered semigroup de ned on the set S T by the law (s t)(s0 t0 ) = (ss0 tt0 ) and the order given by (s t) (s0 t0 ) if and only if s s0 and t t0 . A morp ism of ordered semigroups : S T is a semigroup morphism from S into T such that, for every x y S, x y implies (x) (y). A morphism of ordered semigroups : S T is an isomorp ism if and only if is a bijective semigroup morphism and, for every x y S, x y is equivalent with (x) (y). Let S and T be two ordered semigroups. Then S is a quotient of T if there exists a surjective morphism from T onto S, and S divides T if S is a quotient of an ordered subsemigroup of T . Division is a quasi-order on ordered semigroups. Furthermore, one can show that two nite ordered semigroups divide each other if and only if they are isomorphic. A subset X of A+ is recognized by an ordered semigroup S if there exists S such that X = an ideal order I of S and a semigroup morphism : A+ −1 (I). It is not di cult to see that a language is recognizable if and only if it is recognized by a nite ordered semigroup. The syntactic congruence of a recognizable subset X of A+ is the stable quasiorder X de ned on A+ by u X v if and only if, for every x y A , xvy
X
xuy
X
80
Jean-Eric Pin
The equivalence relation X associated with X is called the syntactic equivalence of X. Thus u X v if and only if, for every x y A , xuy
X
xvy
X
The ordered semigroup S(X) = A+ X is the ordered syntactic semigroup of X and the order relation on S(X) the syntactic order of X. 3.
Transition Semigroup of a B¨ uchi Automaton
A standard construction associates to each nite automaton a nite semigroup, its transition semigroup. We de ne in this section the ordered version of this notion. Let A = (Q A E I ) be a B¨ uchi automaton. Denote by (Q) the semigroup of relations on Q, under composition of relations. Then the map : A+ (Q), de ned by (u) = (q q 0 )
Q
Q there is a path of label u from q to q 0
is a morphism of semigroups, and the semigroup S(A) = (A+ ) is called the transition semigroup of A. If A is ordered, S(A) is naturally equipped with a relation de ned by u0 u if and only if, for all (p q) u, there exists q 0 q such that (p q 0 ) u0 . Equivalently, u0 u if and only if, for all (p q) u and for all p0 p, there exists q 0 q such that (p0 q 0 ) u. One can show that this relation is a congruence of ordered semigroup on (S(A) =), but it is not necessarily an order. The ordered semigroup SO(A) = (S(A) ) is called the transition ordered semigroup of A. 3.3
-Semigroups
Finite semigroups can be viewed as a two-sided algebraic counterpart of nite automata that recognize nite words. In the case of in nite words, they can be replaced by -semigroups, which are, roughly speaking, semigroups equipped with an in nite product. More formally, an -semigroup is a two-sorted algebra S = (S+ S ) equipped with the following operations: S+ , that associates to each pair (s t) S+ S+ an (1) A product S+ S+ element of S+ denoted st, S , that associates to each pair (s t) S+ S ( ) A mixed product S+ S an element of S denoted st, N S , that associates to each in nite sequence (3) An in nite product S+ of elements of S+ an element of S denoted s0 s1 s . s 0 s1 s These three operations satisfy all possible types of associativity (a precise de nition can be found in [13]). In particular, S+ , equipped with the binary operation, is a semigroup and for all s t S+ and u S , s(tu) = (st)u. The in nite product of the sequence s s s is denoted s .
Positive Varieties and In nite Words
81
In particular, we denote by A1 the -semigroup (A+ A ) equipped with the usual concatenation product. Given two -semigroups S = (S+ S ) and T = (T+ T ), a morp ism of ) consisting of a semigroup morphism -semigroups S is a pair = ( + T+ and of a mapping :S T preserving the mixed product + : S+ and the in nite product: for every sequence (sn )n2N of elements of S+ , (s0 s1 s
)=
+ (s0 ) + (s1 ) + (s
)
(st). In the sequel, we shall omit and for every s S+ , t S , + (s) (t) = . the subscripts, and use the simpli ed notation instead of + and Algebraic concepts like subsemigroup, quotient, division and product are easily adapted to -semigroups. 3.4
Ordered
-Semigroups
An ordered -semigroup is an -semigroup (S+ S ) equipped with two partial orders on S+ and S which are stable under the operations of -semigroup: (1) for all s s0 t S+ , s s0 implies ts ts0 and st s0 t, ( ) for all s s0 S+ and for all u S , s s0 implies su s0 u, N such that sn (3) if (sn )n2N and (s0n )n2N are two sequences of elements of S+ 0 0 0 for all n, then s0 s1 s s0 s1 s .
s0n
A subset X = (X+ X ) of A1 is recognized by an ordered -semigroup S if there exists an ideal order I = (I+ I ) of S and a morphism of ordered -semigroup S such that X+ = −1 (I+ ) and X = −1 (I ). : A1
4
Ordered Syntactic
-Semigroup
The syntactic congruence of a recognizable subset X of A1 is the quasiorder + A and for every X de ned on A by u X v if and only if, for every x y z A ,
and on A by u
X
xvy xvyz
X+ X
xuy X+ xuyz X
(1) ( )
x(vy)
X
x(uy)
(3)
v if and only if, for every x xv
X
xu
X A ,
X
The syntactic ordered -semigroup of X, denoted by S(X), is the quotient of A1 by the syntactic congruence of X. It is a nite object, that can be e ectively constructed, given a B¨ uchi automaton recognizing X (see [13].)
8
5
Jean-Eric Pin
The Variety Theorem
We state in this section our extended version of the variety theorem. A variety of ordered semigroups is a class of nite ordered semigroups closed under division and nite product. If V is a variety of ordered semigroups, a V -language is a language recognized by an ordered semigroup of V , or, equivalently, whose syntactic ordered semigroup belongs to V . Similarly, a variety of ordered -semigroups is a class of nite ordered semigroups closed under division and nite product. If V is a variety of ordered -semigroups, denote by V(A1 ) the set of recognizable subsets of A1 recognized by an ordered -semigroup of V . This is also the set of subsets of A1 whose syntactic ordered -semigroup belongs to V . An -class of recognizable sets is a correspondence which associates a set C(A1 ) of recognizable sets of A1 with every nite alphabet A. In particular, the correspondence V V associates an -class of recognizable sets with every variety of ordered -semigroups. The variety theorem states in particular that this correspondence is one-to-one. It also gives an abstract description of the classes V arising in this way. Given a subset X of A1 , a word u A and an in nite word v of A , set u−1 X = x −
Xu = x Xv −1 = x A positive
-variety is an
A1 ux
X
A+ (xu) X + A xv X
-class such that
(1) For every alphabet A, V(A1 ) is closed under nite union and nite intersection, B+, X V(B 1 ) implies ( ) for every semigroup morphism : A+ −1 1 (X) V(A ), V(A1 ) (3) If X V(A1 ), then, for all u A , u−1 X V(A1 ) and Xu− −1 1 V(A ). and for all u A , Xu It is important to remember that the elements of a positive -variety are sets of nite or in nite words. The variety theorem can now be stated. Theorem 1. T e correspondence V of ordered -semigroups and positive
V de nes a bijection between varieties -varieties.
Varieties are conveniently de ned by identities. Let (u v) be a pair of words of A+ . An ordered semigroup S satis es t e identity u v if and only if (u) S. (v) for every morphism of ordered -semigroups : A+ Similarly, let (u v) be a pair of words of A1 . An ordered -semigroup S satis es t e identity u v if and only if (u) (v) for every morphism of S. Given a set of identities, the class of all ordered -semigroups : A1 nite ordered -semigroups that satisfy all the identities of is a variety of ordered -semigroup, denoted [[ ]].
Positive Varieties and In nite Words
83
In a nite semigroup, the subsemigroup generated by an element x contains a unique idempotent, denoted by x . We will also adopt this notation for identities (this can be rigorously justi ed, see [ 1]). For instance, an ordered -semigroup S x ]] if and only if, for every x y S+ , x y x . belongs to the variety [[x y Note that x is an idempotent of S+ , while x and x y are elements of S .
6
Topological Classes
Our ordered version of the variety theorem is perfectly suited to give algebraic characterization of certain topological properties of recognizable -languages. Similar results [ 5, 6,13] were so far limited to classes closed under complement. Recall that the topology on A is de ned by considering A as a discrete space and by taking the product topology. In particular the open sets of A are the A+ . A set is closed if its complement is sets of the form XA for some X open. Open sets form the rst level 1 of the Borel hierarchy. The second level consists of the countable intersection of open sets. There is a dual hierarchy, consists of whose rst level is the class 1 of closed sets. The second level the countable unions of closed sets. One can show that the recognizable sets of are exactly the deterministic -languages. These classes have the following algebraic characterization. Theorem . Let X be a recognizable subset of A and let S(X) be its syntactic ordered -semigroup. T en: (1) ( ) (3) (4)
7
X X X X
is is is is
in in in in
1 1
if if if if
and and and and
only only only only
if if if if
S(X) belongs S(X) belongs S(X) belongs S(X) belongs
to to to to
[[x yz x ]], [[x x yz ]], [[(x y) (x y) x ]], [[(x y) x (x y) ]].
Weak Recognizability
It is a well know fact that every recognizable set of A is a nite union of sets of the form XY , where X and Y are recognizable languages. Pecuchet [7,8] succeeded to relativize this result to varieties of semigroups. We give in this section an even stronger version, that works for varieties of ordered semigroups. We rst need some auxiliary de nitions. Let S be a nite ordered semigroup S be a morphism of semigroups. A linked pair is a pair and let : A+ (s e) S S such that e = e and se = s. These linked pairs play an important role in the study of nite -semigroups (see [13].) Denote by s the order ideal generated by an element s of S. Thus, by de nition s = t S t s . An -language is called -simple if it is of the form −1 ( s) −1 ( e) where (s e) is a linked pair. Note that −1 ( s) = −1 ( s) −1 ( e) and + −1 ( e) = −1 ( e) . Furthermore, these two languages are, by construction,
84
Jean-Eric Pin
recognized by the ordered semigroup S. Finally, let us say that an -language is weakly recognized by if it is a nite union of -simple -languages. Our theorem can now be stated: Theorem 3. Let V be a variety of ordered semigroups, and let Z be a subset of A . T e following conditions are equivalent: (1) Z is recognized by an ordered B¨ uc i automaton w ose ordered transition semigroup belongs to V , ( ) Z is weakly recognized by an ordered semigroup of V , (3) Z is a nite union of sets of t e form XY , w ere XY and Y + are V -languages. The reader may observe that this result does not t directly into the theory of varieties, as developed in Sect. 5. Actually, the connection between weak recognizability and recognizability is a di cult topic, that cannot be covered here. But in some cases (see for instance Theorems 5, 7 and 9 below), weak recognizability is equivalent with recognizability.
8
Shu e Ideals
As an illustration of our results, we investigate in this section the analog for in nite words of some standard classes of recognizable languages. Recall that an A , a s u e ideal is a nite union of languages of the form A a1 A a A an are letters of A. It was shown in [ ] that a language where n > 0 and a1 is a shu e ideal if and only if its syntactic ordered monoid satis es the identity x 1. It is equivalent to saying that its syntactic ordered semigroup satis es the identities xy x and yx x. Several extensions to -languages can be proposed. A rst possibility is to A ak A . consider nite unions of sets of the form A a1 A a Theorem 4. Let Z be a recognizable subset of A . T e following conditions are equivalent: A ak A , w ere a1 , (1) Z is a nite union of sets of t e form A a1 A a ak A, − ( ) Z is of t e form L , w ere L is a s u e ideal, x . (3) S(Z) satis es t e identities xy x, yx x and x y Another possibility is to consider -languages whose syntactic ordered group satisfy the identities xy x and yx x.
,
-semi-
Positive Varieties and In nite Words
85
Theorem 5. Let Z be a recognizable subset of A . T e following conditions are equivalent: A ak A , (1) Z is a positive boolean combination of sets of t e form A a1 A ak A, or (A a) , wit a A, wit k > 0, a1 ( ) Z is a nite union of sets of t e form XY , w ere XY and Y + are s u e ideals, (3) Z is weakly recognized by an ordered semigroup of [[xy x yx x]], x ]], (4) Z is weakly recognized by an ordered semigroup of [[x y x yx (5) S(Z) satis es t e identities xy x and yx x, x . (6) S(Z) satis es t e identities x y x and yx The equivalence of conditions (5) and (6) seems to contradict Theorem 1 since x yx the varieties of ordered -semigroups [[xy x yx x]] and [[x y x ]], which are distinct, de ne the same class of -languages. But in fact, they de ne two di erent classes of languages, and hence, two di erent classes of languages, which explains the paradox. The previous results lead to consider the variety of ordered semigroups dened by the identity xy x. The corresponding languages are nite unions of languages of the form an A a0 A a1 A a A where a0 a1
ak
A. We have again two possible generalizations.
Theorem 6. Let Z be a recognizable subset of A . T e following conditions are equivalent: A ak A , w ere a0 , a1 , (1) Z is a nite union of sets of t e form a0 A a1 A , an A, − ( ) Z is of t e form L , w ere S(L) [[xy x]], x . (3) S(Z) satis es t e identities xy x and x y The other possibility is to consider the -semigroup satisfy the identity xy x.
-languages whose syntactic ordered
Theorem 7. Let Z be a recognizable subset of A . T e following conditions are equivalent: (1) Z is a positive boolean combination of sets of t e form a0 A a1 A or (A a) , w ere a, a0 , , an A, ( ) Z is a nite union of sets of t e form XY , w ere S(XY ) S(Y + ) x]], (3) Z is weakly recognized by an ordered semigroup of [[xy x]], (4) S(Z) satis es t e identity xy x.
ak A [[xy
One can also consider boolean combination of shu e ideals. These languages are called piecewise testable and have been intensively studied. They admit a simple, but deep algebraic characterization, discovered by Simon [ 3]. Recall that a nite semigroup is J -trivial if two elements that generate the same ideal are equal. A recognizable language is piecewise testable if and only if its syntactic semigroup is J -trivial. There are again two possible extensions to in nite words, which are more precise versions of results by Pecuchet [7,8].)
86
Jean-Eric Pin
Theorem 8. Let Z be a recognizable subset of A . T e following conditions are equivalent: A ak A , w ere (1) Z is a boolean combination of sets of t e form A a1 A a ak A, a1 − ( ) Z is a boolean combination of sets of t e form L , w ere L is a s u e ideal, − (3) Z = L , w ere L is piecewise testable, − (4) Z is a boolean combination of sets of t e form L , w ere L is piecewise testable, (5) S+ (Z) is J -trivial and S(Z) satis es t e identity (x y ) x = (x y ) y .
Theorem 9. Let Z be a recognizable subset of A . T e following conditions are equivalent: A ak A , wit (1) Z is a boolean combination of sets of t e form A a1 A ak A, or (A a) , wit a A, a1 ( ) Z is a boolean combination of sets of t e form XY , w ere XY and Y + are s u e ideals, (3) Z is a nite union of sets of t e form XY , w ere XY and Y + are piecewise testable, (4) Z is a boolean combination of sets of t e form XY , w ere XY and Y + are piecewise testable, (5) Z is weakly recognized by a J -trivial semigroup, (6) S+ (Z) is J -trivial.
References 1. A. Arnold, A syntactic congruence for rational -languages, Theoret. Comput. Sci. 39, (1985) 333 335. . J. R. B¨ uchi, Weak second-order arithmetic and nite automata, Z. Math. Logik und Grundl. Math. 6, (1960) 66 9 . 3. J. R. B¨ uchi, On a decision method in restricted second-order arithmetic, in Proc. 1960 Int. Congr. for Logic, Methodology and Philosophy of Science, Stanford Univ. Press, Standford, (196 ) 1 11. 4. S. Eilenberg, Automata, languages and machines, Vol. B, Academic Press, New York (1976). 5. R. McNaughton, Testing and generating in nite sequences by a nite automaton Information and Control 9, (1966) 5 1 530. 6. D. Muller, In nite sequences and nite machines, in Switching Theory and Logical Design, Proc. Fourth Annual Symp. IEEE, (1963) 3 16. 7. J.-P. Pecuchet, Varietes de semigroupes et mots in nis, in B. Monien and G. VidalNaquet eds., STACS 86, Lecture Notes in Computer Science 10, Springer, (1986) 180 191. 8. J.-P. Pecuchet, Etude syntaxique des parties reconnaissables de mots in nis, in Proc. 13th ICALP, (L. Kott ed.) Lecture Notes in Computer Science 6, Springer, Berlin, (1986) 94 303.
Positive Varieties and In nite Words
87
9. D. Perrin, Varietes de semigroupes et mots in nis, C.R. Acad. Sci. Paris 95, (198 ) 595 598. 10. D. Perrin, Recent results on automata and in nite words, in Mathematical Foundations of Computer Science, Lecture Notes in Computer Science 176, Springer, Berlin, (1984) 134 148. 11. D. Perrin, An introduction to automata on in nite words, in Automata on In nite Words (Nivat, M. ed.), Lecture Notes in Computer Science 19 , Springer, Berlin, (1984) 17. 1 . D. Perrin and J.-E. Pin, First order logic and star-free sets, J. Comput. System Sci. 3 , (1986), 393 406. 13. D. Perrin and J.-E. Pin, Semigroups and automata on in nite words, in NATO Advanced Study Institute Semigroups, Formal Languages and Groups, J. Fountain (ed.), Kluwer academic publishers, (1995), 49 7 . 14. D. Perrin and J.-E. Pin, Mots in nis, to appear (LITP report 97 04), (1997). Accessible on the web: ttp://liafa.jussieu.fr/~jep. 15. J.-E. Pin, Varietes de langages formels, Masson, Paris (1984); English translation: Varieties of formal languages, Plenum, New-York (1986). 16. J.-E. Pin, Finite semigroups and recognizable languages: an introduction, in NATO Advanced Study Institute Semigroups, Formal Languages and Groups, J. Fountain (ed.), Kluwer academic publishers, (1995), 1 3 . 17. J.-E. Pin, A variety theorem without complementation, Russian Mathematics (Iz. VUZ) 39 (1995), 80 90. 18. J.-E. Pin, A negative answer to a question of Wilke on varieties of -languages, Information Processing Letters, (1995), 197 00. 19. J.-E. Pin, Logic, Semigroups and Automata on Words, Annals of Mathematics and Arti cial Intelligence 16 (1996), 343 384. 0. J.-E. Pin, Syntactic semigroups, in Handbook of language t eory, G. Rozenberg and A. Salomaa (ed.), Springer Verlag, 1997. 1. J.-E. Pin and P. Weil, A Reiterman theorem for pseudovarieties of nite rst-order structures, Algebra Universalis 35 (1996), 577 595. . J.-E. Pin and P. Weil, Polynomial closure and unambiguous product, Theory Comput. Systems 30 (1997), 1 39. 3. I. Simon, Piecewise testable events, Proc. nd GI Conf., Lecture Notes in Computer Science 33, Springer, Berlin, (1975) 14 . 4. W. Thomas, Automata on in nite objects, in Handbook of Theoretical Computer Science, vol B, Formal models and semantics, Elsevier, (1990) 135 191. 5. T. Wilke, An Eilenberg theorem for ∞-languages, in Automata, Languages and Programming, Lecture Notes in Computer Science 510, Springer Verlag, Berlin, Heidelberg, New York, (1991), 588 599. 6. T. Wilke, An algebraic theory for regular languages of nite and in nite words, Int. J. Alg. Comput. 3, (1993), 447 489. 7. T. Wilke, Locally threshold testable languages of in nite words, in STACS 93, P. Enjalbert, A. Finkel, K.W. Wagner (Eds.), Lecture Notes in Computer Science 665, Springer, Berlin, (1993) 607 616.
Unfold ng Parametr c Automata Marcos Veloso Pe xoto1 and Laurent Fr bourg2 1
2
U.F.F., Praca do Valongu nho, s.n. 45, N tero , R.J., Braz l L.I.E.N.S (URA 1327 CNRS), 45, rue d’Ulm, 75005 Par s, France
Abs rac . In th s paper, we descr be a method for prov ng propert es of parametr c automata. These propert es have the form p(L S) (S), where p s a pred cate character z ng the automaton, L s a l st of parametr c act ons accepted by the automaton, S s a parametr c state, and a log c formula. In prev ous work, we proposed a bottom-up evaluat on process for comput ng the set of all the consequences of p(L S), then show ng that t conta ns (S). Such forward reason ng methods often do not term nate. We propose here nstead a backward reason ng method wh ch s dr ven by the conclus on . The method cons sts essent ally n transform ng the program de n ng p by unfold ng all the recurs ve clauses that do not leave nvar ant . We have successfully appl ed th s method to the ver cat on of some funct onal propert es of a parametr c sl d ng w ndow protocol.
1
Introduct on
As po nted out n [1], most of the work on the semant cs of nteract ng processes rely on the not on of automata. On the one hand, each process can be modeled by a n te-state automaton (or, more generally, by a trans t on system). On the other hand, synchron zat on operat ons, such as synchron zed product [2], can be de ned over automata n order to model nteract on between components of a system of processes. The ver cat on of propert es of process systems s often done by spec fy ng the property as a formula of a certa n log c, and nterpret ng t over the n te state graph of the automaton model ng the system. Unfortunately, the number of states of the graph very often explodes comb nator ally as soon as a value of some parameter of the system ncreases. For th s reason t s h ghly useful to model nteract ng processes, not by n te state mach nes, but by gener c automata whose states (and act ons) conta n non nstant ated parameters: each parametr c state represents a pr or an n n te number of states. We have proposed to descr be such parametr c automata us ng log c programm ng w th ar thmet cal constra nts: each parametr c automaton s a set of clauses, say ( ) (1 k), labeled by parametr c act ons a (A), wh ch express when a trans t on makes the system go from a state S to a new state S 0 .(The symbols A, S and S 0 denote vectors of ar thmet cal var ables constra ned by some ar thmet cal cond t on assoc ated w th the trans t on.) In order to prove propert es of such parametr c automata, we propose n [8] to descr be the property as an nclus on of the l sts of act ons accepted by the system automaton nto the l st of C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 88 101, 1998. c Spr nger-Verlag Berl n He delberg 1998
Unfold ng Parametr c Automata
89
act ons accepted by an automaton character st c of the property that one wants to prove. Equ valently, th s amounts to express propert es of the system under the form: autsystem (L S1 )
autobserver (L S2 )
(S1 S2 )
where the observer automaton autobserver (L S2 ) denotes a parametr c automaton wh ch ntends to collect nformat on about the l st of act ons accepted by the system automaton, and (S1 S2 ) s an ar thmet cal formula. S nce the ntersect on of two parametr c automata s a parametr c automaton, the above property can be put under the form: aut(L S1 S2 )
(S1 S2 )
( )
where aut denotes a parametr c automaton. In prev ous work, n order to prove ( ), we rst computed by bottom-up evaluat on of the assoc ated log c program an ar thmet c formula (S1 S2 ) character st c of aut(L S1 S2 ). The proof of ( ) then reduced to the proof of the (S1 S2 ) (see [8] for deta ls). The problem w th ar thmet c formula (S1 S2 ) th s method s that bottom-up evaluat on procedures often do not term nate. Here, we propose nstead to prove the above formula by transform ng the program de n ng aut so that each of ts recurs ve clauses leaves nvar ant the ar thmet cal conclus on (S1 S2 ). The transformat on s done by unfold ng. The plan of th s paper s as follows: In sect on II we present the not on of parametr c automata and expla n how we use t to model systems of concurrent processes. In sect on III we ntroduce our bas c proof procedure (based on unfold ng) for prov ng propert es of parametr c automata. In sect on IV we descr be the form of the clauses that are generated n our framework by unfold ng. In sect on V we re ne the bas c procedure us ng the not on of cycl c clauses . In sect on VI we apply the procedure to n te automata and n sect on VII we present the conclus on of th s work.
2
Automata w th Ar thmet cal Constra nts
The dea of us ng log c programs for model ng concurrent systems w th parameters was explored at the beg nn ng of the e ght es by Azema, Bochmann and others [3,4]. They used Prolog for s mulat ng and test ng protocols at a gener c level: parameters d d not need to be nstant ated w th constant values. In [8], we proposed to use log c programm ng methods for prov ng propert es of such commun cat ng systems w th parameters. We used the not on of general zed automata w th ar thmet c constra nts: that s, a state mach ne whose act ons and states conta n ar thmet c parameters. W th each trans t on s assoc ated an ar thmet c constra nt that must be sat s ed by the act on and state parameters for enabl ng the trans t on. Formally, a general zed automaton w th ar thmet cal constra nts (or more s mply an ar thmet cal automaton) s a log c program made of a base clause of the form p([ ] S) :− 0 (S)
90
Marcos Veloso Pe xoto and Laurent Fr bourg
and of recurs ve clauses of the form (1 k) 0 p([a (A ) L] S ) :− p(L S) (S S 0 A ) where [ ] denotes the empty l st, [ ] the cons operator, L a l st var able, S, S 0 and A vectors of nteger var ables, a an un nterpreted funct on symbol, 0 and ar thmet c formulas. G ven a recurs ve rule of the above form, we say that S s the source state and S 0 s the target state of the recurs ve rule. We also say that the values of S that sat sfy 0 are the n t al states. The rst argument of p represents a l st L of act ons a accepted by the automaton. The second argument of p represents the state S reached by the automaton when execut ng the l st of act ons L. Each recurs ve rule corresponds to a trans t on from state S to state S 0 v a act on a . In order to make appear the funct onal dependenc es between S and S 0 , we w ll use the follow ng equ valent de n t on: De n t on 1. An ar thmet cal automaton s a log c program made of a base clause of the form p([ ] f0 (S)) :− 0 (S) and of recurs ve clauses of the form (1 k) p([a (A ) L] f (S N )) :− p(L S) (S N A ) where f0 and f denote ar thmet c funct ons, and N s the vector of state var ables ntroduced by the -th trans t on. Remark 1. One can retr eve the or g nal de n t on by de n ng f (S N ) to be equal to N and dent fy ng N to S 0 . Remark 2. A n te state automaton can be seen as an ar thmet c automaton: each trans t on from a source state, say s, to a target state, say t, can be s mulated by the clause p([as t L] t) :− p(L S) S = s Note that the answer for L to the query p(L S) s the reverse of the l st of act ons accepted by the automaton. We w ll use ar thmet c automata for model ng systems of concurrent processes: g ven a system descr bed by a set of nteract ng processes, we rst represent each component process by an ar thmet c automata; then, apply ng the operat on of ar thmet cal automata synchron zat on (see [8]), we generate an ar thmet c automaton that models the system.(The operat on of synchron zat on s closed for ar thmet c automata.) Th s s llustrated n example 1. We w ll also use ar thmet c automata for collect ng nformat on about the concurrent system: n th s case, the automata play the role of observers of the system (for the use of observers n concurrent systems, see [6,10]). Th s s llustrated n examples 2 and 3. Example 1. The follow ng program
1
s an ar thmet c automaton:
p1 ([ ] 0 0 0 −1) p1 ([put1 (A) L] U B + 1 − U U U − 1) :− p1 (L B S U W ) S = 0 A = B p1 ([get2 (A) L] W + 1 B + S − W − 1 U + 1 U ) :− p1 (L B S U W ) A = U p1 ([put3 (A) L] B 1 U W ) :− p1 (L B S U W ) S = 0 A = B
Unfold ng Parametr c Automata
91
The state var ables are B S U and W . Note that no new state var able s ntroduced by the trans t ons. Th s program s a s mpl ed vers on of an automaton wh ch models the behav or of the sl d ng w ndow protocol [13,9]. The sl d ng w ndow automaton was obta ned by synchron z ng the automata of four component processes (rece ver, sender, med um1 and med um2 ).
Example 2. The program
2
below s an observer of the above automaton
1:
p2 ([ ] 0 0) p2 ([put1 (A) L] Z1 + 1 Z2 ) :− p2 (L Z1 Z2 ) p2 ([get2 (A) L] Z1 Z2 + 1) :− p2 (L Z1 Z2 ) p2 ([put3 (A) L] Z1 + 1 Z2 ) :− p2 (L Z1 Z2 )
The state var ables are Z1 and Z2 . Argument Z1 stores the number of occurrences of act ons put1 put3 n l st argument L, wh le argument Z2 stores the number of occurrences of act on get2 . Example 3. The follow ng program
3
s another observer of
1:
p3 ([ ] 0 0 1) p3 ([put1 (A) L] C Z2 ) :− p3 (L Z1 Z2 ) A = C p3 ([get2 (A) L] Z1 D) :− p3 (L Z1 Z2 ) A = D p3 ([put3 (A) L] C Z2 ) :− p3 (L Z1 Z2 ) A = C
The state var ables are Z1 and Z2 . Argument Z1 stores the parameter value of the last act on put1 or put3 n a l st L, wh le Z2 stores the parameter value of the last act on get2 . We proved n [8] the follow ng propos t on: Propos t on 1. Let p1 and p2 be two pred cates de ned by ar thmet c automata. Let p be de ned as: p(L S1 S2 ) :− p1 (L S1 )
p2 (L S2 )
(
)
where S1 and S2 are d sjo nt vectors of ar thmet c var ables. Then, by foldng/unfold ng operat ons over ( ), the pred cate p can be de ned by an ar thmet c automaton. Example 4. a) Let us cons der the ar thmet c automaton 1 of example 1 and the observer 2 of example 2. The pred cate p de ned by p(L B S U W Z1 Z2 ) :− p1 (L B S U W ) p2 (L Z1 Z2 ) can be rede ned by the equ valent program: p([] 0 0 0 −1 0 0) p([put1 (A) L] U B + 1 − U U W Z1 + 1 Z2 )
:− p(L B S U W Z1 Z2 ) S=0 A=B p([get2 (A) L] W + 1 B + S − W − 1 U + 1 W Z1 Z2 + 1) :− p(L B S U W Z1 Z2 ) A=U p([put3 (A) L] B 1 U W Z1 + 1 Z2 ) :− p(L B S U W Z1 Z2 ) S=0 A=B
wh ch s an ar thmet c automaton.
92
Marcos Veloso Pe xoto and Laurent Fr bourg
b) Cons der now the observer 3 of example 3. The pred cate p de ned as p(L B S U W Z1 Z2 ) :− p1 (L B S U W ) p3 (L Z1 Z2 ) can be rede ned as the equ valent program: p([] 0 0 0 −1 0 0) p([put1 (A) L] U B + 1 − U U W B Z2 )
:− p(L B S U W Z1 Z2 ) S=0 A=B p([get2 (A) L] W + 1 B + S − W − 1 U + 1 W Z1 U ) :− p(L B S U W Z1 Z2 ) A=U p([put3 (A) L] B 1 U W B Z2 ) :− p(L B S U W Z1 Z2 ) S=0 A=B
wh ch s an ar thmet c automaton.
3
Bas c Proof Procedure
G ven a pred cate p1 de ned by an ar thmet c automaton (represent ng a concurrent system), and a pred cate p2 de ned by an ar thmet c automaton (represent ng an observer), our goal s to prove propert es of the form p1 (L S1 )
p2 (L S2 )
(S1 S2 )
(
)
where S1 and S2 are d sjo nt vectors of ar thmet c var ables and (S1 S2 ) s an ar thmet c formula. ) by an The conjunct on p1 (L S1 ) p2 (L S2 ) can be replaced n equat on ( equ valent atom p(L S1 S2 ), wh ch s tself de ned by an ar thmet c automaton (see propos t on 1). Our goal reduces then to prove that: p(L S1 S2 )
(S1 S2 )
The method descr bed n [8] cons sts of two steps: (1) project ng the program onto ts ar thmet cal argument S ( .e., dropp ng the l st argument L). Th s y elds a program 0 de n ng a pred cate p0 . (S1 S2 ). (2) prov ng: p0 (S1 S2 ) (S1 S2 ), we rst compute an ar thmet cal In order to prove p0 (S1 S2 ) formula (S1 S2 ) (equ valent to p0 (S1 S2 )) by bottom-up evaluat on of 0 . Then (S1 S2 ). The problem w th th s method s that we prove that (S1 S2 ) the bottom-up evaluat on of 0 does not term nate n general (although some procedures, such as Revesz’s one [12], are guaranteed to term nate n part cular cases.) Th s bottom-up evaluat on corresponds to the construct on of the least xed po nt assoc ated w th 0 . Actually the expl c t construct on of the least xed po nt s someth ng stronger that needed. In order to prove the property, t su ces to show that the the mmed ate consequence operator assoc ated w th 0 preserves ( .e., lets nvar ant ) the ar thmet cal conclus on (S1 S2 ). We propose hereafter to show th s by select ng all the recurs ve clauses of 0 wh ch do not preserve the conclus on (S1 S2 ), and by replac ng each of them by an
Unfold ng Parametr c Automata
93
equ valent set of clauses. Th s replacement w ll be done through the program transformat on rule of unfold ng, and w ll be repeated unt l all the recurs ve clauses of the transformed program preserve the conclus on. The el m nat on of the l st var able L at step 1 g ves a program 0 of the form: (0) p0 (f0 (S)) :− 0 (S) 0 0 () p ([f (S N )) :− p (S) (S N A ) In each recurs ve clause ( ), the formula (S N A ) can be s mpl ed as an equ valent formula 0 (S N ), by el m nat on of the ex stent al var able A . Example 5. Let us cons der the programs of example 4a. By project on onto the ar thmet cal arguments and el m nat on of A, we get the programs 0 : p (0 0 0 −1 0 0) p (U B + 1 − U U W Z1 + 1 Z2 ) :− p (B S U W Z1 Z2 ) S = 0 p (W + 1 B + S − W − 1 U + 1 W Z1 Z2 + 1) :− p (B S U W Z1 Z2 ) p (B 1 U W Z1 + 1 Z2 ) :− p (B S U W Z1 Z2 ) S = 0
(0) (1) (2) (3)
In order to prove the property: (Z1 Z2 ) p1 (L B S U W ) p2 (L Z1 Z2 ) t su ces to show: (Z1 Z2 ) ( ) p0 (B S U W Z1 Z2 ) Instead of try ng to generate an ar thmet c formula (B S U W Z1 Z2 ) by (Z1 Z2 ), bottom-up evaluat on of 0 (then to prove (B S U W Z1 Z2 ) as suggested n [7]), we w ll prove ( ) by unfold ng terat vely the clauses of 0 that do not preserve the ar thmet c conclus on (Z1 Z2 ). Let us now de ne formally the not ons of sat s ab l ty and preservat on of a formula by a clause. De n t on 2. Let us cons der a base clause C of the form p0 (f (S)) :− 0 (S) and an ar thmet c formula (S). The clause C sat s es the formula 0 (S) (f (S)) De n t on 3. Let us cons der a recurs ve clause D of the form p0 (f (S N )) :−p0 (S) 0 (S N ) and an ar thmet c formula (S). The clause D preserves the formula 0 (S N ) (S) (f (S N ))
:
:
Remark 3. If the cond t on 0 (S) of the base clause C s unsat s able, then clause C tr v ally sat s es the formula . L kew se, f the cond t on 0 (S N ) of the recurs ve clause D s unsat s able, then clause D tr v ally sat s es . Example 6. Cons der have:
0
of example 5 and de ne (Z1 Z2 ) as Z2 + 1
clause (0) sat s es the ar thmet c formula s nce 0 + 1 0. clauses (1) and (3) do not preserve s nce (S = 0 (Z1 Z2 )) Z2 + 1 Z1 Z2 + 1 Z1 + 1). 1 Z2 ) ( .e., S = 0
Z1 , we
(Z1 +
94
Marcos Veloso Pe xoto and Laurent Fr bourg
clause (2) preserves Z2 + 2 Z1 ).
s nce (Z1 Z2 )
(Z1 Z2 + 1) ( .e., Z2 + 1
Z1
Our proof procedure would therefore unfold clauses (1) and (3). Our bas c procedure terat vely unfolds the recurs ve clauses that do not preserve . It stops when all the generated recurs ve clauses preserve . The correctness of th s procedure rel es on the follow ng theorem: Theorem 1. Let p0 be a pred cate de ned by (the project on of ) an ar thmet c automaton program 0 , and an ar thmet c formula. Let 0 be a program equ v0 0 s e ther a base clause sat sfy ng , alent to . Suppose that each clause of (S) holds n the least model or a recurs ve clause preserv ng . Then p0 (S) of 0 .
4
Unfold ng
In th s sect on, we descr be the form of the clauses that are generated by unfoldng ar thmet c automata. Let p0 (f0 (S)) :− 00 (S), (0) be the bas c clause of 0 : 0 0 p (f (S N )) :− p0 (S) 0 (S N ), ( ) the recurs ve clause of : and C the recurs ve clause: p0 (f (S N )) :− p0 (S) 0 (S N ). From the de n t on of unfold ng (see [14]), we have: the result of unfold ng C v a (0) s the base clause: p0 (f (f0 (S) N )) :− 0 (f0 (S)) 00 (S) the result of unfold ng C v a ( ) s the recurs ve clause: p0 (f (f (S N ) N )) :− p0 (S) 0 (f (S N ) N ) 0 (S N ). We now descr be the result of perform ng a sequence of unfold ng. We rst ntroduce some notat on. De n t on 4. Cons der an ar thmet c automaton k . and cons der 0 , 1 ,..., n belong to 1
0
of the form (0)
We de ne the express on [ 0 1 n−1 0] as the base clause: 0 (f )) :− (f0 ). p0 (f[ 0 1 0 ] n−1 [0 1 n−1 ] ] as the recurs ve clause: We de ne the express on [ 0 1 n−1 n 0 0 (S)) :− p (S) (S). p0 (f[ 0 1 n] [0 1 n] (S) s de ned recurs vely by where f[ 0 n] = f 0 (S N 0 ) f[ 0 ] (S) (S) = f[ 0 (f n (S N n )) f[ 0 n] n−1 ] and 0[ 0 (S) s de ned n] 0 = 0 0 (S N 0 ) [ 0 ] (S) 0 0 (S) = 0[ 0 (f n (S N n )) (S N n ) [0 n n] n−1 ]
(k)
Unfold ng Parametr c Automata
Propos t on 2. Let We have:
n−1
0
belong to 1
k and
the unfold ng of 0 v a 1 generates the clause [ 0 1 ]. the unfold ng of [ 0 1 ] v a 2 generates the clause [ 0 the unfold ng of [ Note that, f clause.
n
0
n−1 ]
1
= 0 then [
0
1
va n]
n
n
95
k .
belong to 0
2 ].
1
generates the clause [
0
1
n ].
s a base clause; otherw se t s a recurs ve
We now descr be the not on of unfold ng a clause v a a program: (k) , and [ 0 De n t on 5. Let 0 be the program (0) clause de ned as above ( 0 1 n = 0). The unfold ng of [ the set of clauses [ 0 n j] 0jk .
n]
1 0
1
a recurs ve 0 s n] v a
Z2 +1 Example 7. . Cons der 0 of example 5 and of example 6 ( (Z1 Z2 ) Z1 ). Our proof procedure has to unfold the recurs ve clauses that do no preserve , v z (1) and (3) (see example 3.1). Let us focus on the unfold ng of (3). Th s y elds: p (0 1 0 −1 1 0) p (U 1 U U − 1 U Z1 + 2 Z2 )
:− 0 = 0 :− p (B S U W Z1 Z2 ) S = 0 B+1−U = 0 p (W + 1 1 U + 1 U Z1 + 1 Z2 + 1) :− p (B S U W Z1 Z2 ) B+S−W −1=0 p (B 1 U W Z1 + 2 Z2 ) :− p (B S U W Z1 Z2 ) 1=0 S=0
[3 0] [3 1] [3 2] [3 3]
Cond t on 1 = 0 of clause [3 3] s never sat s ed, so clause [3 3] tr v ally preserves . It s easy to show that clause [3 2] preserves , wh le clause [3 1] does not. So the procedure proceeds by unfold ng [3 1]. Th s y elds: p (0 1 0 −1 2 0) p (U 1 U U − 1 Z1 + 3 Z2 )
:− 0 = 0 0 + 1 − 0 = 0 [3 1 0] :− p (B S U W Z1 Z2 ) B + 1 − U = 0 U +1−U =0 S =0 [3 1 1] p (U + 1 1 U + 1 U Z1 + 2 Z2 + 1) :− p (B S U W Z1 Z2 ) B + S − W − 1 = 0 (W + 1) + 1 − (U + 1) = 0 [3 1 2] p (U 1 U U − 1 Z1 + 3 Z2 ) :− p (B S U W Z1 Z2 ) 1 = 0 B+1−U = 0 S = 0 [3 1 3]
Cond t ons U + 1 − U = 0 and 1 = 0 of clauses [3,1,1] and [3,1,3] are unsat s able, so these clauses tr v ally preserve . Clause [3,1,2] does not preserve . The procedure then proceeds further by unfold ng clause [3,1,2] v a 0 (see example 8).
96
5
Marcos Veloso Pe xoto and Laurent Fr bourg
Re ned Procedure
Each step of the bas c procedure unfolds the set of clauses that do not preserve . Note that each step generates new clauses that do not preserve . So th s procedure can not term te. In order to avo d such cases of nonterm nat on, we ntroduce a re nement of the bas c procedure based on the not on of cycl c clause. Informally, a cycl c clause s a recurs ve clause C2 obta ned by terat ve unfold ng of a clause C1 such that the cond t on 02 of C2 co nc des w th the cond t on 01 of C1 . Such a clause C2 (when t does not preserve ) makes the bas c procedure loop forever. Cycl c clauses are d scarded (under some cond t ons) n the re nement of the bas c procedure. As we w ll see, once we get a cycl c clause by fold ng, we can el m nate th s clause. Th s new procedure often term nates for programs model ng trans t on systems due to the presence of cycles n th s programs. De n t on 6. Let us cons der the recurs ve clause: p0 (f[ 0 0 (S) Suppose that there ex sts j < n such that [0 n] 0 (S). Then the clause s sa d to be cycl c over [ j+1 [0 j] Furthermore, f cond t on holds, then the cycle [ j+1
0 [ n]
(S)) :− p0 (S) 0 (S) [0 n] ]. n
n]
(S) (f[ 0 (S)) (f[ 0 (S)) j] n] 0 j] of the clause s sa d to be preserv ng w.r.t. .
N n ntroduced by the 0 -th,..., n -th trans t ons Remark 4. The var ables N 0 0 (S) (S) and do not appear expl c tly n the above formulas 0[ 0 [0 n] j] 0 (S) (f (S)) (f (S)) holds, then the cycle [0 [0 j] n] [0 j] [ j+1 n ]. These var ables should be cons dered mpl c tly as un versally quant ed. Remark 5. If the cond t on (S N ) of a recurs ve clause D s unsat s able, then every clause ssued from D by unfold ng s tr v ally cycl c, and ts cycle tr v ally preserves (any) property . Example 8. Let us cons der aga n the clause [3 1 2] of example 7: p0 (U + 1 1 U + 1 U Z1 + 2 Z2 + 1) :− p0 (B S U W Z1 Z2 ) B + S − W − 1 = 0 (W + 1) + 1 − (U + 1) = 0 [3 1 2] and let us unfold t v a
0
. Th s y elds:
:− 0 + 0 − 0 = 0 0 − 0 = 0 :− p (B S U W Z1 Z2 ) U + (B + 1 − U ) − U = 0 U − U = 0 S = 0 B + 1− U = 0 p (U + 2 1 U + 2 U + 1 Z1 + 2 Z2 + 2) :− p (B S U W Z1 Z2 ) U + 1 − (U + 1) = 0 (B + S) − U − 1 = 0 p (U + 1 1 U + 1 U Z1 + 3 Z2 + 1) :− p (B S U W Z1 Z2 ) B+1−W −1 = 0 W +1−U = 0 S = 0
p (1 1 1 0 2 1) p (U + 1 1 U + 1 U Z1 + 3 Z2 + 1)
[3 1 2 0]
[3 1 2 1]
[3 1 2 2]
[3 1 2 3]
Unfold ng Parametr c Automata
97
Clause [3 1 2 2] can be shown to be preserv ng for . Cons der now clause [3 1 2 1]. The express on 0[3 1 2 1] (B S U W Z) s U + (B + 1 − U ) − U = 0 U −U = 0 S = 0 B +1−U = 0. Th s s equ valent to S = 0 B +1−U = 0, that s 0[3 1] (B S U W Z). So clause [3 1 2 1] s cycl c of cycle [2 1]. The cond t on of preservat on of the property by th s cycle s 0 [3 1] (B
S U W Z)
(f[3 1] (B S U W Z))
(f[3 1 2 1] (B S U W Z))
that s: S=0
B+1−U =0
S=0
B+1−U =0
(Z1 + 2 Z2 )
(Z1 + 3 Z2 + 1)
that s: Z2 + 1
Z1 + 2
Z2 + 2
Z1 + 3
wh ch s true. So the cycle of [3 1 2 1] preserves . L kew se one can show that clause [3 1 2 3] has a cycle [2 3] wh ch preserves . Let I, J and K be l st of ntegers belong ng to 0 k . Let us denote the concatenat on operat on for such l sts by ‘.’ , and let J m denote the l st obta ned by concatenat ng m t mes l st J. Propos t on 3. Suppose that clause I J s cycl c over J, and cycle J preserves . Then clause I J m s cycl c over J m , and cycle J m preserves . 0 Proof. Let us suppose that clause I J s cycl c over J, .e. 0I J (S) I (S), and m m 0 s cycl c over J , .e. I J m (S) let us show by nduct on over m that I J 0 I (S). The base case for m = 1 s tr v al. The proof of the nduct on step s the follow ng: 0 I J m+1 (S) 0 0 (by de n t on) I J m (fJ (S)) J (S) 0 0 (by nduct on hypothes s) I (fJ (S)) J (S) 0 (by de n t on) I J (S) 0 (by cycl cness of I J). I (S) Let us now suppose that I J s cycl c over J and cycle J preserves , .e. 0 (fI (S)) (fI J (S)), and let us show by nduct on over m that cycle I (S) (fI J m (S)). The base case for m = 1 s I J m preserves , .e. 0I (S) (fI (S)) tr v al. For prov ng the nduct on step, let us assume that 0I (S) (fI (S)) holds, and let us prove (fI J m+1 (S)). From 0I (S) t follows 0I J m+1 (S) (s nce I J m+1 0 0 s cycl c over J m+1 ), .e. 0I J m (fJ (S)) J (S). On the other hand, from I (S) and (fI (S)) t follows (fI J (S)) (s nce cycle J preserves ), .e. (fI (fJ (S))). F nally, from 0I J m (fJ (S)) and (fI (fJ (S))), t follows (fI J m (fJ (S))) (by nduct on hypothes s), .e. (fI J m+1 (S)).
Prop. 3 can be general zed as follows: Jq be l sts of ntegers belong ng to 0 k . Suppose Lemma 1. Let I J1 that I Jq s cycl c over Jq and cycle Jq preserves , for all 1 q n. Then, for Jq by concatenat on, clause I J 0 s cycl c over J 0 any l st J 0 bu lt upon J1 0 and the cycle J preserves .
98
Marcos Veloso Pe xoto and Laurent Fr bourg
Lemma 2. Suppose that clause I J s cycl c over J, and cycle J preserves . Let K be a l st such that clause I K s recurs ve (resp. bas c) and preserves (resp. sat s es) . Then clause I J K preserves (resp. sat s es) . Proof. Let us suppose that I K preserves and let us show that I J K pre(S) (fI J K (S)) We w ll assume that 0I J K (S) serves , .e.: 0I J K (S) 0 (S) holds, and prove (fI J K (S)). From 0I J K (S), .e. 0I J (fK (S)) K (S), 0 0 0 (S) (because I J s cycl c over J), that s we have I (fK (S)) K I K (S). From 0I K (S) and (S) t follows (fI K (S)) (because I K preserves ), .e. (fI (fK (S))). From (fI (fK (S))) and 0I (fK (S)) t follows (fI J (fK (S))) (because cycle J preserves ), .e. (fI J K (S)). One can show exactly n the same manner that, f I K s bas c and sat s es , then I J K sat s es . Us ng lemmas 1 and 2, we can re ne theorem 1 as: Theorem 2. Let p0 a pred cate de ned by (the project on of ) an ar thmet c au tomaton program 0 , and an ar thmet c formula. Let 0 a program obta ned 0 0 by apply ng a sequence of unfold ngs v a . Suppose that each clause of from 0 s e ther: a base clause sat sfy ng , or a recurs ve clause preserv ng , or a cycl c clause whose cycle preserves . Then p0 (S)
(S) holds n the least model of
0
.
Th s theorem prov des the just cat on of the follow ng re ned procedure: terat vely unfold 0 v a 0 unt l all the generated recurs ve clauses e ther preserve (S) then holds or are cycl c (w th cycle preserv ng ). The formula p0 (S) the generated bas c clauses sat sfy . Example 9. Let us recap tulate the results obta ned by unfold ng program (See examples 5 and 6.)
0
.
(1) In t ally, we have: Clause [0] sat s es . Clause [1] does not preserve . Clause [2] preserves . Clause [3] does not preserve . (2) We have replaced clauses [3] by ts unfold ng v a processed n an analogous way). We have: Clause Clause Clause Clause
[3,0] [3,1] [3,2] [3,3]
sat s es . does not preserve . preserves . preserves .
0
. (Clause [1] has to be
Unfold ng Parametr c Automata
99
(3) We have replaced clause [3,1] by ts unfold ng v a 0 . We have: Clause [3,1,0] sat s es . Clause [3,1,1] preserves . Clause [3,1,2] does not preserve . Clause [3,1,3] preserves . (4) We have replaced clause [3,1,2] by ts unfold ng v a 0 . We have: Clause [3,1,2,0] sat s es . Clause [3,1,2,1] cycl c of cycle [2,1], and cycle [2,1] preserves . Clause [3,1,2,2] preserves . Clause [3,1,2,3] cycl c of cycle [2,3], and cycle [2,3] preserves . Every clause ssued from terat ve -unfold ng of clause [3] e ther sat s es or preserves or s cycl c (w th cycle preserv ng ). The same can be checked for clauses generated by terat ve -unfold ng of clause [1]. So, by theorem 2, we Z2 + 1 Z1 . have: p0 (B S U W Z1 Z2 )
6
Appl cat on to F n te State Automata
The unfold ng of ar thmet c automata corresponds to the expans on of the state graph n the case of n te state automata. The generat on of a cycl c clause corresponds to the d scovery of a cycle n the state graph. More formally, we have: Theorem 3. Each ar thmet c automaton represent ng a n te state automaton can be unfolded nto a program that conta ns only bas c and cycl c clauses. Example 10. Cons der the automaton AU T de ned by log c program : pau pau pau pau pau
([ ] 0) ([put1 L] ([put2 L] ([get3 L] ([get4 L]
1) 2) 1) 3)
:− :− :− :−
pau pau pau pau
(L (L (L (L
S) S) S) S)
S S S S
=0 =1 =2 =1
(0) (1) (2) (3) (4)
Unfold ng , we generate the equ valent program pau pau pau pau pau pau pau
([ ] 0) ([put1 ] 2) ([get4 put1 L] 3) ([put2 put1 L] 3) ([get4 get3 put2 L] 3) :− pau (L S) S = 1 ([put2 get3 put2 L] 2) :− pau (L S) S = 1 ([get3 put2 get3 L] 1) :− pau (L S) S = 2
: (0) [1 0] [4 1 0] [2 1 0] [4 3 2] [2 3 2] [3 2 3]
Th s program s only made of bas c clauses and of recurs ve clauses wh ch are cycl c (the r cycl c are underl ned). Cons der the property: for all l st L of act ons accepted by AU T , the number of put1 put2 s greater than or equal to the number of get3 get4 . Th s property holds s nce the bas c rules of sat sfy t and the cycles of the cycl c clauses preserve t. Note that the bas c procedure loops forever when appl ed to the same program .
100
Marcos Veloso Pe xoto and Laurent Fr bourg
It can be seen from theorem 2 and theorem 3 that our re ned unfold ng procedure prov des a dec s on procedure for the language empt ness problem n the Vncase of n te state automata. (Th s problem can be coded as: paut (L S) =1 S = sfi , where sfi denote the nal states of AU T .)
7
F nal Remarks
We have descr bed n th s paper a procedure based on unfold ng for prov ng propert es of systems of commun cat ng automata w th parameters. We have successfully appl ed th s procedure to prove some propert es of a sl d ng w ndow protocol where the w ndow s ze was g ven as a parameter. (We proved for example that, whenever a message w th sequence number s transm tted, t s not poss ble that another -th message s transm tted unless an -th message s del vered, see [13].) The ma n l m tat on of our procedure l es n the fact that t may n theory loop forever generat ng an n n te number of noncycl c clauses, although th s has not happened for all the pract cal examples we treated. For such cases of non-term nat on, we comb ne such a backward reason ng procedure w th forward reason ng methods. (Such comb nat ons are proposed n [5,11]). Note also that we are only able to prove funct onal propert es of commun cat ng systems, but not temporal propert es such as fa rness or l veness. Acknowledgments. We would l ke to thank Hubert Garavel, Andre Arnold and all the members of the VTT group for very helpful d scuss ons.
References 1. A. Arnold. Ver cat on and Compar son of Trans t on Systems , Proc. TAPSOFT , 1993, pp.121-135. 2. A. Arnold and M. N vat. Comportements de processus , Colloque AFCET Les Mathemat ques de l’Informat que , 1982, pp. 35-68. 3. P. Azema et. al. Spec cat on and Ver cat on of D str buted Systems us ng Prolog Interpreted Petr Nets , Proc. of 7 h Internat onal Conference on Software Eng., 1984, Orlando. 4. G. Bochmann, R. Dssoul and W. Lopes de Souza. Use of Prolog for Bu ld ng Protocol Des gn Tools , Proc. of the IFIP WG 6.1 5th Intl. Workshop on Protocol Spec cat on, Test ng and Ver cat on, North-Holland, 1985, Toulouse-Mo ssac, pp.131-145. 5. P. Cousot and R. Cousot. Abstract Interpretat on: A Un ed Latt ce Model for Stat c Analys s Of Programs by Construct on or Approx mat on of F xpo nts , Conference Record of the 4 h ACM Sympos um on Programm ng Languages, 1977, Par s. 6. R. Dssoul and G. Bochmann. Error Detect on w th Mult ple Observers , Proc. of the IFIP WG 6.1 5th Intl. Workshop on Protocol Spec cat on, Test ng and Ver cat on, North-Holland, 1985, Toulouse-Mo ssac, pp.483-494. 7. L. Fr bourg. M x ng L st Recurs on and Ar thmet c . Proc. 7 h IEEE Symp. on Log c n Computer Sc ence, Santa Cruz, 1992, pp. 419-429.
Unfold ng Parametr c Automata
101
8. L. Fr bourg and M. Veloso Pe xoto. Concurrent Constra nt Automata , Techn cal Report LIENS 93-10, Ecole Normale Super eure, May 1993 (poster presented at ILPS, 1993, Vancouver). 9. H. Garavel. Protocole a Fenetre Gl ssante , Unpubl shed manuscr pt , 1992. 10. G. Jard, J.F. Mon n and R. Groz. Exper ence n Implement ng X.250 (a CCITT subset of Estelle) n VEDA , Proc. of the IFIP WG 6.1 5th Intl. Workshop on Protocol Spec cat on, Test ng and Ver cat on, North-Holland, 1985, ToulouseMo ssac, pp.483-494. pp.315-331. 11. L. Na sh. Ver cat on of Log c Programs and Imperat ve Programs , Construct ng Log c Programs, J.M. Jacquet (ed.), W ley, 1993, pp.143-164. 12. P. Revesz. A Closed Form for Datalog Quer es w th Integer Order . Proc. 3rd Internat onal Conference on Database Theory, Par s, 1990, pp. 187-201. 13. J. R ch er, C. Rodr guez, J. S fak s and J. Vo ron. Ver cat on n XESAR of the Sl d ng W ndow Protocol , Proc. of the IFIP WG 6.1 7th Intl. Workshop on Protocol Spec cat on, Test ng and Ver cat on, North-Holland, 1987, pp.235-248. 14. H. Tamak , T. Sato. Unfold/Fold Transformat ons of Log c Programs . Proc. 2nd Internat onal Log c Programm ng Conference, Uppsala, 1984, pp. 127-138.
Fundamental Structures in Well-Structured In nite Transition Systems? Alain Finkel and Philippe Schnoebelen Lab. Speci cation and Veri cation, ENS de Cachan & CNRS URA 2236, 61, av. Pdt Wilson; 94235 Cachan Cedex; France finkel,p s @lsv.ens-cac an.fr
Ab tract. We suggest a simple and clean de nition for Well-Structured Transition Systems [2 ,1], a general class of in nite state systems for which decidability results exist. As a consequence we can (1) generalize the de nition in many ways, (2) nd examples of (general) WSTS’s in many elds, and (3) present new decidability results.
1
Introduction
Formal veri cation of programs and systems is a very active eld for both theoretical research and practical developments, especially since impressive advances in formal veri cation technology proved feasible in several realistic applications from the industrial world. The highly successful model-checking approach for nite systems [7] suggested that a working veri cation technology could well be developed for systems with an in nite state space. This explains the considerable amount of work that has been spent in recent years on this Veri cation of In nite State Systems (VISS) eld, with a surprising wealth of positive results [ 8,15]. The eld now has its own conference. However, this wealth of positive results is quite disorganized. Many people investigated extensions of essentially nite formal models for which decidability results existed, and they had to carefully control which kind of extensions could be a orded. A very interesting development in this eld is the introduction of well-structured transition systems (WSTS’s). These are transition systems where the existence of a well-quasi-ordering over the in nite set of states ensures the termination of several algorithmic methods. WSTS’s are an abstract generalization of several speci c structures and they allow general decidability results that can be applied to Petri nets, lossy channel systems, and many more. (Of course, WSTS’s are not intended as a general explanation of all the decidability results one can nd for speci c models.) Finkel [18,19, 0] was the rst to propose a de nition of WSTS (actually several variant de nitions). His insights came from the study of Petri nets where several decidability results rely on a monotonicity property (transitions rable ?
this work was supported by ECOS Action U93E 5 allelisme .
C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 10 c Springer-Verlag Berlin Heidelberg 1998
Modeles formels du par-
118, 1998.
Fundamental Structures in Well-Structured In nite Transition Systems
1 3
from marking M are rable from any larger marking) and Dickson’s lemma (inclusion between markings of a net is a well-ordering). He mainly investigated the decidability of termination, limitation and coverability-set problems. He applied the idea to several classes of fo nets and of CFSM’s (see Sect. 11). Independently, Abdulla et al. [1] later proposed another de nition. Their insights came from their study of lossy-channel systems and other families of analyzable in nite-state systems (e.g. integral relational automata [11]). They mainly investigated covering, inevitability and simulation problems. They applied the idea to timed networks [4] and lossy systems.
up w do ard w st nw ri a no ct rd n st -str ro ic re ng t fl st exi ut ve tr t e r an in sit g iv e
Later, Kushnarenko and Schnoebelen [ 5] introduced WSTS’s with downward compatibility, motivated by some analysis problems raised by Recursive-Parallel Programs. In this paper, we propose a simpler and cleaner de nition of WSTS’s by separating structural and e ectiveness issues. (For clarity, we do not consider labeled transitions and only focus on core problems related to reachability and inevitability of sets of states.) We also show how this basic de nition can be generalized in many ways, allowing many systems to be seen as (some kind of) WSTS. Indeed we show, through a large collection of examples, that WSTS’s are ubiquitous provided the key notion, compatibility of transitions with a wellordering, is explored under various angles. The following table collects all examples we mention in the paper.
Petri nets post SM-nets Petri nets with transfer arcs Petri nets with reset arcs BPP with inhibitory arcs CFSM’s with Lossy Channels CFSM’s with Insertion Errors Monogeneous CFSM’s [17] Free-choice FIFO nets Synchronizable CFSM’s BPA (Basic Process Algebra) BPP (Basic Parallel Process) Normed BPA Context-free grammars Permutation grammars RPPS Schemes RPPS Schemes with Integral Relational Automata [11] (A indicates presence, a is presence implied by a
Petri Nets and their extensions
Communicating Finite State Machines
Process Algebras String Rewriting
Other models stronger .)
1 4
Alain Finkel and Philippe Schnoebelen
The paper uses Petri nets ( 3) to introduce WSTS’s ( 4) because we rst exemplify the Finite Reachability Tree algorithm ( 5). Limitation analysis leads to strict WSTS’s ( 6). Then 7 presents the Saturation method for reachability analysis. Other WSTS’s are found in string rewriting ( 8) and simple process algebra ( 8). This motivates the introduction of stuttering WSTS’s ( 10). Other WSTS’s are found in Communicating Finite State Machines ( 11), motivating the introduction of downward WSTS’s ( 1 ).
Transition Systems A transition system (TS) is a structure S = S where S = t is a set of states, and S S is any set of transitions. Transition systems may have additional structure like initial states, labels for transitions, durations, causal independence relations, etc., but in this paper we are only interested in the state-part of the behaviors. 0 of immediate We write Succ( ) (resp. Pred ( )) for the set 0 S 0 S the predecessors). A state with no successors of (resp. 0 successor is a terminal state. S is nitely branching if all Succ( ) are nite.
3
Petri Nets Are Well-Structured
Petri nets are a well-known model of concurrent systems giving rise to in nite TS’s. Still, many questions about the behavior of Petri nets (e.g. reachability, niteness, ) can be decided. A key ingredient in methods for the analysis of Petri nets is the monotonicity of the ring rule. We describe this upon the example net from Fig. 1.
p1
t2
t1 p3
p2
p4
t3
Fig. 1. A Petri net The current marking, denoted by the four tokens inside the places, is M0 = p1 p1 p p3 (also denoted by p1 p p3 ). From M0 , t1 and t are rable. Firing def t from M0 leads to M1 = p1 p1 p1 p p4 . Now, if we had started from a marking M00 that includes M0 , then t1 and t are still rable (and maybe t3 also is now rable). Firing t from M00 will lead def
Fundamental Structures in Well-Structured In nite Transition Systems
1 5
to an M10 with the property that the di erence M00 − M0 is preserved and equals M10 − M1 . This is the monotonicity property of Petri nets: transitions are compatible with inclusion between markings: M and M1 M10 entail M10 M 0 and M 0 − M = M10 − M1 M1 Another key ingredient in methods for the analysis of Petri nets is that inclusion between markings is a well-ordering: De nition 1. An ordering relation for any in nite sequence x0 x1 x xj .
(over some set X) is a well-ordering i , in X, there exist indexes i < j s.t. x
If is a well-ordering, then any in nite sequence contains an in nite increasing . subsequence: x 0 x 1 x 2
4
Well-Structured Transition Systems
The monotonicity property for Petri nets motivated the original de nition of well-structured systems [ 0]. Here we give a simpli ed version: De nition . A well-structured transition system (WSTS) is a TS S = S equipped with an ordering S S between states s.t. is a well-ordering, and is compatible with , where compatible means that for all t s.t. t . exists a transition t1
1
t1 and transition
1
, there
Thus compatibility states that is a simulation relation in the Hennessy-Milner sense [ 4]. See Fig. for a diagrammatic presentation of compatibility where we quantify universally over solid lines and existentially over dashed lines.
1
t1
2
t2
Fig. . Strong compatibility With
, Petri nets give rise to well-structured transition systems.
1 6
5
Alain Finkel and Philippe Schnoebelen
Finite Reachability Tree
We assume S = S
is some WSTS.
De nition 3. [ 0] F RT ( ), the Finite Reachability Tree from , is a directed unordered tree where nodes are labeled by states of S. Nodes are either dead or alive. The root node is a live node n0 , labeled (written n0 : ). A dead node has no child node. A live node n : t has one child n0 : t0 for each successor Succ(t). If along the path from the root n0 : to some node n0 : t0 there t0 exists a node n : t (n = n0 ) s.t. t t0 , we say that n subsumes n0 , and then n0 is a dead node. Otherwise, n0 is alive. Thus leaf nodes in F RT ( ) are exactly (1) the nodes labeled with terminal states, and ( ) the subsumed nodes. Fig. 3 shows F RT (M0 ) for our previous Petri net example. (We decorated F RT (M0) with some explanations like transition names, subsumption data, and crosses for dead nodes.)
t1
M0 = p21p2p3
t2
M1 = p1p23
p31p2p4 = M2
t2
M3 = p21p3p4 t2
M5 = p31p24
t3
t1
p21p2p23 = M6
p31p22p3 = M4
M3
t3
! M0
M5
M6
! M0
! M0
t3
p31p2p3p4 = M7
! M0 M3
M7
! M2
Fig. 3. Finite Reachability Tree for the Petri net example For a WSTS S, the well-ordering property ensures that all paths in F RT ( ) are nite because any in nite path would have to contain a subsumed node. Now K¨ onig’s lemma entails that if S is nitely branching, then F RT ( ) is nite (hence the name). So that F RT ( ) is e ectively computable if (1) is decidable, and ( ) the Succ mapping is computable ( S has e ective Succ ). Clearly Petri nets have a decidable and e ective Succ. The construction of F RT ( ) does not require compatibility between and . But when we have compatibility, F RT ( ) contains, in a nite form, su cient information to answer several questions about computations paths starting from . Assume I S is upward-closed, i.e. t I entails I. For Petri nets, the set I can be e.g. all markings where a given place is marked , or all markings where a given transition is enabled , etc.
Fundamental Structures in Well-Structured In nite Transition Systems
1 7
Proposition 1. S admits a computation starting from where all states are in a given upward-closed I i F RT ( ) has a maximal path where all states are in I. 1 When is a well-ordering, any upward-closed I can be represented via a nite . When S has e ective Succ basis, i.e. as I = S 1 k and decidable and I S is e ectively given via a nite basis, the FRT can be used to yield: Proposition . CSM, the Control State Maintainability problem, is decidable for WSTS’s with e ective Succ and decidable . [1] A CSM inputs some state and a nite basis If , and answers yes i there exists a computation starting from where all visited states are in I (i.e. I can be maintained), a property written = 2I in temporal logic . A dual view is possible: assume D S is downward-closed. Proposition 3. DInev, the Inevitability problem, is decidable for WSTS’s with e ective Succ and decidable . [1] A DInev inputs some state and a nite co-basis for D (i.e. a basis for S D), and answers yes i all computations starting from eventually reach a state in D (i.e. D is inevitable), a property written = 3I in temporal logic. Termination is a special case of DInev because the set of terminal states is downward-closed. Looking back to our Petri net from Fig. 1, F RT (M0 ) lets us see that M0 = p , i.e. it is possible to have p always marked. One just has to check all paths and notice the rightmost one: M0 M M4 .
6
Strict Compatibility
In Petri nets the monotonicity property is much stronger than just compatibility . Finkel [ 0] also considered WSTS’s with strict compatibility. Strict compatibility is a stronger form of compatibility: it is obtained by using the strict rather than in De nition . Hence strict compatibility means that from strictly larger states it is possible to reach strictly larger states. See Fig. 4 for a diagrammatic presentation. Many extensions of Petri nets are WSTS’s with strict compatibility: e.g. nite colored Petri nets, or Valk’s post-SM nets [ 9]. Other extensions of Petri nets [14, 6] allow special kind of arcs, e.g. reset arcs which empty the attached place when the corresponding transition is red, transfer arcs which move all tokens from a given place to another place, and copy arcs which add a copy of every tokens from some place to a given place. All these extensions enjoy compatibility in the strict sense, except for nets with reset arcs which only have non-strict compatibility. Other extensions like zero-test arcs or inhibitory arcs clearly break the compatibility requirement (w.r.t. ). 1 2
We say that RT (s) contains a state when we mean a node labeled by a state . If we have another upward-closed (e ectively given) I and if I does not contain terminal states, then I = 2I is decidable. However when I contains terminal states, I = 2I is not decidable in general.
1 8
Alain Finkel and Philippe Schnoebelen
1
t1
2
t2
Fig. 4. Strict compatibility
E ective WSTS’s with strict compatibility have a decidable limitation problem: Proposition 4. If S is a nitely branching WSTS with strict compatibility, Succ ( ) is in nite i F RT ( ) contains a leaf node n : t subsumed by an ancestor n0 : t0 with t0 t. [ 0] Corollary 1. The limitation problem is decidable for strict-WSTS’s with (1) decidable and ( ) computable Succ. As an example, F RT (M0) has a leaf strictly subsumed by one of its ancestors: we conclude that, starting from M0 , one can reach an in nite number of di erent markings. For the limitation problem, it is possible to follow ideas from [ 1] and build a smaller tree De nition 4. RRT ( ), the Reduced Reachability Tree, is a tree built like F RT ( ) except that now a node n : t is dead as soon as it has an ancestor node n0 : t0 with t0 t or t t0 . Theorem 1. If S is a nitely branching WSTS with strict compatibility, then Succ ( ) is in nite i RRT ( ) contains a leaf node n : t subsumed by an ancestor n0 : t0 with t0 t. In our Fig. 3 for example, RRT (M0) equals F RT (M0 ).
7
Saturation Methods
We speak of saturation methods when we have methods whose termination relies on the following Lemma: Lemma 1. Assume is a well-ordering. Then any in nite increasing sequence I1 I of upward-closed sets eventually stabilizes, i.e. there is a I0 k IN s.t. Ik = Ik+1 = Ik+ =
Fundamental Structures in Well-Structured In nite Transition Systems
1 9
For a WSTS S, a saturation method can be used to compute the set Pred (I) of all states from which one can reach a state in I. This relies on the following consequence of the compatibility property: Proposition 5. If I S is upward-closed, then Pred (I) is upward-closed. S In fact, all =0 k Pred (I) are upward-closed, so that, in view of Lemma 1, the given by J0 = I and Jn+1 = Jn Pred (Jn ) eventually sequence J0 J1 stabilizes at some Jk . Then Jk = Pred (I). (Observe that stabilization is ensured as soon as two consecutive sets are equal.) As an example, consider the Petri net from Fig. 1 and write I for M6 , i.e. 0 S, R denotes S all markings covering M6 = p1 p p3 (for R 0 , the upward-closure of R). Using R def
Pred (R
def
R0 ) = Pred (R)
Pred (R0 )
Pred ( R) = Pred (R)
(1) ( )
we compute Pred (I) by the saturation method: J0 = M6 J1 = J0 Pred ( M6 ) = M6 M1 M J = J1 Pred ( M6 M1 M ) Pred (M6 ) Pred (M1 ) Pred (M ) = J1 p1 p4 p31 p M0 M0 = J1 = M0 M1 M p1 p4 p31 p J3 = J Pred (J ) Pred (p1 p4 ) Pred (p31 p ) =J p1 p3 =J J4 = J3 Pred (J ) Pred ( p1 p3 ) p1 p = J3 J5 = J4 Pred (J3 p1 p ) p1 p Pred (p1 p ) = J4 = J3 | {z } =;
For Petri nets, computing Pred (I) is quite simple as we just saw. We obtain a nite basis (or a set of residuals according to [30]’s terminology). It can be used to answer coverability questions because it is possible to cover M starting from M0 i M0 Pred ( M ). This algorithmic idea can be generalized for an arbitrary WSTS S: De nition 5. S has e ective pred basis when there is an algorithm computing a nite basis of Pred ( ) for any . E.g. Petri nets have e ective pred basis. The extensions we mentioned (reset arcs, post-SM nets, all have e ective pred basis). We made De nition 5 more general than the corresponding notion in [1] so that it also works with stuttering and transitive compatibility ( 10).
11
Alain Finkel and Philippe Schnoebelen
Now assume an upward-closed I is given through a nite basis If . We comdef def of nite sets with J0 f = If and Jk+1 f = Jk f pute a sequence J0 f J1 f Jk+1 f = Jk f which can be decided by pred basis(Jk f ). We stop when Jk f with t . Then checking that for any Jk+1 f we can nd a t Jk f = Pred (I) (and in fact any Jl f is a basis for =0 l Pred (I)). Hence the following Proposition 6. A nite basis for Pred (I) can be computed for WSTS’s with (1) decidable and ( ) e ective pred basis. Corollary . For WSTS’s with (1) decidable covering problem is decidable.3
and ( ) e ective pred basis, the
The covering problem, called control-state reachability in [1], consists in deciding whether = 3 I for a state and an upward-closed I (given via a nite basis If ). The variant problem does I 0 = 3 I ? is also decidable. The variant does D = 3 I ? is decidable (where D is downward-closed) if is intersection-e ective, i.e. there is an algorithm computing a nite basis for 0 S. ( ) \ ( 0 ) given any
8
String Rewriting Systems Are Well-Structured
De nition is meant to abstract the fundamental requirements behind the existence of e.g. the reduced reachability tree (see Sect. 5). The interest of this abstract de nition is that many classes of in nite TS’s share this well-structure. Consider Context-Free Grammars (CFG’s) like the following example G:
S X Y
XY
aSS bXS aX b
A possible derivation in G is S
aSS
aXY S
aY S
aY XY
aY Xb
aY b
abb
(3)
If instead of focusing on the language generated by G we emphasize the rewrite steps, then G gives rise to a transition system SG where states are words in (TG NG ) and (3) now is a complete execution of SG , starting from S. Let us now consider word-embedding: we say a word u embeds into a word v, written u v, i u can be obtained by erasing letters from v. This gives a partial ordering, known (Higman’s Lemma) to be a well-ordering. Now assume u v with u v (TG NG ) . If u G u0 by some rule, then clearly the same rule can be applied in v in such a way that v G v 0 with u0 v 0 . 3
Surprisingly, to our knowledge, all covering algorithms for Petri nets in the literature are much more complicated than the simple iterated pred basis approach.
Fundamental Structures in Well-Structured In nite Transition Systems
Furthermore, if u v then u0 v 0 . So that SG with strict-compatibility. . Figure 5 displays F RT (S) for SG
111
is a well-structured system
hhhhhhhhhhhh % hhhh % aSS XY ( h h ( h ( h ( bbb hhhhhhhhhh ((((((((( bXSY Y XaX Xb X X Q X Q QQ XXXX ,,LL QQ b bXSaX XabXS b bXSb aX aX Xa ,,LL ,,LL ,,LL S
a abXS
a abXS
a bXSa
Fig. 5. F RT (S) in SG Another well-structured view is possible: if we consider words as multisets of symbols (rather than strings), we can de ne word inclusion: write u v when u can be obtained from v by a combination of erases and permutations. This is another yields another well-ordering (in fact, a quasi-ordering) and SG WSTS with strict-compatibility. When words are seen as sets of symbols rather is still well-structured, but not with strict-compatibility than multisets, SG anymore. When it comes to algorithms for the analysis of WSTS’s, the precise choice of which ordering we consider is quite relevant because many decidable properties for WSTS’s (e.g. coverability) are expressed in terms of the ordering itself. When a choice is possible, using a larger ordering will often yield less information but more e cient algorithms.
9
Basic Process Algebras Are Well-Structured
A Basic Process Algebra (BPA) is a subset of process algebra rst studied in [5] where only pre xing, non-deterministic choice, sequential composition and guarded recursion are allowed. Here is an example BPA declaration: :
X Y
aY X + bX + c bXX + a
and a possible derivation is X
a
YX
b
XXX
c
XX
c
(4)
112
Alain Finkel and Philippe Schnoebelen
BPA systems can be seen as CFG’s with head-rewriting. (We lose head-rewriting when we replace sequential composition by parallel composition, yielding the Algebra of Basic Parallel Processes (BPP) from [13], which are a subclass of Petri nets.) Because of the head-rewriting strategy, BPA systems do not have transitions compatible with word-embedding. E.g. if in the previous example, we consider v0 Y XXY XX and step Y bXX, we cannot nd some v 0 with XXY XX 0 and bXX v . (The left-factor ordering is compatible with head-rewriting but it is not a well-ordering.) However, if we restrict ourselves to Normed BPA (a class rst introduced in [5]) we can nd a well-structure. Formally, a BPA process is normed if it admits a terminating behavior. A BPA declaration is normed if all its processes are normed. From a CFG viewpoint, this corresponds to grammars in Greibach normal form and without useless productions. E.g. our example above is a normed BPA declaration. With Normed BPA, the di culty with head-rewriting can be circumvented. Let’s consider Y XXY XX. because is normed, there exists a terminating sequence X . So that XXY XX XY XX Y XX bXXXX reaching bXXXX with bXX bXXXX. De nition 6. A well-ordering over states of some TS S has stuttering comt1 and transition 1 , there exists a non-empty patibility if for all 1 t tn with tn and 1 t for all i < n. [ 5] sequence t1 See Fig. 6 for a diagrammatic presentation of stuttering compatibility. Observe
1
t1 t2 tn−1
2
tn
Fig. 6. Stuttering compatibility that requiring 1 t makes stuttering stronger than compatibility for transitive closure of . Proposition 7. For a Normed BPA declaration stuttering compatibility.
, S
+
, the
is a WSTS with
Fundamental Structures in Well-Structured In nite Transition Systems
10
113
Stuttering Compatibility
The name stuttering compatibility comes from stuttering (also branching ) bisimulation [6]. It is weaker than the usual strong compatibility but stronger than transitive compatibility, a notion we obtain when we do not require 1 t for 1 < i < n. Finkel’s notion of 3-structured [ 0] is similar to transitive compatibility (in a framework where labels of transitions are taken into account). This class of WSTS’s contain e.g. our RPPS schemes [ 5] and Free-Choice Fifo nets [16]. Of course there exists a notion of strict-stuttering (and strict-transitive) compatibility when instead of is considered. The decision methods we presented earlier accommodate stuttering compatibility: Theorem . The following assertions hold: (1) Proposition 1 and decidability of CSM and DInev generalize to WSTS’s with stuttering compatibility. ( ) Prop. 5, 6 and decidability of Coverability generalize to WSTS’s with transitive-reflexive stuttering. (3) Proposition 4 and decidability of Limitation generalize to WSTS’s with stricttransitive stuttering. These decidability results do not require any additional e ectiveness hypothesis. ) are used Also, the same methods (building F RT ( ), iterating pred basis, with no modi cation.
11
Communicating Finite State Machines Are Well-Structured
A Communicating Finite State Machine (CFSM) [8] can be seen as a Finite cn of n fo channels. State Automaton (FSA) equipped with a collection c1 A transition of the FSA is labeled with a send actions (e.g. q ci ?a
0
ci !a
q 0 ) or a
q ). receive action (e.g. q wn where A state (or con guration) of a CFSM is some = q w1 q is a control state of the FSA, and each w is a word describing the current c !a wn transition q i q 0 content of channel c . In con guration = q w1 0 0 is possible, reaching = q w1 w −1 w a w +1 wn , a new con guration where the control state is now q 0 and where the sent symbol a has c ?a been appended after w . In , transition q i q 0 is only possible if channel c contains an a in rst position, i.e. if w is some a w0 . Then we can reach w −1 w0 w +1 wn . q 0 w1 Fig. 7 shows an example where P1 and P are two di erent automata communicating via two fo channels. This gives a CFSM CP1 P2 if we see P1 and P as one single global FSA.
114
Alain Finkel and Philippe Schnoebelen
P1 :
c2?c
c1!a
P2 :
c1
p0
q0
c1 ?a
c1!b
c1 ?b
c2
p1
c2!c
q1
Fig. 7. A Communicating Finite State Machine A possible behavior for this example is p0 q0
c1 !a
p1 q0 a
c1 !b
p0 q0 a b
c1 ?a
p0 q1 b
c2 !c
p0 q0 b c
Because the channels are unbounded, CFSM’s generate in nite TS’s SC and are Turing-powerful. Recently, Abdulla and Jonsson investigated Lossy Channel Systems [ ,3], i.e. CFSM’s where in any con guration the system may loose wn any symbol from any channel. In other words, any transition q w1 w0 wn is possible when w0 is obtained by removing one symbol q w1 from w . CFSM’s with lossy channels 4 are useful as models of systems assuming unsafe communication links, e.g. the alternating bit protocol. Several interesting decidability results for Lossy CFSM’s can be explained by their well-structure [1]. De ne an ordering between con gurations by q = q 0 and wn q 0 w10 wn0 i q w1 w0 (for i = 1 n) w The right point of view for Lossy CFSM’s is to use stuttering compatibility: Proposition 8. For C a Lossy Channel System (or a Completely Speci ed Prois a WSTS with (non-strict) stuttering compatibility. tocol), SC Several other special classes of CFSM’s from the literature have an underlying well-structure, often with more involved orderings: CP1 P2 from Fig. 7 has transitions compatible with , de ned by 8 p = p 0 qj = qj 0 > > < 0 w1 w1 (ab) if i = i0 = 0 def p 0 qj 0 w10 w0 p qj w1 w > w10 w1 (ba) if i = i0 = 1 > : 0 w c w This variation around the pre x ordering is not a well-ordering in general, (ab) + (ab) a c , a set containing but it is a well-ordering on Q1 Q 4
Finkel’s Completely Speci ed Protocols [22] are Lossy CFSM’s where only the head symbol can be lost. They behave essentially like Lossy CFSM’s.
Fundamental Structures in Well-Structured In nite Transition Systems
115
all the reachable states of CP1 P2 . The same approach can be generalized to all Monogeneous CFSM’s [17]. Cece used a complicated ordering [10] to show that Synchronizable CFSM’s are strict WSTS’s [ 3]. Cece et al. introduced CFSM’s with insertion errors [1 ]. These are CFSM’s where at any time, arbitrary symbols (noise) can be inserted anywhere in the channels. These too can be seen as well-structured systems, but not as easily as Lossy CFSM’s. One way is to consider the transition relation backward: When we consider the transition relation backward −1 , CFSM’s with Insertion Errors are exactly Lossy CFSM’s. This view can be useful for reachability analysis, and it helps understand why [1 ] considered forward analysis on CFSM’s with Insertion Errors, rather than the usual backward analysis based on iterated pred basis. Another approach is to consider downward-compatibility, rst introduced in [ 5] for the RPPS model. De nition 7. A well-ordering over states of some TS S has downward comt1 and transition 1 , there exists a t1 t with patibility if for all 1 t . [ 5] See Fig. 8 for a diagrammatic presentation of downward compatibility.
s1
t1
s2
t2
Fig. 8. Downward compatibility
Proposition 9. For C, a CFSM with Insertion Errors, SC (non-strict) stuttering downward compatibility.
1
is a WSTS with
Downward Compatibility
Downward compatibility is not another generalization or specialization of the usual (upward) WSTS notion. It is a di erent kind of well-structure. However it had similar motivations and similar ubiquity.
116
Alain Finkel and Philippe Schnoebelen
Ubiquity is clear: (1) With the usual ordering BPP nets with inhibitory arcs (even copy arcs) have reflexive downward compatibility. ( ) With the usual ordering, BPA and RPPS [ 5] schemes have reflexive downward compatibility without any normedness hypothesis. (3) With the usual ordering, permutation grammars5 [ 7] have reflexive downward compatibility. Downward WSTS have not yet been explored in great depth. Still, some decision results exist: Proposition 10. Assume S is a downward-WSTS (possibly reflexive and/or transitive), and D S is downward-closed, then Pred (D) is downward-closed. Proof. Assume 0 Pred (D). Then there is a 0 1 n with n If now 00 0 then downward-compatibility entails the existence of some 0 0 0 0 D. So that 00 Pred (D). n . Hence m 1 m with m
0 0
D.
Now assume an upward-closed I is given through a nite basis If . We de ne def def of nite sets by J0 = If and J +1 = J Succ(J ). a nite sequence J0 J1 We end the sequence at the rst k such that Jk = Jk+1 , a condition which is bound to hold eventually. Proposition 11. If S has downward compatibility (possibly reflexive), then Jk = Succ (I). Corollary 3. A nite basis for Succ (I) can be computed for downward (possibly reflexive) WSTS’s with (1) decidable and ( ) e ective Succ. The subcovering problem is decidable for these WSTS’s. The sub-covering problem consists in deciding whether from some reach a state covered by some 0 .
one can
Conclusion In this paper, we clari ed the WSTS concept: they are TS’s where transitions are compatible with a well-ordering. We presented a large collection of WSTS’s, which suggested several variations around the idea of compatibility , for which general decidability results were presented, some of them quite new, the other being slight extensions of earlier results. Our own conclusion is that WSTS’s are ubiquitous and can be found in many models of computation, provided compatibility is understood in a liberal way. Some points worth stressing are The WSTS idea is a guide for relating and unifying many approaches in many di erent elds of computer science. It suggests directions for extending already known decidability results. Dually, our own work (currently in progress) proves that investigating the limits of WSTS is a powerful guide for undecidability results. 5
i.e. grammars where context-sensitive permutation rules AB
BA are allowed.
Fundamental Structures in Well-Structured In nite Transition Systems
117
References 1. P. A. Abdulla, K. Cerans, B. Jonsson, and T. Yih-Kuen. General decidability theorems for in nite-state systems. In Proc. 11th IEEE Symp. Logic in Computer Science (LICS’96), New Brunswick, NJ, USA, July 1996, pages 313 321, 1996. 2. P. A. Abdulla and B. Jonsson. Verifying programs with unreliable channels. In Proc. 8th IEEE Symp. Logic in Computer Science (LICS’93), Montreal, Canada, June 1993, pages 16 17 , 1993. 3. P. A. Abdulla and B. Jonsson. Undecidability of verifying programs with unreliable channels. In Proc. 1st Int. Coll. Automata, Languages, and Programming (ICALP’94), Jerusalem, Israel, July 1994, volume 82 of Lecture Notes in Computer Science, pages 316 327. Springer-Verlag, 1994. 4. P. A. Abdulla and B. Jonsson. Model-checking through constraint solving. In Methods and Tools for the Veri cation of In nite State Systems, Proceedings of the Grenoble-Alpe d’Huez European School of Computer Science, May 3 5, Grenoble, France. VERIMAG, St-Martin d’Heres, France, 1997. 5. J. C. M. Baeten, J. A. Bergstra, and J. W. Klop. Decidability of bisimulation equivalence for processes generating context-free languages. In Proc. Parallel Architectures and Languages Europe (PARLE’87), Eindhoven, NL, June 1987, vol. II: Parallel Languages, volume 259 of Lecture Notes in Computer Science, pages 94 111. Springer-Verlag, 1987. 6. M. C. Browne, E. M. Clarke, and O. Gr¨ umberg. Characterizing nite Kripke structures in propositional temporal logic. Theoretical Computer Science, 59(1 2):115 131, 1988. 7. J. R. Burch, E. M. Clarke, K. L. McMillan, D. L. Dill, and L. J. Hwang. Symbolic model checking: 1 20 states and beyond. Information and Computation, 98(2):142 17 , 1992. 8. G. von Bochmann. Finite state description of communication protocols. Computer Networks and ISDN Systems, 2:361 372, 1978. 9. A. Bonchatbonrat. Concise Holmesian proofs for elementary Moncher-Watson problems. In A. Mycroft, editor, Proc. 3rd Int. Conf. Theor. Crim. Deduction, London, November 1894. 1 . G. Cece. Etat de l’art des techniques d’analyse des automates nis communicants. Rapport de DEA, Universite de Paris-Sud, Orsay, France, September 1993. 11. K. Cerans. Deciding properties of integral relational automata. In Proc. 1st Int. Coll. Automata, Languages, and Programming (ICALP’94), Jerusalem, Israel, July 1994, volume 82 of Lecture Notes in Computer Science, pages 35 46. SpringerVerlag, 1994. 12. G. Cece, A. Finkel, and S. Purushothaman Iyer. Unreliable channels are easier to verify than perfect channels. Information and Computation, 124(1):2 31, 1995. 13. S. Christensen. Decidability and decomposition in process algebras. PhD thesis CST-1 5-93, Dept. of Computer Science, University of Edinburgh, UK, 1993. 14. G. Ciardo. Petri nets with marking-dependent arc cardinality: Properties and analysis. In Proc. 15th Int. Conf. Applications and Theory of Petri Nets, Zaragoza, Spain, June 1994, volume 815 of Lecture Notes in Computer Science, pages 179 198. Springer-Verlag, 1994. 15. J. Esparza. More in nite results. In Proc. 1st Int. Workshop on Veri cation of Innite State Systems (INFINITY’96), Pisa, Italy, Aug. 1996, volume 5 of Electronic Notes in Theor. Comp. Sci. Elsevier, 1997.
118
Alain Finkel and Philippe Schnoebelen
16. A. Finkel and A. Choquet. Fifo nets without order deadlock. Acta Informatica, 25(1):15 36, 1987. 17. A. Finkel. About monogeneous fo Petri nets. In Proc. 3rd European Workshop on Applications and Theory of Petri Nets, Varenna, Italy, Sep. 198 , pages 175 192, 1982. 18. A. Finkel. A generalization of the procedure of Karp and Miller to well structured transition systems. In Proc. 14th Int. Coll. Automata, Languages, and Programming (ICALP’87), Karlsruhe, FRG, July 1987, volume 267 of Lecture Notes in Computer Science, pages 499 5 8. Springer-Verlag, 1987. 19. A. Finkel. Well structured transition systems. Research Report 365, Lab. de Recherche en Informatique (LRI), Univ. Paris-Sud, Orsay, August 1987. 2 . A. Finkel. Reduction and covering of in nite reachability trees. Information and Computation, 89(2):144 179, 199 . 21. A. Finkel. The minimal coverability graph algorithm. In Advances in Petri Nets 1993, volume 674 of Lecture Notes in Computer Science, pages 21 243. SpringerVerlag, 1993. 22. A. Finkel. Decidability of the termination problem for completely speci cied protocols. Distributed Computing, 7:129 135, 1994. 23. M. G. Gouda and L. E. Rosier. Synchronizable networks of communicating nite state machines. Unpublished manuscript, 1985. 24. M. Hennessy and R. Milner. Algebraic laws for nondeterminism and concurrency. Journal of the ACM, 32(1):137 161, 1985. 25. O. Kouchnarenko and Ph. Schnoebelen. A model for recursive-parallel programs. In Proc. 1st Int. Workshop on Veri cation of In nite State Systems (INFINITY’96), Pisa, Italy, Aug. 1996, volume 5 of Electronic Notes in Theor. Comp. Sci. Elsevier, 1997. 26. C. Lakos and S. Christensen. A general approach to arc extensions for coloured Petri nets. In Proc. 15th Int. Conf. Applications and Theory of Petri Nets, Zaragoza, Spain, June 1994, volume 815 of Lecture Notes in Computer Science, pages 338 357. Springer-Verlag, 1994. 27. E. M¨ akinen. On permutation grammars generating context-free languages. BIT, 25:6 4 61 , 1985. 28. F. Moller. In nite results. In Proc. 7th Int. Conf. Concurrency Theory (CONCUR’96), Pisa, Italy, Aug. 1996, volume 1119 of Lecture Notes in Computer Science, pages 195 216. Springer-Verlag, 1996. 29. R. Valk. Self-modifying nets, a natural extension of Petri nets. In Proc. 5th Int. Coll. Automata, Languages, and Programming (ICALP’78), Udine, Italy, Jul. 1978, volume 62 of Lecture Notes in Computer Science, pages 464 476. SpringerVerlag, 1978. 3 . R. Valk and M. Jantzen. The residue of vector sets with applications to decidability problems in Petri nets. Acta Informatica, 21:643 674, 1985.
Shape Reconstruction with Delaunay Complex (Invited Paper) Herbert Edelsbrunner Dept. Comput. Sci., Univ. Illinois at Urbana-Champaign, and Raindrop Geomagic, Champaign, Illinois, USA. edel @c .uiuc.edu
Abstract. The reconstruction of a shape or surface from a nite set of points is a practically signi cant and theoretically challenging problem. This paper presents a uni ed view of algorithmic solutions proposed in the computer science literature that are based on the Delaunay complex of the points.
1
Introduction
This paper considers the problem of reconstructing a shape from a given nite set of points. Solutions based on the Delaunay complex of the set are surveyed and a uni ed view using restricted Delaunay complexes is developed Problem Description. De ne a s ape as a subset of Euclidean space. It inherits the topology of that space and can be viewed as a topological space itself. Given a nite set of points, S Rd , the s ape reconstruction problem asks for a shape in Rd that best approximates S. Without quantifying what it means that a shape approximates a point set, shape reconstruction remains a vague and primarily morphological problem. In spite of the importance of the problem there has been little success in phrasing it as an optimization problem. The main source of the di culty is the fantastically rich variety of shape and form as apparent in nature around us [16,39]. One way to cope with this di culty is to limit the range of shapes. We call the result a narrow problem speci cation. An example is the requirement that the shape R3 . The trouble with this produced be a closed surface in R3 , assuming S 3 R do not admit any reasonably problem speci cation is that many sets S approximation by a closed surface. As a consequence, any algorithm is limited to a subclass of input sets S, and it is di cult to characterize this subclass other than through success and failure of the algorithm. A wide problem speci cation permits the construction of any topological space. For example in R3 , the shape can be a point, a curve, a surface, a solid, or any combination of these. The trouble with this speci cation is that an algorithm might not produce a closed surface even in cases where one approximating S would exist and the application would prefer one. C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 119 13 , 1998. c Springer-Verlag Berlin Heidelberg 1998
1 0
Herbert Edelsbrunner
Rami cations. Versions of the shape reconstruction problem can be found in diverse areas of science and engineering. -dimensional versions arise in pattern recognition, image processing and computer vision [36]. For example, solutions to boundary reconstruction from images based on Delaunay complexes have been studied by Brandt and Algazi [11] and by Robinson et al. [38]. The most common 3-dimensional version of the problem is narrow and requires the reconstructed shape be a closed surface, see Lodha and Franke [31] for a survey of scattered point techniques for surfaces. The importance of this case stems from the fact that the closed surfaces are exactly the boundaries of the solids (3-manifolds with boundary). Given a closed surface, the solid can be physically created by modern 3D printing technology surveyed by Burns [13]. A particular algorithmic solution to the wide version of the 3-dimensional shape reconstruction problem are the -shapes proposed by Edelsbrunner and M¨ ucke [ 1]. They are dual to the space lling models of molecules and found extensive applications in molecular biology. The surface reconstruction problem in R3 has been generalized to the manifold learning problem in dimensions beyond 3 by Bregler and Omohundro [1 ]. This problem arises in the analysis of dynamical systems and of physical phenomena described by data of a xed dimension greater than 3. Reconstruction Met ods. This paper focuses on the Delaunay approac that reconstructs a shape from the Delaunay complex of the points. This is a simplicial complex that decomposes the convex hull of S by connecting the points with simplices of all possible dimensions. The shape is the underlying space of a subcomplex chosen from the Delaunay complex by some algorithm. Other approaches to the shape reconstruction problem are beyond the scope of this paper. Most solutions described in the scienti c literature follow either the Delaunay or one of three other approaches. The rst other approach keeps the idea of taking subcomplexes and replaces the Delaunay by a di erent complex. Kirkpatrick and Radke [ 7] suggest the one-parameter family of -skeletons to reconstruct the shape of a nite set in R . The sphere-of-influence graph has been proposed by Toussaint, see [6], and used by Edelsbrunner, Rote and Welzl [ ] to nd shortest curves. The -skeletons have been generalized by Veltkamp [41] and used for the reconstruction of surfaces in R3 . The second approach reconstructs surfaces in R3 from slices. Each slice represents the intersection of the shape with a plane by a collection of polygons in that plane. It is usually assumed that the slices are de ned by a sequence of parallel planes. If two adjacent slices consist of a single polygon each then the reconstruction problem reduces to nding a cylindrical surface that connects the two polygons. Fuchs, Kedem and Uselton [ 5] describe a polynomial time algorithm for constructing minimum area and other optimal cylinders. In the general case each slice consists of several pairwise disjoint but possibly nested polygons. Solutions that rst match and second connect the polygons are surveyed by Meyers, Skinner and Sloan [35]. Boissonnat and Geiger [10] combine
Shape Reconstruction with Delaunay Complex
1 1
the matching and connecting into one step using the Delaunay complex of the two slices. The third approach takes advantage of the fact that the points in S used to specify the shape have otherwise no signi cance. S can be replaced by any set that leads to the same or a similar shape. Assuming a dense distribution along the hypothetical surface, Hoppe et al. [ 6] construct a signed distance function, R. The surface is de ned as the zero-set, f −1 (0), and constructed f : R3 by the marching cube algorithm [3 ]. The limitation to sets S that are dense everywhere along the surface has been partially overcome by Curless and Levoy [15] who assume S is obtained by a scanner that provides, for each point, also the ray meeting the surface at that point. Outline. Section introduces the restricted Delaunay complex, which is the central notion in the uni ed view of solutions following the Delaunay approach. Section 3 discusses versions of the Delaunay approach that construct the restricting space from the data set. Section 4 considers version that assume the restricting space is given, either implicitly or explicitly. Section 5 concludes this paper.
Restricted Delaunay Complex This section presents the de nitions needed to unify and classify the proposed solutions following the Delaunay approach to shape reconstruction. The main concept is the so-called restricted Delaunay complex rst de ned in full generality by Edelsbrunner and Shah [ 3].
Rd induces a decomposition of the space into Voronoi Cells. A nite set S regions of influence. Speci cally, let x − p be the Euclidean distance between points x p Rd . The Voronoi cell of p S is the set of points x whose distance from p is less than or equal to the distance from any other point in S: Vp = x
Rd
x−p
x−q q
S
Each Voronoi cell is a closed and possibly unbounded convex polyhedron, see Figure 1. The Voronoi cells meet at most along common boundary faces, and together they cover the entire Rd . The collection of Voronoi cells is denoted as VS = Vp p S . In this paper we only consider Voronoi cells de ned by nitely many (unweighted) points and the Euclidean metric. Refer to the survey article by Aurenhammer [5] for generalizations to points with weights, to in nite point sets, and to other metrics. Nerve. The nerve of a nite collection of sets, A, is the set system (or set of collections) consisting of all subcollections with non-empty common intersection: Nrv A = X
A
X=
1
Herbert Edelsbrunner
Fig. 1. Decomposition of the plane by Voronoi cells of a nite set. The points are the locations of trees in the Allerton Park near Monticello, Illinois. It has been introduced by Alexandrov [1] as a tool to construct simplicial complexes. Observe that Y X and X Nrv A implies Y Nrv A; this is the de ning property of an abstract simplicial complex. To obtain an embedding we represent each set in A by a point in the Euclidean space of some dimension, e, and each collection X Nrv A by the convex hull of the corresponding points. Re so that Speci cally, we nd an injective function : A conv (X)
conv (Z) = conv (X
Z)
holds for all X Z Nrv A. In words, (X), (Z), and (X Z) are nite sets of points, their convex hulls are simplices, and the intersection of the rst two simplices is the simplex spanned by the intersection of the rst two point sets. K = conv (X) X Nrv A is a simplicial complex and K together with is a geometric realization of Nrv A. The underlying space of K is the part of Re covered by its simplices: K = K. A general position argument shows that there is always a geometric realization in dimension e m − 1, where m is the maximum cardinality of any X Nrv A, and there are examples that show e = m − 1 is sometimes necessary [ 4,40]. For computational purposes it is important to keep e as small as possible, and for A = VS it turns out that e = m − 1 = d is su cient. Delaunay Complex. Recall that S is a nite set of points in Rd and VS is the collection of Voronoi cells. We assume general position so that the common intersection of any k Voronoi cells is either empty or a convex polyhedron of dimension d + 1 − k. It follows that the collections X in the nerve of VS have cardinality at most d+1. The Delaunay complex of S is the geometric realization
Shape Reconstruction with Delaunay Complex
of Nrv VS de ned by the injection its generator, (Vp ) = p:
: VS
Rd
Del S = conv (X) X
1 3
that maps every Voronoi cell to Nrv VS
see Fig. . In other words, if two Voronoi cells share a common (d − 1)-face then
Fig. 2. Delaunay complex corresponding to the decomposition of the plane into Voronoi cells shown in Fig. 1. their generating points are connected by an edge, if three cells share a common (d − )-face then their generators are connected by a triangle, etc. General position is a convenient but not a necessary assumption. Without this assumption we get cells that are not simplices. Speci cally, the convex hull of k points is a cell in the Delaunay complex i the corresponding k Voronoi cells have a non-empty common intersection not contained in any other Voronoi cell. In this paper we assume general position, which can be simulated computationally by a symbolic perturbation [ 0]. Restricting Voronoi Cells and Delaunay Complex. Just as the Voronoi cells decompose Rd , they decompose any topological subspace X Rd . We call Vp X the restricted Voronoi cell of p and consider the collection of all such cells: S . The restricted Delaunay complex is the geometric VS X = Vp X p realization in Rd of the nerve of the collection of restricted Voronoi cells: DelXS = conv (Y ) Y
Nrv VS X
see Fig. 3 in the section on alpha shapes. Observe that the restricting space speci es a subcomplex of the Delaunay complex: DelXS Del S.
1 4
Herbert Edelsbrunner
The nerve theorem of Leray [ 9] implies that if all restricted Voronoi cells are contractible then X and the underlying space of the restricted Delaunay complex, DelXS , are homotopy equivalent. This means that the two topological spaces are connected the same way: they can be geometrically di erent but they have the same kind and arrangement of holes. Edelsbrunner and Shah [ 3] prove that if X is a k-manifold with boundary then X and DelXS are homeomorphic if the restricted Voronoi cells satisfy the closed ball property: (i) the common intersection of X and any k + 1 − Voronoi cells is either empty or a closed -ball, and (ii) the common intersection of the boundary of X and any k + 1 − Voronoi cells is either empty or a closed ( − 1)-ball. The closed ball property generalizes to a su cient condition that implies homeomorphic reconstruction for general triangulable spaces X.
3
Constructing the Restricting Space
All algorithmic solutions to shape reconstruction surveyed in this paper use restricted Delaunay complexes and only di er in how they arrive at the restricting space and how they treat it computationally. This section discusses solutions that generate the restricting space from the given data points. Alp a S apes. In 1983, Edelsbrunner, Kirkpatrick and Seidel introduced the s ape of a set S R as the space generated by connecting point pairs that can be touched by an empty disk of radius [19]. Speci cally, points p q S are connected by a straight edge if there is a circle of radius that passes through p and q, and all other points of S lie strictly outside the circle. The collection of edges decomposes R into interior regions that belong to the -shape and exterior regions that constitute the background. The unbounded region is always exterior. An equivalent de nition restricts the Delaunay complex of S with open disks of radius centered at the points:
X=
x
R
x−p <
for some p
S
The -shape is the underlying space of K = DelXS, see Fig. 3. The -complex, K , triangulates the interior regions and thus clari es the distinction between interior and exterior. Each restricted Voronoi cell is the intersection of the original Voronoi cell with the disk of its generating point. Since the cell and the disk are both convex, the restricted cell is convex and therefore contractible. The nerve theorem implies that X and the -shape are homotopy equivalent. This fact is reflected in Fig. 3 where both spaces consist of components, one with 4 holes and the other with 1 hole. The two spaces are not homeomorphic since K contains 3 isolated edges, which under any retraction are the pinched images of locally -dimensional regions in X. Each one of these edges violates condition (ii) of the closed ball
Shape Reconstruction with Delaunay Complex
1 5
Fig. 3. Each Voronoi cell in Fig. 1 is restricted to within the open disk of radius centered at the generating point. The result is a subcomplex of the Delaunay complex in Fig. that represents the shape at the resolution determined by .
property stated in Sect. . The fact that restricted Voronoi cells are open, rather than closed as required by the closed ball property, is a minor di culty that can be remedied by taking slightly smaller closed cells. Bernardini and Bajaj [8] reconstruct -dimensional shapes, and they prove that under some density requirements -shapes correctly reconstruct uniformly sampled smooth curves. Alpha shapes generalize to Rd by using open balls in the de nition of the restricting space. Edelsbrunner and M¨ ucke [ 1] discuss this construction in R3 and Bajaj, Bernardini and Xu [7] use it to reconstruct shapes and surfaces. The 3dimensional case has also applications to computational biology where molecules are modeled as unions of spherical balls. Such models have been introduced by Lee and Richards [ 8] in 1971 and are commonly used to assess spatial properties of molecules such as volume, surface area, connectivity, shape, etc. Edelsbrunner [17] generalizes alpha shapes to points with weights in order to model molecules made up of atoms of varying size. That paper also contains inclusion-exclusion formulas that compute the volume and surface area of a molecule directly from the -complex, without constructing the union of balls. These formulas have been used to measure molecules and their voids and pockets by Liang and collaborators, see e.g. [30].
R Crust. In 1997, Amenta, Bern and Eppstein de ned the crust of a set S as the subcomplex of Del (S U ) induced by S, where U is the set of vertices of the Voronoi cells de ned by S [3]. In other words, a simplex in Del (S U ) belongs to the crust if all its vertices are points in S, see Fig. 4. To reformulate the de nition we consider the collection of Voronoi cells of S U and use the
1 6
Herbert Edelsbrunner
Fig. 4. Point set and crust are courtesy of Nina Amenta at the University of Texas in Austin. The crust reconstructs the goose from a collection of points sampled from the outline.
subset of cells generated by points in S to restrict the Delaunay complex of S:
X = int (
p2S Vp )
Note that X is the set of points closer to S than to U , or equivalently it is the complement of the union of the Voronoi cells generated by points in U . The crust is the resulting restricted Delaunay complex: C = DelXS. No triangle in Del S can be in C because the corresponding intersection of the three Voronoi cells is a point in U , which necessarily lies outside X. The crust is suitable for the reconstruction of smooth boundary curves in the plane. The main result in [3] is a fairly modest condition on the sampling density under which the crust is guaranteed to reconstruct a smooth closed curve, γ. De ne the medial axis of γ as the set of points y R with two or more closest points on γ, and for a point x let f (x) be the distance to the medial axis. A nite set S γ is an -sample if every point x γ is within distance f (x) of some point in S. For < 0 5 the crust is guaranteed to contain an edge connecting points p q S i they are contiguous along γ. This result justi es the de nition of crust by the observation that the points in U approximate the medial axis of γ. It is straightforward to extend the de nition of crust to 3 and higher dimensions. However, already in R3 the Voronoi vertices of points sampled on a smooth surface no longer approximate the medial axis of that surface. The source of the trouble are slivers, which are Delaunay tetrahedra whose 4 vertices are almost cocircular. The center of the circumsphere belongs to U which implies that the sliver does not belong to the crust, but in many cases neither do the 4 triangles of the sliver. As a consequence, the crust develops holes or windows in the re-
Shape Reconstruction with Delaunay Complex
1 7
constructed surface. Amenta and Bern [ ] cope with this di culty by using only a subset of the points in U for the restricting space. A-s ape. In 1997, Melkemi proposed a general family of shapes that includes shapes and the crust as special cases [34]. Let S be a nite set in R . A member R . The in this family is identi ed with the help of a second nite set A A-s ape of S is generated by drawing an edge connecting points p q S if there is a circle that passes through p, q, and a point a A, and all other points of S A lie strictly outside the circle. The crust is the special case where A = U is the collection of Voronoi vertices de ned by S. The -shape is the special case where A is the collection of points a on Voronoi edges that span empty circles of radius with points in S. The reformulation of the de nition is similar to the crust. The restricting space is the set of points closer to S than to A:
X = int (
p2S Vp )
and the A-s ape is the boundary of the underlying space of the Delaunay complex restricted by X. The trouble with this de nition is the high degree of freedom. A can be anything and it is not clear how to construct sets that bring out the shape of S best. To address this concern, Melkemi suggests a two-parameter family of point sets, A = A( t). The rst parameter, 0, controls the resolution and the second parameter, t [0 1], interpolates between the unweighted case and the case where points are weighted by the local density. To be speci c, let (p) be the minimum Euclidean distance between p S and any other point in S. For a given t, the weig ted distance of a point x R from p S is t p (x)
= x−p
−t
(p)
which is the square length of a tangent line segment from x to the circle with center p and radius t (p). A point a R belongs to A( t) if there are points = t p (a) = t q (a) < t r (a) for all r S − p q . The family p q S with A( t) does in general not contain the particular sets that generate the -shapes, but it would be easy to de ne a similar family that does. Wrap Complex. Commercial software produced at Raindrop Geomagic reconR3 through an iteration that re nes both structs the shape of a nite set S Rd to the Delaunay the shape and the restricting space [37]. Let map X complex restricted by X, and let G map a subcomplex K Del S to a topological subspace of Rd . The composition maps a subcomplex of Del S to another such subcomplex, and we write K L if L = (G(K)). Edelsbrunner constructs G so the relation is acyclic [18]. It follows that the maximal elements in the relation are xed points of G. Another property implied by the special choice of G is that the union of two xed points is again a xed point. This implies there exists a unique largest xed point, which we call the Wrap complex of S, see Fig. 5. It can be obtained by iterating G starting with Del S = (Rd ):
1 8
Herbert Edelsbrunner
Fig. 5. Wrap complex of a collection of points sampled on the surface of a monkey saddle. Del S = X0 X1
Xj = Xj+1
with Xi+1 = (G(Xi )). The software developed at Raindrop Geomagic o ers the user convenient control that permits the transition between di erent xed points, each representing a locally reasonable approximation of S.
4
Assuming the Restricting Space
This section discusses variants of the Delaunay approach to shape reconstruction that assume the restricting space is given, either implicitly or explicitly. The rst two solutions make use of an oracle that returns a small bit of information about the restricting space. The third solution aims at reconstructing the restricted Delaunay complex without any information about the restricting space other than the sampled data points. Neural Network. In 1994, Martinetz and Schulten designed what they initially called the neural gas algorithm [33]. It constructs a neural network modeled as a 1-dimensional complex of nodes and edges that approximates a target space X Rd . The approximation is achieved by sampling points from X and using them to locally adjust the nodes and connect them with edges. For example, if X is the state space of a dynamical system then each sample is a snapshot of that system during its evolution. The algorithm starts with a loose collection of nodes or points distributed in Rd , see Fig. 6. For each sample x X, the positions of the nearby nodes are adjusted towards x and the age of every edge is increased by one. Furthermore, if the two nodes p and q closest to x are already connected by an edge, pq, then x is interpreted as further justi cation of that edge and its age is set back to 0. If pq is not yet in the network then it is now added with age 0. Edges whose age exceeds a certain threshold are removed from the network.
Shape Reconstruction with Delaunay Complex
1 9
Fig. 6. The pictures are courtesy of Klaus Schulten from the University of Illinois at Urbana-Champaign. The target space consists of a 3-dimensional box, a dimensional rectangle and a ring with line segment. The left shows the starting con guration and the right shows the ending con gurations after 40,000 steps of the neural gas algorithm. The connection to restricted Delaunay complexes arises from the fact that a sample point x with closest nodes p and q can be interpreted as evidence that the Voronoi cells of p and q share a common (d − 1)-dimensional face that has a non-empty intersection with X. It is therefore reasonable to expect that the neural gas algorithm approaches the 1-skeleton of the Delaunay complex of the nodes restricted by X. The simple strategy of connecting the k + 1 = closest nodes does not extend to k-simplices for k > 1 because already the 3 closest nodes do not, in general, span a triangle in the Delaunay complex. Surface Triangulation. In 1993, Chew de ned the Delaunay triangulation of a nite set S of points on a surface X in R3 by modifying the Euclidean empty disk criterion [14]. Speci cally, the triangle formed by points p q r S belongs to the surface Delaunay triangulation if there is a sphere K with center on X so that p, q, r lie on and all other points lie strictly outside K. Chew uses this de nition in combination with a point placement mechanism to produce surface triangulations with good quality triangles, see Fig. 7. The connection to restricted Delaunay complexes should be obvious: the sphere K exists i the 3-dimensional Voronoi cells of p, q, and r meet along an edge that has a non-empty common intersection with X. In other words, Chew’s surface Delaunay triangulation is the same as the 3-dimensional Delaunay complex restricted by that surface. This suggest that the closed ball property of Sect. be used as part of the point placement mechanism to guarantee the restricted complex is homeomorphic to the surface. Normalized Mes . In 1997, Attali de ned the normalized mes of a nite set X as a complex that approximates the space X [4]. It contains the convex S hull of a subset T S as a cell if there is a point x X equally far from all
130
Herbert Edelsbrunner
Fig. 7. The triangulation is courtesy of Paul Chew at Cornell University. The surface is part of an airplane wing and it is triangulated using the modi ed empty disk criterion. points in T and further from all other points: x − p = x − q < x − r for all p q T and r S − T . Observe that for X a surface in R3 this is the same as Chew’s surface Delaunay triangulation. In general, the normalized mesh is the same as the Delaunay complex restricted by X. The algorithmic problem tackled in [4] is the reconstruction of X through the construction of the normalized mesh. The problem is made di cult by assuming that X is not given at all, other than indirectly through the points in S. In two dimensions the strategy is to discriminate edges pq Del S by (pq) de ned as the sum of angles opposite to pq in the two incident triangles. If only one triangle exists, the other opposite angle is set to 0. Under some smoothness conditions it is possible to prove that the normalized mesh consists exactly of all edges with small value of . Call Y R an R-regular s ape if the circle passing through any three boundary points has radius greater than R. We assume that R is positive and X is the boundary of Y. This implies that the curvature at any point x X is smaller than 1 . A nite set S X is an -sample if every point x X is within distance R of some point in S. The main result in [4] states that if < sin 8 then the normalized mesh consists exactly of all edges pq Del S with (pq) < R. For similar reasons as mentioned in the discussion of the crust, this result does not generalize to R3 and Attali presents heuristics that patch up the holes in the partially reconstructed surfaces.
5
Conclusions
This paper uni es algorithmic solutions to the shape reconstruction problem following the Delaunay approach by identifying a common underlying concept: the restriction of the Delaunay complex by a topological space. The uni cation succeeds in all cases known to the author at this time, except for the work of Boissonnat [9] who suggests to sculpt a shape from the Delaunay complex by removing simplices from outside in. This idea is closely related to the crust, the wrap complex, and the normalized mesh, but in its general form it is not guided by any restricting space.
Shape Reconstruction with Delaunay Complex
131
The seven solutions to shape reconstruction surveyed in this paper are classi ed according to their treatment of the restricting space. The four methods in Sect, 3 con truct the space from the data points, while the three methods in Sect. 4 a ume the space is given and cannot be altered by the algorithm. Another useful classi cation criterion for shape reconstruction is the one-dimensional scale from narrow to wide. The surface reconstruction methods ought to be classi ed as narrow, and examples are the crust, the surface Delaunay triangulation, and the normalized mesh. The other four methods show no bias for any particular type of shape and ought to be classi ed as wide.
References ¨ 1. P. S. Alexandrov. Uber den allgemeinen Dimensionsbegri und seine Beziehungen zur elementaren geometrischen Anschauung. Mat . Ann. 98 (19 8), 617 635. . N. Amenta and M. Bern. Surface reconstruction by Voronoi ltering. Manuscript, 1998. 3. N. Amenta, M. Bern and D. Eppstein. The crust and the -skeleton: combinatorial curve reconstruction. Grap ical Models and Image Process., to appear. 4. D. Attali. r-regular shape reconstruction from unorganized points. In Proc. 13th Ann. Sympos. Comput. Geom., 1997 , 48 53. 5. F. Aurenhammer. Voronoi diagrams a study of a fundamental geometric data structure. ACM Comput. Surveys 3 (1991), 345 405. 6. D. Avis and J. Horton. Remarks on the sphere of influence graph. In Proc. Conf. Discrete Geom. Convexity , J. E. Goodman et al. (eds.), Ann. New York Acad. Sci. 440 (1985), 3 3 3 7. 7. C. L. Bajaj, F. Bernardini and G. Xu. Automatic reconstruction of surfaces and scalar elds from 3D scans. Comput. Grap ics, Proc. siggrap 1995, 109 118. 8. F. Bernardini and C. L. Bajaj. Sampling and reconstructing manifolds using shapes. In Proc. 9th Canadian Conf. Comput. Geom., 1997 , 193 198. 9. J.-D. Boissonnat. Geometric structures for three-dimensional shape representation. ACM Trans. Grap ics 3 (1984), 66 86. 10. J.-D. Boissonnat and B. Geiger. Three-dimensional reconstruction of complex shapes based on the Delaunay triangulation. In Proc. Biomedical Image Process. Biomed. Visualization, 1993 , 964 975. 11. J. Brandt and V. R. Algazi. Continuous skeleton computation by Voronoi diagram. Comput. Vision, Grap ics, Image Process. 55 (199 ), 3 9 338. 1 . C. Bregler and S. M. Omohundro. Nonlinear manifold learning for visual speech recognition. In Proc. 5th Internat. Conf. Comput. Vision, 1995 , 494-499. 13. M. Burns. Automated Fabrication. Improving Productivity in Manufacturing. Prentice Hall, Englewood Cli s, New Jersey, 1993. 14. L. P. Chew. Guaranteed-quality mesh generation for curved surfaces. In Proc. 9th Ann. Sympos. Comput. Geom., 1993 , 74 80. 15. B. Curless and M. Levoy. A volumetric method for building complex models from range images. Comput. Grap ics, Proc. siggrap 1996, 303 31 . 16. W. D’Arcy Thompson. Growt and Form. Cambridge Univ. Press, 1917. 17. H. Edelsbrunner. The union of balls and its dual shape. Discrete Comput. Geom. 13 (1995), 415 440. 18. H. Edelsbrunner. Surface reconstruction by wrapping nite sets in space. Rept. rgi-tech-96-001, Raindrop Geomagic, Urbana, Illinois, 1996.
13
Herbert Edelsbrunner
19. H. Edelsbrunner, D. G. Kirkpatrick and R. Seidel. On the shape of a set of points in the plane. IEEE Trans. Inform. T eory 9 (1983), 551 559. 0. H. Edelsbrunner and E. P. M¨ ucke. Simulation of simplicity: a technique to cope with degenerate cases in geometric algorithms. ACM Trans. Grap ics 9 (1990), 66 104. 1. H. Edelsbrunner and E. P. M¨ ucke. Three-dimensional alpha shapes. ACM Trans. Grap ics 13 (1994), 43 7 . . H. Edelsbrunner, G. Rote and E. Welzl. Testing the necklace condition for shortest tours and optimal factors in the plane. T eoret. Comput. Sci. 66 (1989), 157 180. 3. H. Edelsbrunner and N. R. Shah. Triangulating topological spaces. Internat. J. Comput. Geom. Appl. 7 (1997), 365 378. ¨ 4. A. Flores. Uber die Existenz n-dimensionaler Komplexe, die nicht in den R2n topologisch einbettbar sind. Ergeb. Mat . Kolloq. 5 (193 /33), 17 4. 5. H. Fuchs, Z. M. Kedem and S. P. Uselton. Optimal surface reconstruction from planar contours. Commun. ACM 0 (1977), 693 70 . 6. H. Hoppe, T. de Rose, T. Duchamp, J. McDonald, and W. St¨ utzle. Surface reconstruction from unorganized points. Comput. Grap ics, Proc. siggrap 199 , 71 78. 7. D. G. Kirkpatrick and J. D. Radke. A framework for computational morphology. In Computational Morp ology, G. Toussaint (ed.), Elsevier (1985), 17 48. 8. B. Lee and F. M. Richards. The interpretation of protein structures: estimation of static accessibility. J. Mol. Biol. 55 (1971), 379 400. 9. J. Leray. Sur la forme des espaces topologiques et sur les point xes des representations. J. Mat . Pures Appl. 4 (1945), 95 167. 30. J. Liang, H. Edelsbrunner and C. Woodward. Anatomy of protein pockets and cavities: measurement of binding site geometry and implications for ligand design. Manuscript, 1997. 31. S. K. Lodha and R. Franke. Scattered data techniques for surfaces. In Scienti c Visualization: Met ods and Applications, G. M. Nielson, H. Hagen and F. Post (eds.), Springer-Verlag, Heidelberg, to appear. 3 . W. E. Lorensen and H. E. Cline. Marching cubes: a high resolution 3D surface construction algorithm. Comput. Grap ics 1, Proc. siggrap 1987, 163 169. 33. T. Martinetz and K. Schulten. Topology representing networks. Neural Networks 7 (1994), 507 5 . 34. M. Melkemi. A-shapes of a nite point set. Correspondence in Proc. 13th Ann. Sympos. Comput. Geom., 1997 , 367 369. 35. D. Meyers, S. Skinner and K. Sloan. Surfaces from contours. ACM Trans. Grap ics 11 (199 ), 8 58. 36. Y.-L. O, A. Toet, D. Foster, H. J. A. M. Heijmans and P. Meer (eds.) S ape in Picture. Mat ematical Description of S ape in Grey-level images. NATO ASI Series F: Computer and Systems Sciences 1 6, Springer-Verlag, Berlin, 1994. 37. Raindrop Geomagic, Inc. www.geomagic.com. 38. G. P. Robinson, A. C. F. Colchester, L. D. Gri n and D. J. Hawkes. Integrated skeleton and boundary shape representation for medical image interpretation. In Proc. European Conf. Comput. Vision, 199 , 7 5 7 9. 39. R. Thom. Structural Stability and Morp ogenesis. Addison Wesley, Reading, Massachusetts, 1989. 40. E. R. van Kampen. Komplexe in euklidischen R¨ aumen. Ab . Mat . Sem. Univ. Hamburg 9 (1933), 7 78. 41. R. C. Veltkamp. Closed Object Boundaries from Scattered Points. Springer-Verlag, Berlin, 1994.
Bases for Non-homogeneous Polynomial C k Splines on the Sphere Anamaria Gomide and Jorge Stol Institute of Computing, University of Campinas Caixa Postal 6176 13081-970 Campinas SP, Brazil anamaria, tolfi @dcc.unicamp.br
Ab tract. We investigate the use of non-homogeneous spherical polynomials for the approximation of functions de ned on the sphere S2 . A sp erical polynomial is the restriction to S2 of a polynomial in the three coordinates x y z of IR3 . Let P d be the space of spherical polynomials with degree d. We show that P d is the direct sum of Hd and d−1 d H , where H denotes the space of omogeneous degree-d polynomials in x y z. We also generalize this result to splines de ned on a geodesic triangulation T of the sphere. Let Pkd [T ] denote the space of all functions f from S2 to IR such that (1) the restriction of f to each triangle of T belongs to P d ; and ( ) the function f has order-k continuity across the edges of T . Analogously, let Hdk [T ] denote the subspace of Pkd [T ] consisting of those functions that are Hd within each triangle of T . We show that Pkd [T ] = Hdk [T ] Hd−1 [T ]. Combined with results of Alfeld, Neamtu and k Schumaker on bases of Hdk [T ] this decomposition provides an e ective construction for a basis of Pkd [T ]. There has been considerable interest recently in the use of the homogeneous spherical splines Hdk [T ] as approximations for functions de ned on S2 . We argue that the non-homogeneous splines Pkd [T ] would be a more natural choice for that purpose.
1
Introduction
Spherical functions real functions de ned on the sphere S arise in many applications: geophysics, meteorology, computer graphics, robotics, lighting, acoustics, etc.. In such applications, one usually represents a point of the sphere by its spherical coordinates, the longitude and the latitude . The function is typically modeled as (or approximated by) a polynomial in and , or a sum of spherical harmonics up to a certain order. However, the latitude-longitude representation has some drawbacks. Spherical coordinates have an essential discontinuity at the poles; the geodesic lines have complicated descriptions in terms of and ; a regular ( ) grid covers the sphere with non-uniform resolution; and so on. These problems are particularly annoying when implementing advanced numerical methods, such as multiscale This research was partly funded by CNPq grant 301016/9 -5 C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 133 140, 1998. c Springer-Verlag Berlin Heidelberg 1998
134
Anamaria Gomide and Jorge Stol
integration and approximation by irregular adaptive meshes. As a consequence, there has been a lot of interest recently on the modeling of spherical functions in situ, that is, as functions de ned in terms of the three Cartesian coordinates (x y z) of IR3 , which are then restricted to the sphere. Recently, Alfeld, Neamtu and Schumaker [1, ,3] have proposed the use of a special class of functions, the homogeneous spherical splines Hkd [T ], as an approximation space for functions de ned on S . Here we de ne an alternative class, the polynomial spherical splines Pkd [T ]. We show that Pkd [T ] = Hkd [T ] Hkd−1[T ], which allows us to obtain a characterization for the bases of the space Pkd [T ]. We argue that Pkd [T ] is a more natural choice for approximating spherical functions.
Polynomial Functions on IRn Let P d n be the space of polynomials on n variables of degree d, viewed as functions from IRn to IR. A function p belongs to P d n if and only if it can be written in the form ci1 i2
p(x) =
i1 i2 in x1 x
xinn
0i1 +i2 ++in d
for all x = (x1 x xn ) IRn , where ci1 i2 in are real coe cients. (All indices in this paper are non-negative integers). The set P d n is obviously a vector space, of dimension dim P d n =
d+n n
We say that a function f de ned on IRn is homogeneous of degree d if f (ax) = a f (x), for all a IR and all x IRn . Let Hd n the space of the polynomials on the IRn of degree d which are homogeneous of degree d. Obviously Hd n is a subspace of P d n . A function belongs to Hd n if and only if it can be written in the form d
ci1 i2
(x) =
i1 i2 in x1 x
xinn
i1 +i2 ++in =d
for all x = (x1 x that
xn )
IRn , where ci1 i2
dim Hd n =
in
are real coe cients. It follows
d+n−1 n−1
It easy to see that, if d = d0 , the spaces Hd n and Hd 0 that is, Hd n Hd n = 0 .
0
n
are linearly independent;
Bases for Non-homogeneous Polynomial
3
C
k
Splines on the Sphere
135
Polynomial Functions on S2
If a function f is de ned on IRn , and X IRn , we denote by f X the restriction of f to the set X. By extension, we de ne the restriction of a function space F to the set X as F X=
f X:f
F
We will use the following notation: if f X = g X, we write f g (mod X); or just f g, when X is implicit in the context. It is obvious that ‘ ’ is an equivalence relation. We are interested in the space P d n Sn−1 , consisting of the polynomial functions on IRn of degree d, restricted to the sphere Sn−1 = x IRn : x = 1 . Observe that polynomials which are distinct in IRn can be identical when restricted to the sphere Sn−1 . Therefore, the dimension of P d n Sn−1 is smaller than that of P d n . The following Theorem is fundamental for the characterization of P d n Sn−1 : Theorem 1. Every polynomial in P d n , n a unique polynomial in Hd−1 n Hd n .
1, is equivalent (modulo Sn−1 . to
Proof. Existence. Let p be a polynomial in P d n . We will show that all terms of p with degree d − can be replaced by terms of degree d and d − 1. Let xinn be a term of p such that i1 + i + + in = k d − . Since ci1 i2 in xi11 xi2 n−1 we have x1 + x + + xn = 1. Then, x is a point of S ci1
i1 in x1
xinn = ci1
i1 in x1
xinn (x1 +
+ xn )
Thus, a term of degree k d − can be replaced by n terms of degree k + d, while maintaining equivalence modulo Sn−1 . If we keep applying this substitution until there are no more terms of degree d − , we obtain a polynomial q with terms of degree d and d − 1. Then q is a polynomial in Hd−1 n Hd n such that p q (mod Sn−1 ). Unicity. Let p be a polynomial in P d n . Suppose that there exist q1 q p and q p. Since is transitive, we must have Hd−1 n Hd n with q1 q1 q , i. e. (q1 − q ) Sn−1 = 0 Since Sn−1 is an algebraic variety [5,4], the minimal equation of Sn−1 must be a factor of q1 − q , that is, q1 − q = R(x1 + x +
+ xn − 1)
136
Anamaria Gomide and Jorge Stol
where R is some polynomial on IRn of degree n − . Since the polynomial x1 + + xn − 1 has terms whose degrees di er by , and, on the other hand, x + q1 − q
Hd−1 n
Hd n
we can conclude that R = 0, i.e. q1 = q . Thus, there is at most one polynomial in Hd−1 n Hd n that is equivalent to p modulo Sn−1 . Corollary 1. Hd n Sn−1 Corollary . If p q
Hd+
Sn−1 .
Hd n and p
Hd−1 n
Corollary 3. Hd−1 n Sn−1
n
q (mod Sn−1 ), then p = q.
Hd n Sn−1 = 0 .
As a consequence of Thm. 1, P d n Sn−1 = (Hd−1 n
Hd n ) Sn−1 = Hd−1 n Sn−1
Hd n Sn−1
It follows that dim((Hd−1 n
4
Hd n ) Sn−1 ) = dim(Hd−1 n
Hd n ) =
d+n−1 n
Derivatives of Spherical Polynomials
If f is a function from Sn−1 to IR, we denote by f its gradient with respect to directions tangent to Sn−1 . For this paper, we can de ne f as a vector of IRn , tangent to Sn−1 , such that the derivative of f at a point u Sn−1 , in the direction of a unit vector v tangent at u, is v ( f (u)). If f = F Sn−1 for some di erentiable function F from IRn to IR, then f is merely the projection of F (the ordinary gradient of F ) onto the sphere. That is, for any point u Sn−1 , ( f )(u) = ((
F )(u) − (( F )(u) u) u) Sn−1
Let us denote by [v] the th component of a vector v Theorem . If f belongs to Hd n Sn−1 , then [ f ] 1 n .
(1)
IRn .
Hd+1 n Sn−1 for each
Proof. If f belongs to Hd n Sn−1 , for d 1, then f = F Sn−1 for some F Hd n . It is easy to check that [ F ] = F x is in Hd−1 n . Therefore, the righthand side of formula (1) lies in Hd−1 n Sn−1 + Hd+1 n Sn−1 . By Corollary 1, Hd−1 n Sn−1 is actually a subspace of Hd+1 n Sn−1 . Corollary 4. If f belongs to P d n , then [ f ]
P d+1 n Sn−1 , for each .
Bases for Non-homogeneous Polynomial
5
C
k
Splines on the Sphere
137
Piecewise Polynomial Functions on S2
We now proceed to extend Thm. 1 to the piecewise polynomial functions de ned Tn , with on the sphere S . Let T a decomposition of IR3 into trihedra T1 T non empty interiors and a common vertex at the origin of IRn . Observe that the set Ti S , for 1 i n, is a spherical triangle, whose sides are arcs of great circles. Therefore, T determines a triangulation of S , which we denote by T S . For such a trihedral decomposition T , we de ne the following function spaces from IR3 to IR: P d [T ] = Hd [T ] =
P d 3 Ti H d 3 Ti
p : ( i) p Ti : ( i) Ti
By Thm. 1, we can immediately conclude that P d [T ] S = Hd−1 [T ] S + Hd [T ] S Furthermore, we can show the following: Theorem 3. If p q are functions in Hd−1 [T ] + Hd [T ], then p and only if p = q.
q (mod S ) if
i n, let pi qi be functions of Proof. Let p, q Hd−1 [T ] + Hd [T ]. For 1 d−1 3 d3 +H such as p Ti = pi Ti e q Ti = qi Ti . Then p q means that H Ti ) = qi (S Ti ), and, therefore, (pi − qi ) (S Ti ) = 0. Since Ti S pi (S is a -dimensional subset of S , and S is an irreducible algebraic variety in IR3 , we can conclude that (pi − qi ) S = 0. According to the Corollary , pi − qi = 0. Since this equality holds for all trihedra Ti , we conclude that p = q. This Theorem has the following consequences: Corollary 5. Hd−1 [T ] S
Hd [T ] S = 0
Corollary 6. P d [T ] S = Hd−1 [T ] S
6
S .
Hd [T ] S .
Continuity Constraints
Finally, we extend Corollary 6 to piecewise polynomial functions, subject to continuity restrictions across the edges of the triangulation. We say that a function from Sn−1 to IR is continuous to order zero if it is continuous in the ordinary sense; and is continuous to order k, for k > 0, if it is continuous, and each component of its spherical gradient is continuous to order k − 1. We denote by C k (S ) the set of all functions from S to IR that are continuous to order k.
138
Anamaria Gomide and Jorge Stol
For a decomposition T of IR3 into non-degenerate trihedra, as in Sect. 5, we de ne the function spaces Pkd [T ] = Hkd [T ]
p:p
P d [T ]
:
Hd [T ]
=
p S S
C k (S ) C k (S )
Our goal is to show that Pkd [T ] S is the direct sum of Hkd−1 [T ] S and S . In other words, imposing kth-order continuity on P d [T ] S is equivalent to independently imposing kth-order continuity on each of the two subspaces Hd−1 [T ] S and Hd [T ] S . Let us consider rst the case k = 0:
Hkd [T ]
Theorem 4. P0d [T ] S = H0d−1 [T ] S
H0d [T ] S .
Proof. ( ): Trivial. ( ): Let p be a function in P0d [T ] S . Let Ti and Tj be adjacent trihedra from T , and let pi and pj be functions of P d 3 such that p Ti = pi Ti and p Tj = pj Tj . By Corollary 6, d
p=
d−1
+
with d Hd [T ] S and d−1 Hd−1 [T ] S . (Here ‘d ’ is an index and not an exponent.) Since p Ti = pi Ti , pi can be written as pi = where
d i
Ti
Hd 3 Ti and
d−1 i
d i
d−1 i
+
Hd−1 3 Ti . Analogously,
Ti
pj =
d j
d−1 j
+
where d−1 Tj Hd−1 3 Tj and dj Tj Hd 3 Tj . j Let w be the arc shared by the spherical triangles Ti S and Tj S , and let c be the circle which contains the arc w. Since p C 0 (S ), we have p w = pi w = pj w, and therefore (pi − pj ) w = 0. Given that w is a one-dimensional subset of c, and c is an irreducible variety, we conclude that (pi − pj ) c = 0. We can assume, without loss of generality, that c is the circle x + y = 1, P d and contained in the plane with equation z = 0. Since (pi − pj ) (pi − pj ) c = 0, we can, by Corollary 5, conclude that
(
−
(
d i
d−1 i
−
d j) d−1 ) j
c=0 c=0
Therefore, d i d−1 i
We conclude that
d
H0d [T ] and
c= c= d−1
d j c d−1 j
c
H0d−1 [T ].
Bases for Non-homogeneous Polynomial
C
k
Splines on the Sphere
139
Let us now prove the general case: 0, Pkd [T ] S = Hkd−1 [T ] S
Theorem 5. For any k
Hkd [T ] S .
Proof. ( ): Trivial. ( ): We prove this part by induction on k. The case k = 0 is Thm. 4, so let us assume k > 0. Let p be a function in Pkd [T ] S . By de nition, p is continuous, and p is continuous of order k − 1. By Corollary 4, [ p] belongs to P d+1 [T ] S , and d+1 [T ] S . By induction, therefore to Pk−1 [ p]
d Hk−1 [T ] S
d+1 Hk−1 [T ] S
( )
Hd [T ] S and On the other hand, by Thm. 1, p = d + d−1 , where d d−1 d−1 H [T ] S . In the interior of each triangle of T , the spherical gradient of p is then
p=
d
+
d−1
By Thm. 1, [ [
d
d−1
]
Hd+ [T ] S
(3)
]
H
(4)
d+1
[T ] S
Comparing equation ( ) with equations (3) and (4), we conclude that [ [
d
]
d+ Hk−1 [T ] S
(5)
d−1
]
d+1 Hk−1 [T ] S
(6)
Since p is continuous, Thm. 4 implies that d and d−1 are continuous, too. With equations (5-6), we conclude that d Hkd [T ] and d−1 Hkd−1 [T ].
7
Bases for Spherical Splines
The results above show that Pkd [T ] S , the space of piecewise polynomial functions restricted to the sphere with order-k continuity, is the direct sum of the spaces Hkd [T ] and Hkd−1 [T ], restricted to S . Alfeld, Neamtu and Schumaker [1, ,3] have recently obtained an explicit basis for the space Hkd [T ], in terms of Bernstein-Bezier polynomials. In view of Thm. 5, their construction also gives a basis for Pkd [T ] S .
8
Conclusion
We believe that the space Pkd [T ] S is a better choice than Hkd [T ] S for function approximation on the sphere. For one thing, Pkr [T ] Pkd [T ] when r d, while Hkr [T ] Hkd [T ] only when d−r is even. In particular, Pkd [T ] includes the functions which are constant on S , for all d; whereas Hkd [T ] only contains such functions when d is even.
140
Anamaria Gomide and Jorge Stol
References 1. Peter Alfeld, Marian Neamtu, and Larry L. Schumaker. Bernstein-Bezier polynomials on circle, spheres, and sphere-like surfaces. Computer Aided Geometric Design Journal, 13:333 349, 1996. . Peter Alfeld, Marian Neamtu, and Larry L. Schumaker. Dimension and local bases of homogeneous spline spaces. SIAM Journal of Mat ematical Analysis, 7(5):148 1501, September 1996. 3. Peter Alfeld, Marian Neamtu, and Larry L. Schumaker. Fitting scattered data on sphere-like surfaces using spherical splines. Journal of Computational and Applied Mat ematics, 73:5 43, 1996. 4. W. Fulton. Algebraic Curves: An Introduction to Algebraic Geometry. W. A. Benjamin, 1969. 5. E. Kunz. Introduction to Commutative Algebra and Algebraic Geometry. Birkhauser, 1993.
The Spl tt ng Number of the 4 Cube? Luerbio Faria1 3 , Celina Miraglia Herrera de Figueiredo2 3 , and Candido Ferreira Xavier de Mendonca Neto4 1
Faculdade de Formacao de Professores, UERJ. 2 Inst tuto de Matemat ca, UFRJ. 3 COPPE S stemas e Computacao, UFRJ. 4 Inst tuto de Computacao, UNICAMP. luerb o,cel na @cos.ufrj.br xav
[email protected] camp.br
Abs rac . The spl tt ng number of a graph s the smallest nteger k 0 such that a planar graph can be obta ned from by k spl tt ng operat ons. Such operat on replaces v by two nonadjacent vert ces v1 and v2 , and attaches the ne ghbors of v e ther to v1 or to v2 . The n-cube has a d st ngu shed place n Computer Sc ence. Dean and R chter devoted an art cle to prov ng that the m n mum number of cross ngs n an opt mum draw ng of the 4 cube s 8, but no results about spl tt ng number of other nonplanar n cubes are known. In th s note we g ve a proof that the spl tt ng number of the 4 cube s 4. In add t on, we g ve the lower bound 2n−2 for the spl tt ng number of the n cube. It s known that the spl tt ng number of the n cube s O(2n ), thus our result mpl es that the spl tt ng number of the n-cube s (2n ).
1
Introduct on
Applications in Computer Science are frequently modeled with nonplanar graphs. Graph visualization and VLSI projects many times require strategies of layout techniques. Layout algorithms are limited to special classes of graphs. For instance, there is a wealth of layout algorithms for planar graphs; however, these algorithms are useless for nonplanar graphs. One approach to handling nonplanarity in layout algorithms is to consider another topological invariant of the graph, the splitting number. The splitting number is a graph invariant that is used as a measure of nonplanarity in many applications such as graph drawing. Research on topological properties of the n cube is important for applications such as Parallel Processing. In this article we prove that the splitting number of the 4 cube is 4. As we shall see, this result implies that the splitting number of the n cube is (2n ). A s mple draw ng of a graph G is a drawing of G on the plane such that no edge crosses itself, adjacent edges do not cross, crossing edges do so only once, edges do not cross vertices, and no more than two edges cross at a common Work part ally supported by CNPq, CAPES, FAPERJ and FAPESP, Braz l an research agenc es. C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 141 150, 1998. c Spr nger-Verlag Berl n He delberg 1998
142
Luerb o Far a, Cel na M. H. de F gue redo, C. F. Xav er de Mendonca Neto
point. A graph is planar when there is a simple drawing for this graph in the plane such that no edges cross. In what follows, all drawings are assumed to be simple. A drawing of a graph G is opt mum when it has the minimum number of crossings among all drawings of G. This number is called the cross ng number of G and is denoted by (G). The skewness (G) of G is the smallest integer k 0 such that the removal of k edges from G yields a planar graph. The spl tt ng number (G) of a graph G is the smallest integer k 0 such that a planar graph can be obtained from G by k vertex splitting operations. A vertex spl tt ng operat on, or simply spl tt ng, of a vertex v V (G) partitions the set of neighbours of v into two nonempty sets P1 and P2 and adds to G v two new and nonadjacent vertices v1 and v2 , such that P1 is the set of neighbours of v1 and P2 is the set of neighbours of v2 . If a graph H is obtained from G by a set of k splittings, we say that H is the result ng graph of this set of k splittings in G. We note that the resulting graph H can be obtained either by splitting only vertices of G, or by splitting vertices of G and vertices created by former splittings. Two aspects of the study of splitting numbers have been considered recently by Eades and Mendonca [5,6,20]: they established the NP-completeness of a related problem eligible split set , and they successfully used splitting numbers in layout algorithm design. Research on the splitting number of graphs can be also justi ed by the interdependency of the following three nonplanarity parameters of a graph: crossing number, skewness and splitting number. Consider an optimum drawing for G with (G) crossings. If in this drawing one of the edges in each crossing is removed from G, then there exists a set with at most (G) edges such that the removal of these edges yields a planar graph. This means that the crossing number (G) is greater than or equal to the skewness (G). On the other hand, there are (G) edges whose removal yields a planar resulting graph. Suppose that e = (u u1 ) is one of these (G) edges and that the neighborhood of u is ud . Let the splitting of u into the vertices v1 and given by Adj(u) = u1 u2 ud and Adj(v2 ) = u1 . Note that this v2 be such that Adj(v1 ) = u2 u3 splitting removes the crossing that occurred in e = (u u1 ). Hence there is a set of splittings of size at most (G), which yields a planar graph from G. Hence, (G) (G). Therefore, the study of the splitting number for a given graph helps to obtain lower bounds both for its skewness and its crossing number. Very little is known about skewness, crossing numbers or splitting numbers for speci c classes of graphs. The corresponding decision problems are all NPcomplete [1 ,11,10]. For xed k, crossing number turns out to be polynomial [11]. We show in Lemma 1 that the number of all possible splittings for a vertex in a graph G is Ω(2jV (G)j ), which suggests that splitting number is not a polynomial problem, even for xed k. The di culty of nding the values for these invariants can justify articles in which just one type of graph is considered. For instance, the crossing numbers for the graphs C6 C6 and C7 C7 were recently established [1,2].
The Spl tt ng Number of the 4 Cube
143
The knowledge of one of these invariants for the smallest nonplanar element in a class of graphs can help to nd the values or bounds for this invariant for every element in the class. For instance, (C3 C3 ) = 3 was proved in [12] and used later in [21] to establish that (C3 Cn ) = n. The splitting number has been computed for complete graphs [13] and for complete bipartite graphs [14]. For a recent survey on splitting numbers, see [17]. Let Qn denote the n cube graph. The vertices of Qn are all n tuples of 0’s xn ) and and 1’s, of which there are V (Qn ) = 2n . Two vertices x = (x1 x2 yn ) are adjacent if and only if x = y , for exactly one index . y = (y1 y2 Much has been done about properties of nonplanarity for the class of n cubes. 2 5 − 2n−2 n 2+1 . Madej [19] Eggleton and Guy [7] conjectured that (Qn ) 4n 32 established an upper bound for the crossing number of the n cube: (Qn ) 1 . Sykora and Vrto [22] used Madej’s upper bound 4n 61 −2n−3 n2 +2n−4 3+(−2)n 48 n to prove that (Qn ) = (4 ). Faria and Figueiredo [ ,9] constructed drawings for the 6-,7- and -cubes with the same upper bound of crossings predicted by Eggleton and Guy. Cimikowski [3] showed that (Qn ) = 2n (n − 2) − 2n−1 n + 4. 4. Figure 1 Note that (Qn ) = 0, for n = 1, 2, 3, but (Qn ) > 0, for n shows drawings for the 1-, 2-, 3- and 4-cubes. Recently, Dean and Richter [4] devoted an entire article to proving that (Q4 ) = . Their proof consists of two main steps. Firstly, they show that in any optimum drawing of Q4 there exists a C4 with at least 4 crossings. Secondly, they show that the removal of the edges of a C4 in Q4 leaves a subdivision of C3 C4 . Using that (C3 C4 ) = 4, they establish that (Q4 ) = .
F g. 1. Optimum drawings for Q1 Q2 Q3 and Q4 .
In this article, we prove that the splitting number of the 4 cube is 4. We prove in Lemma 2 that the removal of a C4 from Q4 leaves the same graph up to isomorphism strengthening a result of Dean and Richter [4]. The strategy of the main proof is to show that if it were possible to obtain a planar graph from Q4 with three splittings, then it would be needed to do them in three di erent vertices of Q4 with each pair of them not in the same C4 . Adding to this statement the property that for every triple of vertices in Q4 there exists a C4 containing two of them, we have the need of four splittings, as required. We notice that, for n > 4, there are 2n−4 vertex-disjoint subgraphs of Qn isomorphic to Q4 . Hence the splitting number of Q4 gives the following lower bound for the splitting number of the n cube, for n 4: (Qn ) 4 2n−4 = 2n−2 .
144
Luerb o Far a, Cel na M. H. de F gue redo, C. F. Xav er de Mendonca Neto
The skewness of the n cube is known [3] to be (Qn ) = 2n (n − 2) − 2n−1 n + 4. As noted above, skewness and splitting number are related as follows: (Qn ) (Qn ). Therefore, our lower bound implies that the splitting number of the ncube is in fact (2n ).
2
The Result
In this section we prove that (Q4 ) = 4 Figure 2 exhibits a set of four splittings that obtains a planar resulting graph from Q4 . This proves that (Q4 ) 4
w2
v2 v
v
w
Q4
s
w
1
u
1
G4
s1
u
1
u
s2
F g. 2. (Q4 )
2
4.
Kuratowski [15] characterized the class of planar graphs by saying that a graph is planar if and only if it does not contain a subdivision of K5 or K3 3 as a subgraph .
u
w
u
w
u
w
u
w
u
w
u
w
u
w
v
F g. 3. All possible splittings in a vertex v of degree 4. We recall that Q4 is a 4 regular graph. A vertex v of degree 4 can be split into vertices v1 and v2 in seven di erent ways as shown in Fig. 3. We can generalize this observation with the following lemma. Lemma 1. If v s a vertex of degree d n G, then there are exactly 2d−1 − 1 poss ble d erent spl tt ngs that can be done n v.
The Spl tt ng Number of the 4 Cube
145
Proof. Let Adj(v) = u1 u2 ud be the set of vertices adjacent to v. Let H be the resulting graph obtained from G by splitting v into vertices v1 and v2 . We shall show that this splitting can be done in 2d−1 − 1 di erent ways, by considering the possibilities for partitioning Adj(v) into two sets. Adj(v) we have to decide whether this vertex is adjacent to For each u v1 or not. We have 2d possibilities for this decision. Let P1 P2 be a partition of Adj(v). The assignment of the set P1 to be the neighborhood of v1 and of P2 to be the neighborhood of v2 gives the same graph as the assignment of the set P1 to be the neighborhood of v2 and of P2 to be the neighborhood of v1 . Thus we must divide 2d by 2 in order to obtain non isomorphic graphs. Finally, as the assignment that has the empty set is not allowed, we must subtract 1 from 2d−1 , which gives the result. An automorph sm of G is a bijective function : V (G) V (G), such that (u v) E(G) if and only if ( (u) (v)) E(G). Given a graph G and a subgraph S of G, we say that G is S trans t ve if for each pair F H of subgraphs of G, where F and H are isomorphic to S, there is an automorphism of G such that if v V (F ), then (v) V (H). 2 is C4 -transitive, that is, a Lemma 2 proves that the n cube Qn , for n C4 can be chosen with no loss of generality among all the subgraphs C4 of Qn . Lemma 2. If Qn , for n 2 s cons dered w thout the labels of ts vert ces, then any C4 can be selected n Qn w th no loss of general ty between all the subgraphs C4 of Qn . Proof. Given S and W two C4 ’s of Qn , we shall exhibit an automorphism of Qn carrying S to W . Because an automorphism is a bijective function, it has inverse and it admits composition with another automorphism. Thus given T , a xed C4 of Qn , it is enough to de ne for a each C4 of Qn an automorphism of Qn carrying this C4 to T . For we rst show a key property needed for the de nition of : in the four binary n tuples of the vertices of a xed C4 of Qn there are precisely (n − 2) xed digits. 2 and a C4 induced by vertices v1 v2 v3 v4 , where Consider Qn , with n an ) be the n tuple of v1 . (v v( +1)mod4 ) E(Qn ). Let v1 = (a1 a2 1 2 n As v1 is adjacent to v2 , by de nition of the n cube, there is k ak−1 ak ak+1 an ), where ak denote the binary such that v2 = (a1 a2 complement of ak , i.e., ak = 0 if and only if ak = 1. , As v1 = v3 and (v2 v3 ) E(Qn ) there is j = k such that v3 = (a1 , a2 , ak−1 , ak ak+1 an ). We assume j < k with no loss of aj−1 , aj , aj+1 , generality. As v4 = v2 and (v1 v4 ) E(Qn ), we have that the n tuple of v4 must be aj−1 aj aj+1 ak−1 ak ak+1 an ). given by v4 = (a1 a2 Hence, the n tuples of the four vertices of a C4 of Qn have (n − 2) xed digits.
146
Luerb o Far a, Cel na M. H. de F gue redo, C. F. Xav er de Mendonca Neto
In this way, we can de ne S a xed generic C4 of Qn by writing S = sn ), where sjIj+1 and sjI IIj+2 assume (I sjIj+1 II sjI IIj+2 III) = (s1 s2 values in 0 1 , and (I II III) is the xed (n − 2) tuple with respect to the four vertices de ning S. Let us x T to be the C4 of Qn given by T = (IV tjIV j+1 tjIV j+2 ), where IV is the (n − 2) tuple consisting of 0’s only. Qn carrying S to T by setting We de ne an automorphism : Qn xjIj−1 xjIj xjIj+1 xjI IIj xjI IIj+1 xjI IIj+2 xn ) = (y1 y2 (x1 x2 yjI IIj+1 yjI IIj+3 yjI IIj+4 yn xjIj+1 xjI IIj+2 ) yjIj yjIj+2 yjIj+3 I + 1 I II + 2 . where y = x if and only if s = 0 and Trivially, carries S to T . For each n tuple of Qn the map binary complements a set of digits de ned by the xed part of S. It follows that is an automorphism. Note that an argument similar to that used in the proof of Lemma 2 shows that the n cube is vertex transitive. Lemma 3. For every set of three vert ces n a Q4 , there s a C4 conta n ng two of them. Proof. Consider, with no loss of generality, the black vertex of Q4 depicted in Fig. 4a. The vertices in Q4 that are not in the same C4 with respect to this vertex are the black vertices depicted in Fig. 4b. Since each pair of these vertices share a C4 , the result is obtained.
a)
b)
F g. 4. For every set of three vertices in a Q4 there is a C4 containing two of them. Next we show that a graph obtained from Q4 by three splittings such that one of the splittings is not done in a vertex of Q4 is nonplanar. Lemma 4. If G s obta ned from Q4 by two spl tt ngs, such that the rst spl ts a vertex v of Q4 nto u and w, and the other spl ts u, then (G) 2.
The Spl tt ng Number of the 4 Cube
147
Proof. We establish that (G) 2 by considering the subgraph F of G de ned by the drawing D(F ) in Fig. 5. Because F is a subgraph of G, we have (G) (F ). So it is enough to show that (F ) 2. For we show that splitting an arbitrary vertex of F yields a graph containing a subdivision of K3 3 . In Fig. 5 we show also ten auxiliary copies of D(F ). We partition the vertex set of F into two sets: black vertices and striped vertices. We show rst that the removal of any black vertex always produces a graph containing a subdivision of K3 3 . Although the removal of a striped vertex produces a planar graph, we show that the splitting of such a vertex produces a graph containing a subdivision of K3 3 .
1 2
D(F) 2 1
1
2
1
1
1 2
2
1
1
2
1
2
2
2
2
1
1
2
1
1
2
1 2
1
2 1 2
2 1
2
1
1 1
2
1
2
1
1 2
2
2
1
1 2
1
1 2
2
2
2
1
2 1
1 2
2
F g. 5. Auxiliary graph F with splitting number at least 2 (for Lemma 4). Consider rst the top rightmost copy of D(F ). It contains two black vertices and a subgraph of F that is a subdivision of K3 3 having partitions labeled
148
Luerb o Far a, Cel na M. H. de F gue redo, C. F. Xav er de Mendonca Neto
respectively with 1 and 2. This means that the removal of any of those two black vertices yields a graph that still has a subdivision of K3 3 . An analogous argument shows that this is the case for any black vertex in the other nine copies of D(F ). As a black vertex can be removed without producing a planar graph, the splitting of a black vertex cannot produce a planar graph either. The key property we use with respect to the splitting of a striped vertex v is that v is a vertex of degree four and so any splitting of v into vertices v1 and v2 is such that at least one of them has degree at least 2. We list three copies of D(F ) for each one of the two striped vertices with the corresponding three subdivisions of K3 3 . Each subdivision uses two edges incident to each striped vertex. For the convenience of the reader, we list beside each of the six last drawings all possible splittings for a striped vertex and the corresponding subdivision of K3 3 in the resulting graph. Now we show that three splittings, done in three vertices of Q4 , two of these vertices sharing the same C4 are not enough to obtain a planar resulting graph from Q4 . Lemma 5. If G s obta ned from Q4 by two spl tt ngs n the same C4 , then (G) 2. Proof. We shall show that G contains the auxiliary graph F of Lemma 4 as subgraph, which implies (G) 2. The de nition of G xes two vertices u and v in the same C4 of Q4 with corresponding splittings. We de ne SG as the graph obtained from Q4 by removing v and by splitting u in the same way u is split to obtain G. Note that SG is a subgraph of G. We show that F is a subgraph of SG by considering all possibilities for SG . We consider in Fig. 6 two cases according to u and v being adjacent in Q4 or not: Case 1. Vertices u and v are adjacent in Q4 . For the convenience of the reader, we label the three possibilities for SG in order to show that these three graphs are isomorphic. We note that SG a b is in turn isomorphic to F , as required. Case 2. vertices u and v are not adjacent in Q4 . The seven possibilities for SG are also shown in Fig. 6. In each case, it is easy to nd F as subgraph. This completes the proof of the lemma. Finally, we state and prove the main theorem. Theorem 1. The spl tt ng number of Q4 s 4. 4. It follows from Proof. Figure 2 shows it is enough to establish (Q4 ) Lemma 3, Lemma 4 and Lemma 5 that there is no set of three splittings that obtains from Q4 a planar resulting graph. Thus, the splitting number of Q4 is at least 4, which implies the equality (Q4 ) = 4.
The Spl tt ng Number of the 4 Cube
s
Case 1.
r
149
u and v are adjacent
v u
Q4 Isomorphic Graphs
h p
g
d
e d
b
k
q
f
c
j
m
b a
SG
p
d q
g j
i
n
q
c
a
h
b
k
p
m
e
a
SG
n
c
f
k m
i
j n
SG
e f g
i
h
s v
Case 2.
u and v are not adjacent
r u
Q4
SG
SG
SG
SG
SG
SG
F g. 6. The poss b l t es for the subgraph S of Q4 that are spl t to obta n
(for Lemma 5).
SG
, where u and v are the vert ces n
150
Luerb o Far a, Cel na M. H. de F gue redo, C. F. Xav er de Mendonca Neto
References 1. M. S. Anderson, R. B. R chter and P. Rodney (1996). The cross ng number of C6 C6 , Congressus Numerant um 118, 97 107. 2. M. S. Anderson, R. B. R chter and P. Rodney (1997). The cross ng number of C7 C7 . Proc. 28th Southeastern Conference on Comb nator cs, Graph Theory and Comput ng, Boca Raton, Flor da, USA. 3. R. J. C m kowsk (1992). Graph planar zat on and skewness , Congressus Numerant um 88, 21 32. 4. A. M. Dean and R. B. R chter (1995). The cross ng number of C4 C4 , Journal of Graph Theory 19, 125 129. 5. P. Eades and C. F. X. Mendonca (1993). Heur st cs for Planar zat on by Vertex Spl tt ng . Proc. ALCOM Int. Workshop on Graph Draw ng, GD’93, 83 85. 6. P. Eades and C. F. X. Mendonca (1996). Vertex Spl tt ng and Tens on Free Layout . Proc. GD’95, Lecture Notes n Computer Sc ence 1027, 202 211. 7. R. B. Eggleton and R. P. Guy (1970). The cross ng number of the n cube , AMS Not ces 17, 757. 8. L. Far a (1994). Bounds for the cross ng number of the n cube , Master thes s, Un vers dade Federal do R o de Jane ro (In Portuguese). 9. L. Far a and C. M. H. F gue redo (1997). On the Eggleton and Guy conjectured upper bound for the cross ng number of the n cube , subm tted to Math. Slovaca. 10. L. Far a, C. M. H. F gue redo and C. F. X. Mendonca (1997). Spl tt ng number s NP Complete , Techn cal Report ES-443/97, COPPE/UFRJ, Braz l. 11. M. R. Garey and D. S. Johnson (1983). Cross ng number s NP-complete , SIAM J. Algebra c and D screte Methods 4, 312 316. 12. F. Harary, P. C. Ka nen and A. J. Schwenk (1973). Toro dal graphs w th arb trar ly h gh cross ng number , Nanta Math. 6, 58 67. 13. N. Hart eld, B. Jackson and G. R ngel (1985). The spl tt ng number of the complete graph , Graphs and Comb nator cs 1, 311 329. 14. B. Jackson and G. R ngel (1984). The spl tt ng number of complete b part te graphs , Arch. Math. 42, 178 184. 15. K. Kuratowsk (1930). Sur le probleme des courbes gauches en topolog e , Fundamenta Mathemat cae 15, 271 283. 16. F. T. Le ghton (1981). New lower bound techn ques for VLSI , Proc. 22nd Annual Sympos um on Foundat ons of Computer Sc ence, Long Beach CA, 1 12. 17. A. L ebers (1996). Methods for Planar z ng Graphs A Survey and Annotated B bl ography , ftp://ftp. nformat k.un -konstanz.de/pub/prepr nts/1996/prepr nt012.ps.Z. 18. P. C. L u and R. C. Geldmacher (1979). On the delet on of nonplanar edges of a graph , Congressus Numerant um 24, 727 738. 19. T. Madej (1991). Bounds for the cross ng number of the n cube , Journal of Graph Theory 15, 81 97. 20. C. F. X. Mendonca (1994). A Layout System for Informat on System D agrams , Ph.D. thes s, Un vers ty of Queensland, Austral a. 21. R. D. R nge sen and L. W. Be neke (1978). The cross ng number of C3 Cn , Journal of Comb nator al Theory Ser. B 24, 134 136. 22. O. Sykora and I. Vrto (1993). On the cross ng number of hypercubes and cube connected cycles , BIT 33, 232 237.
Short and Smooth Polygonal Paths James Abello1 and Emden Gansner2 1
Commun cat on Informat on Systems Research, AT&T Labs-Research, USA
[email protected] .com 2 Informat on Analys s and D splay Research, AT&T Labs-Research, USA
[email protected] .com
Abs rac . Automat c graph drawers need to compute paths among vert ces of a s mple polygon wh ch bes des rema n ng n the nter or need to exh b t certa n aesthet c propert es. Some of these requ re the ncorporat on of some nformat on about the polygonal shape w thout be ng too far from the actual shortest path. We present an algor thm to compute a locally convex reg on that conta ns the shortest Eucl dean path among two vert ces of a s mple polygon. The reg on has a boundary shape that follows the shortest path shape. A cub c Bez er spl ne n the reg on nter or prov des a short and smooth coll s on free curve between the two g ven vert ces. The obta ned results appear to be aesthet cally pleasant and the methods used may be of ndependent nterest. They are elementary and mplementable. F gure 7 s a sample output produced by our current mplementat on.
1
Introduct on
The problem of nd ng coll s on free paths has been stud ed n robot cs, VLSI layout and computat onal geometry. A host of shortest path based methods have been proposed n the l terature. In some cases, curvature constra nts are mposed and n others the phys cal constra nts of robot cars or man pulators are ncorporated ([5], [6], [10], [11], [12], [16], [18]). A d erent flavor of path rout ng s requ red by automat c graph drawers spec ally n appl cat ons where the nodes are represented by s ngle connected shapes. In th s case, once nodes are pos t oned the edges need to be placed and the layouts are forced to use some form of curved edges to avo d coll s on w th non- nc dent nodes ([7], [17]). The descr bed env ronment can be modeled as a s mple polygon P conta n ng a collect on of d sjo nt s mple polygonal holes, correspond ng to the node obstacles. The general edge placement problem cons sts n draw ng natural-look ng curves between vert ces n th s env ronment. Arguably, natural curves avo d obstacles, stay close to a shortest path, do not turn too sharply, and avo d unnecessary nflect ons [4]. Robot cs phys cal constra nts such as robot s ze, mass, accelerat on, or turn ng rad us do not seem to have a clear nterpretat on n the context of natural-look ng curves n graph draw ngs. Recently, a heur st c has been proposed that produces curves wh ch sat sfy some of the cr ter a ment oned above [4]. Unfortunately, the obta ned curves are forced C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 151 162, 1998. c Spr nger-Verlag Berl n He delberg 1998
152
James Abello and Emden Gansner
to touch the obstacles that l e on the shortest path and are not guaranteed to be completely conta ned n the ava lable free space. The authors of [4] comment on the d culty of produc ng an mplementable solut on that uses n an e c ent manner the ava lable free space wh le keep ng the curve n the prox m ty of the shortest path. We o er an algor thm c solut on to both of these problems for the case of s mple polygons w thout holes. Namely, for a g ven pa r of vert ces x, y of a s mple polygon P , the algor thm produces a smooth curve that s completely conta ned n the nter or of P and that l es on the prox m ty of the shortest path from x to y. The methods used are v s b l ty based and may be of ndependent nterest. They are elementary and mplementable. F gure 7 s a sample output produced by our current mplementat on of the algor thm. F nally, t s mportant to not ce that the apparently more general edge placement problem that cons ders polygons w th holes n the r nter or can be handled us ng s m lar methods to the ones presented here for s mple polygons w thout holes. We expla n the overall dea n Sect. 4. The second sect on of the paper conta ns the relevant de n t ons. The th rd sect on s the bulk of the paper. It conta ns a descr pt on of the algor thm and the ma n arguments just fy ng ts correctness. The general problem, clos ng remarks and further explorat on avenues are presented n Sect. 4.
2 2.1
Problem Statement De n t ons
We cons der s mple polygons on the plane w th an mpl c t counterclockw se label ng of the vert ces from 1 to n. G ven a pa r of non-consecut ve vert ces x and y on the boundary of P , cons der the r counterclockw se boundary successors next(x) and next(y) respect vely. The boundary s thus d v ded nto two closed cha ns [next(x) y] and [next(y) x]. Denote by CSP [x y] the ordered shortest path from x to y n the counterclockw se d rect on. Not ce that CSP [x y] s not necessar ly the shortest nter or Eucl dean path from x to y wh ch we denote by SP [x y]. It shall be clear then that any two nonconsecut ve vert ces x and y determ ne a subpolygon R[x y] whose counterclockw se boundary s x CSP [next(x) y] next(y) CSP [next(y) x] . The shortest Eucl dean path SP [x y] s nowhere exter or to R[x y] (see F g. 1). There are however other subpolygons s m lar to R[x y] that not only conta n SP [x y] but that prov de n ts prox m ty as much space as allowed by the boundary of the nput polygon. Our object ve s then to nd one such subpolygon on wh ch we can draw a smooth curve that s close to SP [x y]. We refer to th s subpolygon as the draw ng reg on. 2.2
V s b l ty Not ons
Two polygon vert ces are called v s ble f the open l ne segment between them s completely conta ned n the nter or of the polygon or f they are consecut ve on
Short and Smooth Polygonal Paths
153
next(y)
y
next(x)
x
F g. 1. The shortest Eucl dean path SP [x y] s nowhere exter or to the subpolygon R[x y] determ ned by two nonconsecut ve po nts x and y. the polygon boundary. The set of vert ces v s ble from a vertex x s denoted by N (x). The v s b l ty graph of a polygon s the graph whose vert ces correspond to the vert ces of the polygon and edges correspond to v s ble pa rs of polygon vert ces. It can be computed opt mally [8]. A very large subclass of these graphs has been stud ed n ([1], [2]) and a good overv ew of related research can be found n [13]. If x and y are two v s ble vert ces, (x smaller than y n the mpl c t counterclockw se order), let f r(x y) denote the rst vertex to the r ght of y ( f any) that s v s ble from x. S m larly, let f l(x y) be the rst to the left of y ( f any) that s v s ble from x. The computat on of f r(x y) and f l(x y) s a fundamental operat on n many v s b l ty based algor thms. It s useful then to let T (x y) denote the t me taken to compute e ther f r(x y) or f l(x y) for a g ven pa r of v s ble vert ces x and y n a polygon P . W th th s n m nd, for a subset S of vert ces of a polygon P , let T (S) = max T (x y): both x and y are v s ble vert ces of S . One can de ne s m lar not ons n terms of other computat onal resources but we w ll not d scuss these ssues any further here.
3 3.1
The Algor thm Algor thm Overv ew
Our general method can be v ewed n three stages w th the rst two nterleaved for e c ency purposes as follows.
154
James Abello and Emden Gansner
Stage I.
F nd Local Pentagons
For every ordered tr ple of consecut ve vert ces ( ,j,k) n SP[x,y] where and k are both d fferent from x and y do { Compute an aux l ary po nt on each of the d rected rays r( ,j) and r(k,j) and call them a( ,j) and a(k,j) respect vely. The nter or of the pentagon w th vert ces , j, k, a( ,j), a(k,j) must be conta ned n P. } Stage II.
F nd the boundary of the draw ng reg on
For every four consecut ve vert ces ( ,j,k,l) n SP[x,y] where and l are both d fferent from x and y do { Use the local pentagons obta ned n the prev ous stage to obta n from them local heptagons and octagons that surround the shortest path. } Stage III. F nd a curve from x to y n the nter or of the draw ng reg on that approx mates SP[x,y]. 3.2
Stage I: How to F nd the Local Pentagons?
The rst th ng to not ce s that f , j and k are three consecut ve vert ces on SP [a b] then, by the Jordan closed curve theorem, the sequence ( j k [k ]) forms a subpolygon of P . Th s guarantees the ex stence of po nts on the d rected rays r( j) and r(k j) w th n th s subpolygon. The ma n dea s to nd, n an nterleaved fash on, a sequence of po nts f e(k j) on the ray r(k j) wh ch are v s ble from and a correspond ng sequence f e( j) on the ray r( j) wh ch are v s ble from k. These two sequences sat sfy the add t onal requ rement of be ng mutually v s ble. The farthest po nts n these two sequences are called the aux l ary po nts a(k j) and a( j) and they form w th j, and k a pentagon whose nter or s conta ned n the nter or of P . The sequences are determ ned by ntersect ons of the rays r(k j) and r( j) w th e ther, v s b l ty rays emanat ng from the vert ces and k or w th certa n spec al polygon boundary edges. We descr be next the major steps nvolved n these computat ons. For clar ty of expos t on we present just the procedure rauxpo nt (P, ,j,k) n charge of the computat on of a(k j). Sw tch ng the roles of and k and subst tut ng f r( j) for f l(k j) we obta n the symmetr c procedure lauxpo nt(P, ,j,k) that computes a( j). As prev ously nd cated, nterleav ng these two procedures we obta n a method to compute the aux l ary po nts. As a nal notat onal conven ence we let closest(j S) denote those vert ces n a set S that are closest n Eucl dean d stance to a vertex j.
Short and Smooth Polygonal Paths
155
Computat on of the R ght Aux l ary Po nts. Assume that ( j k) are three consecut ve vert ces on SP [x y] wh ch form a left turn and recall the de n t on of f r( j) and f l(k j) g ven n Sect. 2. It s a well known fact , proved us ng standard v s b l ty arguments, that there ex sts a un que boundary edge of P wh ch serves as the base of a funnel subpolygon w th apex that passes through the vert ces j and f r( j). We denote such boundary edge by f base( j f r( j)) (see F g. 2). S m larly, there s a funnel subpolygon assoc ated w th k, j and f l(k j). The computat on depends on the pos t on of f r( j) and f l(k j) w th respect to the l nes l(k j) and l( j) respect vely. In one case, the ntersect on of the funnel bases f base( j f r( j)) and f base(k j f l(k j)) w th the rays r(k j) and r( j) are used n determ n ng the aux l ary po nts. In the other, the procedure s called recurs vely to look for other su table funnel bases on a smaller polygon P ’ whose nter or s conta ned n P . F gs. 2, 3 and 4 llustrate the ma n cases that need to be cons dered by the procedure rauxpo nt(P, ,j,k). r(i, fr(i,j))
r(k,j)
r(i,j) a f(k,j)
a(k,j)
b
j
fe(k,j)
fr(i,j)
k i
F g. 2. The ray r( f r( j)) ntersects the ray r(k j) and the same s de of the l ne l(k j)
procedure rauxpo nt(P, ,j,k) If
(r( , fr( ,j)) ntersects the ray r(k,j) then label such ntersect on f(k,j) else f(k,j) := nf n ty;
and f r( j) l e on
156
James Abello and Emden Gansner r(i, fr(i,j))
r(k,j)
r(i,j)
a
b fr(i,j)
a(k,j) fe(k,j)
f(k,j) j
k i
F g. 3. The ray r( f r( j)) ntersects the ray r(k j) but l e on the same s de of the l ne l(k j)
If (
and f r( j) do not
and fr( ,j) are not on d fferent s des of the l ne l(k,j)) then { [a,b] := fbase( ,j,fr( ,j)); p1 := [a,b] ntersect on w th r(k,j); a(k,j) := closest(j,{p1, f(k,j)}) } else { Let P’ be the polygon w th boundary ( , f(k,j), j, fr( ,j), clockw secha n[fr( ,j), ]); a(k,j) := rauxpo nt(P’, ,f(k,j),j); }
Lemma 1. The aux l ary po nts a( ,j) and a(k,j) are computed by nterleav ng the procedures rauxpo nt(P, ,j,k) and lauxpo nt(P, ,j,k). These po nts together w th , j and k form a pentagon whose nter or s completely conta ned n the nter or of P (F g. 5). Proof. In the case that and f r( j) do not l e on d erent s des of the l ne l(k j), stra ghtforward propert es of the r assoc ated funnel guarantee that the procedure rauxpo nt computes a po nt a(k j) on the ray r(k,j) wh ch s v s ble from . In the rema n ng case, the nter or of the polygon P ’ w th boundary ( f (k j) j f r( j) clockw secha n[f r( j) ]) s conta ned n the nter or of P and the shortest path n P ’ from to j goes trough f (k j). Th s allow us to apply the procedure recurs vely to P ’. S m lar analys s s true for the procedure lauxpo nt when the po nts nvolved are k and f l(k j) and the computed po nt
Short and Smooth Polygonal Paths r(k,j)
157
r(i, fr(i,j))
r(i,j) a a(k,j) b
fe(k,j) j fr(i,j)
k i
F g. 4. When the ray r( f r( j)) does not ntersect the ray r(k j), the funnel base ntersected w th the ray r(k j) determ nes the aux l ary po nt a(k j) s a( j) on the ray r( j). F nally, the nvar ant ma nta ned by the nterleaved execut on of these two procedures nsure that the nter or of the pentagon w th vert ces j k a( j) and a(k j) s completely conta ned n the nter or of P .
Complex ty of F nd ng the Aux l ary Po nts. In order to speed up the computat on of the aux l ary po nts and assum ng that we are nterested n answer ng a ser es of shortest path quer es t s worthwh le to precompute bes des the v s b l ty graph of P an aux l ary b part te graph AV(Vert ces of P, Boundary edges of P). AV g ves for every vertex v of P , the ordered sequence of boundary edges seen by v. Th s graph was mpl c tly de ned n [2] and an e c ent algor thm for ts computat on has been recently proposed n [14]. It s of nterest to not ce that the v s b l ty graph of P determ nes completely the aux l ary graph AV but not conversely. In any case the complex ty s st ll O(V s b l ty graph of P). W th these two graphs at hand, t follows from the proof of the prev ous Lemma, that the aux l ary po nts a( j) and a(k j) can be found n O( N (k) ) and O( N ( ) ) respect vely. 3.3
Stage II. How to F nd the Boundary of the Draw ng Reg on?
G ven four consecut ve vert ces ( j k l) n SP [x y] for wh ch the aux l ary po nts a( j) a(k j) a(j k) and a(l k) have been computed, t s necessary to check f they are compat ble w th each other n the sense of determ n ng a locally convex ordered set of po nts. Insur ng that th s compat b l ty requ rement s sat s ed amounts to a case analys s that depends on how the tr angles j a( j) a(k j) and k a(j k) a(l k) are pos t oned w th respect to each
158
James Abello and Emden Gansner
a(i,j) a(l,k) c(l,k)
lt
c(i,j)
a(k,j)
j k
a(j,k)
l
i
F g. 5. The local pentagons around ( j k) and (j k l) descr bed n Lemma 1 and the heptagon ment oned n one case of the proof of Lemma 2.
other. The proof of the follow ng two lemmas can be turned nto a procedure ndreg on( ,j,k,l) that computes the des red locally convex reg ons. A Useful Property of Success ve Quadruples of Po nts n Shortest Eucl dean Paths. Lemma 2 (The concave case). Let ( j k l) denote four consecut ve po nts n a shortest Eucl dean path between two vert ces of a s mple polygon. Assume also that the path from to j to k to l s concave. Under these cond t ons the aux l ary po nts computed by the procedures of the prev ous sect on sat sfy the follow ng property: There ex sts a po nt lt (called hereafter local top) conta ned n the un on of the two tr angles w th vert ces (k j a( j)) and (k j a(l k)) wh ch form w th the vert ces j k l a(j k) and a(k j), a heptagon whose nter or s completely conta ned n the nter or of P (F g. 5). Proof. (Sketch.) The nter or of the tr angles (j k a( j)) and (j k a(l k)) can not conta n vert ces of P because that contrad cts the way that a( j) and a(l k) were chosen by the procedures of the prev ous sect on. The result depends completely on the relat ve pos t on of these two tr angles and on wh ch s de of the d rected l ne l(k j) res des the ntersect on of the l nes l( j) and l(l k). In some cases, the local top s de ned to be e ther the ntersect on of the segment [j a(l k)] w th the segment [k a( j)] or the ntersect on of the segment [j a( j)] w th the segment [k a(l k)]. In the rema n ng cases, the local top s de ned to be e ther the ntersect on of the ray r(a(j k) a(l k)) w th the segment [a(k j) a( j)] or the ntersect on of the ray r(a(k j) a( j)) w th the segment [a(j k) a(l k)]. The descr bed cho ce of the local top together w th v s b l ty cons derat ons determ nes the des red heptagon (F g. 5).
Short and Smooth Polygonal Paths
159
We po nt out that n many cases, depend ng on the local geometry, the local heptagons can be enlarged w thout ncreas ng the overall complex ty. These local opt m zat ons w ll be expla ned n deta l somewhere else. S m lar analys s to the one presented n the prev ous lemma g ve us the follow ng result. Lemma 3 (The non-concave case). If ( j k l) are four consecut ve vert ces on a shortest Eucl dean path between two vert ces of a polygon P wh ch do not const tute a concave path then they together w th the aux l ary vert ces form an octagon whose nter or s completely conta ned n the nter or of P (F g. 6). l
a(i,j)
a(k,j)
j k i
a(j,k)
a(l,k)
F g. 6. The octagon assoc ated w th a non-concave subpath n Lemma 3.
j k l as descr bed
Complex ty of F nd ng the Locally Convex Reg on. The computat ons descr bed n the prev ous two lemmas are all local. We repeat them sl d ng a w ndow of s ze four over SP(x,y). Some care s necessary n preserv ng the local convex ty dur ng the ncremental reg on computat on. Th s amounts to the ma ntenance of the reg on boundary n a data structure that allow us to answer ray shoot ng quer es e c ently. At most a logar thm c cost s ncurred here. F gure 7 s a sample output of our current mplementat on.
3.4
Stage III. Embed a Smooth Curve W th n the Computed Draw ng Reg on
The collect on of po nts computed n the prev ous stage de nes a subpolygon that conta ns the shortest Eucl dean path from x to y. Th s subpolygon s the reg on on wh ch the shortest path approx mat ng curve w ll be drawn. As one approach to embed a smooth curve n ts nter or we can l m t ourselves to cub c Bez er spl nes [3]. Th s fam ly of curves s general enough to g ve pleas ng results,
160
James Abello and Emden Gansner
F g. 7. Sample output produced by the current mplementat on. The draw ng reg on s h ghl ghted
s computat onal s mple, and most mportantly for our purposes, sat s es the convex hull property, .e., a Bez er curve determ ned by four po nts l es w th n the convex hull of those four po nts. To explo t th s property we observe that the draw ng reg on computed n the prev ous stage can be v ewed as a cha n of tr angles and convex polygons. Each convex polygon conta ns two vert ces on the shortest path where bends occur. We can p ck these po nts as the rst and last control po nts p0 and p3 . The control po nt p1 s chosen to l e on the ray based at p0 that b sects the two edges of the polygon meet ng at p0 , and n the nter or of the polygon. The control po nt p2 s chosen s m larly w th respect to p3 . For aesthet c reasons, t s des rable that p1 and p2 roughly equally d v de the d stance between p0 and p3 , and that path p0 p1 p2 p3 m m c any change n curvature of the shortest path from p0 to p3 . For example, f the bends on the shortest path at p0 and p3 have the same s gn of curvature, we expect p2 not to cross the ray (p0 p1 ) and p1 not to cross the ray (p3 p2 ). Th s solut on s adequate, but causes the curve to co nc de w th the shortest path on turns and makes no use of the add t onal space prov ded by the tr angles anchored at the bends. We can mprove the s tuat on by general z ng th s techn que. Namely, us ng the tr angle at each bend, we p ck p0 and p3 to l e on the two s des of the tr angle meet ng at the bend, each one-th rd along the s de away from the bend. The change n angle at the bend can be d v ded proport onally between p0 and p3 , de n ng rays based at these po nts on wh ch we can p ck p1 and p2 ns de the polygons and sat sfy ng the needed local convex ty or concav ty propert es.
Short and Smooth Polygonal Paths
161
Another approach s to use subd v s on methods l ke Cha k n’s algor thm [3]. We are currently exper ment ng w th these techn ques and the obta ned results w ll be reported n the nal paper vers on. For now, we let O(SP [x y], Draw ng Reg on) denote the complex ty of embedd ng a smooth curve n a draw ng reg on.
4
Overall Complex ty and Conclud ng Remarks
A polygonal hole s a s mple planar polygon together w th ts nter or. In the general edge placement problem, the collect on H of polygonal holes are d sjo nt and we assume that they are completely conta ned n the nter or of a large bound ng s mple polygon P . The forb dden space forb(H) s the un on of the nter ors of the polygons n H and the free con gurat on space free(P,H) s equal to P f orb(H). G ven two po nts x and y n free(P,H), any Eucl dean shortest path from x to y s a polygonal path whose nner vert ces are vert ces of H. Th s path can be constructed n O(hlog(h)) t me, where h s the total number of boundary edges of the polygons n H ([9]). Assum ng that we are answer ng a collect on of shortest path quer es, t s more e ect ve to precompute the v s b l ty graph of P H where two vert ces are cons dered v s ble f the segment jo n ng them s completely conta ned n free(P,H). Th s computat on can be done n O(h0 log(h0 ) + k) where h0 s the number of vert ces n P H and k s the number of edges n ts v s b l ty graph([8]). When x and y are not vert ces of P H we cons der the extended v s b l ty graph of (H P ) = vert ces of (H P ) x y . In e ther case, after hav ng a shortest path from x to y we can proceed to compute n ts v c n ty a locally convex reg on on wh ch a spl ne curve w ll be drawn. The methods d scussed here are adaptable to th s more general case but the r correctness proofs become more ntr cate. Due to space l m tat ons we defer the deta ls to the journal vers on of th s report. The overall complex ty of the proposed approach s O(V s b l ty graph of vert ces n free space) + O(SP [x y], Draw ng Reg on). The second term depends on the qual ty of the des red approx mat on and we bel eve the rst term s opt mal n the amort zed sense. An nterest ng related ssue for further explorat on s how to deal w th the presence of coll near t es. Currently, we can prove that, under the real ar thmet c model, most of the steps to determ ne the draw ng reg on are nsens t ve to the presence of coll near t es. However, n certa n s tuat ons where the shortest path conta ns a subsequence of almost coll near po nts, the result ng draw ng reg on may exh b t subreg ons wh ch are locally very small. To overcome th s d culty we have developed a parametr zed vers on of the algor thm. The bas c dea s to reapply the algor thm to those nternal convex segments of the obta ned boundary reg on as long as they sat sfy a prespec ed curvature constra nt. Th s vers on, produces better results n the sense that t depends more heav ly on the topology of the nput polygon that on the presence of coll near t es.
162
James Abello and Emden Gansner
Acknowledgements. Thanks to Peter Eades and to Sandra Sudarsky for suggest ons that helped to mprove the readab l ty of the paper.
References 1. J. Abello, O. Egec oglu, K. Kumar, V s b l ty Graphs of Sta rcase Polygons and the Weak Bruhat Order I: From Polygons to Max mal Cha ns, D screte and Computat onal Geometry, Vol. 14, No. 3, 1995, pp. 331-358. 2. J. Abello, K. Kumar, V s b l ty Graphs and Or ented Matro ds, In R. Tamass a, I. Toll s, ed tors, Sympos um on Graph Draw ng GD’94, Pr nceton, Lecture Notes n Computer Sc ence, Vol. 894, 1994, pp. 147-158. 3. R. Bartels, J. Beatty, B. Barsky, An Introduct on to Spl nes for Use n Computer Graph cs and Geometr c Model ng, Morgan Kaufman, Los Altos, Cal forn a, 1987. 4. D. P. Dobk n, E. Gansner, E. Koutso os, S. C. North, Implement ng a generalpurpose edge router, To appear n, Proceed ngs of the Sympos um on Graph Drawng GD’97, Rome, Sept. 1997. 5. L. E. Dub s. On Curves of m n mal length w th a constra nt on average curvature and w th prescr bed n t al and term nal pos t ons and tangents, Amer. J. Math., 79:497-516, 1957. 6. S. Fortune and G. W lfong, Plann ng constra nt mot on, In Proc. 20th Annu. ACM Sympos. Theory Comput., pp. 445-459, 1988. 7. E. R. Gansner, E. Koutso os, S.C. North and K.P. Vo, A techn que for draw ng d rected graphs, IEEE Transact ons on Software Eng neer ng, March 1993. 8. S. K. Ghosh and D. M. Mount, An output-sens t ve algor thm for comput ng v s b l ty graphs , S am J. Comput ng, 20(5):888-910, 1991. 9. J. Hershberger and S. Sur , E c ent computat on of Eucl dean shortest paths n the plane, In Proc. 34th Annu. IEEE Sympos. Found. Comput. Sc ., pp. 508-517, 1993. 10. Y. Kanayama and B. I. Hartman, Smooth local path plann ng for autonomous veh cles, In Proc. IEEE Intl. Conf. on Robot cs and Automat on, Vol. 3, pp. 12651270, 1989. 11. J.C. Latombe, Robot Mot on Plann ng, Kluwer Academ c Publ shers, Boston, 1991. 12. J. P. Laumond, F nd ng coll s on free smooth trajector es for a non-holonom c mob le robot, In Proc. IEEE Intl. Jo n Conf. on Art c al Intell gence, pp. 11201123, 1987. 13. J. O’Rourke, The Computat onal Geometry Column, SIGACT News, 1992. 14. J. O’Rourke, I. Stre nu, Pseudo V s b l ty Graphs, Proc. ACM Sympos um on Computat onal Geometry, June 1997, France. 15. F. Preparata, M. Shamos, Computat onal Geometry: An Introduct on, Spr nger Verlag, NY, 1995. 16. J. A. Reeds and L. A. Shepp, Opt mal paths for a car that goes both forward and backwards, Pac c Journal of Mathemat cs,, 145(2), 1990. 17. G. Sander, M, Alt, A, Ferd nand, and R. W lhelm, Clax, a v sual zed Comp ler, In F. J. Brandenburg, ed tor, Sympos um on Graph Draw ng GD’95, Vol. 1027 of Lecture Notes n Computer Sc ence, pp. 459-462, 1996. 18. J. T. Schwartz and M. Shar r, Algor thm c mot on plann ng n robot cs, In J. van Leeuwen, ed tor, Algor thms and Complex ty, Vol A of Handbook of Theoret cal Computer Sc ence, pp. 391-430. Elsev er, Amsterdam, 1990.
Quantum Cryptanalys s of Hash and Claw-Free Funct ons (Inv ted Paper) G lles Brassard1 , Peter H yer2 , and Ala n Tapp1 1
Un vers te de Montreal, Departement IRO C.P. 6128, succursale centre-v lle, Montreal (Quebec), Canada H3C 3J7 brassard,tappa @ ro.umontreal.ca 2 Odense Un vers ty, Department of Mathemat cs and Computer Sc ence Campusvej 55, DK 5230 Odense M, Denmark u2p @ mada.ou.dk
Abs rac . We g ve a quantum algor thm that nds coll s ons n arb trary r-to-one funct ons after only O( 3 N r ) expected evaluat ons of the funct on, where N s the card nal ty of the doma n. Assum ng the funct on s g ven by a black box, th s s more e c ent than the best poss ble class cal algor thm, even allow ng probab l sm. We also g ve a s m lar algor thm for nd ng claws n pa rs of funct ons. Further, we exh b t a space-t me tradeo for our techn que. Our approach uses Grover’s quantum search ng algor thm n a novel way.
1
Introduct on
A coll s on for funct on F : X Y cons sts of two d st nct elements x0 x1 X such that F (x0 ) = F (x1 ). The coll s on problem s to nd a coll s on n F under the prom se that there s one. Th s problem s of part cular nterest for cryptology because some funct ons known as hash funct ons are used n var ous cryptograph c protocols. The secur ty of these protocols depends cruc ally on the presumed d culty of nd ng coll s ons n such funct ons. A related quest on s to nd so-called claws n pa rs of funct ons; our quantum algor thm extends to th s task. In part cular, th s has consequences for the secur ty of class cal s gnature and b t comm tment schemes. A funct on F s sa d to be r-to-one f every element n ts mage has exactly r d st nct pre mages. We assume throughout th s note that funct on F s g ven as a black box, so that t s not poss ble to obta n knowledge about t by any other means than evaluat ng t on po nts n ts doma n. When F s two-to-one, the most e c ent class cal algor thm poss ble for the coll s on problem requ res ? ??
???
Supported n part by Canada’s nserc, Quebec’s fcar, and the Canada Counc l. Supported n part by the espr t Long Term Research Programme of the EU under project number 20244 (alcom- t). Research carr ed out wh le th s author was at the Un vers te de Montreal. Supported n part by postgraduate fellowsh ps from nserc and fcar
C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 163 169, 1998. c Spr nger-Verlag Berl n He delberg 1998
164
G lles Brassard, Peter H yer, and Ala n Tapp
an expected ( N ) evaluat ons of F , where N = X denotes the card nal ty of the doma n. Th s class cal algor thm, wh ch uses a pr nc ple rem n scent of the b rthday paradox, s rev ewed n the next sect on. Recently, at a talk held at AT&T, Er c Ra ns [8] asked f t s poss ble to do better on a quantum computer. In th s note, we g ve a pos t ve answer to th s quest on by prov d ng a quantum algor thm that nds a coll s on n an arb trary two-to-one funct on F after only ( 3 N ) expected evaluat ons. Earl er, S mon [9] addressed the xor-mask problem de ned as follows. Con0 1 n and s der a pos t ve nteger n. We are g ven a funct on F : 0 1 n prom sed that e ther F s one-to-one or t s two-to-one and there ex sts an s 0 1 n such that F (x0 ) = F (x1 ) f and only f x0 x1 = s, for all d st nct 0 1 n, where denotes the b tw se exclus ve-or. S mon’s problem s to x0 x1 dec de wh ch of these two cond t ons holds, and to nd s n the latter case. Note that nd ng s s equ valent to nd ng a coll s on n the case that F s two-to-one. S mon gave a quantum algor thm to solve h s problem n expected t me polynom al n n and n the t me requ red to compute F . The runn ng t me requ red for th s task on a quantum computer was recently mproved to be ng polynom al n the worst case (rather than n the expected case), thanks to a more soph st cated algor thm [3]. S mon’s algor thm s nterest ng from a theoret cal po nt of v ew because any class cal algor thm that uses only sub-exponent ally ( n n) many evaluat ons of F cannot hope to d st ngu sh between the two types of funct ons s gn cantly better than s mply by toss ng a co n, assum ng equal a pr or probab l t es [9,3]. Unfortunately, the xor-mask constra nt when F s two-to-one s so restr ct ve that S mon’s algor thm has not yet found a pract cal appl cat on. More recently, Grover [6,7] d scovered a quantum algor thm for a d erent search ng problem. We are g ven a funct on F : X 0 1 w th the prom se that there ex sts a un que x0 X so that F (x0 ) = 1, and we are asked to nd x0 . Prov ded the doma n of the funct on s of card nal ty a power of two (N = 2n ), Grover gave a quantum algor thm that nds the unknown x0 w th probab l ty at least 1 2 after only ( N ) evaluat ons of F . A natural general zat on of th s search ng problem occurs when F : X Y s an arb trary funct on. G ven some y0 Y , we are asked to nd an x X such that F (x) = y0 , prov ded such an x ex sts. If = x X F (x) = y0 denotes the number of d erent solut ons, Grover’s algor thm can be generalzed [1] to nd a solut on whenever t ex sts ( 1) after an expected number of ( N ) evaluat ons of F . Although the algor thm does not need to know the value of ahead of t me, t s more e c ent ( n terms of the h dden constant n the O notat on) when s known, wh ch w ll be the case for most algor thms g ven here. From now on, we refer to th s general zat on of Grover’s algor thm as Grover(F y0 ). Note that the number of evaluat ons of F s not polynom ally bounded n log N when N ; nevertheless Grover’s algor thm s cons derably more e c ent than class cal brute-force search ng. In the next sect on, we g ve our new quantum algor thm for solv ng the coll s on problem for two-to-one funct ons. We then d scuss a stra ghtforward general zat on to r-to-one funct ons and even to arb trary funct ons whose mage s
Quantum Cryptanalys s of Hash and Claw-Free Funct ons
165
su c ently smaller than the r doma n. A natural space-t me tradeo emerges for our techn que. F nally, we g ve appl cat ons to nd ng claws n pa rs of funct ons.
2
Algor thms for the Coll s on Problem
We rst state two s mple algor thms for the coll s on problem, one class cal and one quantum. Both of these algor thms use an expected number of ( N ) evaluat ons of the g ven funct on, but the quantum algor thm s more space e c ent. We der ve our mproved algor thm from these two s mple solut ons. The rst solut on s a well-known class cal probab l st c algor thm, here stated n sl ghtly d erent terms than trad t onally. The algor thm cons sts of three steps. F rst, t selects a random subset K X of card nal ty k = c N for an appropr ate constant c. Then, t computes the pa r (x F (x)) for each x K and sorts these pa rs accord ng to the second entry. F nally, t outputs a coll s on n K f there s one, and otherw se reports that none has been found. Based on the b rthday paradox, t s not d cult to show that f F s two-to-one then th s algor thm returns a coll s on w th probab l ty at least 1 2 prov ded c s su c ently large (c 1 18 w ll do). If we take a pa r (x F (x)) as un t of space then the algor thm can be mplemented n space ( N ), and ( N ) evaluat ons of F su ce to succeed w th probab l ty 1 2. If we care about runn ng t me rather than s mply the number of evaluat ons of F , t may be preferable to resort to un versal hash ng [4] rather than sort ng to nd a coll s on n K. Th s would avo d spend ng ( N log N ) t me sort ng the table, mak ng poss ble a ( N ) overall expected runn ng t me f we assume that each evaluat on of F takes constant t me. We st ck to the sort ng parad gm for s mpl c ty and because t s not clear f the bene ts of un versal hash ng carry over to quantum parallel sm s tuat ons such as ours. We come back to th s ssue n Sect. 3. The s mple quantum algor thm for two-to-one funct ons also cons sts of three steps. F rst, t p cks an arb trary element x0 X. Then, the algor thm com0 1 denotes the funct on de ned by putes x1 = Grover(H 1) where H : X H(x) = 1 f and only f F (x) = F (x0 ) but x = x0 . F nally, t outputs the coll s on x0 x1 . There s exactly one x X that sat s es H(x) = 1, so = 1, and thus the expected number of evaluat ons of F s also ( N ), st ll to succeed w th probab l ty 1 2, but constant space su ces. Our new algor thm, denoted Coll s on and g ven below, can be thought of as a log cal un on of the two algor thms above. The ma n dea s to select a subset K of X and then use Grover to nd a coll s on x0 x1 w th x0 K and x1 X K. The expected number of evaluat ons of F and the space used by the algor thm are determ ned by the parameter k = K , the card nal ty of K. Coll s on(F k) (1) P ck an arb trary subset K X of card nal ty k. Construct a table L of s ze k where each tem n L holds a d st nct pa r (x F (x)) w th x K. (2) Sort L accord ng to the second entry n each tem of L.
166
G lles Brassard, Peter H yer, and Ala n Tapp
(3) Check f L conta ns a coll s on, that s, check f there ex st d st nct elements (x0 F (x0 )) (x1 F (x1 )) L for wh ch F (x0 ) = F (x1 ). If so, proceed to step (6). 0 1 denotes the funct on de(4) Compute x1 = Grover(H 1) where H : X ned by H(x) = 1 f and only f there ex sts x0 K so that (x0 F (x)) L but x = x0 . (5) F nd (x0 F (x1 )) L. (6) Output the coll s on x0 x1 . Theorem 1. G ven a two-to-one funct on F : X Y w th N = X and an nteger 1 k N , algor thm Coll s on(F k) returns a coll s on after an expected number of (k + N k ) evaluat ons of F and uses space (k). In part cular, when k = 3 N then Coll s on(F k) evaluates F an expected number of ( 3 N ) t mes and uses space ( 3 N ). Proof. The correctness of the algor thm follows eas ly from the de n t on of H and the construct on of Grover(H 1). We now count the number of evaluat ons of F . In the rst step, the algor thm uses k such evaluat ons. Let p be the probab l ty that a coll s on s found at step (3). If t s not found, set = x X H(x) = 1 . By the prev ous sect on, subrout ne Grover n step (4) uses an expected number of ( N ) evaluat ons of the funct on H to nd one of the solut ons. Note that each evaluat on of H requ res a s ngle evaluat on of F , and = k because F s two-to-one. F nally, our algor thm evaluates F once n step (5), g v ng a total expected number of k + (1 − p)( ( N k ) + 1) evaluat ons of F . Prov ded N s su c ently large, 3 N whereas N k < k otherw se. In e ther case, the p s negl g ble when k expected number of evaluat ons of F s (k + N k ) as cla med. The second part of the theorem s mmed ate. In a nutshell, the mprovement of our algor thm over the s mple quantum algor thm s ach eved by trad ng t me for space. Suppose the card nal ty of set K s large. Then the expected number of evaluat ons of H used by subrout ne Grover(H 1) s small, but on the other hand more space s needed to store table L. Analogously, the space requ rements are less but also Grover(H 1) runs slower f K s small. Suppose now that we apply algor thm Coll s on, not necessar ly on a two-to-one funct on, but on an arb trary r-to-one funct on where r 2. Then we have the follow ng theorem, whose proof s essent ally the same as that of Thm. 1. Theorem 2. G ven an r-to-one funct on F : X Y w th r 2 and an nteger 1 k N = X , algor thm Coll s on(F k) returns a coll s on after an expected number of (k + N rk ) evaluat ons of F and uses space (k). In part cular, when k = 3 N r then Coll s on(F k) uses an expected number of ( 3 N r ) evaluat ons of F and space ( 3 N r ).
Quantum Cryptanalys s of Hash and Claw-Free Funct ons
167
Note that Coll s on(F k) can also be appl ed on an arb trary funct on F :X Y for wh ch X r Y for some r > 1, even f F s not r-to-one. However, the algor thm must be mod ed n two ways for the general case. F rst of all, the subset K X of card nal ty k must be p cked at random, rather than arb trar ly, at step (1). Furthermore, because the number of solut ons for Grover(H 1) s no longer known n advance to be exactly = (r − 1)k, the fully general zed vers on of Grover’s algor thm g ven n [1] must be used at step (4). By vary ng k n Thm. 2, the follow ng space-t me tradeo emerges. Corollary 1. There ex sts a quantum algor thm that can nd a coll s on n an arb trary r-to-one funct on F : X Y , for any r 2, us ng space S and an expected number of (T ) evaluat ons of F for every 1 S T subject to ST 2
F (X)
where F (X) denotes the mage of F . Cons der now two funct ons F : X Z and G : Y Z that have the same codoma n. By de n t on, a claw s a pa r x X, y Y such that F (x) = G(y). Many cryptograph c protocols are based on the assumpt on that there are e c ently-computable funct ons F and G for wh ch claws cannot be found e c ently even though they ex st n large number. The s mplest case ar ses when both F and G are b ject ons, wh ch s the usual s tuat on when such funct ons are used to create class cal uncond t onallyconceal ng b t comm tment schemes [2] and strong d g tal s gnature schemes [5]. If N = X = Y = Z , algor thm Coll s on s eas ly mod ed as follows. Claw(F G k) (1) P ck an arb trary subset K X of card nal ty k. Construct a table L of s ze k where each tem n L holds a d st nct pa r (x F (x)) w th x K. (2) Sort L accord ng to the second entry n each tem of L. 0 1 denotes the funct on de(3) Compute y0 = Grover(H 1) where H : Y ned by H(y) = 1 f and only f a pa r (x G(y)) appears n L for some arb trary x K. (4) F nd (x0 G(y0 )) L. (5) Output the claw (x0 y0 ). Theorem 3. G ven two one-to-one funct ons F : X Z and G : Y Z w th N = X = Y = Z and an nteger 1 k N , algor thm Claw(F G k) returns a claw after k evaluat ons of F and an expected number of ( N k ) evaluat ons of G, and uses space (k). In part cular, when k = 3 N then Claw(F G k) evaluates F and G an expected number of ( 3 N ) t mes and uses space ( 3 N ). Proof. S m lar to the proof of Thm. 1.
168
G lles Brassard, Peter H yer, and Ala n Tapp
The case n wh ch both F and G are r-to-one for some r 2 and N = X = Y = r Z s handled s m larly. However, t becomes necessary n step (1) of algor thm Claw to select the elements of K so that no two of them are mapped to the same po nt by F . Th s w ll ensure that the call on Grover(H 1) at step (3) has exactly kr solut ons to choose from. The s mplest way to choose K s to p ck random elements n X unt l F (K) = k. As long as k Z 2, th s requ res try ng less than 2k random elements of X, except w th van sh ng probab l ty. The proof of the follow ng theorem s aga n essent ally as before. Theorem 4. G ven two r-to-one funct ons F : X Z and G : Y Z w th N = X = Y = r Z and an nteger 1 k N 2r, mod ed algor thm Claw(F G k) returns a claw after an expected number of (k) evaluat ons of F and ( N rk ) evaluat ons of G, and uses space (k). In part cular, when k = 3 N r then Claw(F G k) evaluates F and G an expected number of ( 3 N r ) t mes and uses space ( 3 N r ).
3
D scuss on
When we say that our quantum algor thms requ re (k) space to hold table L, th s corresponds unfortunately to the amount of quantum memory, a rather scarce resource w th current technology. Note however that th s table s bu lt class cally n the n t al steps of algor thms Coll s on and Claw, and t conta ns class cal nformat on. Even though t has to be access ble n quantum superpos t on of addresses, t may be that the class cal nature of the nformat on t conta ns would make t eas er to mplement than a memory that can be used to store and retr eve quantum nformat on. At the very least, as Er c Ra ns po nted out to us, t may requ re a s mpler error-correct on process than a general quantum memory would. We cons dered only the number of evaluat ons of F n the analys s of algor thm Coll s on. The t me spent sort ng L and do ng b nary search n L should also be taken nto account f we wanted to analyze the runn ng t me of our algor thm. If we assume that t takes t me T to compute the funct on (rather than assum ng that t s g ven as a black box), then t s stra ghtforward to show that the algor thm cons dered n Thm. 2 runs n expected t me O (k +
N rk )(T + log k)
Thus, the t me spent sort ng s negl g ble only f t takes Ω(log k) t me to compute F . S m lar cons derat ons apply to algor thm Claw. It s tempt ng to try us ng un versal hash ng [4] to bypass the need for sort ng, as n the s mple class cal algor thm, but t s not clear that th s approach saves t me here because our use of quantum parallel sm when we apply Grover’s algor thm w ll take a t me that s g ven by the max mum t me taken for all requests to the table, wh ch s unl kely to be constant even though the expected average t me s constant.
Quantum Cryptanalys s of Hash and Claw-Free Funct ons
169
References 1. M chel Boyer, G lles Brassard, Peter H yer and Ala n Tapp, T ght bounds on quantum search ng , Proceed ngs of Fourth Workshop on Phys cs and Computat on PhysComp ’96, November 1996, pp. 36 43. F nal vers on to appear n Fortschr tte Der Phys k. 2. G lles Brassard, Dav d Chaum and Claude Crepeau, M n mum d sclosure proofs of knowledge , Journal of Computer and System Sc ences, Vol. 37, no. 2, October 1988, pp. 156 189. 3. G lles Brassard and Peter H yer, An exact quantum polynom al-t me algor thm for S mon’s problem , Proceed ngs of F fth Israel Sympos um on Theory of Comput ng and Systems ISTCS ’97, June 1997, IEEE Computer Soc ety Press, pp. 12 23. 4. J. Larry Carter and Mark N. Wegman, Un versal classes of hash funct ons , Journal of Computer and System Sc ences, Vol. 18, no. 2, 1979, pp. 143 154. 5. Sha Goldwasser, S lv o M cal and Ronald L. R vest, A d g tal s gnature scheme secure aga nst adapt ve chosen-message attacks , SIAM Journal on Comput ng, Vol. 17, 1988, pp. 281 308. 6. Lov K. Grover, A fast quantum mechan cal algor thm for database search , Proceed ngs of the 28th Annual ACM Sympos um on Theory of Comput ng, 1996, pp. 212 219. 7. Lov K. Grover, Quantum mechan cs helps n search ng for a needle n a haystack , Phys cal Rev ew Letters, Vol. 79, no. 2, 14 July 1997, pp. 325 328. 8. Er c Ra ns, talk g ven at AT&T, Murray H ll, New Jersey, 12 March 1997. 9. Dan el R. S mon, On the power of quantum computat on , SIAM Journal on Comput ng, Vol. 26, no. 5, October 1997, pp. 1474 1483.
Batch Ver cat on w th Appl cat ons to Cryptography and Check ng (Inv ted Paper) Mihir Bellare1 , Juan A. Garay2 , and Tal Rabin2 1
Department of Computer Science & Engineering, Mail Code 0114, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA. m h
[email protected] http://www-cse.ucsd.edu/users/m h r 2 IBM T.J. Watson Research Center, PO Box 704, Yorktown Heights, New York 10598, USA. garay,talr @watson. bm.com http://www.research. bm.com/secur ty
Abstract. Let R( ) be a polynomial time-computable boolean relation. Suppose we are given a sequence nst1 nstn of instances and asked whether it is the case that R( nst ) = 1 for all = 1 n. The naive way to gure out the answer is to compute R( nst ) for each and check that we get 1 each time. But this takes n computations of R. Can one do any better? The above is the batch veri cation problem. We initiate a broad investigation of it. We look at the possibility of designing probabilistic batch veri ers, or tests, for basic mathematical relations R. Our main results are for modular exponentiation, an expensive operation in terms of number of multiplications: here g is some xed element of a group G and R(x y) = 1 i g x = y. We nd surprisingly fast batch veri ers for this relation. We also nd e cient batch veri ers for the degrees of polynomials. The rst application is to cryptography, where modular exponentiation is a common component of a large number of protocols, including digital signatures, bit commitment, and zero knowledge. Similarly, the problem of verifying the degrees of polynomials underlies (veri able) secret sharing, which in turn underlies many secure distributed protocols. The second application is to program checking. We can use batch veri cation to provide faster batch checkers, in the sense of [20], for modular exponentiation. These checkers also have stronger properties than standard ones, and illustrate how batch veri cation can not only speed up how we do old things, but also enable us to do new things.
Work supported in part by NSF CAREER Award CCR-9624439 and a 1996 Packard Foundation Fellowship in Science and Engineering. C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 170 191, 1998. c Spr nger-Verlag Berl n He delberg 1998
Batch Veri cation with Applications to Cryptography and Checking
1
171
Introduct on
We suggest the notion of batch veri cation. Based on this we suggest and implement a new paradigm for program checking [7]. We also suggest applications in cryptography. Motivated by this we design batch veri ers for some particular functions of interest in these domains. 1.1
Batch Ver cat on
Let R be a (polynomial time-computable, boolean) relation. The veri cation problem for R is given an instance nst, check whether R( nst) = 1. In the batch nstn of instances and veri cation problem we are given a sequence nst1 asked to verify that for all = 1 n we have R( nst ) = 1. The naive way n. We want to do it is to compute R( nst ), and check it is 1, for all = 1 faster. To do this, we allow probabilism and an error probability. A batch ver er nstn and (also called a test ) is a probabilistic algorithm V which takes nst1 n, this produces a bit as output. We ask that when R( nst ) = 1 for all = 1 output be 1. On the other hand, if there is even a single for which R( nst ) = 0 nstn ) = 1 with very low probability. Speci cally, then we want that V ( nst1 we let l be a security parameter and ask that this probability be at most 2−l . We stress that if even a single one of the n instances is wrong the veri er should detect it, except with probability 2−l . Yet we want this veri er to run faster than the time to do n computations of R. 1.2
Appl cat on Doma ns
Batch veri cation will be useful in any algorithmic setting where there are repetitive tasks. Before presenting our results and the particular applications that ensue, let us briefly discuss two concrete application domains that have motivated our work. Cryptography. It is a consequence of the adversarial nature of cryptography that many of its computational tasks are for the purpose of verifying some property or computation. A setting where batch veri cation is useful is in the veri cation of digital signatures. For example, the validity of a sequence of electron c co ns needs to be veri ed by checking the bank’s signature on each coin. When there are lots of coins, batch veri cation will help. Similarly one may receive many cert cates, containing public keys signed by a certi cation authority, and one can check all the signatures simultaneously. Beyond this, batch veri cation is useful for a large number of standard cryptographic protocols. These protocols typically involve repetition of some operation, such as a committal, done for example via the discrete exponentiation function x g x in a group with generator g, so that a party commits to x by providing y = g x , and later de-commits by revealing x. At this point, someone must check that indeed y = g x . In a zero-knowledge protocol, thousands or more committals are being performed simultaneously, and batch veri cation will be useful. The
172
Mihir Bellare, Juan A. Garay, and Tal Rabin
same is true for other standard cut-and-choose type protocols, for example for key escrow. We also provide fast batch veri cation methods for degrees of polynomials, which have applications in veri able secret sharing and other robust distributed tasks. We elaborate on more speci c applications of our results in this domain in Sect. 1.5. Program check ng. The notion of batch veri cation has on the face of it nothing to do with program checking: as Sect. 1.1 indicates, there is no program in the picture that one is trying to check. Nonetheless, we apply this notion to do program checking in a novel way. Our approach, called batch program instance checking, has the following bene ts: it permits fast checking; and it permits instance checking, not just program checking, in the sense that a correct result is not rejected just because the program might be wrong on some other instance. (The last is in contrast to standard program checking.) We can do batch program instance checking for any function f whose corresponding graph (the relation Rf (x y) = 1 i f (x) = y) has e cient batch veri ers, so that the main technical problem is the construction of batch veri ers. We do not elaborate here, but Sect. 3 presents in more detail both the approach and the background, including explanations of how this di ers from other notions like batch program checking [20]. Now we move on to the design of batch veri ers. 1.3
Batch Ver ers for Modular Exponent at on
We have been able to design some surprisingly e cient batch veri ers for modular exponentiation. By the approach of Sect. 3, these translate into fast batch program instance checkers. In particular the amortized (per instance) cost of our checkers is signi cantly lower than that of [1]. Let g be a generator of a (cyclic) group G, and let q denote the order of G. The modular exponentiation function is x g x , where x Zq . De ne the x exponentiation relation EXPG g (x y) = 1 i g = y, for x Zq and y G. We design batch veri ers for this relation. As per the above, such a veri er is (xn yn ) and wants to verify that EXPG g (x y ) = given a sequence (x1 y1 ) 1 for all = 1 n. The naive test is to compute g xi and test it equals y , for all =1 n, having cost n exponentiations. We want to do better; multiplication (the group operation) will be our basic operation by which we shall compute costs. Folklore techniques yield a rst, basic test that we include for completeness, calling it the Random SubsetTest. Our main results are two better tests, the SmallExponentsTest and the BucketTest. They are presented, with analysis of correctness, in Sect. 5. Their performance is summarized in Fig. 1, with the naive test listed for comparison. We explain the notation used in the Fig.: k1 = lg( G ); ExpCostG (k1 ) is the number of multiplications required to compute an exponentiation ab for a G and b an integer of k1 bits; and ExpCostsG (k1 ) is the cost of computing s di erent such exponentiations. s ExpCostG (k1 ), but there are ways to make it (Obviously ExpCostsG (k1 )
Batch Veri cation with Applications to Cryptography and Checking Test
No. of mult pl cat ons
Naive
ExpCostn G (k1 ) nl 2 + ExpCostlG (k1 )
Random Subset (RS) Small Exponents (SE) Bucket
173
l
minm
2
l + nl 2 + ExpCostG (k1 ) l m−1
m
(n + m + 2m−1 m + ExpCostG (k1 ))
F g. 1. Performance of algor thms for batch ver cat on of modular exponent at on. We indicate the number of multiplications each method uses to get error 2−l . Here n is the number of instances to be checked, k1 = lg( G ), l is the security parameter, ExpCostG (k1 ) is the number of multiplications required to perform a single exponentiation with a k1 -bit exponent, and ExpCostsG (k1 ) is the number of multiplications to perform s such exponentiations. See text for explanations.
strictly less [10,17,9], which is why it is separate parameter. Under the normal square-and-multiply method, ExpCostG (k1 ) 1 5k1 multiplications in the group, but again, it could be less [10,17,9]. See Sect. 4 for more information.) We treat costs of basic operations like exponentiation as a parameter to stress that our tests can make use of any method for the task. In particular, this explains why standard methods of speeding up modular exponentiation such as those mentioned above are not competitors of our schemes; rather, our batch veri ers will always do better by using these methods as subroutines. Figure ref g-dlog-egs in Sect. 5.4 looks at some example parameter values and computes the speed-ups. We see where are the cross over points in performance: for small values of n the SmallExponentsTest is better, while for larger values, BucketTest wins. Notice that even for quite small values of n we start getting appreciable speed-ups over the naive method, meaning the bene ts of batching kick in even when the number of instances to batch is quite small. Asymptotically more e cient tests can be constructed by recursively applying the tests we have presented, but the gains kick in at values of n that seem too high to be useful, so we don’t discuss this. Exponent at on w th common exponent. Above, we consider exponentiation to a xed base g. Another version of the problem is when the exponent is xed, and the relation becomes BASEG v (x y) = 1 i xv = y in G, where G is some appropriate underlying group. (This is the kind of veri cation that is needed .) The for the RSA function [19], which has the form x xv with G = ZN results discussed above do not apply to this version. (Actually the tests are easily adapted, but the correctness is a di erent story. It turns out the natural adaptations don’t work.) In the full paper we suggest a di erent notion of batch veri cation for this case which we call screening.
174
1.4
Mihir Bellare, Juan A. Garay, and Tal Rabin
Batch Ver cat on of Degrees of Polynom als
Roughly, the problem of checking the degree of a polynomial is as follows: Given a set of points, determine whether there exists a polynomial of a certain degree, which passes through all these points. More formally, let SJGeqdef ( 1 m) (S) = 1 i there denote a set of points. We de ne the relation DEGF t ( 1 m) exists a polynomial f (x) such that the degree of f (x) is at most t, and 1 m , f ( ) = , assuming that all the computations are carried out in the nite eld F . A single veri cation of the degree of one polynomial requires one polynomial interpolation. Hence, the naive veri er for the batch instance would be very expensive. The batch veri er which we present allows for the veri cation of multiple (exponentially many in k, for a eld of size 2k ) polynomials at the same cost of a single polynomial interpolation. The general idea underlying the batch veri er is to compute a random linear combination of the shares corresponding to the various polynomials. This in turn generates a new single instance of the problem. The correlation is such that, with high probability, if the single instance is correct then so is the batch instance. 1.5
Appl cat ons
Our results for modular exponentiation immediately apply to any discrete logbased protocol in which discrete exponentiation needs to be veri ed. In some cases, we need to tweak the techniques. DSS signatures [15] are a particularly attractive target for batch veri cation because signing is fast and veri cation is slow. Naccache et al. [1 ] were able to give some batch veri cation algorithms for a slight variant of DSS. In the full paper we show how to adapt our tests to apply to this variant, and get faster batch veri cation algorithms than the ones in [1 ]. Many popular zero knowledge or witness-hiding proofs are based on discrete logarithms. For example, discrete exponentiation may be used to implement bit commitment, and such protocols typically involve a lot of bit commitments. Verifying the de-commitments corresponds to veri cation of modular exponentiation, and the use of our batch veri ers can speed up this process. We also improve the discrete log-based n-party signature/identi cation protocols of Brickel et al. [11]. One of the applications of these protocols is teleconferencing, where all the participants are connected to a central facility called a br dge. The bridge receives signals from the participants, operates on these signal in an appropriate way, and then broadcasts the result back to the participants. The problem of checking the degrees of polynomials has wide applications in the elds of fault-tolerant and secure distributed computation, where some of the participants may be (maliciously) faulty. Roughly, the ability of the good players to verify the existence of a valid interpolating polynomial through points that are distributed among the participants, is a basic building block for Veriable Secret Sharing (VSS) [12]. VSS, in turn, enables fundamental distributed primitives such as shared coins, Byzantine agreement, broadcast channel, and
Batch Veri cation with Applications to Cryptography and Checking
175
secret balloting and voting. In [3] we use the techniques for the batch veri cation of this relation to construct a very e cient shared coin tossing scheme. 1.6
Related Work
There has been a lot of previous work on speeding up the modular exponentiation operation itself, for example by pre-processing (Brickell et al. [10], Lim and Lee [17] and others) or addition chain heuristics (Bos and Coster [9], Saerbrey and Dietel [24]). These works provide faster ways to do modular exponentiations. What we are saying is that performing modular exponentiation is only one way to perform veri cation, and if the interest is veri cation, one can do better than any of these ways. In particular, our batch veri ers will perform better than the naive re-computation based veri er, even when the latter uses the best known exponentiation methods. In fact, better exponentiation methods only make our batch veri ers even faster, because we use these methods as subroutines. The idea of batching in cryptography is of course not new. Some previous instances are Fiat’s batch RSA [14], Naccache et al.’s batch veri cation for a variant of DSS [1 ], and Beller and Yacobi’s batch Di e-Hellman key agreement [4]. However, there seems to have been no previous systematic look at the general problem of batch veri cation for modular exponentiation, and our rst set of results indicate that by putting oneself above speci c applications one can actually nd general speed-up tools that apply to them; in particular, we improve some of the mentioned works. In the context of program checking, batch program checking was introduced xn . Again if P by Rubinfeld [20]. Here the checker gets many instances x1 is entirely correct the checker must accept. And if P (x ) = f (x ) for some the checker must reject with high probability. Rubinfeld provides batch veri ers for linear functions. (Speci cally, the mod function.) A similar notion is used by Blum et al. [6] to check programs that handle data structures. In this paper we introduce the notion of batch instance checking and show how to achieve it using batch veri cation. 1.7
Organ zat on of the Paper
The remainder of the paper is organized as follows. In Sect. 2 we formalize the notion of batch veri cation. Sect. 3 is devoted to our approach to program checking; this section is somewhat independent from the rest of the paper, so a reader only interested in the algorithmic techniques can directly proceed to Sect. 4, where we discuss the costs of multiplication and exponentiation. In Sect. 5 we present our batch veri ers for modular exponentiation, while in Sect. 6 we treat the batch veri cation of degrees of polynomials.
2
The Not on of Batch Ver cat on
Here we provide a formal de nition of the notion, extending the discussion in Sect. 1.1. Let R( ) be a boolean relation, meaning R( ) 0 1 . An nstance for
176
Mihir Bellare, Juan A. Garay, and Tal Rabin
the relation is an input nst on which the relation is evaluated. A batch nstance nstn of instances for R. We say that the for relation R is a sequence nst1 n, and ncorrect if batch instance is correct if R( nst ) = 1 for all = 1 there is some 1 n for which R( nst ) = 0. De n t on 1. A batch veri er for R s a probab l st c algor thm V that takes nstn ) as nput (poss bly a descr pt on of R), a batch nstance X = ( nst1 for R, and a secur ty parameter l prov ded n unary. It sat s es: (1) If X s correct then V outputs 1. (2) If X s ncorrect then the probab l ty that V outputs 1 s at most 2−l . The probab l ty s over the co n tosses of V only. Obvious extensions can be made, such as allowing a slight error in the rst case. We stress that if there is even a single for which R( nst ) = 1, the veri er must reject, except with probability 2−l . The na ve batch ver er , or na ve test , consists of computing R( nst ) for each = 1 n, and checking that each of these n values is 1. But this takes n computations of R. We want to do better. The goal is to design batch veri ers for various relations which are e cient compared to the naive veri er. We will always seek to have a low error = 2−l , controlled by a security parameter l. In practice, setting l to be about 60 will su ce. The above is a worst-case notion. Sometimes we might be interested in a more average case version. For example, say R = Rf is the graph of some function f , meaning Rf (x y) = 1 i f (x) = y. We might be in a setting where in each instance nst = (x y ) we know that x is uniformly distributed. We still want to check that indeed y = f (x ). The batch veri er need only work for instances drawn from a distribution where each x is chosen independently and uniformly. This can happen in a cryptographic protocol where one party chooses xn at random, another party computes y1 yn , and the rst party x1 n. For example, say f (x) = g x is must check that f (x ) = y for all = 1 exponentiation in some group of which g is a generator; then this kind of thing does arise in zero-knowledge protocols. In the above it is impossible to fool the batch veri er except with low probability. We are also interested in a weaker notion under which it is possible, in principle, to fool the batch veri er, but computationally infeasible to nd instances that do so. This notion, called computational batch veri cation, is again useful in cryptographic settings where we might not be able to design full-fledged batch veri ers but are able to do so under the assumption that the underlying cryptosystem can’t be broken. Let R = Rd d2D be a family of relations over an index set D. Associated to D is some probability distribution. The batch veri er V ( ) gets input d, a batch instance X for Rd , and a security parameter l. It outputs a bit V (d X l). We consider an algorithm A that given d l tries to produce batch instances that fool V . Let P ss(A V R l) = Pr [ d
D; X
A(d l) : X is incorrect but V (d X l) = 1]
Batch Veri cation with Applications to Cryptography and Checking
177
be the probability that V accepts even though the instance is incorrect. The probability is over the coins of both A and the test V . We want to say this probability is small as long as A is not allowed too much computing time. De n t on 2. A computational batch veri er for relat on fam ly Rd d2D s a probab l st c algor thm V that takes as nput d, a batch nstance X = ( nst1 nstn ) for Rd , and a secur ty parameter l prov ded n unary. V s sa d to be (t m )-rel able f the follow ng are true: (1) If X s correct then V outputs 1. (2) P ss(A V R l) for any algor thm A runn ng n t me at most t. Here t m
3
may be funct ons of d l.
Batch Program Instance Check ng
In this section we introduce the notion of batch instance checking and show how to achieve it using batch veri cation. We begin with some background and motivation, present the approach, and conclude with the formal de nition of the notion. 3.1
Program Check ng: Background and Issues
Let f be a function and P a program that supposedly computes it. A program checker, as introduced by Blum and Kannan [7], is a machine C which takes input x and has oracle access to P . It calls the program not just on x but also on other points. If P is correct, meaning it correctly computes f at all points, then C must accept x, but if P (x) = f (x) then C must reject x with high probability. Program checking has been extensively investigated, and checkers are now known for many problems [7,1,6,16, ,21,22,13]. Checking has also proven very useful in the design of probabilistic proofs [23,2]. Batch program checking was introduced by Rubinfeld [20]. Here the checker xn . Again if P is entirely correct the checker must gets many instances x1 accept. And if P (x ) = f (x ) for some the checker must reject with high probability. Rubinfeld provides batch veri ers for linear functions. (Speci cally, the mod function.) A similar notion is used by Blum et al. [6] to check programs that handle data structures. The l ttle-oh constra nt. To make checking meaningful, it is required that the checker be di erent from the program. Blum captured this by asking that the checker run faster than any algorithm to compute f , formally in time little-oh of the time of any algorithm for f . We will see that with our approach, we will use a slow program as a tool to check a fast one. Nonetheless, the checker w ll run faster than any program for f , so that Blum’s constraint will be met.
178
Mihir Bellare, Juan A. Garay, and Tal Rabin
Problems w th check ng. Program checking is a very attractive notion, and some very elegant and useful checkers have been designed. Still the notion, or some current implementations, have some drawbacks that we would like to address: Good results can be rejected: Suppose P is correct on some instances and wrong on others. In such a case, even if P (x) is correct, the checker is allowed to (and might) reject on input x. This is not a desirable property. It appears quite plausible, even likely, that we have some heuristic program that is correct on some but not all of the instances. We would like that whenever P (x) is correct the checker accepts, else it doesn’t. (As usual it is to be understood that in such statements we mean with high probability in both cases.) This is to some extent addressed by self-correction [ ], but that only works for problems which have a nice algebraic structure, and needs assumptions about the fraction of correct instances for a program. Check ng s slow: Even the best known checkers are relatively costly. For example, just calling the program twice to check one instance is costly in any real application, yet checkers typically call it a constant number of times to just get a constant error probability, meaning that to get error probability 2−l the program might be invoked Ω(l) times. Batch checking improves on this to some extent, but, even here, to get error 2−l , the mod function checker of [20] calls the program Ω(nl) times for n instances, so that the amortized cost per instance is Ω(l) calls to the program, plus overhead. What to check? We remain interested in designing checkers for the kinds of functions for which checkers have been designed in the past. For example, linear functions. The approach discussed below applies to any function, but to be concrete we think of f as the modular exponentiation function. This is a particularly interesting function because of the wide usage in cryptography, so that fast checkers would be particularly welcome. 3.2
Check ng Fast Programs w th Slow Ones
Our approach. To introduce our approach let us go back to the basic question. Let f be the function we want to check, say modular exponentiation. Why do we want to check a program P for f ? Why can’t we just put the burden on the programmer to get it right? After all modular exponentiation is not that complicated to code if you use the usual (simple, cubic time) algorithm. It should not be too hard to get it right. The issue is that we probably do NOT want to use the usual algorithm. We want to design a program P that is faster. To achieve this speed it will try to optimize and cut corners in many ways. For example, it would try heuristics. These might be complex. Alternatively, it might be implemented in hardware. Now, we are well justi ed in being doubtful that the program is right, and asking about checking.
Batch Veri cation with Applications to Cryptography and Checking
179
Thus, we conclude that it is reasonable to assume that it is not hard to design a reliable but slow program Pslow that correctly computes f on all instances. Our problem is that we have a fast but possibly unreliable program P that claims to compute f , and we want to check it. Thus, a natural thought is to use Pslow to check P . That is, if P (x) returns y, check that Pslow (x) also returns y. Of course this makes no sense. If we were willing to invest the time to run Pslow on each instance, we don’t need P anyway. Formally, we have violated the little-oh property: our checker is not faster than all programs for f , since it is not faster than Pslow . However, what we want is to essentially do the above in a meaningful way. The answer is batching. However we will not do batch program checking in the sense of [20]. Instead we will be batch-verifying the outputs of P , using Pslow , and without invoking P at all. More precisely, de ne the relation R, for any nst = (x y), by R(x y) = 1 i f (x) = y. Let’s assume we could design a batch veri er V for R, in the sense of Sect. 1.1. (Typically, as in our later designs, V will make some number of calls to Pslow . But much fewer than n calls, since its running time is less than n times the time to compute R.) Our program checker is for a batch instance xn . Say we have the outputs y1 = P (x1 ) yn = P (xn ) of the program, x1 and want to know if they are correct. We simply run V on the batch instance (xn yn ) and accept if V returns one. The properties of a batch (x1 y1 ) veri er as de ned in Sect. 1.1 tells us the following. If P is correct on all the xn , then we accept. If P is wrong on any one of these instances instances x1 then we reject. Thus, we have a guarantee similar to that of batch program checking (but a little stronger as we will explain) and at lower cost. Since V makes some use of Pslow we view this as using a slow program to check a fast one. Features of our approach. We highlight the following bene ts of our batch program checking approach: Instance correctness: In our approach, as long as P is correct on the speci c xn on which we want results, we accept, even if P is wrong instances x1 on other instances. (Recall from the above that usual checkers can reject even when the program is correct on the instance in question, because it is wrong somewhere else, and this is a drawback.) In this sense we have more a notion of program instance checking. Speed: In our approach, the program is called only on the original instances, so the number of program calls, amortized, is just one! Thus, we only need to worry about the overhead. However, with good batch veri ers (such as we will later design), this can be signi cantly smaller than the total running time of the program on the n calls. Thus the amortized additional cost of our checker is like o(1) program calls, and this is to achieve low error, not just constant error. This is very fast. O -l ne check ng: Our checking can be done o -line as in [6]. Thus, for example, we can use (slow) software to check (fast) hardware.
180
Mihir Bellare, Juan A. Garay, and Tal Rabin
Of course batching carries with it some issues too. When an error is detected (xn yn ) we know that some (x y ) is incorrect in a batch instance (x1 y1 ) but we don’t know which. There are several ways to compensate for this. First, we expect to be in settings where errors are rare. (As bugs are discovered they are xed, so we expect the quality of P to keep improving.) In some cases it is reasonable to discard the entire batch instance. (In cryptographic settings, we are often just trying to exponentiate random numbers, and can throw away one batch and try another.) Alternatively, one can gure out the bad instance o line; this may be acceptable if it doesn’t have to be done too often. 3.3
De n t on
We conclude by summarizing the formal de nition of our notion of batch program instance checking. Similarly to relations, a batch instance for a (not necessarily xn of points in its domain. boolean) function f is simply a sequence X = x1 n, and incorrect A program P is correct on X if P (x ) = f (x ) for all = 1 if there is some 1 n such that P (x ) = f (x ). If f is a function we let Rf be its graph, namely the relation Rf (x y) = 1 if f (x) = y, and 0 otherwise. (xn P (xn )) is a correct instance Notice that P is correct on X i (x1 P (x1 )) of the batch veri cation problem for Rf . De n t on 3. A batch program instance checker for f s a probab l st c oracle algor thm C P that takes as nput (poss bly a descr pt on of f ), a batch nstance xn ) for f , and a secur ty parameter l prov ded n unary. It sat s es: X = (x1 (1) If P s correct on X then C P outputs 1. (2) If P s ncorrect on X then the probab l ty that C P outputs 1 s at most 2−l . We wish to design such batch program instance checkers which have a very low complexity and make only marginally more than n oracle calls to the program. As indicated above, this is easily done for a function f if we have available batch veri ers for Rf , so we concentrate on getting e cient batch veri ers.
4
Costs of Mult pl cat on and Exponent at on
Let G be a (multiplicative) group. Many of our algorithms are in cryptographic or subgroups thereof (N could be composite or prime). We meagroups like ZN sure cost in terms of the number of group operations, here multiplications. Cost of one exponent at on. Given a G and an integer b, the standard squareG at a cost of 1 5 b multiplications on and-multiply method computes ab the average. However, there are methods to do better. For example, using the windowing method based on addition chains [9,24], the cost can be reduced to about 1 2 b ; pre-computation methods have been proposed to reduce the number of multiplications further at the expense of storage for the pre-computed values [10,17] (a range of values can be obtained here; we give some numerical
Batch Veri cation with Applications to Cryptography and Checking
181
examples in Sect. 5.4). Accordingly it is best to treat the cost of exponentiation as a parameter. We let ExpCostG (k1 ) denote the time to compute ab in group G when k1 = b , and express the costs of our algorithms in terms of this. abn , exponenMult ple exponent at ons. Suppose we need to compute ab1 tiations in a common base a but with changing exponents. Say each exponent is t bits long. We can certainly do this with n ExpCostG (t) multiplications. However, it is possible to do better, via the techniques of [10,17], because in this case the pre-computation can be done on-line and still yield an overall savings. Accordingly, we treat the cost of this operation as a parameter too, denoting it ExpCostnG (t). Comput ng the product of powers. We now present a general algorithm we will an G. Suppose b1 bn are use in Sect. 5 as a subroutine. Suppose a1 integers in the range 0 2t − 1 < G . We write them all as strings of some b [1]. The problem is to compute the product a = length t, so that b = b [t] n bi =1 a , the operations being in G. The naive way to do this is to compute c = n n n and then compute a = abi for = 1 =1 c . This takes ExpCostG (t) + n − 1 multiplications, where k2 is the size of the representation of an element of G. (Using square-and-multiply exponentiation, for example, this works out to 3ntk2 2 + n − 1 multiplications; with a faster exponentiation it may be a bit less.) However, drawing on some ideas from [10], we can do better, as shown in Fig. 2. This algorithm performs t multiplications in the outer loop and nt 2 multiplications on the average for the inner loop. Hence, for computing y we get a total of t + nt 2 multiplications.
G ven: a1 an G; b1 Q Compute: a = n=1 abi .
bn integers in the range 0
2t − 1 < G .
Algor thm FastMult((a1 b2 ) (an bn )) a := 1; for j = t downto 1 do for = 1 to n do f b [j] = 1 then a := a a ; a := a2 return a
F g. 2. Fast algor thm for comput ng the product of powers.
5
Batch Ver cat on for Modular Exponent at on
Let G be a group, and let q = G be the order of G. Let g be a primitive element of G. Hence, for each y G there is a unique Zq such that y = g . This is
182
Mihir Bellare, Juan A. Garay, and Tal Rabin
the discrete logarithm of y to the base g and is denoted logg (y). De ne relation EXPG g (x y) to be true i g x = y. (Equivalently, x = logg (y).) We let k1 denote the length (number of bits) of q, and k2 the length of g. We are interested in groups arising in cryptography for which the discrete log problem (computing logg (y) given y) is hard. This is not an assumption needed for our results (in particular we do not use any hardness assumptions), it is rather the motivation. In this category what is important is that k2 is quite large, about k2 = 1024. In comparing complexities we think of k2 as about this much. With G g xed we want to construct fast batch veri ers for the relation EXPG g . We begin with a simple test which, although better than the naive method, is not so e cient. 5.1
Random Subset Test n
The rst thing that one might think of is to compute x = =1 x mod q and y = n x y (the multiplications are in G) and check that g = y. However it is easy =1 to see this doesn’t work: for example, the batch instance (x + g x ) (x − g x ) passes the test for any Zq , but is clearly not a correct instance when = 0. A natural x that comes to mind is to do the above test on a random subset of the instances: pick a random subset S of 1 n , compute x = 2S x mod q x and y = 2S y and check that g = y. (The idea is that randomizing splits any bad pairs such as those of the example above.) We call this the sc Atomic Random Subset Test. This test seems simple enough that it might be viewed as folklore. Its analysis is quite simple, and we skip the proofs but state the results, for comparison with our later better methods. Lemma 1. G ven a group G and a generator g of G, suppose (x1 y1 ) (xn yn ) s an ncorrect batch nstance of the batch ver cat on problem for EXPG g ( ). Then the Atom c Random Subset Test accepts (x1 y1 ) (xn yn ) w th probab l ty at most 1 2. This lemma tells us that the test does work, but not too well, in the sense that the error is not small, but a constant, namely 1 2. (Moreover, one can show that this analysis is best possible.) So to lower the error to the desired 2−l we must repeat the atomic test independently l times. We call this the Random SubsetTest. See Fig. 3. However, the repetition is costly: the total cost is now nl 2+ExpCostlG (k1 ) multiplications. This is not so good, and, in many practical instances may even be worse than the naive test, for example if n l. (Since l should be at least 60 this is not unlikely.) The conclusion is that repeating many times some atomic test which itself has constant error can be costly even if the atomic test is e cient. Thus, in what follows we will look for ways to d rectly get low error. First, lets summarize the results we just discussed in a theorem.
Batch Veri cation with Applications to Cryptography and Checking
183
Theorem 1. G ven a group G, a generator g of G, the Random Subset Test s a batch ver er for the relat on EXPG g ( ) w th cost nl 2 + ExpCostlG (k1 ) mult pl cat ons, where k1 = lg( G ) . 5.2
The Small Exponents Test
We can view the sc Atomic Random Subset Test in a di erent way. Namely, sn 0 1 at random, let x = n=1 s x and y = n=1 y si , pick bits s1 x : s = 1 .) and check that g = y. (This corresponds to choosing the set S = sn We know this test has error 1 2. The idea to get lower error is to choose s1 from a larger domain, say t bit strings for some t > 1. There are now two things to ask: whether this does help lower the error faster, and, if so, at what rate as a function of t; and then as we increase t, how performance is impacted. Let’s look at the latter rst. If we can keep t small, then we have only a single exponentiation to a large (ie. k1 -bit) exponent, as compared to l of them in the random subset test. That’s where we expect the main performance gain. But now we have added n new exponentiations. However, to a smaller exponent. Thus, the question is how large t has to be to get the desired error of 2−l . We use some group theory to show that the tradeo between the length t of the s ’s and the error is about as good as we could hope: setting t = l yields the desired error 2−l . The corresponding test is the Small Exponents (SE) Test and is depicted in Fig. 3. The following theorem summarizes its properties and provides the analysis proving our claim about the error. Theorem 2. G ven a group G of pr me order q and a generator g of G, then SmallExponentsTest s a batch ver er for the relat on EXPG g ( ) w th cost l + n(1 + l 2) + ExpCostG (k1 ) mult pl cat ons, where k1 = q . Proof. F rst let us see how to get the cla m about the performance. Instead of comput ng y si nd v dually for each value of and then mult ply ng these n si d rectly and more e c ently as values, we compute the product y = =1 y (yn sn )), the algor thm be ng that of Sect. 4. S nce y = FastMult((y1 s1 ) sn were random l-b t str ngs the cost s l + nl 2 mult pl cat ons on the s1 average. Comput ng x takes n mult pl cat ons. F nally, there s a s ngle exponent at on to the power x, g v ng the total number of mult pl cat ons stated n the theorem. That the test always accepts when the nput s correct s clear. Now we prove (xn yn ) be ncorrect. Let x0 = logg (y ) the soundness. Let the nput (x1 y1 ) = x − x0 . S nce the nput s ncorrect for = 1 n. For = 1 n let = 0. For notat onal s mpl c ty we may assume (wlog) there s an such that that th s s true for = 1. (Note: Th s does not mean we are assum ng j = 0 for j > 1. There may be many j > 1 for wh ch j = 0.) Now suppose the test sn . Then accepts on a part cular cho ce of s1 g s1 x1 ++sn xn JGeqqy1s1
ynsn
(1)
184
Mihir Bellare, Juan A. Garay, and Tal Rabin
G ven: g a generator of G, and (x1 y1 ) (xn yn ) with x and y G. Also a security parameter l. Check: That
1
Zp
n : y = g xi .
Random Subset (RS) Test: Repeat the following atomic test, independently l times, and accept i all sub-tests accept: Atom c Random Subset Test: (1) For each = 1 n pick b 0 1 at random (2) Let S = : b =1 P Q (3) Compute x = S x mod q, and y = Sy (4) If g x = y then accept, else reject.
Small Exponents (SE) Test: (1) Pick s1 sn 0 1 l at random Pn Qn si (2) Compute x = =1 x s mod q, and y = =1 y x (3) If g = y then accept, else reject. Bucket Test: Takes an additional parameter m 2. Set M = 2m . Repeat the following atomic test, independently l (m − 1) times, and accept i all sub-tests accept: Atom c Bucket Test: (1) For each = 1 n pick t 1 M at random (2) For each j = 1 M let Bj = : t =j P (3) For each j = 1 M let cj = Bj x mod q, and dj = Q Bj y (4)
Run, on the instance (c1 d1 ) (cM dM ), the Small Exponent Test with security parameter set to m.
F g. 3. Batch ver cat on algor thms for exponent at on w th a common base. 0
0
But the r ght hand s de s also equal to g s1 x1 ++sn xn . Therefore, we get 0 0 g s1 x1 ++sn xn = g s1 x1 ++sn xn , or g s1 1 ++sn n = 1. S nce g s a pr m t ve + sn n 0 mod q. But 1 = 0. element of the group, t must be that s1 1 + S nce q s pr me, 1 has an nverse 1 sat sfy ng 1 1 1 mod q. Thus, we can wr te s1
−
1
(s2
2
+
+ sn
n)
mod q
(2)
sn , there s exactly one (and hence at most Th s means that for any xed s2 0 1 l (namely that of Equat on 2) for wh ch Equat on 1 one) cho ce of s1 sn , f we draw s1 at random the probab l ty that s true. So for xed s2
Batch Veri cation with Applications to Cryptography and Checking
185
Equat on 1 s true s at most 2−l . Hence the same s true f we draw all of sn ndependently at random. So the probab l ty that the test accepts s at s1 most 2−l .
Remark 1. We stress that this result holds in a group of pr me order. We are not working in Zq (which has order q − 1) but in a group G which has order q a prime. (If the group does not have prime order, it is easy to nd examples to show that our tests don’t work.) In practice this is not really a restriction. As is standard in many schemes, we can work in an appropriate subgroup of Zp where p is a prime such that q divides p − 1. In fact, prime order groups seem superior to plain integers modulo a prime in many ways. The discrete logarithm problem seems harder there, and they also have nice algebraic properties which many schemes exploit to their advantage. 5.3
The Bucket Test
We saw that the SmallExponentsTest was quite e cient, especially for an n that was not too large. We now present another test that does even better for large n. Our BucketTest, shown in Fig. 3, repeats m times an Atom c Bucket Test for some parameter m to be determined. In its rst stage, which is BM . steps (1) (3) of the description, the atomic test forms M buckets B1 For each it picks at random one of the M buckets, and puts the pair (x y ) in this bucket. (The value t in the test description chooses the bucket for .) The x values of pairs falling in a particular bucket are added while the corresponding M speci ed y values are multiplied; this yields the values cj dj for j = 1 in the description. The rst part of the analysis below shows that if there had been some for which g xi = y then except with quite small probability (2−m ) there is a bad bucket, namely one for which g cj = dj . Thus we are reduced to another instance of the same batch veri cation prob(cM dM ) we need lem with a smaller instance size M . Namely, given (c1 d1 ) M . The desired error is 2−m . to check that g cj = dj for all j = 1 We can use the SE test to solve the smaller problem as has been described in Fig. 3. (Alternatively, we could recursively apply the bucket test, bottoming out the recursion with a use of the SE test after a while. This seems to help, yet for n so large that it doesn’t really matter in practice. Thus, we shall continue our analysis under the assumption that the smaller sized problem is solved using SE.) This yields a test depending on a parameter m. Finally, we would optimize to choose the best value of m. Note that until these choices are made we don’t have a concrete test but rather a framework which can yield many possible tests. To enable us to make the best choices we now provide the analysis of the Atom c Bucket Test and BucketTest with a given value of the parameter m, and evaluate the performance as a function of the performance of the inner test, which is SE. Later we can optimize.
186
Mihir Bellare, Juan A. Garay, and Tal Rabin
Lemma 2. Suppose G s a group of pr me order p, and g s a generator of G. (xn yn ) s an ncorrect batch nstance of the batch ver Suppose (x1 y1 ) cat on problem for EXPg ( ). Then the Atom c Bucket Test w th parameter (xn yn ) w th probab l ty at most 2−(m−1) . m accepts (x1 y1 ) = x − x0 for = Proof. As n the proof of Thm. 2, let x0 = logg (y ) and 1 n. We may assume 1 = 0. Say that a bucket Bj s good (1 j M ) f tn , that all buckets g cj = dj . Let r be the probab l ty, over the cho ce of t1 BM are good. We cla m that r 1 M = 2−m . B1 0 mod q. To see th s, rst note that f a bucket Bj s good then 2Bj tn have been chosen, so that (x2 y2 ) (xn yn ) have been Now assume t2 >1 : t =j these are the current buckets. allotted the r buckets. Let Bj0 = 0 0 0 mod q. If all of B BM are good, then after Say Bj0 s good f 0 1 2Bj x1 s ass gned, there s at least one bad bucket, because 1 = 0. Th s means that there ex sts a j such that Bj0 s bad. (Th s doesn’t mean t’s the only one, but f there are more bad buckets the test w ll fa l. Thus we can assume that there s a BM are good after x1 s thrown n s at s ngle j.) The probab l ty that B1 most the probab l ty that x1 falls n bucket j, wh ch s 1 M . So r 1 M . By assumpt on the test n Step (4) has error at most 2−m so the total error of the atom c bucket test s 2 2−m = 2−(m−1) . Regarding performance, it takes n multiplications to generate the buckets and the smaller instance. To evaluate the smaller instance using SE with parameters 2m m q k2 takes m + 2m m 2 + 2m + ExpCostG ( q ) multiplications by Thm. 2. This process is repeated l (m − 1) times. When we run the test, we choose the optimal value of m, meaning that which minimizes the cost. Thus we have the following. Theorem 3. G ven a group G of pr me order q, and a generator g of G, the BucketTest (w th m set to the opt mal value) s a batch ver er for the relat on EXPG g ( ) w th cost min
m2
l m−1
(n + m + 2m−1 (m + 2) + ExpCostG (k1 ))
mult pl cat ons, where k1 = q . To minimize analytically we would set m log(n + k1 ) − log log(n + k1 ), but in practice it is better to work with the above formula and nd the best value of m by search. This is what is done in the next section. 5.4
Performance Analys s
We look at the actual performance of the batch veri cation tests of Fig. 3. For a given value of n (the number of instances we are simultaneously verifying), exactly how much work does each test need, and which is the best? In particular we don’t want to end up with results that are purely asymptotic, ie. the
Batch Veri cation with Applications to Cryptography and Checking n
No. of mult pl cat ons used by d
187
erent tests
Naive Random Subset Small Exponents Bucket 5
1K
12 K
0.4 K
4.3 K
10
2K
12.5 K
0.6 K
4.4 K
50
10 K
13.5 K
1.8 K
5K
100
20 K
15 K
3.2 K
5.7 K
200
40 K
18 K
6.2 K
7.1 K
500
100 K
27 K
15.2 K
10.7 K
1,000 200 K
42 K
30.2 K
16.5 K
5,000 1000 K
162 K
150 K
56 K
F g. 4. Example: For increasing values of n, we list the number of multiplications (in thousands, rounded up) for 1024-bit exponents for each method to verify n exponentiations with error probability 2−60 . We assume that a single exponentiation requires 200 multiplications [17]. The lowest number for each n is underlined: notice how it is not always via the same test!
improvement is only for very large n. For n = 5 or n = 10, what happens? And how does it grow? To measure this, we count exactly the number of (k2 -bit) multiplications used by each test. These numbers are also tabulated in Fig. 1. Let us x some reasonable values for k1 and the security parameter l: set k1 = 1024, and l = 60. (Meaning the exponentiation is for 1024 bit moduli, and the error probability will be 2−60 .) For various values of the number n of terms in the batch instance, we compare the number of multiplications each test takes. We compare it to the results of [10,17] as they seem harder to beat.1 These results are tabulated in Fig. 4. We stress that these savings occur also if other methods for computing exponentiations are used. We nd that the speedups provided by our tests are real. First, observe that even for small values of n, we can do much better than naive: at n = 5 the SE test is a factor of 2 better than naive. Also observe that which test is better depends on the value of n. (In the gure, we underline the best for each value of n.) As we expected, the RS test is actually worse than naive for small n. Until n about 200, the SmallExponentsTest test is the best. From then on, the 1
Lim and Lee [17] present di erent con gurations to perform exponentiation with precomputation that trade-o the number of multiplications with storage. The estimate of 200 multiplications corresponds to an intermediate con guration, with an acceptable storage requirement (300 pre-computed values). Their fastest con guration with a considerable storage blow-up uses 100 multiplications. Still in this case our tests perform consistently better.
188
Mihir Bellare, Juan A. Garay, and Tal Rabin
BucketTest performs better. Note that the factor of improvement increases: at n = 200 we can do about 6 times better than naive (using SE); at n = 5000, about 17 times better (using Bucket). Another relevant value for k1 is 160. Suppose n = 40, and one would be happy with l = 40. Using the methods of [17] would require about 1700 multiplications. On the other hand, SmallExponentsTest uses 10 0, and using plain squareand-multiply. Combining SmallExponentsTest with pre-processing with reasonable storage brings the number of multiplications below 900. In other words, using these tests can bring sizeable speedups in any setting where we need to perform over ve modular exponentiations simultaneously, and as n increases the savings get even larger.
6
Batch Ver cat on of Degree of Polynom als
The problem of checking the degree of a polynomial is as follows: Given a set of points, determine whether there exists a polynomial of a certain degree, which passes through all these points. More formally, let SJGeqdef ( 1 m ) denote (S) = 1 i there exists a a set of points. We de ne the relation DEGF t ( 1 m) polynomial f (x) such that the degree of f (x) is at most t, and 1 m , f ( ) = , assuming that all the computations are carried out in the nite eld F. Sn , where S = ( 1 Let the batch instance of this problem be S1 m ). (S ) = 1 for all = 1 n; The batch instance is correct if DEGF t ( 1 m) incorrect otherwise. The relation DEG can be evaluated by taking t + 1 values from the set and interpolating a polynomial f (x) through them. This de nes a polynomial of degree at most t. Then verify that all the remaining points are on the graph of this polynomial. Thus, a single veri cation of the degree requires a polynomial interpolation. Hence, the naive veri er for the batch instance would be highly expensive. The batch veri er which we present here carries out a single interpon lation in a eld of size F , and achieves a probability of error less than jF j . The general idea is that a random linear combination of the shares will be computed. This in return will generate a new single instance of DEG. The correlation will be such that, with high probability, if the single instance is correct then so is the batch instance. Hence, we can solve the batch instance computing a s ngle polynomial interpolation, contrasting O(m2 n) multiplications with O(mn) multiplications. We will be working over a nite eld F whose size will be denoted by p (not necessarily a prime). 2 We will be measuring the computational e ort of the players executing a protocol by the number of multiplications that they are required to perform. Note that the size of the eld is of relevance, as the naive multiplication in a eld of size 2k takes O(k 2 ) steps. We note that the 2
At this point we shall assume that the instances are computed in the same eld F as the new instance that we generate. Later we shall show how to dispense with this assumption.
Batch Veri cation with Applications to Cryptography and Checking G ven: S1 Sn where S = ( 1 security parameter l; value t. Check: That f ( 1) =
1 1
m );
1
189
n;
n : f (x) such that deg(f ) f ( m) = m
t, and
Random L near Comb nat on Test: (1) Pick r R F (2) Compute γ JGeqdef r n n + + r 1. ( This can be e ciently computed as ( ((r n + (n−1) )r + (n−2) ) )r + 1 )r. ) (3) If DEGF t ( 1 (γ γ ) = 1, then output correct, 1 m m) else output incorrect.
F g. 5. Batch ver cat on algor thm for check ng the degree of polynom als. elds in which the computations are carried out can be specially constructed in order to multiply faster. The test (protocol), which we call Random L near Comb nat onTest, appears in Fig. 5. Theorem 4. Assume j such that for all polynom als fj (x) wh ch sat sfy that 1 m , fj ( ) = , t holds that the degree of fj (x) s greater than t. Then Random L nearComb nat onTest s a batch ver er for the relat on ( ) wh ch runs n t me O(mn) and has an error probab l ty of DEGF t ( 1 m) at most np . Notat on. Given a polynomial f (x) = am xm + f (x) t+1 JGeqdef am xm + If m
+ a1 x + a0 , where am = 0, + at+1 xt+1
t, then fj (x) t+1 = 0.
Proof. In order for Random L nearComb nat onTest to output correct, (γ1 γm ) = 1. Namely, there exists it must be the case that DEGF t ( 1 m) a polynomial F (x) of degree at most t which satis es all the values in S. Let f (x) be the polynomial interpolated by the set S ; it might be that deg(f ) > t. By de nition, the polynomial F (x) = n=1 r f (x). As deg(F ) t, it holds that n t+1 must be equal to 0. This is an equation of degree n and hence =1 r f (x) has at most n roots. In order for Random L nearComb nat onTest to fail, namely, to output correct when in fact the instance is incorrect, r must be one of the roots of the equation. However, this can happen with probability at most np . Each linear combination of the shares requires O(mn) multiplications, and the nal interpolation requires O(m2 ) multiplications.
190
Mihir Bellare, Juan A. Garay, and Tal Rabin
Batch ver cat on of part al de n t on of polynom als. Consider the following problem: Given the set S as above and a value variant of the DEGF t ( 1 m) t, there is an additional value s, and the requirement is that there exists a polynomial f (x) of degree at most t such that for all but s of the values f ( ) = . As this is in essence an error correcting scheme, some limitations exist on the value of r. The best known practical solution to this variation is given by Berlekamp and Welch [5]. It requires solving a linear equation system of size m. Hence, again, using a naive batch veri er to check a batch instance would be highly ine cient. Random L nearComb nat onTest can be modi ed to solve this variant e ciently as well. D erent elds. It might be the case that the original instances were all computed in a eld F of size p. Yet, 1p is not deemed a small-enough probability of error. Therefore, we create an extension eld F 0 of the original eld, containing F as a sub eld. For example, view F as the base eld and let F 0 = F [x] < r(x) > for some irreducible polynomial of the right degree (namely, of a degree big enough to make F 0 of the size we want). Thus, if F = GF (2k ) we will get 0 F 0 = GF (2k ), for some k 0 > k, and the former is a sub eld of the latter. It must be noted that if the extension eld is considerably larger than the original eld, then the computations in the extension eld are more expensive. Thus, in this case there is a trade-o between using the sophisticated batch veri er and using the naive veri er.
References 1. L. Adleman and K. Kompella. Fast Checkers for Cryptography. In A. J. Menezes and S. Vanstone, editors, Advances n Cryptology Crypto ’90, pages 515 529, Berlin, 1990. Springer-Verlag. Lecture Notes in Computer Science No. 537. 2. S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy. Proof veri cation and hardness of approximation problems. In Proc. 33rd Annual Sympos um on Foundat ons of Computer Sc ence, pages 14 23. IEEE, 1992. 3. M. Bellare, J. Garay, and T. Rabin. Distributed Pseudo-Random Bit Generators A New Way to Speed-Up Shared Coin Tossing. In Proceed ngs F fteenth Annual Sympos um on Pr nc ples of D str buted Comput ng, pages 191 200. ACM, 1996. 4. M. Beller and Y. Yacobi. Batch Di e-Hellman Key Agreement Systems and their Application to Portable Communications. In R. Rueppel, editor, Advances n Cryptology Eurocrypt ’92, pages 208 220, Berlin, 1992. Springer-Verlag. Lecture Notes in Computer Science No. 658. 5. E. Berlekamp and L. Welch. Error correction of algebraic block codes. US Patent 4,633,470. 6. M. Blum, W. Evans, P. Gemmell, S. Kannan, and M. Naor. Checking the Correctness of Memories. In Proceed ng 32nd Annual Sympos um on the Foundat ons of Computer Sc ence, pages 90 99. IEEE, 1991. 7. M. Blum and S. Kannan. Designing Programs that Check their Work. In Proceed ngs 21st Annual Sympos um on the Theory of Comput ng, pages 86 97. ACM, 1989.
Batch Veri cation with Applications to Cryptography and Checking
191
8. M. Blum, M. Luby, and R. Rubinfeld. Self-Testing/Correcting with Applications to Numerical Problems. Journal of Computer and System Sc ences, 47:549 595, 1993. 9. J. Bos and M. Coster. Addition Chain Heuristics. In Advances n Cryptology Proceed ngs of Crypto 89, Lecture Notes n Computer Sc ence Vol. 658, pages 400 407. Springer-Verlag, 1989. 10. E. Brickell, D. Gordon, K. McCurley, and D. Wilson. Fast Exponentiation with Precomputation. In R. Rueppel, editor, Advances n Cryptology Eurocrypt ’92, pages 200 207, Berlin, 1992. Springer-Verlag. Lecture Notes in Computer Science No. 658. 11. E. Brickell, P. Lee, and Y. Yacobi. Secure Audio Teleconference. In Advances n Cryptology Proceed ngs of Crypto 87, Lecture Notes n Computer Sc ence Vol. 293, C. Pomerance ed tor, pages 418 426. Springer-Verlag, 1987. 12. B. Chor, S. Goldwasser, S. Micali, and B. Awerbuch. Veri able Secret Sharing and Achieving Simultaneity in the Presence of Faults. In Proceed ng 26th Annual Sympos um on the Foundat ons of Computer Sc ence, pages 383 395. IEEE, 1985. 13. F. Ergun, S. Ravi Kumar, and R. Rubinfeld. Approximate Checking of Polynomials and Functional Equations. In Proc. 37th Annual Sympos um on Foundat ons of Computer Sc ence, pages 592 601. IEEE, 1996. 14. A. Fiat. Batch RSA. Journal of Cryptology, 10(2):75 88, 1997. 15. National Institute for Standards and Technology. Digital Signature Standard (DSS). Technical Report 169, August 30 1991. 16. P. Gemmell, R. Lipton, R. Rubinfeld, M. Sudan, and A. Wigderson. Selftesting/correcting for polynomials and for approximate functions. In Proc. Twenty Th rd Annual ACM Sympos um on Theory of Comput ng, pages 32 42. ACM, 1991. 17. C.H. Lim and P.J. Lee. More Flexible Exponentiation with Precomputation. In Y. Desmedt, editor, Advances n Cryptology Crypto ’94, pages 95 107, Berlin, 1994. Springer-Verlag. Lecture Notes in Computer Science No. 839. 18. D. Naccache, D. M’Rahi, S. Vaudenay, and D. Raphaeli. Can D.S.A be improved? Complexity trade-o s with the digital signature standard. In A. De Santis, editor, Advances n Cryptology Eurocrypt ’94, pages 77 85, Berlin, 1994. SpringerVerlag. Lecture Notes in Computer Science No. 950. 19. R. Rivest, A. Shamir, and L. Adleman. A Method for Obtaining Digital Signatures and Public-Key Cryptosystems. Commun cat ons of the ACM, 21:120 126, 1978. 20. R. Rubinfeld. Batch Checking with Applications to Linear Functions. Informat on Process ng Letters, 42:77 80, 1992. 21. R. Rubinfeld. On the Robustness of Functional Equations. In Proc. 35th Annual Sympos um on Foundat ons of Computer Sc ence, pages 2 13. IEEE, 1994. 22. R. Rubinfeld. Designing Checkers for Programs that Run in Parallel. Algor thm ca, 15(4):287 301, 1996. 23. R. Rubinfeld and M. Sudan. Robust Characterizations of Polynomials with Applications to Program Testing. SIAM Journal on Comput ng, 25(2):252 271, 1996. 24. J. Sauerbrey and A. Dietel. Resource requirements for the application of addition chains modulo exponentiation. In Advances n Cryptology Eurorypt ’92, Lecture Notes n Computer Sc ence Vol. 658. Springer-Verlag, 1992.
Strength of Two Data Encrypt on Standard Implementat ons Under T m ng Attacks Alejandro Hev a1 and Marcos K w 2 1
Dept. de C enc as de la Computac on, Facultad de C enc as F s cas y Matemat cas, U. de Ch le. 2 Dept. de Ingen er a Matemat ca, Facultad de C enc as F s cas y Matemat cas, U. de Ch le. ahev a@dcc,mk w @d m .uch le.cl
Abstract. We study the vulnerab l ty of several mplementat ons of the Data Encrypt on Standard (DES) cryptosystem under a t m ng attack. A t m ng attack s a method des gned to break cryptograph c systems that was recently proposed by Paul Kocher. It explo ts the eng neer ng aspects nvolved n the mplementat on of cryptosystems and m ght succeed even aga nst cryptosystems that rema n mperv ous to soph st cated cryptanalyt c techn ques. A t m ng attack s, essent ally, a way of obta nng some user’s pr vate nformat on by carefully measur ng the t me t takes the user to carry out cryptograph c operat ons. In th s work we analyze two mplementat ons of DES. We show that a t m ng attack y elds the Hamm ng we ght of the key used by both DES mplementat ons. Moreover, the attack s computat onally nexpens ve. We also show that all the des gn character st cs of the target system, necessary to carry out the t m ng attack, can be nferred from t m ng measurements. To the best of our knowledge th s work s the rst one that shows that symmetr c cryptosystems are vulnerable to t m ng attacks.
1
Introduct on
A new type of cryptanalyt c attack was ntroduced by Kocher n [11]. Th s new attack s called t m ng attack. It explo ts the fact that cryptosystems often take sl ghtly d erent amounts of t me on d erent nputs. Kocher gave several poss ble explanat ons for th s behav or, among these: branch ng and cond t onal statements, RAM cache h ts, processor nstruct ons that run n non- xed t me, etc. Kocher’s most s gn cant contr but on was to show that runn ng t me d fferent als can be explo ted n order to nd some of a target system’s pr vate nformat on. Indeed, n [11] t s shown how to cryptanalyze a s mple modular exponent ator. Modular exponent at on s a key operat on n D e Hellman’s key exchange protocol [7] and the RSA cryptosystem [17]. A modular exponenN , n = 0, and y Z computes t ator s a procedure that on nputs k n (y k mod n). In the cryptograph c protocols ment oned above, n s publ c and k Part ally supported by FONDECYT No. 1960849, Fundac on Andes, and FONDAP n Appl ed Mathemat cs 1997. C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 192 205, 1998. c Spr nger-Verlag Berl n He delberg 1998
Strength of Two Data Encrypt on Standard Implementat ons
193
s pr vate. Kocher reports that f a pass ve eavesdropper can measure the t me t takes a target system to compute (y k mod n) for several nputs y, then he can recover the secret exponent k. Moreover, the overall computat onal e ort nvolved n the attack s proport onal to the amount of work done by the v ct m. For concreteness sake and clar ty of expos t on we now descr be the essence of Kocher’s method for recover ng the secret exponent of the xed exponent modular exponent ator shown n F g. 1. Input: y Z Code: z = 1 let kl k0 be k n b nary for = l downto 0 do z = z 2 mod n f k = 1 then z = z y mod n Output: z.
F g. 1. Modular exponent ator. kt to recover kt−1 . (To obta n The attack allows someone who knows kl the ent re exponent the attacker starts w th t = l + 1 and repeats the attack unt l t = 1.) The attacker rst computes l − t + 1 terat ons of the for loop. The next terat on requ res the rst unknown b t kt−1 . If the b t s set, the operat on (z = z y mod n) s performed, otherw se t s sk pped. Thus, the runn ng t me of the modular exponent ator s longer when kt−1 s set. If an eavesdropper can measure the amount of t me t takes the modular exponent ator to respond to each nput y, then t can nfer whether kt−1 s set or not. Indeed, f modular mult pl cat on was a very slow operat on, and for some nput y the overall runn ng t me of the modular exponent ator was small, th s would mply that kt−1 must be zero. (See [11] for an explanat on of how the above descr bed dea s further re ned.) The preced ng d scuss on establ shes that, n theory, t m ng attacks can y eld some of a target system’s pr vate nformat on. The feas b l ty of carry ng out a t m ng attack n a real sett ng rema ns to be seen. It s not clear n wh ch env ronments t s poss ble to carry out the very prec se t m ng measurements requ red by a t m ng attack. Moreover, n order to successfully mount a t m ng attack on a remote cryptosystem, random network delays may force to collect a proh b t vely large number of t m ng measurements n order to compensate for the r ncreased uncerta nty. Nevertheless, there are some s tuat ons where we feel t s real st c to mount a t m ng attack. We now descr be one of them. Challenge response protocols are used to establ sh whether two ent t es nvolved n commun cat on are ndeed genu ne ent t es and can thus be allowed to cont nue commun cat on w th each other. In these protocols, one ent ty challenges the other w th a random number on wh ch a predeterm ned calculat on must be performed, often nclud ng a secret key. In order to generate the correct re-
194
Alejandro Hev a and Marcos K w
sult for the computat on the other dev ce must posses the correct secret key and therefore can be assumed to be authent c. Many smart cards, n part cular dynam c password generators1 (tokens) and electron c wallet cards, mplement challenge response protocols (e.g. the message authent cat on code generated accord ng to the X.9.9 [19, page 456] standard). It s expected that extens ve use w ll be made of smart cards based n general purpose programmable ntegrated c rcu t ch ps. Thus, the spec c funct onal ty of each smart card w ll be ach eved through programm ng. The secur ty of these smart cards w ll be prov ded us ng tamper proof technology and cryptograph c techn ques. The above descr bed scenar o s an deal sett ng n wh ch to carry out a t m ng attack. The w despread ava lab l ty of a part cular type of card w ll make t easy and nexpens ve to determ ne the t m ng character st cs of the system on wh ch to mount the attack. Later, the obta n ng of prec se t m ng measurements (e.g. by mon tor ng or alter ng a card reader, or ga n ng possess on of a card) could be used to retr eve some of the secret nformat on stored n the card by means of a t m ng attack. Thus, cards that mplement challenge response protocols where master keys are nvolved could g ve r se to a secur ty problem. New unant c pated stra ns of t m ng attacks m ght ar se. Hence, t m ng attacks should be g ven some ser ous cons derat on. Th s work contr butes, ult mately, n further ng our understand ng of the strengths of the recently ntroduced t m ng attack techn que, the weaknesses t explo ts, and the ways of el m nat ng the poss b l ty of t becom ng pract cal. Kocher mplemented the attack aga nst the D e Hellman key exchange protocol. He also observed that t m ng attacks could potent ally be used aga nst other cryptosystems, n part cular aga nst the data encrypt on standard (DES). The latter cla m s the mot vat on for th s work.
2
Summary of Results and Organ zat on
We study the vulnerab l ty of one of the most w dely used cryptosystems n the world, DES, aga nst a t m ng attack. The start ng po nt of th s work s the observat on of Kocher [11] that n DES’s key schedule generat on process the t me needed to sh ft around nonzero b ts could be a source of non- xed encrypt on runn ng t mes. Hence, he conjectured that a t m ng attack aga nst DES could reveal the Hamm ng we ght of the key.2 We show that although Kocher’s observat on s ncorrect (for the DES mplementat ons that we analyzed), h s conjecture s true. But, we do more. In Sect. 3 we g ve a br ef descr pt on of DES. In Sect. 4.1 we descr be a t m ng attack aga nst DES that assumes the attacker knows the target system’s des gn character st cs. We rst d scuss exper mental results that show that a computat onally nexpens ve t m ng attack aga nst two mplementat ons of DES would y eld enough nformat on to recover 1 2
Dynam c password generators are becom ng w dely used as a means of allow ng author zed users to access remote comput ng systems. The Hamm ng we ght of a b tstr ng equals the number of ts b ts that are nonzero.
Strength of Two Data Encrypt on Standard Implementat ons
195
the Hamm ng we ght of the DES key be ng used. Hence, assum ng the DES keys are randomly chosen, an attacker can recover approx mately 3 95 b ts of key nformat on.3 To the best of our knowledge, th s s the rst mplementat on of a t m ng attack aga nst a symmetr c cryptosystem. We dent fy the sources of non- xed runn ng t me n the mplementat ons of DES that we stud ed. These were mostly due to cond t onal statements. We also establ sh a surpr s ng fact. In both DES mplementat ons that we analyzed the encrypt on t me T s roughly equal to a l near funct on of the key’s Hamm ng we ght X plus some normally d str buted no se e. S nce a DES key s a b tstr ng of length 56, and keys are chosen un formly at random n the key space we have that X B nom (56 1 2).4 Thus, for some , , and , T = X + +e X B nom (56 1 2) e N orm 0 2 In Sect. 4.2 we show that t s not necessary, n order to perform a t m ng attack of DES, to assume that the des gn character st cs of the target system s known. Indeed, we propose two stat st cal methods whereby a pass ve eavesdropper can nfer from t m ng measurements all the target system’s des gn nformat on requ red to successfully mount a t m ng attack aga nst DES. To the best of our knowledge th s s the rst proof that t s poss ble to nfer a target system’s des gn character st c through t m ng measurements. We would l ke to stress that all of the t m ng attacks descr bed n th s work only requ re prec se measurements of encrypt on t mes, but no knowledge of the encrypted pla ntexts or produced c phertexts. In Sect. 5 we propose a bl nd ng techn que that can be used to el m nate almost all of the execut on t me d erent als n the analyzed DES mplementat ons. Th s bl nd ng techn que makes both DES mplementat ons that we study mperv ous to the sort of t m ng attack we descr be n th s work. F nally, we d scuss under wh ch cond t ons all, and not only the Hamm ng we ght of, a DES key m ght be recovered through a t m ng attack. 2.1
Related Work
Modern cryptography advocates the des gn of cryptosystems based on sound mathemat cal pr nc ples. Thus, many of the cryptosystems des gned over the last two decades can be proved to res st many soph st cated, mathemat cally based, cryptanalyt c techn ques (prov ded one s w ll ng to accept some reasonable assumpt ons). Trad t onally, the cryptanalyt c techn ques used to attack such cryptosystems explo t the algor thm c des gn weaknesses of the cryptosystem. On the other hand, t m ng attacks take advantage of the dec s ons made when mplement ng the cryptosystems (spec ally those that produce non- xed 3 4
The latter statement s a l ttle b t too opt m st c. See Remark 1 n Sect. 4.1 for a more thorough d scuss on. Recall that the d str but on B nom (N p) corresponds to the d str but on of the sum of N ndependent dent cally d str buted 0 1 -random var ables w th expectat on p.
196
Alejandro Hev a and Marcos K w
runn ng t mes). But, t m ng attacks are not the only type of attacks that explo t the eng neer ng aspects nvolved n the mplementat on of cryptosystems. Indeed, recently Boneh, L pton, and DeM llo [5] ntroduced the concept of fault tolerant attacks. These attacks take advantage of (poss bly nduced) hardware faults. Boneh et al. po nt out that the r attack shows the danger that hardware faults pose to var ous cryptograph c protocols. They conclude that even soph st cated cryptograph c schemes sealed ns de tamper res stant dev ces m ght leak nformat on about secret keys. A new stra n of fault tolerant attacks, d erent al fault analys s (DFA), was proposed by B ham and Sham r [4]. The r attack s appl cable to almost any secret key cryptosystem proposed so far n the open l terature. DFA works under var ous fault models and uses cryptanalyt c techn ques to recover the cryptograph c secret nformat on stored n tamper res stant dev ces. In part cular, B ham and Sham r show that under the same hardware fault model cons dered by Boneh et al., the full DES key can be extracted from a sealed tamper res stant DES encryptor by analyz ng between 40 and 200 c phertexts generated from unknown but related pla ntexts. Furthermore, n [4] techn ques are developed to dent fy the keys of completely unknown c phers sealed n tamper res stant dev ces. The new type of attacks descr bed above have rece ved w despread attent on (see for example [8,13]).
3
The Data Encrypt on Standard
DES s the most w dely used cryptosystem n the world, spec ally among nanc al nst tut ons. It was developed at IBM and adopted as a standard n 1977. It has been rev ewed every ve years s nce ts adopt on. DES has held up remarkably well aga nst years of cryptanalys s. But, faster and cheaper processors allow, us ng current technology, to bu ld a spec al purpose mach ne wh ch at a reasonable pr ce can recover a DES key w th n hours [20, pp. 82 83]. It s expected that on 1998 DES w ll cease to be a standard. For concreteness’s sake, we prov de below a br ef descr pt on of DES. A deta led descr pt on of DES s g ven n [1]. More eas ly access ble descr pt ons of DES can be found n [19,20]. DES s a symmetr c or pr vate key cryptosystem, .e. t s a cryptosystem where the part es that w sh to use t must agree n advance on a common secret key wh ch must be kept pr vate. DES encrypts a message (pla ntext) b tstr ng of length 64 us ng a b tstr ng key of length 56 and obta ns a c phertext b tstr ng of length 64. It has three ma n stages. In the rst stage, the b ts of the pla ntext are permuted accord ng to a xed n t al permutat on. In the second stage, 16 terat ons of a certa n funct on are success vely appl ed to the b tstr ng result ng from the rst stage. In the nal stage, the nverse of the n t al permutat on s appl ed to the b tstr ng obta ned n the second stage. The strength of DES res des on the funct on that s terated dur ng the encrypt on process. For the d scuss on that follows t s only necessary to have a
Strength of Two Data Encrypt on Standard Implementat ons
197
rough understand ng of the terat on process. We now g ve a br ef descr pt on of th s terat on process. The nput to terat on s the output b tstr ng of terat on − 1 and a 48 b t long str ng, K , computed as a funct on of the DES key. Actually, each K s a permuted select on of b ts from the DES key. The str ngs K16 compr se what s called the key schedule. Dur ng each terat on a K1 64 b t long output str ng s computed by apply ng a xed rule to the two nput str ngs. The encrypt on process s dep cted n F g. 2.
Pla ntext 1 Perm.
Key
2
..... Encrypt on
C phertext 16 Perm.
Key schedule
F g. 2. DES encrypt on process.
Decrypt on s done us ng the same algor thm as encrypt on but us ng the key K1 . schedule n reverse order K16 Several attacks on DES have been proposed n the l terature. Among these, B ham and Sham r’s d erent al cryptanalys s [2,3] and Matsu ’s l near cryptanalys s [14,15]. D erent al cryptanalys s s, pr mar ly, a chosen pla ntext attack, l near cryptanalys s s a known pla ntext attack. In a chosen pla ntext attack the adversary chooses pla ntext and obta ns the correspond ng c phertext. In a known pla ntext attack the adversary has a quant ty of pla ntext and correspond ng c phertext. In both of these attacks the adversary subsequently uses any nformat on deduced n order to recover pla ntext correspond ng to prev ously unseen c phertext. Assum ng that obta n ng enormous numbers of chosen (known) pla ntexts pa rs s feas ble, d erent al and l near cryptanalys s are not, however, cons dered a threat to DES n pract cal env ronments (see [16, pp. 258 259]).
4
T m ng Attack of DES
We now cons der the problem of recover ng the Hamm ng we ght of the DES key of a target system by means of a t m ng attack. We rst address the problem, n Sect. 4.1, assum ng the attacker knows the des gn of the target system. We then show, n Sect. 4.2, that the latter assumpt on can be removed.
198
4.1
Alejandro Hev a and Marcos K w
T m ng Character st cs of Two Implementat ons of DES
We stud ed the t m ng character st cs of two mplementat ons of DES. The rst one was obta ned from the RSAEuro cryptograph c toolk t [10], henceforth referred to as RSA DES. The other mplementat on of DES that we looked at was one due to Louko [12], henceforth referred to as L DES. We stud ed both mplementat ons on a 120-MHz Pent umTM computer runn ng MSDOSTM . The advantage of work ng on an MSDOSTM env ronment s that t s a s ngle process operat ng system. Th s fac l tates carry ng out t m ng measurements s nce there are no other nterfer ng processes runn ng and there are less operat ng system ma ntenance tasks be ng performed. We measured t me n m croseconds ( s). We were able to measure runn ng t mes w th an accuracy of up to 0 8 s. In our rst exper ment we xed the nput message to be the b tstr ng of length 64 all of whose b ts are set to 0. For each 0 56 , we randomly chose 32 keys of Hamm ng we ght . For each selected key we encrypted the message a total of 16 t mes. Dur ng each encrypt on we measured the t me t took to generate the key schedule and the total t me t took to encrypt the message. The plots, for each of the mplementat ons that we looked at, of the average (for each key) encrypt on and key schedule generat on t mes are shown n F g. 3 and F g. 4. Only obv ous outl ers were el m nated. In fact the only outl ers that we not ced appeared at xed ntervals of 216 clock t cks. These outl ers were caused by system ma ntenance tasks. A randomly chosen DES key has a Hamm ng we ght between 23 and 33 w th probab l ty approx mately 0 86. Thus, the most relevant data po nts among the ones shown n F g. 3 and F g. 4 are those close to the m ddle of the plots.
550
500
450
Encryption
Time
400
350
300 Key Schedule 250
200
150 0
10
20 30 Hamming weight of key
F g. 3. RSA-DES.
40
50
Strength of Two Data Encrypt on Standard Implementat ons
199
400
350
300
Time
250
200
150
Encryption
100
Key Schedule
50
0 0
10
20 30 Hamming weight of key
40
50
F g. 4. L-DES.
For var ous keys chosen at random, we performed 216 t me measurements (for each key) of the encrypt on and key schedule generat on t mes. After d scard ng obv ous outl ers we graphed the emp r cal frequency d str but ons of the collected data. The emp r cal d str but ons we observed were roughly symmetr c and concentrated n three d erent values. Th s concentrat on n three d erent values s due to the fact that we were only able to perform t me measurements w th an accuracy of up to 0 8 s. The above suggests, as one would expect, that the var at ons on the runn ng t me observed, when the same process s executed many t mes over the same nput, are due to the e ect of normally d str buted random no se. For d erent values of 8 48 we randomly chose 28 keys of Hamm ng we ght . After throw ng away outl ers we graphed the emp r cal frequency d str but ons of the collected data. The emp r cal frequenc es observed looked l ke normal d str but ons w th small dev at ons (typ cally 1 2 s for L DES and 1 8 s for RSA DES). We conclude that the var at ons on the encrypt on and key schedule generat ons t mes observed among keys of same Hamm ng we ght are mostly due to the total number of b ts of the key that are set. Thus, the e ect of wh ch b ts are set, among keys of same Hamm ng we ght, s negl g ble. All the exper ments descr bed n th s sect on were repeated, but, nstead of leav ng the nput message xed, a new randomly selected message was chosen at the start of each encrypt on process. All the results reported above rema ned (essent ally) unchanged. There was only a negl g ble ncrease n the measured dev at ons. The ma n sources of non- xed runn ng t me, n both L DES and RSA DES, were due to cond t onal statements. In fact, n both DES mplementat ons there
200
Alejandro Hev a and Marcos K w
s a p ece of code that s executed a number of t mes wh ch s n d rect proport on to the number of b ts of the key that are set. Assum ng that the attacker knows the des gn of the target system, he can bu ld on ts own a table of the average encrypt on t me versus the Hamm ng we ght of the key. The clear monoton cally ncreas ng relat on between the encrypt on t me and the Hamm ng we ght of the key el c ted by our exper ments s a s gn cant mplementat on flaw. It allows an attacker to determ ne the Hamm ng we ght of the DES key. Indeed, the attacker has to obta n a few encrypt on t me measurements and look n the table t has bu lt to determ ne the key’s Hamm ng we ght from wh ch such t me measurements could have come. Thus, the attacker can recover H(wt(K)) 3 95 b ts of key nformat on (H denotes the b nary entropy funct on). Remark 1. The exact Hamm ng we ght of the DES key can be recovered by means of a t m ng attack f two s tuat ons hold. F rst, accurate t me measurements can be obta ned. Second, the var at ons n the encrypt on and key schedule generat on t me produced by d erent keys w th dent cal Hamm ng we ght s small compared to the t me var at ons produced by keys w th one more or one less set b t. We have not ced that the latter s tuat on approx mately holds. An exact est mat on of the Hamm ng we ght of the DES key can be ach eved f the attacker can accurately perform t me measurements of several encrypt ons of the same pla ntext. But, th s mpl es a more act ve role on the part of the attacker. More remarkable than the establ shed monoton cally ncreas ng relat on between the encrypt on t mes and the Hamm ng we ght of the key s the surpr s ngly clear l near dependency that ex sts between the two measured quant t es. The correlat on factors for the data shown n F g. 3 and F g. 4 are 0 9760 and 0 9999 respect vely. The sharp l near dependency between encrypt on t mes and Hamm ng we ght allows an attacker to nfer the target system’s nformat on wh ch s requ red to carry out the attack descr bed above. Th s top c s d scussed n the next sect on. 4.2
Der vat on of the T m ng Character st cs of the Target System
As d scussed n Sect. 4.1, n both DES mplementat ons that we stud ed the encrypt on t me was roughly equal to a l near funct on of the key’s Hamm ng we ght plus some normally d str buted random no se. In th s sect on we explo t th s fact n order to der ve all the necessary nformat on needed to perform a t m ng attack that reveals the Hamm ng we ght of the target system’s DES key. F rst we need to ntroduce some notat on. Assume we have m measurements on the t me t takes the target system to perform a DES encrypt on. The t me measurements m ght correspond to encrypt ons performed under d erent DES keys. For 1 k , denote by K the -th key that s used by the target system dur ng the per od that t m ng measurements are performed. We make Kk are chosen at random n 0 1 56 and the, real st c, assumpt on that K1 () ndependent of each other. Let X denote the Hamm ng we ght of key K . Thus
Strength of Two Data Encrypt on Standard Implementat ons
201
the d str but on of X ( ) s a B nom (56 1 2). S nce we are assum ng that the K ’s X (k) are ndependent random are chosen ndependently we have that X (1) var ables. Note that success ve t me measurements can correspond to encrypt ons 1 m of the message under the same key. For 1 k , let be the ndex of the last measurement correspond ng to the encrypt on of the message performed w th key K . For conven ence’s sake, let 0 = 0. Hence, < k−1 < k = m. Denote by I the set of nd ces that 0 = 0 < 1 < 1 k , let correspond to t me measurements under key K , .e. for def () . For 1 k and j I , let Tj be the I = n N : −1 < n random var able represent ng the t me t takes the target system to perform () the j-th encrypt on of the message w th key K . F nally, let ej be a random var able represent ng the e ect of random no se on the j-th encrypt on, j I , () of the message performed w th key K . Thus, the ej ’s represent measurement naccurac es and the target system’s runn ng t me fluctuat ons. We now have all the notat on necessary to formally state the problem we want to address. Indeed, the l near dependency between the encrypt on t me and the Hamm ng we ght of the key n both DES mplementat ons that we stud ed mpl es that there ex sts , , and , such that for all 1 k and j I ()
Tj
= X( ) +
()
+ ej
X( )
()
B nom (56 1 2) ej
N orm 0
2
(1)
Our problem s to nfer, from t m ng measurements, the parameters , , and for wh ch (1) holds. We address two var at ons of th s problem. In the nal vers on of th s work we show how to handle the case where the ’s are unknown. In the next sect on we show how to address the problem when the ’s are known. The latter case s the most real st c one. Indeed, a standard cryptanalyt c assumpt on s that the attacker knows the key management procedure of the target system. Known i0 s. We propose two alternat ve stat st cal methods for deduc ng the parameters , , and for wh ch (1) holds. One method s based on max mum l kel hood est mators and the other one on asymptot cally unb ased est mators. S nce the follow ng d scuss on heav ly rel es on standard concepts and results from probab l ty and stat st cs we refer the reader unfam l ar w th these subjects to [9,18,21] for background mater al and term nology. ()
Max mum L kel hood Est mators: Let X = ( X ( ) )k=1 , T ( ) = ( Tj )j2I , T (k) , and T denote random var ables. and T = ( T ( ) )k=1 . Thus, X T (1) ( ) Furthermore, let x = ( x )k=1 , t( ) = ( tj )j2I , and t = ( t( ) )k=1 be the actual values taken by X, T ( ) , and T respect vely. Note that the jo nt dens ty funct on of X and T g ven , , and s fX T ( x t
)=
1 2
m2 Y k 1 56 − 12 e 2 2 56 x 2 =1
P
j∈I
( )
(tj −( x + ))2
202
Alejandro Hev a and Marcos K w
S nce the X ( ) ’s are ndependent random var ables d str buted accord ng to a B nom (56 1 2), the marg nal d str but on of T g ven , , and s fT ( t
)=
m2 Y k
1 2
2
h
E X ( ) e− 2
1
2
P
j∈I
( )
(tj −( X ( ) + ))2
i (2)
=1
For a xed collect on of t me measurements t, the values of , , and that max m ze (2) are the max mum l kel hood est mators sought. As s often the case when deal ng w th max mum l kel hood est mators, t s d cult to solve expl c tly for them. See [21, Ch. 5, 2] for a d scuss on of computat onal rout nes that can be used to calculate max mum l kel hood est mators. The advantage of the above descr bed approach for determ n ng the parameters relevant for carry ng out the t m ng attack s that t uses all the ava lable t m ng measurements. But, t does not allow us to determ ne how many measurements are su c ent n order to obta n accurate est mat ons of the parameters sought. The alternat ve approach descr bed below solves th s problem. Asymptot c Est mators: Our goal s to nd good est mators b, b, and b for , , and . Moreover, we are nterested n determ n ng the asymptot c (on the number of t m ng measurements) behav or of such est mators. In part cular, the r asymptot c d str but ons, the r l m t ng values, and the r rate of convergence. We start w th a key observat on. S nce the expectat on and var ance of a B nom (56 1 2) are 28 and 14 respect vely, tak ng the expectat on and var ance n (1) y elds that for all 1 k and j I h i h i def () () 2 def = E = V T = 28 + T = 14 2 + 2 T T j j c2 Hence, the natural cand dates for b and b g ven good est mators c T , T , and c2 for T , 2 , and 2 respect vely, are T def
b =
1 c2 c2 1 ( − ) 14 T
2
b def = c T − 28 b
c2 , and c2 . F rst we need to ntroduce We now prov de cand dates for c T, T ( ) def P () def Pk () = ( j2I Tj ) I , let T = ( =1 T ) k, and add t onal notat on. Let T def def P () let e( ) = ( j2I ej ) I . De ne c T = T , and k 1X 1 X () c 2 def = (Tj − T )2 T k =1 I j2I
k 1X 1 X () () c2 def = (Tj − T )2 k =1 I j2I
Note that k X () c2 − c2 = 1 (T − T )2 T k =1
0
(3)
Strength of Two Data Encrypt on Standard Implementat ons
203
()
We henceforth assume that I1 = = Ik = n. Note that, T = X ( ) + + () () e . S nce X B nom (56 1 2) and e( ) N orm 0 2 n , t follows that 2 2 2 () T N orm T 14 2 + n and T N orm T 14k + kn . Recall that the sum of the squares of l ndependent dent cally d str buted normal var ables w th zero mean and var ance equal to 1 s d str buted accord ng to a 2l . Hence, 1 2 2 c2 + n1 2 ) 2k−1 . a class cal stat st c’s result and (3) mpl es that c T − k (14 d
Wk−1 N orm (0 1) It follows that for n and k su c ently large, f W1 then ! 1 Pk−1 2 W − 1 =1 k−1 2 2 2 c2 − 14 k c 14 T − 1 k−1 2 S nce V W1 = 3, for n and k su c ently large the central l m t theorem 2 c2 − 14 2 ) N orm 0 3(14 2 )2 . Equ valently, b2 − 2 mpl esthat k ( c T− 4 as 2 c2 )1 2 − N orm 0 3 k . Moreover, by the law of large numbers, b = p114 ( c T− = . The latter two facts, and s nce b − = (b2 − 2 ) (b + ), 2 3 2 2 (3 14 + 14) . . Hence, b N orm mply that b − N orm 0 4k k = Ik = n and k are su c ently large Summar z ng, f I1 = 2 3 2 b N orm 602 b N orm 4k k
p1 ( 2 T 14
Hence,
5
−
(1
2 1 2
)
2
) t me measurements su ce to approx mate
and
to w th n .
F nal Comments
In [11] a bl nd ng techn que , s m lar to that used for bl nd s gnatures [6], s proposed n order to prevent a t m ng attack aga nst a modular exponent ator. For both DES mplementat ons we stud ed bl nd ng techn ques can be adapted to produce (almost) xed runn ng t me for the key schedule generat on processes.5 We now descr be th s adaptat on. Let K be the DES key whose key schedule we want to generate. Let K 0 be a b tstr ng of length 56 generated as follows: randomly choose half of the b ts of K wh ch are set to 1 (resp. 0) and set the correspond ng b ts of K 0 to 0 (resp. 1). Note that the Hamm ng we ghts of K 0 and K K 0 (the b tw se xor of K and K 0 ) are 28. Mod fy the key schedule generat on processes so they rst generate the key schedules for keys K 0 and K K 0 . Let 0 K16 and K1 K16 be the key schedules obta ned. Recall that, K K10 0 (resp. K ) s a permuted select on of b ts from the key K K 0 (resp. K 0 ). Thus, 0 K16 K16 . F gure 5 plots the encrypt on the key schedule of K s K1 K10 t mes of RSA DES and ts prev ously expla ned mod cat on. Note the very clear reduct on n t me d erent als. The reduct on s ach eved at the expense of ncreas ng the encrypt on t me by a factor of approx mately 1 6. 5
Recall that the key schedule generat on process was respons ble for all of the observed d erent als of RSA DES and most of the t me d erent als of L DES.
204
Alejandro Hev a and Marcos K w 750
700
Blinded RSA−DES
650
Time
600
550 RSA−DES
500
450
400
350 0
10
20 30 Hamming weight of key
40
50
F g. 5. RSA DES and mod ed RSA DES encrypt on t mes.
F nally, we address the quest on of whether a t m ng attack can nd all of the DES key and not only ts Hamm ng we ght. Although we d d not succeed n tun ng the t m ng attack techn que n order to recover all the b ts of a DES key we dent ed, n L DES, a source of non- xed runn ng t me that s not due to the key generat on process. Indeed, the d erence n the slopes of the curves plotted n F g. 4 shows that the encrypt on t me, not count ng the key generat on process, depends on the key used. Th s fact s a weakness that could (potent ally) be explo ted n order to recover all of the DES key. It opens the poss b l ty that the t me t takes to encrypt a message M w th a key K s a non l near funct on of both M and K, e.g. t s a monoton cally ncreas ng funct on n the Hamm ng we ght of M K. The latter would allow a t m ng attack to recover a DES key by carefully choos ng the messages to be encrypted. We were not able to dent fy clear sources of non l near dependenc es between t me d erent als and the nputs to the DES encrypt on process n any of the DES mplementat ons that we stud ed. Nevertheless, we feel that the part al nformat on leaked by both mplementat ons of DES that we analyzed suggests that care must be taken n the mplementat on of DES, otherw se, all of the key could be comprom sed through a t m ng attack.
Acknowledgments. We are grateful to Shang Hua Teng for call ng to our attent on the work of Kocher. We thank Raul Gouet, Lu s Mateu, Alejandro Murua, and Ja me San Mart n for helpful d scuss ons. We also thank Paul Kocher for adv se on how to measure runn ng t mes accurately on an MSDOSTM env ronment.
Strength of Two Data Encrypt on Standard Implementat ons
205
References 1. Data encrypt on standard (DES), 1977. Nat onal Bureau of Standards FIPS Publ cat on 46. 2. E. B ham and A. Sham r. D erent al cryptanalys s of DES-l ke cryptosystems. Journal of Cryptology, 4:3 72, 1991. 3. E. B ham and A. Sham r. D erent al cryptanalys s of the full 16 round DES. In E. F. Br ckell, ed tor, Advances n Cryptology CRYPTO’92, number 740 n Lecture Notes n Computer Sc ence, pages 494 502, Santa Barbara, Cal forn a, 1993. Spr nger Verlag. 4. E. B ham and A. Sham r. D erent al fault analys s of secret key cryptosystems. Techn cal Report CS0910, Techn on, Computer Sc ence Department, 1997. 5. D. Boneh, R. A. Dem llo, and R. J. L pton. On the mportance of check ng cryptograph c protocols for faults. In Advances n Cryptology EUROCRYPT’97, Lecture Notes n Computer Sc ence, pages 37 51. Spr nger Verlag, 1997. 6. D. Chaum. Bl nd s gnatures for untraceable payments. In D. Chaum, R. L. R vest, and A. T. Sherman, ed tors, Advances n Cryptology CRYPTO’82, pages 199 203, Santa Barbara, Cal forn a, 1983. Plenum Press. 7. W. D e and M. E. Hellman. New d rect ons n cryptography. IEEE Transact ons on Informat on Theory, IT 22(6):644 654, Nov 1976. 8. E. Engl sh and S. Ham lton. Network secur ty under s ege. The t m ng attack. Computer, 30(5), March 1996. 9. W. Feller. An ntroduct on to probab l ty theory and ts appl cat ons, volume I & II. John W ley & Sons, Inc., 1966. second pr nt ng. 10. J. S. A. Kapp. RSAEuro: A cryptograph c toolk t, 1996. Vers on 1.04 Internet Release D str but on. 11. P. Kocher. T m ng attacks on mplementat ons of D e Hellman, RSA, DSS, and other systems. In N. Kobl tz, ed tor, Advances n Cryptology CRYPTO’96, number 1109 n Lecture Notes n Computer Sc ence, pages 104 113, Santa Barbara, Cal forn a, 1996. Spr nger Verlag. 12. A. Louko. DES package, 1992. Vers on 2.1, (ava lable v a FTP from kamp .hut.f ). 13. J. Marko . Potent al flaw seen n cash card secur ty, September 26 1996. New York T mes. 14. M. Matsu . The rst exper mental cryptanalys s of the data encrypt on standard. In Y. G. Desmedt, ed tor, Advances n Cryptology CRYPTO’94, number 839 n Lecture Notes n Computer Sc ence, pages 1 11, Santa Barbara, Cal forn a, 1994. Spr nger Verlag. 15. M. Matsu . L near cryptanalys s method for DES c pher. In T. Helleseth, ed tor, Advances n Cryptology EUROCRYPT’93, number 765 n Lecture Notes n Computer Sc ence, pages 386 897, Lofthus, Norway, 1994. Spr nger Verlag. 16. A. J. Menezes, P. C. van Oorschot, and S. A. Vanstone. Handbook of Appl ed Cryptography. CRC Press, rst ed t on, 1997. 17. R. L. R vest, A. Sham r, and L. M. Adleman. A method for obta n ng d g tal s gnatures and publ c key cryptosystems. Comm. of the ACM, 21:120 126, 1978. 18. S. Ross. A rst course n probab l ty. Macm llan Pub. Comp., th rd ed t on, 1988. 19. B. Schne er. Appl ed Cryptography: Protocols, algor thms and source code n C. John W ley & Sons, Inc., second ed t on, 1996. 20. D. R. St nson. Cryptography, Theory and Pract ce. CRC Press, rst ed t on, 1995. 21. S. Zacks. The theory of stat st cal nference. John W ley & Sons, Inc., 1971.
Spectral Techn ques n Graph Algor thms (Inv ted Paper) Noga Alon Department of Mathematics, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv, Israel.
[email protected]. l
Abstract. The existence of e cient algorithms to compute the eigenvectors and eigenvalues of graphs supplies a useful tool for the design of various graph algorithms. In this survey we describe several algorithms based on spectral techniques focusing on their performance for randomly generated input graphs.
1
Introduct on
Graph bisection, graph coloring and nding the independence number of a graph are three well studied algorithmic problems. All of them are NP-hard, and even the task of solving any of them approximately cannot be done in polynomial time under the common assumptions in Complexity Theory. It is possible, however, to develop e cient algorithms that solve these problems for almost all graphs in appropriately de ned classes. Such algorithms are desirable, since all three problems arise often in practice, where one might hope that the input instances are not necessarily worst case examples. Spectral techniques, based on the eigenvalues and the eigenvectors of the adjacency or the Laplace matrices of graphs, appear to be very successful in the design of such algorithms, and can provably solve the above problems for various classes of randomly generated graphs, where all previous techniques failed. The analysis of the performance of algorithms for random graphs has gained popularity recently (see [26] and its many references), and it seems to provide a useful measure for the behaviour of algorithmic techniques. In this paper we describe the relevance of spectral techniques to the above mentioned problems, discuss the algorithms and study their performance. This is mostly a survey paper and hence the focus here is on the underlying ideas and not on the detailed proofs which can be found in the relevant references. The adjacency matr x of a graph G = (V E) is the matrix A = (a v ) v2V , in which a v = 1 if uv E and a v = 0 otherwise. The Laplace matr x of G is Q = is the degree D − A, where D = (d v ) v2V is the diagonal matrix in which d Invited Lecturer Research supported in part by a USA Israeli BSF grant, by a grant from the Israel Science Foundation and by the Hermann Minkowski Minerva Center for Geometry at Tel Aviv University. C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 206 215, 1998. c Spr nger-Verlag Berl n He delberg 1998
Spectral Techniques in Graph Algorithms
207
d(u) of u in G and d v = 0 for all u = v. Both matrices above are symmetric and hence have real eigenvalues and an orthonormal basis of eigenvectors. The Laplace matrix is easily seen to be positive semi-de nite and hence its eigenvalues are nonnegative. It is well known that there is a tight relation between the eigenvalues of A and Q and several structural properties of the graph G, and it is natural to expect that this fact, and the fact that all eigenvalues and eigenvectors can be computed e ciently (see, e.g., [44]), may be useful in the design of e cient algorithms. In the following sections we demonstrate this approach in the study of several algorithmic questions.
2
Expans on and B sect on
The vertex expans on cV (G) of a graph G = (V E) on n vertices is cV (G) = m nXV jXjn
2
N (X) − X X
where N (X) = y V : xy E for some x X is the set of all neighbors of X in G. The edge expans on cE (G) of G is de ned by cE (G) = m nXV jXjn
2
e(X V − X) X
where e(X V − X) is the number of edges with one end in X and another end in its complement V − X. A b sect on of G = (V E) is a partition of its set of vertices into two equal parts. The s ze of the bisection is the number of edges with one end in each part. The problem of computing the vertex expansion or edge expansion of a graph and that of nding the minimum size of a bisection in it are useful for tackling various problems in VLSI design, and thus received a considerable amount of attention. The task of nding an optimal solution to any of these problems is NP-hard, and there is no known polynomial time algorithm that approximates any of these quantities up to a constant factor. In fact, it is known that if, as is widely believed, the complexity classes P and N P di er, then even the problem of approximating quantities related to the above three are N P -hard (see [12]). Leighton and Rao [3 ] designed a polynomial time algorithm that approximates cE (G), for any n-vertex graph G, up to a multiplicative factor of log n. The known results concerning the tight relation between the expansion properties of a graph and its spectral properties provide some (very rough) e cient approximation algorithms for vertex expansion and edge expansion, which are based on eigenvalue bounds. In particular, by the results in [21], [6] and [1], for every graph G with maximum degree d in which is the second smallest Laplace eigenvalue, Ω( ) d
cV (G)
O(
)
20
Noga Alon
Similarly, by the results of [21], [6] and [4 ], for every graph G as above cE (G) 2d 2 Donath and Ho man [19] were the rst to suggest to use spectral techniques for graph partitioning. Their work, that of Fiedler [21], and substantial experimental work (see, e.g., [42], [47]) demonstrated that this is indeed a very good heuristics. See also [17] for some related results. The basic idea in the Donath-Ho man algorithm as well as in its more recent variants is that the eigenvector corresponding to the second smallest eigenvalue of the Laplace matrix of the graph provides some information that can be used to nd a good partition of the graph. Indeed, in the extreme case that the graph consists of two connected components of equal size, this eigenvector is a constant on each component, and since we always may assume that the eigenvector of the smallest eigenvalue is the all 1 vector, and that the eigenvectors are orthogonal, this means that in the above extreme case the sign of the coordinates of the second eigenvector provides the desired partition. It is thus natural to expect that even in less trivial cases some information about a good bisection can be deduced from the coordinates of the second eigenvector. Random graphs were initiated by Erdos and Renyi [20], and their extensive study (see, e.g., the comprehensive book of Bollobas [14] and its many references) motivated the investigation of the performance of heuristic algorithms for input graphs generated randomly. Boppana [16] showed that a variant of the basic spectral technique nds, with high probability, the minimum bisection in a random graph with n vertices, m edges and bisection of size b, provided 0 b m 2 − 5 mn log n Therefore, these techniques work provably well on appropriate randomly generated input graphs. More recently, Spielman and Teng showed in [49] that by partitioning the vertices of a planar, bounded-degree graph on n vertices according to the coordinates of the eigenvector of its second smallest Laplace eigenvalue, one obtains a cut for which the ratio between the number of edges and the number of vertices in the smaller side is O(1 n). They also obtained similar results for other classes of graphs. It is well known that every bounded degree planar graph has such a separator, by the Lipton Tarjan separator Theorem [39], which also provides a linear time algorithm for nding such a cut. Although the results of [49] do not supply a better algorithm for planar graphs, they do provide insight for the behaviour of spectral techniques in partitioning algorithms and show that in some cases these techniques are provably useful.
3
Color ng
The chromat c number (G) of a graph G is the minimum number of colors needed to color the vertices of G so that adjacent vertices have distinct colors. The problem of determining or estimating this parameter has received a considerable amount of attention in Combinatorics and in Theoretical Computer Science, as several scheduling problems are naturally formulated as graph
Spectral Techniques in Graph Algorithms
209
coloring problems. It is well known (see [31,27]) that the problem of properly coloring a graph of chromatic number k with k colors is NP-hard, even for any xed k 3. Moreover, even the problem of approximating the chromatic number of an n vertex graph up to an nc multiplicative factor, for an appropriate positive c, is N P -hard, as shown by Lund and Yannakakis [40]. In fact, if N P does not have e cient random zed algorithms, then there is no polynomial time algorithm for approximating the chromatic number of an n vertex graph up to a factor of n1− , for any xed > 0, as proved in [24], using the result of [2 ]. In addition, from the results in [2] it follows that the chromatic number of an n vertex graph cannot be approximated up to a factor of n (log n)7 by a monotone polynomial size circuit. Several sophisticated polynomial time algorithms provide some very rough estimates for the chromatic number of a graph. Despite a lot of e orts (see [35], [13] and their references) there is no known polynomial time algorithm that nds a proper coloring of a 3-colorable graph on n vertices by less than, say, n1 5 colors. There are, however, several results that indicate that the spectral properties of a graph provide some information on its chromatic number. If G is a graph on n vertices with (adjacency matrix) eigenvalues 1 2 n then, as proved by Ho man [29], (G) 1 − n1 Similarly, as proved by Wilf (cf., e.g., [37]), (G) 1 + 1 . Despite the di culties in coloring graphs e ciently in the worst case, various researchers noticed that random k-colorable graphs are usually easy to color optimally. Polynomial time algorithms that optimally color random k-colorable graphs for every xed k with high probability, have been developed by Kucera [33], by Turner [50] and by Dyer and Frieze [1 ], where the last paper provides an algorithm whose average running time over all k-colorable graphs on n vertices is polynomial. Note, however, that most k-colorable graphs are quite dense, and hence easy to color. In fact, in a typical k-colorable graph, the number of common neighbors of any pair of vertices with the same color exceeds considerably that of any pair of vertices of distinct colors, and hence a simple coloring algorithm based on this fact already works with high probability. It is more di cult to color sparser random k-colorable graphs. A precise model for generating sparse random k-colorable graphs is described next, and the sparsity in it is governed by a parameter p that speci es the edge probability. There are, in fact, several possible models, but since most of them are equivalent for our purpose here we focus on one. Here is its description. Let V be a xed set of kn labelled vertices. For a real p = p(n), let Gkn p k be the random graph on the set of vertices V obtained as follows; rst, split Wk , each of cardinality the vertices of V arbitrarily into k color classes W1 n. Next, for each u and v that lie in distinct color classes, choose uv to be an edge, randomly and independently, with probability p. The input to the coloring algorithm is now a graph Gkn p k obtained as above, and the algorithm succeeds to color it if it nds a proper k coloring. The interesting case is a xed value of k 3 and large n, and since the case k = 3 is similar to the more general one of
210
Noga Alon
arbitrarily xed k we discuss mainly this special case. We say that an algorithm colors Gkn p k almost surely if the probability that a randomly chosen graph as above is properly colored by the algorithm tends to one as n tends to in nity. Petford and Welsh [43] suggested a randomized heuristic for 3-coloring random 3-colorable graphs and supplied experimental evidence that it works for most edge probabilities. Blum and Spencer [15] (see also [11] for some related results) designed a polynomial algorithm and proved that it colors optimally, with high probability, random 3-colorable graphs on n vertices with edge probability p provided p n n, for some arbitrarily small but xed > 0. Their algorithm is based on a path counting technique, and can be viewed as a natural generalization of the simple algorithm based on counting common neighbors (that counts paths of length 2), mentioned above. In [3] the authors designed a polynomial time algorithm that works for sparser random 3-colorable graphs, which is based on spectral techniques. If the edge probability p satis es p c n, where c is a su ciently large absolute constant, the algorithm colors optimally the corresponding random 3-colorable graph with high probability. This settled a problem of Blum and Spencer [15], who asked if one can design an algorithm that works almost surely for p polylog(n) n. (Here, and in what follows, almost surely always means: with probability that approaches 1 as n tends to in nity). The algorithm is based on the fact that almost surely a rather accurate approximation of the color classes can be read from the eigenvectors corresponding to the smallest two eigenvalues of the adjacency matrix of a large subgraph. This approximation can then be improved to yield a proper coloring. As is the case with the bisection heuristics, the intuition here is rather simple. If, in the perfectly symmetric case, there is a constant d such that every vertex has precisely d neighbors in each of the three color classes but its own, then −d is an eigenvector of the adjacency matrix of the graph and the corresponding two dimensional eigenspace consists of all vectors whose sum of coordinates is 0 that attain the same value on each color class. If, in addition, the other eigenvalues of the graph behave like those of random regular graphs (see [25]), then −d is the smallest eigenvalue. Therefore, even if the situation is not that symmetric, enough information about the color classes can still be deduced from the two eigenvectors of the smallest eigenvalue, and with some e orts this information can be used to obtain a full proper 3-coloring. The algorithm can be easily extended to the case of k-colorable graphs, for any xed k, and to various models of random regular 3-colorable graphs.
4
Cl ques and Independent Sets
A cl que in a graph G is a set of vertices any two of which are connected by an edge. Let w(G) denote the maximum number of vertices in a clique of G. The problem of determining or estimating w(G) and that of nding a clique of maximum size in G are fundamental problems in Theoretical Computer Science. The problem of computing w(G) is well known to be NP-hard [31]. The best
Spectral Techniques in Graph Algorithms
211
known approximation algorithm for this quantity, designed by Boppana and Halldorsson [10], has a performance guarantee of O(n (log n)2 ), where n is the number of vertices in the graph. When the graph contains a large clique, there are better algorithms, and the best one, given in [4], shows that if w(G) exceeds n k + m, where k is a xed integer and m > 0, then one can nd a clique of size Ω(m3 (k+1) ) in polynomial time, where here the notation g(n) = Ω(f (n)) means, as usual, that g(n) Ω(f (n) (log n)c ) for some constant c independent of n. On the negative side, it is known, by the work of [ ] following [22] and [9], that for some > 0 it is impossible to approximate w(G) in polynomial time for a graph on n vertices within a factor of n , assuming P = NP. The exponent has since been improved in various papers and recently it has been shown by Hastad [2 ] that it is in fact larger than (1 − ) for every positive , assuming N P does not have polynomial time randomized algorithms. Another negative result, proved in [2] following [45], shows that it is impossible to approximate w(G) for an n vertex graph within a factor of n log7 n by a polynomial size monotone circuit. These facts suggest that the problem of nding the largest clique in a general graph is intractable. It is thus natural to study this problem for appropriate classes of randomly generated input graphs. Let G(n 1 2) denote the random graph on n labeled vertices obtained by choosing, randomly and independently, every pair j of vertices to be an edge with probability 1 2. By the results in ([41]) improved by several researchers, it is known that almost surely (that is, with probability that approaches 1 as n tends to in nity), the value of w(G) is either r(n) or r(n) , for a certain function r(n) = (2 + o(1)) log2 n which can be written explicitly (cf., e.g., [7]). Several simple polynomial time algorithms nd, almost surely, a clique of size (1 + o(1)) log2 n in G(n 1 2), that is, a clique of roughly half the size of the largest one. However, there is no known polynomial time algorithm that nds, almost surely, a clique of size at least (1 + ) log2 n for any xed > 0. The problem of nding such an algorithm was suggested by Karp [32]. His results, as well as more recent ones of Jerrum [30] implied that several natural algorithms do not achieve this goal and it seems plausible to conjecture (see [30]) that in fact there is no polynomial time algorithm that nds, with probability more than a half, say, a clique of size bigger than (1 + ) log2 n. The situation may become better in a random model in which the biggest clique is larger. Following [30], let G(n 1 2 k) denote the probability space whose members are generated by choosing a random graph G(n 1 2) and then by placing randomly a clique of size k in it. As observed by Kucera [34], if k is bigger than c n log n for an appropriate constant c, the vertices of the clique would almost surely be the ones with the largest degrees in G, and hence it is easy to nd them e ciently. Can one design an algorithm that nds the biggest clique almost surely if k is o( n log n) ? This problem was mentioned in [34], and has recently been solved in [5] by showing that for every > 0 there is a polynomial time algorithm, based on spectral techniques, that nds, almost surely, the unique
212
Noga Alon
largest clique of size k in G(n 1 2 k), provided k n1 2 . Although this beats the trivial algorithm based on the degrees only by a logarithmic factor, it seems that the spectral technique is crucial for the algorithm. The relevance of graph eigenvalues to cliques or independent sets in the graphs is well known and can be traced back to the old result that the independence number of any regular graph is at most −n n ( 1 − n ) and the related results on the connection between the Shannon capacity of a graph and its eigenvalues (see [36]). The spectral algorithm of [5] is based on the fact that in the random model considered above one can almost surely extract a big portion of the hidden clique using the eigenvector of the second largest eigenvalue of the adjacency matrix of the graph, provided k is at least, say, 10 n. Using this portion, it is not too di cult to recover the whole clique. Using some extra tricks this can be extended n as well. to yield an algorithm for k = The basic idea behind the algorithm is that the largest (adjacency matrix) eigenvalue of a random graph in the model G(n 1 2 k) is likely to be ( 12 +o(1))n, all its eigenvalues besides the largest two are likely to be at most (1 + o(1)) n, by a result of F¨ uredi and Komlos [23], and the second largest eigenvalue is likely to be close to k 2, where the corresponding eigenvector is nearly a constant on all vertices of the largest clique and nearly another one on all other vertices. With some e orts this can be formalized and proved (for k > 10 n), supplying the desired algorithm. More details appear in [5].
5
Open Problems
It would be interesting to extend the spectral algorithms for coloring and for nding the biggest clique to a wider class of randomly generated graphs. In particular, it would be interesting to design a coloring algorithm that nds, almost surely, a proper three-coloring of G3n p 3 for all possible values of p. (Note that the spectral algorithm works for all p C n where C is a large absolute constant, whereas a trivial algorithm based on omitting repeatedly vertices of low degree works for p c n if c is an absolute (small) positive constant. Hence, only the case c n p C n remains.) Similarly, the obvious challenge concerning the problem of nding a large clique in input graphs generated according to the distribution G(n 1 2 k) that remains open is to design e cient algorithms that work, almost surely, for smaller values of k. If k = n1 2− for some xed > 0, even the problem of nding a clique of size at least (1 + ) log2 n in G(n 1 2 k), suggested in [30], is open and seems to require new ideas. Another interesting version of this problem was suggested by Saks [46]. Suppose G is a graph on n vertices which has been generated either according to the distribution G(n 1 2) or according to the distribution G(n 1 2 k) for, say, k = n0 49 . It is then obvious that an all powerful prover can convince a polynomial time veri er deterministically that, almost surely, G has been generated according to the distribution G(n 1 2 k) (if indeed that was the case). To do so,
Spectral Techniques in Graph Algorithms
213
he simply presents the clique to the veri er. However, suppose G has been generated according to the distribution G(n 1 2) Can the prover convince the veri er (without using randomness, of course) that this is the case, almost surely ? At the moment we cannot design such a protocol if k = o( n) (while for k Ω( n) the veri er can clearly convince himself, using the algorithm in [5].) The spectral properties of a graph encode some detailed structural information about it. The ability to compute the eigenvectors and eigenvalues of a graph in polynomial time provides a powerful algorithmic tool, which has already found several applications and may well have additional algorithmic applications in the future too.
References 1. N. Alon, Eigenvalues and expanders, Comb nator ca 6 (19 6), 3-96. 2. N. Alon and R. B. Boppana, The monotone c rcu t complex ty of Boolean funct ons, Combinatorica 7 (19 7), 1 22. 3. N. Alon and N. Kahale, A spectral technique for coloring random 3-colorable graphs, Proc. of the 26 h ACM STOC, ACM Press (1994), 346-355. Also; SIAM J. Comput., in press. 4. N. Alon and N. Kahale, Approximating the independence number via the function, Math. Programming, in press. 5. N. Alon, M. Krivelevich and B. Sudakov, Finding a large hidden clique in a random graph, Proc. of the Ninth Annual ACM-SIAM SODA, ACM Press (199 ), to appear. 6. N. Alon and V. D. Milman, Eigenvalues, expanders and superconcentrators, Proc. 25 h Annual Symp. on Foundat ons of Computer Sc ence, Singer Island, Florida, IEEE (19 4), 320-322. (Also: 1 , isoperimetric inequalities for graphs and superconcentrators, J. Comb nator al Theory, Ser. B 3 (19 5), 73- .) 7. N. Alon and J. H. Spencer, The Probab l st c Method, Wiley, New York, 1992. . S. Arora, C. Lund, R. Motwani, M. Sudan, and M. Szegedy, Proof ver cat on and ntractab l ty of approx mat on problems, Proc. of the 33rd IEEE FOCS, IEEE (1992), 14 23. 9. S. Arora and S. Safra, Probab l st c check ng of proofs; a new character zat on of NP, Proc. of the 33rd IEEE FOCS, IEEE (1992), 2 13. 10. R. Boppana and M. M. Halldorsson, Approx mat ng max mum ndependent sets by exclud ng subgraphs, BIT, 32 (1992),1 0 196. 11. A. Blum, Some tools for approx mate 3-color ng, Proc. 31s IEEE FOCS, IEEE (1990), 554 562. 12. T.N. Bui and C. Jones, F nd ng good approx mate vertex and edge part t ons s NP-hard, Infor. Proc. Letters 42 (1992), 153 159. 13. A. Blum and D. Karger, An O(n3 14 )-coloring algorithm for 3-colorable graphs, IPL 61 (1997), 49-53. 14. B. Bollobas, Random Graphs, Academic Press, London, 19 5. 15. A. Blum and J. H. Spencer, Color ng random and sem -random k-colorable graphs, Journal of Algorithms 19 (1995), 204 234. 16. R. Boppana, E genvalues and graph b sect on: An average case analys s, Proc. 2 h IEEE FOCS, IEEE (19 7), 2 0 2 5. 17. F. Chung and S. T. Yau, E genvalues, flows and separators of graphs, to appear.
214
Noga Alon
1 . M. E. Dyer and A. M. Frieze, The solut on of some random NP-Hard problems n polynom al expected t me, Journal of Algorithms 10 (19 9), 451 4 9. 19. W. E. Donath and A. J. Ho man, Lower bounds for the partitioning of graphs, J. Res. Develop. 17 (1973), 420-425. 20. P. Erdos and A. Renyi, On the evolution of random graphs, Publ. Math. Inst. Hungar. Acad. Sc . 5 (1960), 17-61. 21. M. Fiedler, Algebraic connectivity of graphs, Czechoslovak Math. J. 23(9 ) (1973), 29 -305. 22. U. Feige, S. Goldwasser, L. Lovasz, S. Safra and M. Szegedy, Approx mat ng Cl que s almost NP-complete, Proc. of the 32nd IEEE FOCS, IEEE (1991), 2 12. 23. Z. F¨ uredi and J. Komlos, The e genvalues of random symmetr c matr ces, Combinatorica 1 (19 1), 233 241. 24. U. Feige and J. Kilian, Zero knowledge and the chromat c number, Proc. 11 h Annual IEEE Conf. on Computational Complexity, 1996. 25. J. Friedman, J. Kahn and E. Szemeredi, On the second e genvalue n random regular graphs, Proc. 21s ACM STOC, ACM Press (19 9), 5 7 59 . 26. A. Frieze and C. McDiarmid, Algorithmic theory of random graphs, Random Structures and Algor thms 10 (1997), 5-42. 27. M. R. Garey and D. S. Johnson, Computers and ntractab l ty: a gu de to the theory of NP completeness, Freeman and Company, 1979. 2 . J. Hastad, Clique is hard to approximate within n1− , Proc. 37 h IEEE FOCS, IEEE (1996), 627 636. 29. A. J. Ho man, On eigenvalues and colorings of graphs, in: B. Harris Ed., Graph Theory and ts Appl cat ons, Academic, New York and London, 1970, 79-91. 30. M. Jerrum, Large cl ques elude the metropol s process, Random Structures and Algorithms 3 (1992), 347 359. 31. R. M. Karp, Reducibility among combinatorial problems, In: Complex ty of computer computat ons, R. E. Miller and J. W. Thatcher (eds.), Plenum Press, New York, 1972, pp. 5 103. 32. R. M. Karp, Probabilistic analysis of some combinatorial search problems, In: Algor thms and Complex ty: New D rect ons and Recent Results, J. F. Traub, ed., Academic Press, New York, 1976, pp. 1 19. 33. L. Kucera, Expected behav or of graph colour ng algor thms, In Lecture Notes n Computer Sc ence No. 56, Springer-Verlag, 1977, pp. 447 451. 34. L. Kucera, Expected complex ty of graph part t on ng problems, Discrete Applied Math. 57 (1995), 193 212. 35. D. Karger, R. Motwani, and M. Sudan. Approximate graph coloring by semide nite programming, In 35th Sympos um on Foundat ons of Computer Sc ence, pages 2 13. IEEE Computer Society Press, 1994. 36. L. Lovasz, On the Shannon capac ty of a graph, IEEE Transactions on Information Theory IT-25, (1979), 1-7. 37. L. Lovasz, Comb nator al Problems and Exerc ses, North Holland, Amsterdam, 1979, Chapter 11. 3 . F. T. Leighton and S. Rao, An approximate max-flow min-cut theorem for uniform multicommodity flow problems with applications to approximation algorithms, Proc 29 h annual FOCS (19 ), 422-431. 39. R. J. Lipton and R. E. Tarjan, A separator theorem for planar graphs, SIAM J. Appl. Math. 36(1979), 177 1 9. 40. C. Lund and M. Yannakakis, On the hardness of approx mat ng m n m zat on problems. Proc. 25 h ACM STOC (1993), 2 6 293.
Spectral Techniques in Graph Algorithms
215
41. D. W. Matula, On the complete subgraph of a random graph, Comb natory Mathemat cs and ts Appl cat ons, Chapel Hill, North Carolina (1970), 356-369. 42. A. Pothem, H. D. Simon and K.-P. Liou, Partitioning sparse matrices with eigenvectors of graphs, SIAM J. Matr x Anal. Appl. 11 (1990), 430-452. 43. A. D. Petford and D. J. A. Welsh, A random sed 3-colour ng algor thm, Discrete Mathematics, 74 (19 9), 253 261. 44. A. Ralston, A F rst Course n Numer cal Analys s, McGraw-Hill, 19 5, Section 10.4. 45. A. A. Razborov, Lower bounds for the monotone complex ty of some Boolean funct ons, Dokl. Ak. Nauk. SSSR 2 1 (19 5), 79 01 (in Russian). English translation in: Sov. Math. Dokl. 31 (19 5), 354 357. 46. M. Saks, Private communication. 47. H. D. Simon, Partitioning of unstructured problems for parallel processing, Comput ng Systmes n Eng neer ng 2(2/3) (1991), 135-14 . 4 . A. Sinclair and M. R. Jerrum, Approximate counting, uniform generation and rapidly mixing Markov chains, Informat on and Computat on 2 (19 9), 93-133. 49. D. A. Spielman and S.-H. Teng, Spectral partitioning works: planar graphs and nite element meshes, to appear. 50. J. S. Turner, Almost all k-colorable graphs are easy to color, Journal of Algorithms 9 (19 ), 63 2.
Colour ng Graphs Whose Chromat c Number Is Almost The r Max mum Degree? M chael Molloy1 and Bruce Reed2 1
Dept. of Computer Sc ence, Un vers ty of Toronto, Toronto, Canada
[email protected] 2 CNRS, Par s, France and IME, USP, Sao Paulo, Braz l
[email protected]
Abs rac . We present e c ent algor thms for determ n ng f the chromat c number of an nput graph s close to . Our results are obta ned v a the probab l st c method.
1
The Result
Determ n ng the m n mum number of colours requ red to colour the vert ces of a g ven graph so that every pa r of adjacent vert ces rece ves d erent colours s a notor ously d cult problem. In fact, determ n ng whether three colours w ll su ce was one of the or g nal s x problems that Karp reduced Sat s ab l ty to n h s sem nal 1972 paper [6]. However, t s easy to see that f the max mum degree of G s , then (G) s at most + 1. In fact there s a s mple recurs ve greedy procedure for colour ng a graph G w th (G) + 1 colours. Hav ng coloured G − x we s mply colour x w th a colour not appear ng on the (at most) ne ghbours of G − x. Of course, th s colour ng may not be opt mal. In fact, n 1941, Brooks[3] proved that (G) = + 1 prec sely f some component of G s a ( + 1)-cl que or = 2 and G s not b part te. As these two cond t ons are easy to test, we see that we can determ ne f G has chromat c number + 1 n polynom al ( n fact, l near) t me. It s natural to wonder f one can develop polynom al t me algor thms for determ n ng f G has chromat c number k for k close to + 1. One negat ve result n th s area was prov ded by Ma ray and Pre ssman[7], who proved that determ n ng f (G) s s NP-complete, even for graphs w th = 4 wh ch are tr angle-free. Furthermore, Emden-We nert, Hougardy, and Kreuter[4] have shown that determ n ng f a graph G w th max mum degree + 1 colours s NP-complete prov ded k 3 has a colour ng us ng k = − ( .e. 4). In th s paper we present two compan on pos t ve results:
?
Th s work was supported by NATO Collaborat ve Research Grant #CRG950235. The work of the rst author s supported by an NSERC Research Grant. The work of the second author was part ally supported by a FAPESP grant.
C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 216 225, 1998. c Spr nger-Verlag Berl n He delberg 1998
Graphs Whose Chromat c Number Is Almost The r Max mum Degree
217
Theorem 1. There s a pos t ve such that for each xed , and each k beand + 1, we can determ ne f (G) = k n polynom al tween +1− t me. Furthermore, f s between these two bounds then we can nd an opt mal colour ng n polynom al t me. If we allow to vary w th V (G) then we obta n a s m lar result, however the algor thm s no longer determ n st c. Theorem 2. There s a pos t ve such that for any graph G of max mum degree and each k between + 1 − log and + 1, we can determ ne f (G) = k n polynom al t me. Furthermore, there s a random zed algor thm to nd an opt mal colour ng w th polynom al expected runn ng t me on all nputs whose chromat c number l es between the two g ven bounds. We remark that these theorems are only relevant for graphs for wh ch s at least −2 wh ch s why they do not contrad ct the result of Ma ray and Pre ssman above (and n fact they show that determ n ng f = s d cult only for small ). We note that as a consequence of our techn que, we obta n the follow ng nterest ng non-algor thm c result. Theorem 3. There s a pos t ve such that f G s a graph of max mum degree then the chromat c number of G and chromat c number at least + 1 − s the max mum of the chromat c numbers of ts subgraphs w th at most + vert ces. Our approach s to exam ne a na ve colour ng procedure wh ch cons sts of choos ng a random colour for each vertex w th each poss b l ty equally l kely, uncolour ng those nvolved n a confl ct, and then complet ng the colour ng, essent ally by greed ly extend ng the part al colour ng. We shall see that, surpr s ngly, analyz ng a sl ght var ant of th s procedure us ng s mple but powerful probab l st c tools allows us to develop an algor thm wh ch performs as spec ed above. See [8],[9],[10], [12] [13], [14] for related results, n part cular [12] conta ns a survey of a large var ety of problems solved us ng th s and s m lar techn ques. We note that our result requ res the appl cat on of a recent result of the authors[11], that enables us to develop algor thms from appl cat ons of the Lovasz Local Lemma([5], see also [1]). It also requ res the use of Azuma’s nequal ty[2] and a relat vely new concentrat on nequal ty due to Talagrand[15]. However, n th s short summary, we do not d scuss these aspects. Rather we focus on a structural decompos t on used n the proof and po nt out why − s (approx mately) the boundary at wh ch th ngs become d cult.
218
2
M chael Molloy and Bruce Reed
A Proof Sketch
One s mple result wh ch s cruc al to the algor thm, s the follow ng: Lemma 1. Any part al − r colour ng of G, such that for every vertex v there are at least r + 1 colours appear ng tw ce n the ne ghbourhood of v, can be extended to a − r colour ng of G n l near t me. Proof. S mply colour the uncoloured vert ces one at a t me. When we come to colour v, t has at most − r − 1 colours appear ng n ts ne ghbourhood and hence we can use some colour not appear ng on ts ne ghbourhood. Now, we w ll attempt to nd such a part al colour ng w th − r at most the and the chromat c number of G by analyz ng the probmax mum of −c ab l st c procedure descr bed above. That s, by cons der ng a part al colour ng obta ned by colour ng each vertex w th a un formly ndependently chosen colour and then uncolour ng vert ces wh ch are nvolved n confl cts. Of course, our techn que w ll not always y eld repeated colours n every ne ghbourhood. For example f the ne ghbourhood of a vertex s a cl que then t w ll conta n no repeated colours. More strongly, f there are very few non-edges n the ne ghbourhood of a vertex, then we cannot expect to say very much. It turns out that f every vertex has enough non-edges n ts ne ghbourhood then G has colour ng n wh ch there are repeated colours n each a part al − ne ghbourhood. Furthermore, we can nd such a colour ng eas ly us ng the probab l st c method sketched above. To beg n, we cons der such sparse graphs. To ease our expos t on dur ng these ntroductory remarks, we assume that G s 3 regular. We also assume that each vertex of G has at least e6 2 non-edges n ts ne ghbourhood and analyze a un formly chosen random part al 2 colour ng C of the vert ces of G. Our rst step s to exam ne the number of repeated colours we expect n N (v). We let Zv be the number of pa rs of vert ces of N (v) wh ch rece ve the same colour and such that th s colour s ass gned to no other vertex of N (v) (th s 3 second cond t on s to avo d double count ng). There are at least e6 2 pa rs of non-adjacent vert ces n N (v). For any part cular such pa r, u w , both vert ces rece ve the same colour w th probab l ty 2 . If u w both rece ve the same colour, then they w ll form one of the pa rs counted by Zv prov ded they both reta n that colour n C and no other colour n N (v) s ass gned that colour. I.e. no other vertex n (N (v) − u − w) N (u) N (w) rece ves that colour. The probab l ty that no such vertex rece ves that colour s (1 − 2 )j(N (v)−u−w)[N (u)[N (w)j 3 2 e−6 . It follows that Exp(Zv ) e6 2 e−6 = 2 . (1 − 2 )3 Thus, for each vertex, the expected number of colours wh ch appear tw ce n ts ne ghbourhood n C s h gh, and so t seems hopeful that w th pos t ve probab l ty, C w ll have the des red property. To prove th s requ res two more steps: F rst, we would l ke to show that for each v, Zv s h ghly concentrated, .e. that the probab l ty that Zv d ers from ts expected value by a s gn cant amount s small. Th s w ll establ sh that for any one part cular v, w th h gh
Graphs Whose Chromat c Number Is Almost The r Max mum Degree
219
probab l ty N (v) w ll conta n many repeated colours. We could do th s us ng a concentrat on nequal ty due to Talagrand. The second step s to strengthen th s statement, show ng that w th pos t ve probab l ty every vertex w ll have many such pa rs n ts ne ghbourhood. Th s second step s a stra ghtforward appl cat on of the Lovasz Local Lemma. However, we w ll not prove th s result about sparse graphs, we have just sketched t to nd cate the ma n l nes of our approach. It shows that the crux of the problem s to deal w th those vert ces w th dense ne ghbourhoods. 3 We call a vertex dense f there are more than 2 − e4 2 edges n the subgraph nduced by ts ne ghbourhood. We note that th s mpl es that there 3 are at most 2e4 2 edges between N (v) and V − N (v). We shall show that we Sl and a set L (for L ght can part t on the vertex set of G nto dense sets S1 vert ces) such that: every dense vertex s n some S , 3 each dense set conta ns at most + 4e4 2 vert ces. 3 there are at most 12e4 2 edges between S and V − S , and for each S , every vertex n S has at least 34 ne ghbours n S and every vertex outs de S has fewer than 34 ne ghbours n S . colours, as d scussed Now, the l ght vert ces are easy to colour us ng − above. Furthermore, each S can essent ally be dealt w th nd v dually, as the cond t ons above ensure that t s solated from the rest of the graph. Thus, we can produce an opt mal colour ng of S . Comb n ng these colour ngs, we obta n max (S ) 1 l ) colour ng of G. Thus, th s colour ng s a max( + (G)) colour ng of G. We remark that f s bounded, then a max( + we can e c ently nd the colour ngs of each S because they are of bounded of then to nd an opt mal s ze. As we shall see, f (S ) s w th n b colour ng of each S , we need only nd an opt mal colour ng for some set X of at most 50b vert ces of S . Th s w ll perm t us to nd an opt mal colour ngs prov ded b log . Of course, we cannot colour each S completely ndependently of the rest of the graph. Cons der, for example, the s tuat on n wh ch the dense set Sj w th max mum chromat c number s a − 2 cl que each of whose elements has 3 ne ghbours outs de Sj . If all these ne ghbours rece ve colour 1 then we w ll have to use − 2 new colours n colour ng Sj , and thereby obta n a colour ng us ng one more colour than des red. To avo d such a problem, we need to cons der colour ng all the S at once at the same t me as we perform the part al colour ng of the l ght vert ces. Essent ally, we x some opt mal colour ng of each S and then randomly permute the names of the colour classes (the permutat on for d st nct S are chosen ndependently). As w th the earl er random colour ng we have to uncolour vert ces nvolved n confl cts. However, because there are only a few edges out of each dense set, there w ll be very few uncoloured vert ces n each dense set. Carefully choos ng the or g nal opt mal colour ngs of the S , and the order n wh ch we colour the uncoloured vert ces, w ll allow us to complete the colour ng.
220
M chael Molloy and Bruce Reed
In order to do so, we need to spec fy how we construct the dense sets, exam ne the r structure, and d scuss how we choose the or g nal colour classes for each S . To beg n, we d scuss construct ng the dense sets. We present a sl ghtly more general result then that needed here, w th a v ew to future appl cat ons (see [13] for a precursor of th s result).
3
A Structural Decompos t on
Cons der any nteger p. For any graph G of max mum degree at least 100p, we say that a vertex v of G s p −sparse f the subgraph nduced by ts ne ghbourhood conta ns fewer than 2 − p edges (note v can have a cl que as a ne ghbourhood and st ll be p-sparse, prov ded th s cl que has fewer than − p vert ces). Otherw se, v s p−dense. Note that f v s p-dense then there are at most 2p edges between N (v) and V −N (v). We now de ne a p−dense decompos t on of G. For each p-dense vertex v of G, we de ne a set Sv as follows. Step 0. Set Sv = N (v) + v. Step 1. Wh le there s a vertex y n Sv w th N (y) Sv < such vertex from Sv . Step 2. Wh le there s a vertex y outs de of Sv w th N (y) some such vertex to Sv .
3 4
delete some
Sv
3 4
add
It s easy to see that the set Sv s un quely de ned. We note that: Every vertex n Sv has at least 34 ne ghbours n Sv . Every vertex outs de Sv has less than 34 ne ghbours n Sv . The ne ghbourhood of any p-dense vertex has at least − 2p elements. S nce E(N (v)) 2 − p , fewer than 10p vert ces of N (v) are deleted 12p (Note that n Step 2, the from Sv n Step 1, and E(Sv G − Sv ) number of edges out of Sv always decreases). (5) S nce E(N (v) G − N (v)) 2p , fewer than 4p vert ces of G − N (v) are added to Sv n Step 2.
(1) (2) (3) (4)
(6)
Thus, we have: − 12p S(v)
+ 4p.
Note that (4), (3) and (2) together mply that v Sv . Because Sv s so close to Nv , we can prove: (7) If x and y are spec al vert ces and Sx ntersects Sy then y s n Sx and x s n Sy . By (7), for some nteger l, we can construct a sequence of spec al vert ces xl and correspond ng d sjo nt sets S1 Sl w th S = Sxi such that x1 Sl , p-dense sets. We call every spec al vertex s n l1 S . We shall call S1 Sl L = V − l1 S , a p-dense decompos t on. Recapp ng, we have: S1
Graphs Whose Chromat c Number Is Almost The r Max mum Degree
Propos t on 1. If S1
221
Sl s a p-dense decompos t on then:
every dense set has between − 12p and + 4p vert ces, there are at most 12p edges between S and V − S , a vertex s adjacent to 34 vert ces of S f and only f t s n S , and every vertex n L s p-sparse. Remark 1. There are many var ants of th s structural decompos t on. In part cular, one could choose the dense set S to be Sv for the p-dense vertex v n V − 1−1 S w th the most edges n ts ne ghbourhood. Furthermore, for small p the constants 4 and 12 can be mproved.
4
Part t on ng the Dense Sets
We turn now to the proof of Thm. 3. An analys s of th s proof prov des the algor thms ment oned n Thms. 1 and 2. We shall show that there s some 0 1 such that Thm. 3 holds for 0 sett ng = 106 . The general result follows by 1 1 sett ng = max( 0 +1 106 ). We shall not compute 0 prec sely, we only ns st that t sat s es certa n nequal t es scattered throughout the paper. So now let 1 6 . Let G be a graph of max mum degree 0 , let = 106 , and let p = e Sl L be a p-dense decompos t on of G. S1 We say a dense set s matchable f there s a match ng w th 6p edges n ts complement. We call a dense set wh ch s not matchable a near cl que. We note that: Propos t on 2. A near cl que S has chromat c number at least − 30p. Furthermore, t has an opt mal colour ng n wh ch the s ngleton colour classes form a cl que w th at least − 36p elements, all of whose vert ces see all of every colour class wh ch has more than two elements. Proof. S nce S has at least − 12p elements, n any − 30p colour ng of S , there must be at least 18p vert ces n non-s ngleton colour classes. But each non-s ngleton colour class U conta ns a match ng n the complement of S w th jUj edges. The rst statement follows. S nce S s a near-cl que, at most 6p of 2 the colour classes n an opt mal colour ng are non-s ngleton. Furthermore, n an opt mal colour ng of any graph, the s ngleton colour classes form a cl que. By choos ng an opt mal colour ng w th the max mal number of two element colour classes, we can ensure that there s no edge of S jo n ng a s ngleton colour class and a colour class w th more than three elements. Thus, the second statement also holds. and we take Now, for each matchable S , we let k = S − 6p < − Uki cons st ng of 6p colour classes w th 2 elements a k colour ng of S : U1 and S − 6p s ngleton colour classes. For each near cl que S , we let k be Uki w th the chromat c number of S , and p ck a k -colour ng of S : U1 the propert es spec ed n (2). We say a colour class s compound f t s not a
222
M chael Molloy and Bruce Reed
s ngleton. We may refer to a vertex form ng a s ngleton colour class as a s ngleton max k 1 l ). vertex. We let k = max( − The core of S s the set of vert ces of S w th fewer than p ne ghbours outs de of S . A vertex n the core of a dense set S s a core vertex. Every vertex n a dense set wh ch s not a core vertex s a per pheral vertex. We let Ej be the un on of the external ne ghbourhoods of the vert ces of Uj . We let Fj be the un on of the external ne ghbourhoods of the core vert ces elements of Uj . We can prove the follow ng eas ly. Propos t on 3. For a colour ng of a near-cl que as n (2), every Ej has at most 2 elements and every Fj has at most 100 elements. We say that a vertex s h ghly adjacent to S f t s a vertex w th at least ne ghbours n S or f t s a core vertex n some compound colour class Ujr such that Fjr has more than log3 elements n S . We let HA be the vert ces h ghly adjacent to S . log3
5
Why Is the Boundary at the Square Root of Delta?
We pause to nd cate that t s Prop. 3 wh ch expla ns why k = − +1 s (approx mately) the boundary at wh ch k colour ng becomes d cult. To see th s cons der = r2 , and cons der a dense set S wh ch s the un on of a cl que w th − r vert ces and a stable set U w th r + 1 elements all jo ned to every vertex of the cl que. Then, n any k = − r + 1 colour ng of G, all the vert ces of U must rece ve the same colour. Note that each vertex n U may have up to r ne ghbours outs de S. Thus, U can be thought of as a pseudo vertex whose degree may be r2 + r = + r. So, we can determ ne f a graph H w th max mum degree + r has a k colour ng by replac ng each vertex v of H by a dense set S(v) and correspond ng U (v). Repeated appl cat ons of th s procedure allow us to w den the gap between k and the max mum degree of the graph we are try ng to colour even further. Th s perm ts us to reduce k colour ng to + 1 colour ng. However, as Prop. 3 shows, th s s as close to as we − can get by such reduct ons.
6
The Random Part al Colour ng
We shall cons der the random colour ng process n wh ch we: ass gn each vertex n L a random un formly chosen colour between 1 and k, w th all these cho ces ndependent choose, for each dense set S , a permutat on of 1 to k, where these cho ces are all ndependent of each other and the colours ass gned to the l ght vert ces, then for 1 j k , we g ve Uj the jth colour n the permutat on for S .
Graphs Whose Chromat c Number Is Almost The r Max mum Degree
223
uncolour any vertex wh ch s ass gned the same colour as one of ts ne ghbours. uncolour any vertex n HA wh ch s coloured w th a colour that no longer appears on any s ngleton vertex of S wh ch has less than log2 external ne ghbours (e ther because t was not ass gned to any such vertex or because the vertex t was ass gned to was uncoloured n the last step; th s s to avo d hav ng a colour wh ch appears on an external ne ghbour of every vertex of S ). uncolour any vertex n a compound colour class n wh ch we have uncoloured a core vertex. By analyz ng th s procedure, we can show that: Propos t on 4. There s a part al k colour ng of G such that: (P1) For every Uj , e ther all ts vert ces rece ve the same colour, none of ts vert ces are coloured, or all of ts core vert ces rece ve the same colour and all of ts per pheral vert ces are coloured w th th s colour or uncoloured. colours appear on exactly two (P2) For every vertex v n L, at least 1 5 ne ghbours of v. colours appear on exactly (P3) For every per pheral vertex v, at least 1 5 two ne ghbours of v. (P4) At most log compound colour classes of any S conta n an uncoloured core vertex. (P5) At most 13p vert ces of any S wh ch are s ngleton colour classes n the part t on of S are uncoloured. (P6) There are at least 25 vert ces v of each S sat sfy ng: every colour appear ng n the external ne ghbourhood of v also appears n S . We show now that f there s only one dense set S1 , these propert es ensure that we can extend our part al k-colour ng to a colour ng of G. Propert es (P2) and (P3) nd cate that we need really only worry about the core vert ces. To beg n we complete the colour ng of the core vert ces of the compound colour classes of S1 . Recall that, by (P1), f a compound colour class conta ns an uncoloured core vertex then all ts core vert ces are uncoloured. We shall choose a colour for each of the at most log compound colour classes wh ch conta n an uncoloured core vertex and ass gn t to all of the uncoloured core vert ces n the colour class. When choos ng a colour for such a Uj1 , we forb d all the colours wh ch appear on Ej1 , and all colours wh ch appear on a compound colour class of S1 . We may however choose a colour wh ch appears on some s ngleton vertex n S1 and f we do so, we uncolour th s vertex. We do not perm t ourselves to colour the uncoloured core of some compound Uj w th a colour appear ng on the external ne ghbourhood of more than 50p vert ces of S1 . Now, we know that every Ej has at most 100 elements. Also, there are fewer than 18p vert ces n the un on of the compound colour classes, so we forb d at most 18p colours because they appear n some compound colour class. Furthermore, s nce there are at most 12p edges out of S1 , we forb d fewer than 4 colours because they appear
224
M chael Molloy and Bruce Reed
n the external ne ghbourhood of too many vert ces of S1 . It follows that there are at least 3 poss ble colours to ass gn to the core of any uncoloured compound colour class. Thus, we can choose a d st nct colour sat sfy ng our cond t ons for each of the at most log compound colour classes w th uncoloured cores. We note that after colour ng the uncoloured cores of compound colour classes w th these colours, (P1) st ll holds. Furthermore, (P2) and (P3) hold f we replace the by 1 5 − log . F nally, (P6) holds f we replace 25 by 925 . 15 We now colour the core vert ces not n compound colour classes. If S1 s matchable then th s s easy. Every core vertex sees at most p vert ces outs de of S1 . If t has fewer than − p2 < k ne ghbours then obv ously we can choose a colour for t wh ch appears on none of ts ne ghbours and hence w th wh ch t + 4p can be coloured. Otherw se, t must see at least − 3p 2 of the at most vert ces of S1 and hence sees both vert ces of at least p2 of the 6p compound colour classes. Thus, there s a colour ava lable w th wh ch to colour t. If S1 s a near cl que, th ngs are more d cult. We w ll chose a set of 3p vert ces sat sfy ng the cond t ons of (P6) and uncolour them. We want to do + 1 repeated colours n the so n such a way that there are st ll at least ne ghbourhood of each per pheral or l ght vertex and so that at a set R of at least p of the 3p vert ces cont nue to sat sfy the cond t on of (P6), ( .e. none of the r external ne ghbours were coloured w th a colour used on one of the 3p vert ces we have just uncoloured). To help to ensure that the second cond t on holds, we only p ck from the at least 425 of our 925 poss b l t es wh ch are coloured w th a colour appear ng n the external ne ghbourhoods of fewer than 60p vert ces of S1 (here we aga n use the fact that there are at most 12p edges out f the dense set). If we make such a restr cted cho ce then t s easy to check that w th h gh probab l ty, the two cond t ons we want to sat sfy w ll hold (we w ll not do th s s mple computat on; we note only that each vertex s uncoloured w th probab l ty less than p1 so we can see that we do not expect th s step to have huge e ects). Now, we colour every core s ngleton vertex not n R, wh ch we can do because such a vertex has p uncoloured ne ghbours n R. Next, we colour all the vert ces of R, wh ch we can do because, as (P1) holds, f there s an uncoloured vertex v left n R then there are fewer than k1 − 1 k − 1 colours appear ng n S1 . Hence, because every colour appear ng n the ne ghbourhood of v also appears n S1 , we know that there s a colour ava lable w th wh ch to colour v. Hav ng coloured the core vert ces, we can greed ly n sh o the per pheral + 1 colours and l ght vert ces because for each such vertex there are at least wh ch are repeated n ts ne ghbourhood. Now, f there s more than one dense set, th ngs get a b t more compl cated. We can’t colour them sequent ally because complet ng one dense set w ll affect what happens n the others. So, we w ll colour all the uncoloured cores of compound colour classes at the same t me, us ng a random procedure. We then choose a set R of each near-cl que S wh ch w ll form the last p core vert ces of S coloured. Aga n, we need to choose these R randomly and all at once. Hav-
Graphs Whose Chromat c Number Is Almost The r Max mum Degree
225
ng performed these two steps, we can then n sh o the colour ng n a s mply greedy fash on. The deta ls get a l ttle ha r er, and the techn cal t es are qu te lengthy. We save them for a longer vers on of th s paper. We hope that th s short account has enabled the reader to grasp how our structural decompos t on can be used . n such proofs and why the complex ty boundary s at −
References 1. N. Alon and J. Spencer, The Probab l st c Method. W ley (1992). 2. K. Azuma, We ghted sums of certa n dependent random var ables. Tokuku Math. Journal 19 (1967), 357 - 367. 3. R.L. Brooks, On colour ng the nodes of a network, Proc. Cambr dge Ph l. Soc. 37 (1941), 194 - 197. 4. T. Emden-We nert, S. Hougardy, and B. Kreuter, Un quely colourable graphs and the hardness of colour ng graphs w th large g rth, Probab l ty, Comb nator cs, and Comput ng, to appear. 5. P. Erdos and L. Lovasz, Problems and results on 3-chromat c hypergraphs and some related quest ons, n: In n te and F n te Sets (A. Hajnal et. al. Eds), Colloq. Math. Soc. J. Bolya 11, North Holland, Amsterdam, 1975, 609-627. 6. R. Karp, Reduc b l ty among comb nator al problems, In Complex ty of Computer Computat ons, Plenum Press (1972), 85 - 103. 7. F. Ma ray and M. Pre ssmann, On the NP-completeness of the k-colourab l ty problem for tr angle-free graphs, D screte Mathemat cs 162 (1996), 313-317. 8. M. Molloy and B. Reed, A bound on the strong chromat c ndex of a graph, Journal of Comb nator al Theory (B), to appear. 9. M. Molloy and B. Reed, A bound on the total chromat c number, subm tted. 10. M. Molloy and B. Reed, Assymptot cally better l st colour ngs, manuscr pt. 11. M. Molloy and B. Reed, An algor thm c vers on of the Lovasz Local Lemma, n preparat on. 12. M. Molloy and B. Reed, Graph Colour ng v a the Probab l st c Method , to appear n a book ed ted by A. Gyarfas and L. Lovasz. 13. B. Reed, and , Journal of Graph Theory, to appear. 14. B. Reed, A strengthen ng of Brook’s Theorem, n preparat on. 15. M. Talagrand, Concentrat on of measure and soper metr c nequal t es n product spaces. Inst tut Des Hautes Etudes Sc ent ques, Publ cat ons Mathemat ques 81 (1995), 73 - 205.
C rcu t Covers n Ser es-Parallel M xed Graphs Orlando Lee and Yoshiko Wakabayashi Instituto de Matematica e Estat stica, Universidade de Sao Paulo Rua do Matao, 1010 0550 -900 Sao Paulo, SP, Brazil lee,yw @ime.usp.br
Abs rac . A mixed graph is a graph that contains both edges and arcs. Given a nonnegative integer weight function p on the edges and arcs of a mixed graph M , we wish to decide whether (M p) has a circuit cover, that is, if there is a list of circuits in M such that every edge (arc) e is contained in exactly p(e) circuits in the list. When M is a directed graph or an undirected graph with no Petersen graph as a minor, good necessary and su cient conditions are known for the existence of a circuit cover. For general mixed graphs this problem is known to be NPcomplete. We provide necessary and su cient conditions for the existence of a circuit cover of (M p) when M is a ser es-parallel mixed graph, that is, the underlying graph of M does not have 4 as a minor. We also describe a polynomial-time algorithm to nd such a circuit cover, when it exists. Further, we show that p can be written as a nonnegative integer linear combination of at most m incidence vectors of circuits of M , where m is the number of edges and arcs. We also present a polynomial-time algorithm to nd a minimum circuit in a series-parallel mixed graph with arbitrary weights. Other results on the fractional circuit cover and the circuit double cover problem are discussed.
1
Introduct on
A m xed graph M = (V E A) is a graph that contains both (undirected) edges and arcs. We denote by (M p) a mixed graph M together with a weight function Q+ . We say that (M p) has a c rcu t cover (or M has a circuit p:E A C), p-cover) if there is a vector of nonnegative integer coe cients ( C : C where C := C(M ) denotes the set of circuits of M , such that p = C2C C C . Here, for any subgraph H of M , H denotes the 0,1 -incidence vector of the edge,arc -set of H. We also say that (M p) has a fract onal c rcu t cover if there is a vector of nonnegative rational coe cients ( C : C C) such that p = C2C C C . We are interested in the following problems. Given (M p), where p is a nonnegative nteger (respectively, rat onal ) weight function, decide whether (M p) has a c rcu t cover (respectively, fract onal c rcu t cover ). ?
This work has been partially supported by FAPESP (Proc. 96/04505-2), CNPq (Proc. 304527/ 9-0), CAPES (Proc. 3302006-0) and PRONEX (Project 107/97).
C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 226 238, 1998. c Spr nger-Verlag Berl n He delberg 1998
Circuit Covers in Series-Parallel Mixed Graphs
227
These problems have been largely investigated when M is an undirected graph. In this case, the fractional version was solved by Seymour [16]. The integral version has attracted great interest as it is related to problems involving minimum circuit covers, the Chinese Postman Problem, perfect matchings and others (see [1] for further references). For directed graphs, these problems are easy and they are special cases of a more general problem solved by Ho man [12]. For general mixed graphs, both problems are known to be NP-complete [3,4]. In this note we shall be concerned with these problems and related ones in the special case M is a series-parallel graph. This note is organized as follows. In Sect. 2 we establish the notation we shall use. In Sect. 3, we show a good characterization for the existence of a (fractional) circuit cover of (M p) when M is series-parallel. Section 4 is devoted to the study of Hilbert bases of circuits in series-parallel mixed graphs. There, we also describe a polynomial-time algorithm (relying on the ellipsoid method) that nds a circuit cover of (M p), if it exists, when M is series-parallel. In Sect. 5, we discuss an interesting special case of the circuit cover problem: the case where p is an integral constant vector. If M is a directed graph, the problem is easy as this is equivalent to characterizing Eulerian directed graphs (for any constant vector p). For undirected graphs, there is a celebrated open problem. When p = 1 (or any odd number) the problem is easy: a graph has a circuit p-cover if, and only if, it is Eulerian. When p = 2 we have the well-known Circuit Double Cover Conjecture. The cases p = 4 and p = 6 were solved by Jaeger [5,13] and Fan [9]. These two results together settle the case where p = 2k for any k 2. For mixed graphs, the problem is settled only when p is odd [7,10]. Using the result of Sect. 3, we give a solution for the case p = 2 (and hence for any even number) when M is series-parallel. Finally, in Sect. 6 we describe a polynomial-time algorithm to nd a minimum circuit in a weighted series-parallel mixed graph.
2
Bas c Term nology
Most of the concepts de ned for undirected and directed graphs (see [6]) can be extended in a natural way to mixed graphs. We assume the reader is familiar with them. A m xed graph is a triplet M = (V E A) where V is a nite set of vertices, E is a nite set of edges and A is a nite set of arcs. When E = we say that M is a d rected graph, and when A = we say that M is an und rected graph. We use the capital letters D G and M for directed, undirected and mixed graphs, respectively. Sometimes, it is useful to see M as the union of a directed graph and an undirected graph. We indicate this by M = D G. In a context referring to a mixed graph M , we may denote by V (M ) E(M ) and A(M ) the set of vertices, edges and arcs of M , respectively. + (X) the set of arcs with tail in X and For a directed graph D, we denote by D − + head in V −X, and we write D (X) := D (V −X). For an undirected graph G, we
22
Orlando Lee and Yoshiko Wakabayashi
denote by G (X) the set of edges with an endnode in X and the other in V − X. − + For a mixed graph M = D G, we write M (X) := G (X) D (X) D (X). We often omit the subscript if this does not cause any confusion; for example, (X) := M (X). We may also denote a unitary set e simply by e. A nonempty subset F E(M ) A(M ) is a cut if there is a subset S V (M ) such that F = (S). A br dge is a cut of cardinality one. A mixed graph with no bridges is called br dgeless. vk ek vk+1 ) where v1 vk+1 are pairA path is a sequence (v1 e1 v2 wise distinct vertices, and for each (1 k), either ei is an edge with endnodes vi vi+1 or ei is an arc with tail vi and head vi+1 . A c rcu t is a sequence with the same properties, except that v1 = vk+1 . A mixed graph is strongly connected if for each pair u v of vertices there is a path from u to v. A mixed graph is Euler an if its edge,arc -set can be partitioned into circuits. Two arcs are ant -parallel if they form a circuit of length two. If M = (V E A) is a mixed graph and e E A then M − e (respectively, M e) denotes the graph obtained from M by deleting (respectively, contracting) e. A graph H is a m nor of a graph G (or G has an H-minor) if a graph isomorphic to H can be obtained from a subgraph of G by a sequence of edge contractions. If H is a cubic graph then H is a minor of G if and only if some subgraph of G is homeomorphic to H. We say that M is ser es-parallel if the underlying graph of M does not have a K4 -minor. We recall that a mixed graph M = (V E A) together with a weight function Q+ is denoted by (M p). We use the convention that p(F ) means p:E A p(e), for any F E A. e2F
3
C rcu t Covers n M xed Graphs
Let us recall the two problems of our interest: C rcu t Cover Problem: Given a nonnegative integer weighted mixed graph (M p), decide whether (M p) has a circuit cover. Fract onal C rcu t Cover Problem: Given a nonnegative rational weighted mixed graph (M p), decide whether (M p) has a fractional circuit cover. For directed graphs both problems are equivalent in the sense that if (D p) (with p integral) has a fractional circuit cover then it also has a circuit cover. This is an easy particular case of a more general result, due to Ho man [12], Q+ is called a related to circulations in directed graphs. A vector p : A(D) c rculat on in D if p( − (X)) = p( + (X)) for every X V (D). Clearly, p is a circulation in D if and only if (D p) has a fractional circuit cover. If, in addition, p is integral, then (D p) has also a circuit cover. Theorem 1 (Ho man, 1960). Let D be a d rected graph w th capac t es l u : Q+ such that l u. Then there s a c rculat on p n D such that A(D) l p u f and only f l( − (X)) u( + (X)) for all X V (D). Furthermore, we can choose p ntegral f l u are ntegral.
Circuit Covers in Series-Parallel Mixed Graphs
229
For undirected graphs, the fractional circuit cover problem was solved by Seymour [16]. Theorem 2 (Seymour, 1979). Let G be a graph and p : E(G) (G p) has a fract onal c rcu t cover f and only f p( (X) − e) X V (G) e (X).
Q+ . Then p(e) for all
In the undirected case the fractional version is quite di erent from the integral version. Clearly, the additional condition p( (X)) is even for every X V (G) is necessary for the existence of a circuit cover of (G p). But it is not always su cient as shows the following well-known counterexample: let G = P10 be the Petersen graph and F a perfect matching of P10 ; take p10 (e) = 2 for e F , and p10 (e) = 1, otherwise. Seymour [16] proved that this additional parity condition is su cient for the existence of a circuit cover if G is planar. Alspach, Goddyn and Zhang [1] extended his result showing that, in a certain sense, (P10 p10 ) is a minimal counterexample. Theorem 3 (Alspach, Goddyn and Zhang, 1994). Let G be a graph w th ZZ + . Then (G p) has a c rcu t cover f and only no P10 -m nor and p : E(G) f p( (X) − e) p(e) for all X V (G) e (X), and p( (X)) s even for all X V (G). We turn now to the study of the (fractional) circuit cover problem in mixed graphs. In this case, as the following result due to Arkin and Papadimitriou [3,4] shows, these problems are much harder than in the undirected or the directed case. Theorem 4 (Ark n and Papad m tr ou, 1986). For m xed graphs M , the fract onal c rcu t cover problem and the c rcu t cover problem are NP-complete, even f p(e) = 2 for e E(M ) and p(a) = 1 for a A(M ). In view of this result, it is unlikely that we can nd nice necessary and su cient conditions for the existence of a circuit p-cover in an arbitrary mixed graph. We show that for series-parallel mixed graphs such a nice characterization exists. Given a mixed graph M = (V E A), the following conditions are clearly necessary for the existence of a fractional circuit cover of (M p): (a) For every X V , fp (X) := p( (X)) − p( − (X)) − p( + (X)) (b) For every cut (X) and e (X), p( (X) − e) p(e).
0;
Further, if (M p) is to have a circuit cover then it is necessary that p is nonnegative integral and the following holds: (c) For every cut
(X), p( (X)) is even.
We say that (M p) is balanced if it satis es (b); Euler an if it satis es conditions (a) and (c); and fract onally adm ss ble if it satis es (a) and (b). If p is integral, we say that (M p) is adm ss ble if it is Eulerian and balanced
230
Orlando Lee and Yoshiko Wakabayashi
(or, it is fractionally admissible and satis es (c)). Sometimes, we prefer to say simply that p is balanced, Eulerian, fractionally admissible or admissible (with respect to M ). The following simple example shows that (fractional) admissibility is not always a su cient condition for the existence of a (fractional) circuit cover. Take the mixed graph M K4 whose underlying graph is isomorphic to K4 : the directed part consists of a directed circuit D of length 4 and the undirected part consists of a 1-factor F ; de ne p4 (a) = 1 for a D and p4 (e) = 2 for e F . It is easy to see that (M K4 p4 ) is (fractionally) admissible and does not have a (fractional) circuit cover. In the undirected case, the Petersen graph shows that natural conditions for the existence of a circuit cover do not always su ce. However, by Theorem 3, if it is excluded, these conditions become su cient. We follow a similar approach. The main result of this section shows that the circuit cover problem has a solution when M is series-parallel. More precisely, we show that f M s ser es-parallel then (M p) has a c rcu t cover f and only f p s adm ss ble. Note that, due to the admissibility of (M K4 p4 ), this gives a complete characterization of mixed graphs for which the admissibility of (M p) is equivalent to the existence of a circuit cover of (M p). To prove the main result we use the following two lemmas. The proofs are left to the reader. Lemma 1. Let G be a ser es-parallel und rected graph, B a m n mal cut of G and C a c rcu t of G. Then B C 2. Lemma 2. Let (M p) be an adm ss ble pa r. Cons der a subset X V (M ) and e = uv (X) such that p(e) > 0, fp (X) = 0, p( − (X)) − p( + (X)) > 0, and u X. Let M 0 be the graph obta ned from M by replac ng the edge e by an arc (u v); and let p0 be the we ght funct on on M 0 , obta ned from p by sett ng p0 (uv) := p(e). Then, the result ng pa r (M 0 p0 ) s adm ss ble. The proof of the main theorem is inspired by the proof given by Seymour [16] in the special case of undirected planar graphs, and the proof of Theorem 3 of Alspach et al. Theorem 5. If M s ser es-parallel then (M p) has a c rcu t cover f and only f p s adm ss ble. Proof. (Outline) Let M be a series-parallel mixed graph. Clearly, if (M p) has a circuit cover then p is admissible. Now suppose p is admissible and let us prove that (M p) has a circuit cover. We can assume that p(e) > 0 for every element e E A since otherwise we could delete e from M . Applying Lemma 2 we can also assume that if fp (X) = 0 then (X) = . The proof is by induction on p(E). If p(E) = 0 then M is a directed graph and p is a circulation in M . In this case, the result follows from Theorem 1. Suppose p(E) > 0. If there is a cut (X) such that (X) = (X) = 2, then
Circuit Covers in Series-Parallel Mixed Graphs
231
we can contract one of the edges of this cut and apply the induction hypothesis. Thus, we can assume there are no such cuts. If there is an edge e0 with p(e0 ) = 1 then we can assign an arbitrary orienta(X), it follows tion to e0 . Since fp (X) 2 for every = X V such that e0 that the resulting pair (M 0 p0 ) is admissible. Using the induction hypothesis the result follows. Let e0 = xy be an edge such that p(e0 ) is maximum. We can assume that p(e0 ) 2. Let p0 := p − 2 e0 . We claim that (M p0 ) is admissible (the proof is left to the reader). Since (M p0 ) is admissible and p0 (E) = p(E)− 2, by the induction hypothesis (M p0 ) has a circuit cover. Then there is a list L0 (with possible repetitions) of circuits such that p0 = C2L0 C . Our aim is to consider appropriate circuits in L0 that do not contain e0 and partition them into two paths, so that each path can be extended to a circuit containing the edge e0 . For that, de ne an auxiliary undirected graph H as follows. Take V (H) := V and for each circuit C in L0 that do not contain e0 construct in H a circuit C that is the underlying circuit of C. Label the edges of this circuit with C . We claim there is a path from x to y in H. In fact, it is not di cult to prove, using Lemma 1, that if this does not hold we have a contradiction to the admissibility of (M p). Take a shortest path from x to y in H. For each section of this path corresponding to edges with the same label, take only one representative, and conCk ). Clearly, there are no resider the sequence of such labels, say (C1 peated circuits and V (Ci ) V (Cj ) = if and only if − j = 1. We claim that k−1. To prove this, one can show that if this V (Ci ) V (Ci+1 ) = 1 for = 1 does not happen then the graph induced by the edge e0 together with the edges in these circuits contains a subgraph homeomorphic to K4 , a contradiction. k) intersect each other, it is easy Considering how the circuits Ci (1 to see that Ci can be partitioned into a path from x to y and a path from y to x, say, P 0 and Q0 . Let P (respectively, Q) be the circuit obtained from P 0 (respectively, Q0 ) by adding the edge e0 . Now let L := L0 − C1 Ck P Q . By construction, L contains only circuits of M , and furthermore p = C2L C . This shows that (M p) has a circuit cover.
Corollary 1. If M s ser es-parallel then (M p) has a fract onal c rcu t cover f and only f p s fract onally adm ss ble. To conclude this section we note that (the conditions (a), (b) and (c) of) admissibility is checkable in polynomial time. The reader can verify this. We only mention that checking condition (a) reduces to checking whether an appropriate directed graph D with capacities l u satis es Ho man’s condition in Theorem 1. This can be checked by a single max-flow computation (see [2], Sec. 6.7).
232
4
Orlando Lee and Yoshiko Wakabayashi
H lbert Bases and Algor thm c Aspects
Let H be a nite subset of Qn . Consider the following sets generated by H. C one(H) =
hh
:
h
Q+
hh
:
h
ZZ
hh
:
h
ZZ +
h2H
Lat(H) = h2H
I ntCone(H) = h2H
They are called the cone, latt ce and nteger cone of H, respectively. Clearly, I ntCone(H) C one(H) Lat(H). We say that a set H is a H lbert bas s if I ntCone(H) = C one(H) Lat(H) (see [14,15] for a more detailed account of this subject). Here we are interested in the case where H is the set of (incidence vectors of) circuits of a mixed graph M . We denote the above sets by C one(C) Lat(C) and I ntCone(C). Using this terminology, the fractional circuit cover problem is equivalent to asking whether a given vector p belongs to the cone of the circuits of M ; and the circuit cover problem asks whether p belongs to the integer cone. Corollary 1 says that if M = (V E A) is series-parallel then p C one(C) if and only if p is fractionally admissible. Furthermore, since for every p C one(C) Lat(C), the vector p must be admissible, it follows from Theorem 5 that the circuits of a series-parallel mixed graph M forms a Hilbert basis (p I ntCone(C) if and only if p is admissible). In view of Theorem 4, it is unlikely that we can nd a complete characterization of C one(C) and I ntCone(C) for arbitrary mixed graphs. A nice result in the theory of Hilbert bases is the following theorem of Sebo [15]. Theorem 6 (Sebo, 1990). Let H Qn be a H lbert bas s. Then every vector of I ntCone(H) can be wr tten as a nonnegat ve nteger l near comb nat on of at most 2n − 2 vectors of H. It has been conjectured that 2n−2 can be replaced by n in the above theorem (note that this bound would be sharp). For Hilbert bases related to some wellknown combinatorial problems this is true [15]. So is the case for Hilbert bases of circuits in a series-parallel mixed graph, as we show in the next theorem. Theorem 7. Let M = (V E A) be a ser es-parallel m xed graph and C the set of c rcu ts of M . Then every vector p of I ntCone(C) can be wr tten as a nonnegat ve nteger l near comb nat on of at most m = E A c rcu ts. The proof of the theorem is by induction on the dimension of a minimal face of C one(C) that contains p. We leave it to the reader. We turn now to the algorithmic aspect of the problem of nding a circuit cover of an admissible pair (M p), where M is series-parallel. Unfortunately, our
Circuit Covers in Series-Parallel Mixed Graphs
233
proof of Theorem 7 does not give directly a polynomial-time algorithm to nd such a circuit cover. However, if we do not require that the circuit cover uses at most m circuits (but a polynomially bounded number of them) then we can design a polynomial-time algorithm that relies on the ellipsoid method. For that, we need a polynomial separation algorithm that is interesting in its own right: nding a minimum circuit in a series-parallel mixed graph. We describe two algorithms to nd a circuit cover. The rst algorithm is based on the proof of Theorem 5 but it has pseudo-polynomial running time. The second algorithm is an elegant polynomial procedure that uses the rst one and is based on the algorithm presented in [1]. C rCov1 Algor thm Input: An admissible pair (M p), where M = (V E A) is series-parallel. Output: A circuit cover L0 of (M p). 1. Delete arcs and edges of weight 0. Contract any edge that is in a cut (X) such that (X) = (X) = 2. 2. If there is an edge e in a cut (X) such that fp (X) = 0 then assign an orientation to e according to Lemma 2. 3. If p(E) = 0 then return a circuit cover L0 (for directed graphs this is trivial) and halt. 4. If there is an edge with weight 1 then assign an arbitrary orientation to it. Call CirCov1 recursively to nd a circuit cover L0 of the new graph, return L0 and halt. 5. Let e0 = xy be an edge with maximum weight. 6. Call CirCov1 recursively to nd a circuit cover L0 of (M p − 2 e0 ). 7. As in the proof of Theorem 5, nd a shortest (x y)-path in the auxiliary Ck be the arc labels along this path. undirected graph H. Let C1 . Decompose Ci into an (x y)-path P 0 and a (y x)-path Q0 . Let P := P 0 (y e0 x) and Q := Q0 (x e0 y). Ck P Q and halt. Return the circuit cover L := L − C1 Steps 1 and 2 require O( E ) max-flow min-cut computations. Step 3 can be done in O( A V ) time. The total number of calls of CirCov1 is bounded by p(E) 2 as the total weight of the edges in each successive pair (M p) is reduced by 2. So CirCov1 is a pseudo-polynomial algorithm. We discuss now how to obtain a polynomial algorithm from CirCov1. The idea is to formulate the circuit cover problem for (M p) as an integer program and to solve its relaxation. Then we separate out the fractional part of the resulting solution to de ne a new weight p0 (with relatively small entries), and use CirCov1 to solve the circuit cover problem for (M p0 ). A circuit cover of (M p) is obtained by adjoining the partial circuit cover corresponding to the integral part of the linear program solution and the circuit cover found by CirCov1. In what follows, N denotes the circuit- edge,arc incidence matrix of M and 1 denotes the vector of C ones.
234
Orlando Lee and Yoshiko Wakabayashi
C rCov2 Algor thm Input: An admissible pair (M p), where M = (V E A) is series-parallel. Output: A circuit cover (L ) of (M p), where L is a list containing at most 2 E A − 1 circuits and is a multiplicity vector whose entries are bounded by r := max p(e) : e E A . 1. Find a basic feasible solution max
=(
C )C2C
to the following linear program:
1: N =p
0
(1)
:= − be the integral and the fractional 2. Let := ( C )C2C and N = p− N . (Note that, since p0 is a nonparts of , and let p0 := negative combination of circuits, (M p0 ) is fractionally admissible. As p and N are Eulerian, then p0 is also Eulerian. Thus (M p0 ) is admissible.) 3. Call CirCov1 with input (M p0 ) to obtain a circuit cover L0 of (M p0 ). ), where S := C C : C > 0 , and 4. Adjoin L0 to the circuit cover (S return the resulting circuit cover (L ). Halt. Let us show that L is polynomially bounded. As is a basic solution, 1 1= 1+ 1, and so we have S E A . Furthermore, L0 + 1. Since each nonzero entry in is less than 1 we have 1< E A L0 E A − 1. Thus L S + L0 2 E A − 1. As and therefore L0 0 0 L < E A , we conclude that Step 3 can be done max p (e) : e E A in polynomial time. It remains to show how to solve Step 1 in time bounded by a polynomial in E A log(r), despite the exponential number of variables C . For that, consider the dual linear program of (1): min px : N x
1
(2)
The separat on problem for (2) is the following: G ven a rat onal vector x, e ther cert fy that x sat s es N x 1, or nd a v olated nequal ty (a c rcu t n (M x) hav ng we ght less than 1). A theorem of Gr¨otschel, Lovasz and Schrijver (see [11]) implies that a basic optimal solution of (1) can be found via the ellipsoid method in time polynomially bounded by E A and the input length of p, provided that we can solve the separation problem for (2) in time polynomially bounded by E A and the input length of x. For that, we can use a polynomial-time algorithm that nds a minimum circuit in the weighted series-parallel mixed graph (M x). In the last section we describe such an algorithm. It would be interesting to nd a combinatorial (polynomial) algorithm for this case. It is not di cult to design such an algorithm when M is an und rected series-parallel graph using Seymour’s proof of Theorem 2, and the technique in the proof of Theorem 7.
Circuit Covers in Series-Parallel Mixed Graphs
5
235
Euler an M xed Graphs and C rcu t Double Covers
Recall that a mixed graph M is Euler an if its edge,arc -set can be partitioned into circuits, that is, (M 1) has a circuit cover. We saw in Sect. 3 a characterization of Eulerian series-parallel mixed graphs. In fact, the admissibility of (M 1) is a necessary and su cient condition for any mixed graph M to be Eulerian, according to the next theorem (see [10,7]). In what follows, we use the notation fM (X) := (X) + + (X) − − (X) . Theorem 8. A m xed graph M s Euler an f and only f fM (X) s even for every X V (M ).
0 and fM (X)
We can try to extend this result studying circuit p-covers where p is an integral constant vector. The problem is to characterize which mixed graphs M have a circuit p-cover. Note that the condition fM (X) 0 for every X V (M ) is clearly a necessary one. If p is odd it is easy to prove (using Theorem ) that M has a circuit p-cover if and only if it is Eulerian. The interesting case is when p is even. We say that M has a c rcu t double cover if (M 2) has a circuit cover, or in other words, there is a list of circuits of M such that each edge and each arc appears in exactly two of them. It is not known a complete characterization for the existence of a circuit double cover (p-cover) in an arbitrary mixed graph. This is trivial for directed graphs, as a directed graph has a circuit p-cover (for any p) if and only if it is Eulerian. For undirected graphs, the case p odd is equivalent to characterizing Eulerian graphs, which is well-known (and also a particular case of Theorem ). The case p = 4 and the case p = 6 were solved by Jaeger [13,5] and Fan [9], respectively. These results together settle the case p = 2k for any k 2. Theorem 9. Every br dgeless und rected graph has a c rcu t 4-cover and a c rcu t 6-cover. Corollary 2. For any k 2k-cover.
2, every br dgeless und rected graph has a c rcu t
The case p = 2 is the famous Circuit Double Cover Conjecture. Conjecture. Every br dgeless und rected graph has a c rcu t double cover. This conjecture was solved for some special classes of graphs, as for example, graphs with no P10 -minor (Theorem 3). See [1] for further references on this topic. For mixed graphs, Theorem 5 gives a characterization for the series-parallel graphs. Theorem 10. A br dgeless ser es-parallel m xed graph M has a c rcu t double cover (or a 2k-cover) f and only f fM (X) 0 for every X V (M ).
236
Orlando Lee and Yoshiko Wakabayashi
In view of the preceding result, a natural question arises here. Is t poss ble to drop the ser es-parallelness hypothes s n Theorem 10 ? Note that an a rmative answer to this question would settle the Circuit Double Cover Conjecture. A natural relaxation of this problem is the following. Suppose that M is an 0 for every X V (M ). arbitrary bridgeless mixed graph such that fM (X) Is t true that (M 1) has a fract onal c rcu t cover? In the undirected case it is easy to show that 1 C one(C(G)) for every bridgeless undirected graph G. But here we do not have a characterization of C one(C(M )) and we could not nd a proof of that inclusion for mixed graphs, neither a counterexample. We note that to answer these two questions, we can restrict ourselves to mixed graphs M = (V E A) such that (V A) is Eulerian and (V E) is acyclic. In view of the lack of a solution for the fractional circuit cover problem and Theorem 4, answering these problems seems to be very di cult.
6
M n mum C rcu ts n Ser es-Parallel Graphs
In this section, we investigate the problem of nding a minimum circuit in a weighted series-parallel mixed graph (M w). This is needed to solve the separation problem mentioned in Sect. 4. Here, without loss of generality we assume M is strongly connected, has at least one circuit and does not have cut vertices. As w can have negative entries, we must deal with negative circuits. It is known that for arbitrary mixed graphs the problem of detecting negative circuits is NP-complete [4]. We show here how to compute in polynomial time a minimum circuit in a series-parallel mixed graph (M w) for any weight w. The next result is often used in this section. The proof follows from the following classical theorem due to Dirac [ ]: If G s a s mple 2-connected graph w th m n mum degree at least 3 then G conta ns a subgraph homeomorph c to K4 . Lemma 3. If G s a 2-connected ser es-parallel graph w th V (G) has a vertex w th exactly two ne ghbours.
3 then G
We say that a vertex with exactly two neighbours is a spec al vertex. Since we can compute in polynomial time a minimum circuit of length two in M , if we knew how to nd a minimum circuit of length at least three in M we could nd a minimum circuit in M simply by taking the lightest of the two solutions. To solve the latter problem, we can restrict ourselves to directed graphs: we replace each edge e of M with two anti-parallel arcs, each one with weight w(e). Thus, it su ces to show how to solve the following problem: MC3P(D,w): Given a series-parallel directed graph (D w), nd, if it exists, a minimum directed circuit C of length at least 3. To describe our algorithm we need to consider the following two operations on a directed graph (D w). It is easy to see that at least one of these operations can always be carried over a weighted series-parallel directed graph (D w).
Circuit Covers in Series-Parallel Mixed Graphs
237
Parallel Arc Delet on (PAD) 1. Let a and b be two parallel arcs; 2. Set D0 := D − b and w(a) := min w(a) w(b) ; 3. Let w0 be the restriction of w to D0 . Spec al Vertex El m nat on (SVE) (assuming D does not have parallel arcs) Let v be a special vertex and x and y its neighbours; 1. If (x v) (v x) (y v) (v y) A(D) then set D0 := D (y v) (v y) ; w(x v) := w(x v) + w(v y); w(v x) := w(v x) + w(y v); let w0 be the restriction of w to D0 ; 2. If (x v) (v x) (v y) A(D) and (y v) A(D) then set D0 := (D − (v x)) (v y); w(x v) := w(x v) + w(v y); let w0 be the restriction of w to D0 ; 3. If (x v) (v y) A(D) and (v x) (y v) A(D) then set D0 := D (v y); w(x v) := w(x v) + w(v y); let w0 be the restriction of w to D0 ; Denote by γ(D w) the weight of a minimum circuit of length at least 3 in (D w). Clearly, if (D0 w0 ) is obtained from (D w) by a PAD operation then γ(D w) = γ(D0 w0 ). On the other hand, if (D0 w0 ) comes from a SVE operation then γ(D w) = min w(C) γ(D0 w0 ) where C is a minimum circuit of length 3 containing v x and y (if there is no such a circuit, set w(C) = + ). As D is series-parallel and does not have cut vertices it has a special vertex by Lemma 3. Moreover, operations PAD and SVE preserve series-parallelness and does not create cut vertices. So, a simple polynomial-time algorithm that solves MC3P(D w) consists of successive applications of these two operations. This gives us immediately a polynomial algorithm to nd a minimum circuit in a series-parallel mixed graph.
7
Conclud ng Remarks
Several interesting questions concerning circuit covers in mixed graphs still remain open. Among them we mention the (fractional) circuit cover problem in planar graphs, the circuit double cover problem and its fractional relaxation. As far as the series-parallel mixed graphs, now that we know that there is a polynomial-time (ellipsoid-based) algorithm for the circuit cover problem, it would be interesting to nd a combinatorial one. The proof of Theorem 5 presented here was only outlined and need be worked out in more detail. This, as well as the proofs not given here and other related results will be presented elsewhere.
23
Orlando Lee and Yoshiko Wakabayashi
References 1. B. Alspach, L. Goddyn, and C.Q. Zhang. Graphs with the circuit cover property. Trans. Am. Math. Soc., 344(1):131 154, 1994. 2. R.K. Ahuja, T.L. Magnanti, and J.B. Orlin. Network Flows: Theory, Algor thms and Appl cat ons. Prentice Hall, 1993. 3. E.M. Arkin and C.H. Papadimitriou. On the complexity of circulations. J. Algor thms, 7:134 145, 19 6. 4. E.M. Arkin. Complex ty of Cycle and Path Problems n Graphs. PhD thesis, Stanford University, 19 6. 5. J.C. Bermond, B. Jackson, and F. Jaeger. Shortest coverings of graphs with cycles. J. Comb. Theory Ser. B, 35:297 30 , 19 3. 6. J.A. Bondy and U.S.R. Murty. Graph Theory w th Appl cat ons. MacMillan Press, 1976. 7. V. Batagelj and T. Pisanski. On partially directed eulerian multigraphs. Publ. de l’Inst. Math., Nouvelle serie 25(39):16 24, 1979. . G.A. Dirac. A property of 4-chromatic graphs and some remarks on critical graphs. J. London Math. Soc., 27: 5 92, 1952. 9. G. Fan. Integer flows and cycle covers. J. Comb. Theory Ser. B, 54:113 122, 1992. 10. L.R. Ford and D.R. Fulkerson. Flows n Networks. Princeton U. Press, Princeton, 1973. 11. M. Gr¨ otschel and L. Lovasz and A. Schrijver. Geometr c Algor thms and Comb nator al Opt m zat on. Springer-Verlag, 19 . 12. A.J. Ho man. Some recent applications of the theory of linear inequalities to extremal combinatorial analysis. In Proc. Symp. Appl. Math, volume 10, 1960. 13. F. Jaeger. Flows and generalized coloring theorems in graphs. J. Comb. Theory Ser. B, 26:205 216, 1979. 14. A. Schrijver. Theory of L near and Integer Programm ng. Wiley, 19 6. 15. A. Sebo. Hilbert bases, Caratheodory’s theorem and combinatorial optimization. In R. Kannan and W.R. Pulleyblank, editors, Integer Programm ng and Comb nator al Opt m zat on Proceed ngs, pages 431 456, Waterloo, 1990. University of Waterloo Press. 16. P.D. Seymour. Sum of circuits. In Graph Theory and Related Top cs, pages 341 355. Academic Press, N. York, 1979.
A L near T me Algor thm to Recogn ze Clustered Planar Graphs and Its Parallel zat on El as Dahlhaus Department of Mathemat cs and Department of Computer Sc ence, Un vers ty of Cologne dahlhaus@ nformat k.un -koeln.de and Dept. of Computer Sc ence Un vers ty of Bonn dahlhaus@@cs.un -bonn.de Germany Abs rac . We develop a l near t me algor thm for the follow ng problem: G ven a graph G and a h erarch cal cluster ng of the vert ces such that all clusters nduce connected subgraphs, determ ne whether G may be embedded nto the plane such that no cluster has a hole. Th s s an mprovement to the O(n2 )-algor thm of Q.W. Feng et al. [6] and the algor thm of Lengauer [12] that operates n l near t me on a replacement system. The s ze of the nput of Lengauer’s algor thm s not necessar ly l near w th respect to the number of vert ces.
1
Introduct on
In VLSI-des gn and n draw ng gures, the follow ng problem comes up. The nodes of a graph are part t oned nto subd v s ons (clusters) and and the clusters are aga n d v ded nto clusters and so on. One would l ke to embed the nodes n a cluster qu te closely. In VLSI-des gn one s nterested to put nodes nto the same cluster that belong to the same electron c un t (see for example [8]). Other appl cat ons appear n software v sual zat on [19] and n knowledge representat on [9]. The deal case would be when the graph could be embedded nto the plane, such that edges do not cross and the clusters look n cely . That means clusters should appear as connected areas w thout holes. One algor thm that recogn zes clustered graphs w th connected clusters s due to C.W. Feng [6]. The clusters are g ven by a cluster tree. In her structure, the s ze of the nput s n the order of the number of vert ces. The t me bound s O(n2 ). A very rst algor thm s due to Lengauer [12]. The nput s g ven by a replacement system. The s ze of the nput m ght exceed the order of the number of vert ces. The t me bound s l near n the order of the nput s ze, but not l near n the order of the number of vert ces. As n [6], we assume that the clusters nduce connected subgraphs of the g ven graph. Th s s qu te reasonable, because vert ces n the same cluster should be close to each other w th respect to adjacency. C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 239 248, 1998. c Spr nger-Verlag Berl n He delberg 1998
240
El as Dahlhaus
Here we present an O(n) t me algor thm to recogn ze clustered planar graphs. The algor thm s based on the decompos t on of a graph nto ts 3-connected components . The major dea of the new algor thm s that we we ght the clusters by the r s zes and we ght each edge by the we ght of the cluster of m n mum s ze that conta ns t. We also we ght each face by the max mum we ght of ts adjacent edges. We shall see that a planar embedd ng s a clustered planar embedd ng f and only f for each number , the faces of we ght together w th the edges of we ght form a connected subgraph of the dual graph. In Sect. 2, we ntroduce the notat on that s necessary for the whole paper. In Sect. 4, we d scuss the decompos t on of planar graphs nto un quely embeddable components. In Sect. 3, we show the key result that character zes clustered planar embedd ngs as ment oned n the last paragraph. In Sect. 5, we ntroduce the recogn t on algor thm for clustered planar graphs, show ts correctness, and analyze ts sequent al and parallel complex ty. In the last sect on, we n sh w th some conclud ng remarks.
2
Notat on and Bas c De n t ons
A graph G = (V E) cons sts of a vertex set V and an edge set E. Mult ple edges and loops are not allowed. An nduced subgraph s an edge-preserv ng subgraph, that means (V 0 E 0 ) s an nduced subgraph of (V E) V 0 V and E 0 = xy E : x y V 0 . Trees are always d rected to the root. The not on of the parent, ch ld, ancestor, and descendent are de ned as usual. We denote by n the number of vert ces and edges of G. In planar graphs n s n the order of the number of vert ces. We call a graph k-connected f for each pa r a and b of vert ces, there are k from a to b that have pa rw se a and b as only common vert ces. A graph s planar f t can be embedded nto the plane, such that edges do not cross. An edge cross ng free embedd ng of a planar graph nto the plane s also called a planar embedd ng. The areas the plane s subd v ded by a planar embedd ng are called the faces of the planar embedd ng. The comb nator al embedd ng of a planar embedd ng of G cons sts of the clockw se enumerat ons of the nc dent edges of the vert ces of G. Two comb nator al embedd ngs (fv )v2V and (gv )v2V are equ valent f for each vertex v, the enumerat ons fv and gv de ne the same cycl c or entat on of the nc dent edges of v. The reversal of a comb nator al embedd ng (fv )v2V s the comb nator al embedd ng (gv )v2V that comes up by revers ng the enumerat ons of nc dent vert ces. Two comb nator al embedd ngs are weakly equ valent f they are equ valent or one s equ valent to the reversal of the other. The dual graph Demb of a planar embedd ng emb of G cons sts of the set F of faces of emb as vert ces the edges of Demb are the pa rs of faces that share an edge of G. Note that each face of emb are determ ned by the counterclockw se enumerat on of ts nc dent edges and the faces are un quely determ ned by the comb nator al embedd ng assoc ated w th emb. Lemma 1. (see for example [11]) All comb nator al embedd ngs of a 3-connected planar graph are weakly equ valent.
A L near T me Algor thm to Recogn ze Clustered Planar Graphs
241
A cluster ng of a set V s a subset C of the power set of V , such that for all c d C, e ther c and d are d sjo nt or they are comparable w th respect to the subset relat on and V C. The elements of C are called clusters. One can code a cluster ng also by a cluster tree or dendrogram T , such that the node set of T s C and for each c C, c = V , the parent of c s the smallest d C w th c d. A cluster ng C of V s called a connected cluster ng of G = (V E) f each cluster c C nduces a connected subgraph of G. G ven a graph G = (V E) and a connected cluster ng C on the vertex set V , a clustered planar embedd ng emb s a planar embedd ng of G together w th a mapp ng embC from C to the set of areas of IR2 , such that (1) for each c C, embC (c) s a connected closed subset of IR2 and IR2 embC (c) s a connected open subset of IR2 ( .e. embC (c) has no hole), (2) for each vertex v V and each c C, v c f and only f emb(v) embC (c), (3) c and d are d sjo nt (c (embC (c) embC (d)).
d) f and only f embC (c) and embC (d) are d sjo nt
A graph G = (V E) together w th a cluster ng C on V s called a clustered planar graph f there s a clustered planar embedd ng of G and C.
3
A Comb nator al Character zat on of Clustered Embedd ngs
An almost tr v al observat on s the follow ng. Propos t on 1. A planar embedd ng emb of G = (V E) can be extended to a clustered planar embedd ng of G and C f and only f for each c C, the edges jo n ng a vertex n c w th a vertex not n c form a cycle n the dual graph of emb. We also call a planar embedd ng emb of G a clustered planar embedd ng of G and C f t can be extended to a clustered planar embedd ng. G ven a cluster ng C of V . For any edge e = vw, let we ght(e) be the s ze of the m n mum cluster c C that conta ns v and w. The we ght of a face f , say we ght(f ) s the max mum we ght of an edge that s nc dent w th f (that appears at the border of f ). and Ei be For each , let Fi be the set of faces f of emb w th we ght(f ) the set of edges e of G w th we ght(e) . Theorem 1. A planar embedd ng emb of G = (V E) s a clustered planar embedd ng of G and C f and only w th we ght, Fi and Ei de ned as above, for each w th Fi = , (Fi Ei ) s a connected subgraph of the dual graph of emb and a face of max mum we ght s taken as outer face. Proof. (Sketch.) Suppose Fi s not connected. Then a component c of Fi s surrounded by cycle of edge we ghts < and therefore by a cluster of we ght < . Therefore the cluster of s ze that conta ns has a hole. The converse may be proved n a s m lar way.
242
4
El as Dahlhaus
The Decompos t on of Planar Graphs
We always can produce planar graphs by a graph grammar as follows (for graph grammars see for example [12]). We ntroduce term nal edges that are edges of the graph and nonterm nal edges that can be replaced. Here we start w th two vert ces x and y that are jo nt by a term nal edge of max mum we ght and one nonterm nal edge. A nonterm nal edge g = uv can be replaced by (1) (parallel replacement) a set of nonterm nal and term nal edges hav ng the same end vert ces as f or (2) (sequent al replacement) by a path of term nal and nonterm nal edges startng w th u and end ng w th v, or (3) by a graph Gg cons st ng of term nal and nonterm nal edges, such that Gg uv s 3-connected. Each two-connected graph can be produced n that way n O(n) t me (see also for example [15]) or n O(log n) t me on a CRCW-PRAM w th O(n) processors [16]. Th s der vat on scheme s s m lar to the P-Q-trees [3]. For subst tut on decompos t on n general structures, see [14]. To get all connected planar graphs, one has to allow the follow ng add t onal product on rules. For a vertex v, one can do the follow ng. (1) add a term nal or nonterm nal edge e w th one endpo nt v and a new vertex as the other endpo nt, (2) add a two-connected graph Gv where one vertex n Gv s dent ed w th v. v s also called an art culat on vertex. Note that we can bu ld up a der vat on tree TD w th groot = xy as root node and the term nal edges as leaves. The nner nodes are nonterm nal edges g and the ch ldren of any nonterm nal edges are the edges that are produced by replacement of g n one step. We denote by Hg the graph cons st ng of those term nal edges that are descendents of g. Note that the der vat on tree s almost dent cal to the PQ-tree of a planar graph [3]. The ax s we ght of g, say aw(g) s the we ght of the smallest cluster c, such that the endpo nts u and v of g are n the same connected component of Hg [c]. We select a path pg jo n ng u and v n Hg [c] and call th s path pg the ax s of g. The max mum we ght mw(g) s the max mum we ght of an edge appear ng n Hg . Note that the endpo nts of g spl t the outer cycle of Hg nto two paths, say p1 (g) p2 (g). Any path of the dual graph of any embedd ng of G that crosses both paths p1 (g) and p2 (g) has an edge of we ght aw(g). For any face f of emb, call a nonterm nal or term nal edge g max mal nc dent w th f f the surround ng cycle of f shares an edge w th Hg , Hg does not conta n all edges of the surround ng cycle of f , and for the parent g 0 of g, Hg0 conta ns all edges surround ng f . The ax s we ght aw(f ) of f s the max mum ax s we ght of max mal nc dent edges of f .
A L near T me Algor thm to Recogn ze Clustered Planar Graphs
243
Theorem 2. If the embedd ng emb s a clustered planar embedd ng then the we ght w(f ) of any face s dent cal to ts ax s we ght aw(f ). Proof. (Sketch.) Otherw se f s surrounded by a cycle of smaller max mum we ght and one gets a cluster w th a hole. Let g = uv be a nonterm nal edge and suppose g s replaced n one step by uv where Gg s a 3-connected graph. Let Hg be the graph cons st ng of Gg the term nal edges der ved from g. Let Fg be the set of all nner faces of Hg that are not faces of an Hg0 , such that g 0 s a descendent of g and Fg0 be the set of all nner faces of Gg . Theorem 3. There s a one one correspondence q = qg between Fg0 and Fg . The border edges of any f 0 Fg0 are the max mally nc dent edges of q(f 0 ). Note that the end vert ces u and v of g are forced to be n the outer face of Hg and therefore u and v have to be n the outer face of Gg . That means the embedd ng of Gg s un que. Therefore also the set of nner faces of Gg s un quely determ ned, .e. ndependent of the embedd ng of the whole graph. Therefore the ax s we ghts of the faces n Fg are un quely determ ned, .e. ndependent of the embedd ng.
5
The Algor thm
F rst we assume that the planar graph G s 2-connected. Here we rst recurs vely compute the ax s we ghts and the axes. W th the knowledge of the axes, we determ ne the w ngs and the r max mum we ghts. We have to d st ngu sh the d erent cases how a non term nal edge s replaced. W th the use of the w ng we ghts we get a planar embedd ng. F nally we have to check whether we have a clustered embedd ng. The extens on of the algor thm to connected planar graphs n general can be done qu te eas ly. 5.1
Comput ng a Clustered Embedd ng for 2-Connected Graphs
Ax s We ghts of Nonterm nal Edges. F rst we compute a m n mum spann ng tree TS w th respect to the we ghts. We can do th s n l near t me as follows. For each cluster c, let Ec be the set of edges e of E, such that c s the smallest cluster conta n ng both end vert ces of e. We replace each edge e = uv of Ec by cu cv where cu and cv are the ch ld clusters of c that conta n u and v respect vely. For each cluster c, we compute a spann ng tree Tc of Ec , and for each edge cu cv of Tc , we p ck a representat ve edge uv of G and put t nto an edge set Tc0 . Then the m n mum spann ng tree of G s just the un on of all Tc0 . Th s can be parallel zed, and we get a t me bound of O(log n) and a processor bound of O(n) on a CRCW-PRAM [17] Next we root TS to the vertex x where x y are the two vert ces of the root graph. Now cons der any nonterm nal edge g = uv that s nally replaced by Hg . We d st ngu sh between Introverted Nonterm nals: The parent of u or the parent of v s n Hg and Soc al Nonterm nals: The parents of u and of v are not n Hg .
244
El as Dahlhaus
Ax s We ghts of Introverted Nonterm nals. Suppose the parent of u w th respect to TS s n Hg . Then v s an ancestor of u. The ax s we ght of g s just the max mum we ght of an edge on the un que path from u to v n TS . To get th s e c ently, we have to proceed as follows. Check Introvertedness: Check that the edge uP arentTS (u) s a descendent of g n the der vat on tree TD of G. Determ ne Ax s We ghts Recurs vely: Note that the edges of Gg on the un que path from u to v are all ntroverted. Determ ne the max mum ax s we ght on the un que path from u to v. The recurs on can be done n O(n) t me sequent ally and by parallel tree contract on n logar thm c t me w th a l near workload [1]. Ax s We ghts of Soc al Nonterm nals. Note that f g = uv s soc al then TS [Hg ] s spl t nto exactly two trees Tu and Tv w th root u and v respect vely. Note that the we ghts of edges jo n ng a vertex n Tu and a vertex of Tv are of a we ght at least the max mum we ght of all edges Tu or Tv . Note that any edge g 0 that jo ns a vertex n Tu and a vertex of Tv s soc al. Therefore the ax s we ght Gg jo n ng Tu w th Tv . The ax s of g s the m n mum we ght of an edge g 0 we ght of g s the m n mum ax s we ght of an edge n Gg that jo ns Tu and Tv . Th s s aga n a recurs ve procedure that can be done n l near sequent al t me and n parallel n logar thm c t me w th a l near workload. The max mum we ght max(g) of any component Hg can tr v ally be determ ned n O(n) t me and n O(log n) t me w th O(n log n) processors on an EREW-PRAM. Determ n ng the Axes. Let Gg be the graph that s der ved mmed ately from g. W th u and v as end vert ces of g, the ax s of Gg s a path from u to v n Gg w th a max mum ax s we ght that s dent cal w th the ax s we ght of g, and the ax s of Hg (the term nal graph that s der ved from g) s a path from u to v n Hg that s a re nement of the ax s of Gg and that has as max mum we ght the ax s we ght of g. We determ ne for each Gg , a m n mum spann ng tree Tg as follows. If an edge g 0 of Gg s ntroverted then we put g 0 nto Tg . If g s ntroverted then we are done. Otherw se we add an edge of m n mum we ght jo n ng Tu and Tv to Tg . The ax s of g s the un que path of Tg from u to v. The ax s of Hg spl ts Hg n any embedd ng nto two parts, say a left s de and a r ght s de of Hg . These parts are called w ngs of Hg . Note that a w ng w th max mum we ght k has to be turned nto the d rect on of a face of ax s we ght at least k. Max mum We ghts of W ngs. Nonterm nal Edges Replaced by 3-Connected Graphs. The outer cycle of Gg spl ts nto two paths, say p1 and p2 , from u to v. An edge of pi s put nto the w ng Wig f t does not belong to the ax s of g. If an edge g 0 of pi belongs to the
A L near T me Algor thm to Recogn ze Clustered Planar Graphs
245
ax s of g then t s put nto Wig f the nner nc dent face s of an ax s we ght that s smaller than the max mum we ght of g 0 . The w ng we ght wig of Wig s the max mum of the max mum we ghts of Wig . Nonterm nal Edges Replaced by Parallel Edges. For each g 0 , let Ig0 be the nterval [aw(g 0 ) max(g 0 )). We determ ne a spann ng forest Fg of the ntersect on graph of the Ig0 , g 0 der ves from g. We group the g 0 appear ng n a tree T 0 nto two sets 0 0 S1T and S2T , such that g 0 and g 00 belong to d erent sets f the correspond ng 0 ntervals ntersect. Let S1T be that set w th the greater max mum value. S1g and 0 0 S2g are the un ons of the S1T and S2T respect vely. If S2g s not empty then w1g g and w2 are the max mum max mum we ghts of S1g and S2g . If S2g s empty then w2g s the smaller w ng we ght of the edge g 0 of smallest ax s we ght. It s qu te eas ly checked that n case that there s a clustered planar embedd ng, one has to order the edges n S1g and S2g n ncreas ng order w th respect to the ax s we ght n contrary d rect ons to the nc dent faces of g, .e. The outer edges are those edges of S1g and S2g are those w th max mum ax s we ght and max mum we ght. Nonterm nal Edges Replaced by Sequent al Edges. Suppose g s replaced by the gk . The w ngs of smaller w ng we ghts are d rected to one face sequence g1 and the w ngs of larger w ng we ghts are d rected to the other face. Therefore we get as the smaller w ng we ght of g the max mum over all smaller w ng we ghts of gk and as larger w ng we ght the max mum over all larger w ng we ghts g1 gk . of g1 The sequent al t me bound to get the w ng we ghts s l near. In parallel we use tree contract on, and we get a logar thm c t me bound and a l near workload [1]. Construct on of the Planar Embedd ng. We rst construct embedd ngs of the graphs Gg . If Gg s 3-connected or a cha n then we are done. If Gg cons sts of parallel edges, the edges have to be ordered as requ red n the prev ous step to determ ne the w ng we ghts of g. Th s s done n l near t me or w th a l near processor number and n logar thm c t me. Next we have to nd out the relat ve or entat on of a ch ld component Gg0 of Gg . If g 0 s an nner edge of Gg and the nc dent faces of g 0 n Gg are f1 and f2 then we have to swap Gg f the w ng w th the greater w ng we ght s d rected to the face of smaller ax s we ght. We g ve g 0 the value −1 n that case, otherw se we g ve g 0 the value 1. If g 0 s an outer edge and g s not der ved to a sequence of edges then Gg0 has to be swapped f the nc dent face f n Gg has smaller ax s we ght than the larger w ng we ght of g 0 and the w ng w th larger w ng we ght d rected nto th s face or f has ax s we ght that s at least the larger w ng we ght of g 0 and the w ng w th larger we ght s d rected to the outer face of Gg . If g s der ved nto a sequence of edges then the w ngs of larger we ght are d rected nto the same d rect on. The value of g 0 s −1 f swapp ng has to be done. Otherw se t gets the value 1. Th s procedure can be done n l near t me sequent ally and n logar thm c t me w th a l near workload n parallel. Whether we have to swap depends on whether an odd or an even
246
El as Dahlhaus
number of swaps s necessary. We only have to mult ply, for each g, the values of ancestors of g nclud ng g to check whether we have to swap g, .e. to reverse the comb nator al embedd ng of Gg or not. Th s can be done by parallel tree contract on n logar thm c t me w th a l near workload n parallel and n l near t me sequent ally. We get comb nator al embedd ngs of all Gg that we can paste to a comb nator al embedd ng of the whole graph G. Th s can be done n l near t me sequent ally and n logar thm c t me w th a l near work load. Check ng Clusteredness of the F nal Embedd ng. We try to compute a max mum spann ng tree of the dual graph as follows. If a face f has we ght p and there s an nc dent edge w th we ght p lead ng to a face f 0 of larger we ght then we select such an f 0 as parent of f . Then, for each we ght p, we compute a spann ng forest Fq of the graph cons st ng of the faces of we ght p and the edges of we ght p. We select for each tree of Fq a root where parent has been de ned and update the parent funct on. If for all faces but one the parent s de ned, the result ng tree real zes that the embedd ng s clustered, .e. the faces of we ght q and the edges of we ght q form a connected subgraph. Note that the edge from f to the parent of f s of the we ght of f and the parent of f has a we ght that s not smaller than the we ght of f . Th s procedure can be done n l near t me and n logar thm c t me w th a l near processor number, because we can determ ne connected components n the same bounds ( n parallel see [17]). 5.2
Extens on of the Algor thm to Connected Graphs n General
We bu ld up the follow ng tree T that cons sts of the set C of 2-connected components and the set A of art culat on vert ces of G. An art culat on vertex a s jo nt by an edge w th a component c C f a c. We root T to a component cr that conta ns an edge of largest we ght. We g ve the art culat on vert ces we ghts. The we ght of a, say w(a) s the max mum we ght of an edge appear ng n a descendent component of a. We construct clustered embedd ngs, such that each art culat on vertex appears n a face of we ght at least w(a). If a s the parent of c then a appears n a face of c of max mum we ght, and we always can take such a face as the outer face. We can paste the clustered embedd ngs of the components to a clustered embedd ng by putt ng the components at a nto the face of the parent of a of max mum we ght that conta ns a. To get a clustered embedd ng of G, we have to determ ne for each 2-connected component c a clustered embedd ng, such that each art culat on vertex a n c appears n a face of we ght at least w(a). We only have to change the max mum we ght funct on and the w ng we ghts of nonterm nal edges. If a nonterm nal edge g has end vert ces u and v then max(x) s the max mum we ght of an edge n Hg and of an art culat on vertex = u v n Hg . The w ng we ghts are de ned analogously.
A L near T me Algor thm to Recogn ze Clustered Planar Graphs
247
Theorem 4. To check whether a cluster ng of a connected graph w th connected clusters has a clustered embedd ng can be done n l near t me sequent ally and n logar thm c t me w th a l near processor number on a CRCW-PRAM. Proof. We always get any planar embedd ng n these bounds [15,16]. We also get the der vat on tree TD n these bounds [13]. We can transform any planar embedd ng nto a clustered embedd ng as descr bed above n these bounds.
6
Conclus ons
It m ght also be of nterest to get a qu te n ce planar clustered embedd ng, e.g. by us ng lmc-order ngs [10] [2] or us ng the planar embedd ng algor thm of Tutte [18]. The latter has been done by Feng [5]. But t rema ns open to do th s n l near t me.
References 1. K. Abrahamson, N. Dadoun, D. K rkpatr ck, T. Przyt cka, A S mple Parallel Tree Contract on Algor thm, Journal of Algor thms 10 (1988), pp. 287-302. 2. T. B edl, G. Kant, A better heur st c for Orthogonal Graph Draw ng, ESA 94, LLNCS 855, pp. 24-35. 3. K. Booth, G. Lueker, Test ng for the Consecut ve Ones Property, Interval Graphs, and Graph Planar ty Us ng PQ-Tree Algor thms, Journal of Computer and Systems Sc ences 13(1976), pp. 335-379. 4. R. Cole, Parallel Merge Sort, 27. IEEE-FOCS (1986), pp. 511-516. 5. P. Eades, Q.W. Feng, X. L n, Stra ght L ne Draw ng Algor thms for H erarch cal Graphs and Clustered Graphs, Graph Draw ng, GD’96, LLNCS 1190, pp. 113-128. 6. Q.W. Feng, R. Cohen, P. Eades, Planar ty for Clustered Graphs, ESA’95, LLNCS 979, pp. 213-226. 7. A. G bbons, W. Rytter, E c ent Parallel Algor thms, Cambr dge Un vers ty Press, Cambr dge, 1989. 8. D. Harel, On V sual Formal sms, Commun cat ons of the ACM 21 (1988), pp. 549-568. 9. T. Kameda, V sual z ng Abstract Objects and Relat ons, World Sc ent c Ser es n Computer Sc ence, 1989. 10. G. Kant, Draw ng Planar Graphs us ng the lmc-Order ng, 33rd FOCS (1991), pp. 793-801. 11. T. Lengauer, Comb nator al Algor thms for Integrated C rcu t Lyout, Appl cable Theory n Computer Sc ence, Teubner/W ley, Stuttgart/New York, 1990. 12. T. Lengauer, H erarch cal Planar ty Test ng Algor thm, Journal of the ACM 36 (1989), pp. 474-509. 13. Y. Maon, B. Sch eber, U. V shk n, Parallel Ear Decompos t on Search (EDS) and st-Number ngs n Graphs, Theoret cal Computer Sc ence 47 (1986), pp. 277-296. 14. R. M¨ ohr ng, Algor thm c Aspects of the Subst tut on Decompos t on n Opt m zat on over Relat ons, Set Systems and Boolean Funct ons, Ann. Oper. Res., 4 (1985), pp. 195 225.
248
El as Dahlhaus
15. T. N sh zek , N. Ch ba, Planar Graphs: Theory and Algor thms, Annals of D screte Mathemat cs 32, North Holland, 1988. 16. V. Ramachandran, J. Re f, Planar ty Test ng n Parallel, Journal of Computer and Systems Sc ences 49 (1994), pp. 517-561. 17. Y. Sh loach, U. V shk n, An O(log n) Parallel Connect v ty Algor thm, Journal of Algor thms 3 (1982), pp. 57-67. 18. W.T. Tutte, How to Draw a Graph, Proceed ngs London Mathemat cal Soc ety 3, pp. 743-768. 19. C. W ll ams, J. Rasure, C. Hansen, The State of the Art of V sual Languages for V sual zat on, V sual zat on 92 (1992), pp. 202-209.
A New Character zat on for Par ty Graphs and a Color ng Problem w th Costs Klaus Jansen Max-Planck Inst tut f¨ ur Informat k, Im Stadtwald, 66 123 Saarbr¨ ucken, Germany, jansen@mp -sb.mpg.de
Abs rac . In th s paper, we g ve a character zat on for par ty graphs. A graph s a par ty graph, f and only f for every pa r of vert ces all m n mal cha ns jo n ng them have the same par ty. We prove that s a par ty graph, f and only f the Cartes an product K2 s a perfect graph. Furthermore, as a consequence we get a result for the polyhedron correspond ng to an nteger l near program formulat on of a color ng problem w th costs. For the case that the costs kv 3 = kv c for each color c 3 and vertex v V , we show that the polyhedron conta ns only ntegral 0 1 extrema f and only f the graph s a par ty graph.
1
Introduct on
A graph s a par ty graph, f and only f for every pa r of vert ces all m n mal cha ns jo n ng them have the same par ty. Par ty graphs are perfect [15] and are a subclass of the Meyn el graphs [11]. In a par ty graph, each odd cycle of length at least ve has two cross ng chords. The class of par ty graphs ncludes b part te graphs and cographs. A polynom al algor thm for the recogn t on of par ty graphs (based on three operat ons) s g ven n [3]. Furthermore, the problems max mum ndependent set, max mum cl que, m n mum color ng and m n mum part t on nto cl ques can be solved n polynom al t me for these graphs (see also [3]). Recently, C cerone and D Stefano have g ven mproved algor thms for the recogn t on, max mum we ghted ndependent set and cl que problem [5]. In th s paper, we prove that G s a par ty graph, f and only f the Cartes an product G K2 s a perfect graph. A part al character zat on of the Cartes an product G K2 has been g ven already by Rav ndra and Parathasarathy [13]. Independently, the character zat on of par ty graphs was found also by de Werra and Hertz [6]. We g ve a d rect proof that G K2 s a perfect graph for each par ty graph G. Our proof conta ns also an nterest ng algor thm to color each nduced subgraph of G K2 us ng the comb nator al structure of the par ty graphs. An nd rect shorter proof (w thout g v ng an algor thm) was found by Reed [14]. Furthermore, we study an nteger program formulat on of a spec al case of the general opt mum cost chromat c part t on (GOCCP) problem. The GOCCP problem can be descr bed as follows: An nstance s g ven by an und rected graph C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 249 260, 1998. c Spr nger-Verlag Berl n He delberg 1998
250
Klaus Jansen
G = (V E) w th n vert ces and by a (n m) - cost-matr x (kv c ) w th unrelated costs kv c to execute job v on mach ne c. The problem s to nd a part t on m Um such that c=1 v2Uc kv c s of the graph G nto ndependent sets U1 m n mum. A subproblem w th only mach ne dependent costs kc = kv c for each v V and m = n, called the OCCP problem, has been stud ed n [16,10,9]. In th s paper, we cons der the restr cted case w th m = n and costs kv c = kv 3 for c 3 and v V . We prove that the polyhedron correspond ng to the ILP conta ns only ntegral 0 1 extrema f and only f the graph G n the nstance s a par ty graph.
2
Perfect Matr ces
In th s sect on, we descr be an nteger l near program formulat on of the restr cted color ng problem w th costs. Let I be an nstance of the GOCCP problem conta n ng a graph G = (V E) w th n vert ces and a (n m) cost matr x (kv c ) w th m = n and kv c = kv 3 for c 3 and v V . The object ve funct on and the constra nts of the problem can be descr bed as follows: [kv 1 xv 1 + kv 2 xv 2 + kv 3 (1 − xv 1 − xv 2 )]
mn
(1)
v2V
xv 1 + xv 2
1
for each v
V
(2)
xv c
1
for each cl que C n G, 1
c
2
(3)
v2C
xv c
0 1
for each v
V, 1
c
2
Th s color ng problem amounts to the dec s on problem wh ch vert ces of the graph rece ve colors 1 and 2. Once th s s settled, the other vert ces may rece ve colors 3 4 n. For each vertex v V and color 1 2 , the var able xv s equal to 1, f vertex v rece ves color . The total color ng costs are m n m zed by the object ve funct on (1). The constra nts (2) spec fy that each vertex v rece ves at most one of the colors 1 2, and (3) guarantee that vert ces that are connected by an edge are colored d erently. Not ce that the object ve funct on s equ valent to the l near funct on [(kv 1 − kv 3 ) xv 1 + (kv 2 − kv 3 ) xv 2 ]
mn v2V
The coe c ent matr x (a zero-one matr x) correspond ng to the restr ct ons (2)-(3) s called M . A zero-one matr x M s called perfect f the polyhedron P (M ) = x M x 1 x 0 has only ntegral extreme po nts. It follows that the GOCCP problem can be solved by apply ng a l near program algor thm, f the matr x M s perfect. The goal of th s paper s a character zat on of the graphs such that the polyhedron P (M ) conta ns only ntegral extrema. The Cartes an product G1 G2 = (V1 V2 E) of two graphs G1 = (V1 E1 ) and G2 = (V2 E2 ) s de ned by the edge set E=
(u1 u2 ) (v1 v2 ) [u1 = v1
u2 v2
E2 ]
[u2 = v2
u1 v1
E1 ]
A Character zat on for Par ty Graphs and a Color ng Problem w th Costs
251
A graph G = (V E) s perfect, f and only f for each subset V 0 V the chromat c number (G[V 0 ]) of the subgraph G[V 0 ] nduced by V 0 s equal to the card nal ty (G[V 0 ]) of a max mum cl que n G[V 0 ]. It s easy to see that the constra nts (2), (3) for G de n ng the matr x M can be seen as the cl que nequal t es for G K2 . The constra nts (2) are for the cl ques of s ze 2 cons st ng of pa rs of correspond ng vert ces (v 1) (v 2) of the two cop es of G. Furthermore, (3) correspond to the cl ques n each of the cop es of G. Thus, the polyhedron P (M ) has only ntegral extreme po nts (or equ valent M s perfect), f and only f G K2 s perfect (Chvatal [4]). Therefore, the goal s to nd a character zat on of graphs G such that G K2 s perfect. We note that G K2 s a b part te graph, f G s b part te. The solut on x found by the l near program may not correspond d rectly to a solut on of the color ng problem. If kv 3 << m n(kv 1 kv 2 ) for all v V , then the best solut on of the l near program s xv 1 = xv 2 = 0 for all v V . Th s mpl es that only mach nes 3 n should be used n the best solut on. If (G) n − 2, then the solut on found by the l near program represents also a solut on of the color ng problem. If (G) > n − 2, then we have to color at least one vertex or two vert ces w th the colors 1 2. If (G) = n then G s a complete graph w th n vert ces, and the opt mum solut on of the color ng problem can be computed us ng a m n mum we ghted match ng n a b part te graph. If (G) = n − 1 then the complement of G does not conta n two non- nc dent edges or a tr angle. Us ng th s assert on (and we ghted match ngs), an opt mum solut on can be found n polynom al t me for (G) > n − 2.
3
Par ty Graphs
Let Γ (x) be the set of vert ces adjacent to x (the ne ghbours of x). Two vert ces x and y are called true tw ns, f x and y are adjacent and have the same ne ghbours (that means Γ (x) x = Γ (y) y ). Two vert ces x and y that are not adjacent but have the same ne ghbours are called false tw ns. We de ne the extens on of a graph G by a b part te graph B = (X1 X2 A) the operat on that generates a new graph by dent cat on of a subset X of vert ces of X1 w th a set of false tw ns of G (poss bly w th X = 1). Theorem 1. Every connected par ty graph G = (V E) s obta ned from a s ngle vertex by the follow ng operat ons: [3] (1) (2) (3)
1 2 3
creat on of a false tw n, creat on of a true tw n, extens on by a b part te graph.
Par ty graphs are also descr bed by a l st of forb dden nduced subgraphs [3]. 2) or odd cycles hav ng a short The l st cons sts of the odd cycles C2k+1 (k (called general zed house) and the cycle C5 w th two non-cross ng chord C2k+1 chords C5 (called gem). A metr c character zat on of par ty graphs s g ven n [2] and parallel algor thms for recogn t on of par ty graphs n [1,12].
252
4
Klaus Jansen
Ma n Theorem
In th s sect on, we prove the follow ng result: Theorem 2. The follow ng two statements are equ valent: (1) G K2 s a perfect graph. (2) G s a par ty graph. Us ng th s character zat on and the theorem of Chvatal [4] (see also the sect on about perfect matr ces), we obta n d rectly: Corollary 1. Let I be an nstance of the GOCCP problem conta n ng a graph G = (V E) w th n vert ces and color ng costs kv c such that kv c = kv 3 for c 3 and v V . Then, we have the follow ng equ valence: The polyhedron P (M ) correspond ng to the nstance I conta ns only ntegral extrema f, and only f, G s a par ty graph. The same result holds also for the OCCP problem w th a sequence of color ng = kn ( n th s case we have only a s mpler object ve costs k1 < k2 < k3 = funct on). Furthermore, we note that the number of cl ques s exponent al n the number of vert ces even n a cograph. S nce G K2 s a perfect graph, the strong opt m zat on problem to nd a vector that m n m zes a l near funct on on P (M ) s solvable n polynom al t me [8]. Th s mpl es the follow ng result: Corollary 2. The GOCCP problem restr cted to par ty graphs and color ng 3 and v V can be solved n polycosts kv c such that kv c = kv 3 for c nom al t me. 4.1
F rst D rect on
In th s subsect on, we prove the rst part of the ma n theorem: Theorem 3. If G s not a par ty graph, then the Cartes an product G not perfect.
K2 s
Proof. To prove th s, we have to cons der the forb dden subgraphs. If G s not a par ty graph, then G must conta n one of the follow ng nduced subgraphs: an odd cycle C2k+1 w th k 2, w th k a general zed house C2k+1 a gem C5 .
2,
In all three cases, we obta n an odd cycle n the Cartes an product G K2 and, therefore, we get a non-perfect nduced subgraph n G K2 . Th s shows that G K2 s not perfect f G s not a par ty graph. Case 1: If we have an odd cycle n G, then we have already a non-perfect nduced subgraph n G and, therefore, also n G K2 . generates an odd cycle C2k+3 n G K2 ; Case 2: A general zed house C2k+1 see also F g. 1. Case 3: A gem C5 generates an odd cycle C7 n G K2 ; see also F g. 2.
A Character zat on for Par ty Graphs and a Color ng Problem w th Costs
253
2k 2k + 1
2k − 1
1
2k − 2
2
2k − 3
k−1
k
(1 1) (2 1)
(1 2)
(2k − 1 1)
(2k + 1 2)
∗ house C2k+1
1
2
(2k 2)
cycle C2k+3
F g. 1. A house C2k+1 generates an odd cycle C2k+3 n G
3
5
gem C5∗∗
4
(1 1) (2 1) (3 1) (4 1)
(1 2)
(5 2)
(4 2)
cycle C7
F g. 2. A gem C5 generates a C7 n G
K2
(2k 1)
K2
254
4.2
Klaus Jansen
Second D rect on
In th s subsect on, we prove the second part of the ma n theorem. An example for the algor thm to compute the color ngs s g ven n the next sect on. Theorem 4. If G s a par ty graph, then the Cartes an product G K2 s perfect. Proof. Let H = (VH EH ) be an nduced subgraph of G K2 w th max mum cl que s ze (H) = k. Furthermore, let V be the set of vert ces v V (v ) 2. If k = 1 then H conta ns only VH n the .th part of G K2 , for 1 solated vert ces and can be colored w th one color. Therefore, we may assume that k > 1. Moreover, we may assume that G s a connected par ty graph; otherw se we compute a color ng for each correspond ng component of G K2 . In the follow ng, we construct a k-color ng for the nduced subgraph H. S multaneously, we compute two k-color ngs f for the nduced subgraphs G[V ] (1 2) such that f1 (v) = f2 (v) for each vertex v V1 \ V2 . Th s g ves a k-color ng for the graph H and proves the theorem. Clearly, for k > 2, a max mum cl que C l es e ther n G[V1 ] or n G[V2 ]. IN0 . At the We compute a k-color ng us ng we ght funct ons c(1) c(2) : V beg nn ng, we de ne 1 fv V c( ) (v) = 0 otherw se. Then, the max mum we ghted cl que n G w th we ghts c( ) (v) s equal to the 2. max mum cl que s ze (G[V ]), for 1 −1 (and also −1 By the reverse operat ons −1 1 2 3 ), we can transform the par ty graph nto a smaller par ty graph. Us ng these reverse operat ons, we mod fy the we ghts of the vert ces. In general, a we ght c( ) ( ) stores the s ze of a max mum cl que for a graph correspond ng to . Operat on −1 1 : False tw ns a and b n G. In th s case, we transform G nto a graph G (see also F g. 3) w th we ghts c( ) ( ) = max(c( ) (a) c( ) (b)).
∗
a
b
F g. 3. The transformat on for false tw ns a and b
A Character zat on for Par ty Graphs and a Color ng Problem w th Costs
255
Operat on −1 2 : True tw ns a and b n G. In th s case, we transform G nto a graph G (see also F g. 4) w th we ghts c( ) ( ) = c( ) (a) + c( ) (b).
∗
a
b
F g. 4. The transformat on for true tw ns a and b X2 A) (see F g. Operat on −1 3 : Extens on by a b part te graph B = (X1 5). In th s case, we remove the b part te graph B and get a vertex comb n ng the false tw ns a and b w th we ghts c( ) ( ) = max[c( ) (a) c( ) (b)]. Not ce that we store n vertex only the we ght of the vert ces a and b (the number of colors for a and b). The card nal ty of the max mum cl que n G (or the m n mum number of colors to color G) s g ven by the max mum of (1) the we ght of a max mum we ghted cl que n B and (2) the we ght of a max mum we ghted cl que n G . w th In general, we replace the set X X1 of false tw ns by a vertex we ghts c( ) ( ) = maxx2X c( ) (x). Recurs vely, we compute for par ty graphs G0 = (V 0 E 0 ) w th we ghts c( ) (v) 2f1 kg such that the follow ng nvar ants are for v V 0 color ngs g : V 0 sat s ed: the card nal t es g (v) = c( ) (v), f c(1) (v) = c(2) (v) = 1 then g1 (v) = g2 (v),
(4) (5)
f c(1) (v) = c(2) (v) = k − 1 then g1 (v) = g2 (v)
(6)
Th s means that we compute for each vertex a color set g ( ) w th card nal ty equal to the we ght c( ) ( ). S nce the we ght s equal to the s ze of a max mum cl que correspond ng to , th s color set stores a color ng of the graph correspond ng to . For the or g nal graph G = (V E) the we ghts c( ) (v) are zero or one. Us ng (5) we obta n color sets g (v) w th card nal t es one for the vert ces v n G[V ] such that g1 (v) = g2 (v) for v V1 \ V2 . The nvar ant (6) s used to spl t color sets for operat on 3 (extens on by a b part te graph).
256
Klaus Jansen
∗
a
b
B
F g. 5. The transformat on for an extens on by a b part te graph B. Suppose that vertex has the we ght c( ) ( ) = k − 1 n both graphs G[V ], 1 2. Furthermore, suppose that the color sets are equal: g1 ( ) = g2 ( ). If X = a and vertex a s adjacent to a vertex x X2 ( n the b part te graph B) w th we ghts c( ) (x) = 1, then we must color x w th the same color. To avo d such a s tuat on, we use (6). In what follows, we compute recurs vely color ngs g such that the nvar ants are sat s ed. It must be noted that the sets g (v) for the vert ces v n G[V ] form a feas ble color ng. We start w th a s ngle vertex graph and use color sets g (v) of s ze equal to the we ghts c( ) (v). For true tw ns a, b, we d str bute the color sets g ( ) to both tw ns. S nce we do not use one color for both a and b, th s generates a feas ble color ng. For operat on 3 , we extend the color ngs of G to the vert ces n B such that adjacent vert ces x y n B get color sets w th g (x) \ g (y) = . For a s ngle vertex graph, we can nd color ngs g such that the nvar ants are sat s ed for each k 2. We assume now that we have such color ngs g for the par ty graph G , and apply one of the operat ons 1 , 2 or 3 to get G. Operat on φ1 . False tw ns a and b. Th s s a spec al case of and X1 = X = a b .
3
w th X2 =
Operat on φ2 . True tw ns a and b. In th s case, we have to spl t the color sets g1 ( ) and g2 ( ) such that the nvar ants are sat s ed for the vert ces a and b. Case 1: We have to use two color sets of s ze k − 1 at vertex a or b. We may assume that c(1) (a) = c(2) (a) = k − 1. Case 1.1: c(1) (b) = c(2) (b) = 1. In th s case, c(1) ( ) = c(2) ( ) = k. We choose two d erent colors red blue 1 k and de ne the color sets as follows: g1 (a) g1 (b) g2 (a) g2 (b) 1 k red red 1 k blue blue
A Character zat on for Par ty Graphs and a Color ng Problem w th Costs
257
These color sets sat sfy the nvar ants. Case 1.2: c(1) (b) = 1 and c(2) (b) = 0. In th s case, we have g2 ( ) = k − 1 and know the un que color red 1 k g2 ( ). Us ng a color blue d erent from red, we de ne the follow ng color sets: g1 (a) 1
k
blue
g1 (b) g2 (a) g2 (b) blue g2 ( )
Aga n, these color sets sat sfy the nvar ants. Case 1.3: [c(1) (b) = 0 and c(2) (b) = 1] or [c(1) (b) = c(2) (b) = 0]. The rst case s symmetr cal to case 1 2 and n the second case, we can use the color sets g (a) = g ( ). Case 2: We have to use two color sets of s ze 1 at vertex a or b. We may assume that c(1) (a) = c(2) (a) = 1. Th s case can be proved n a s m lar way as Case 1. 1 k−1 Case 3: It rema ns the case n wh ch c(1) (a) = c(2) (a) or c(1) (a) 1 k − 1 . In th s case, we can choose an and also c(1) (b) = c(2) (b) or c(1) (b) arb trary spl tt ng of the color sets; e.g. take the rst c( ) (a) colors of g ( ) for g (a) and the rema n ng colors for g (b). In all these cases we have obta ned color sets for a and b that sat sfy (4)-(6). Operat on φ3 . Extens on by a b part te graph B = (X1 X2 A) where X X1 s a set of false tw ns. S nce the max mum cl que n G[V ] s at most k, we can use the fact that the we ghts c( ) ( ) k, c( ) (x) k for each x X1 X2 and A. that the sums c( ) (x) + c( ) (y) k for each edge x y Aga n, we cons der by case analys s the color sets g1 ( ) and g2 ( ) and extend the color ng of G to the b part te graph B. In all cases, (us ng an ordered l st of colors) we color rst the vert ces n X1 as early as poss ble and, then, the vert ces n X2 as late as poss ble. Th s mpl es that a vertex x X1 w th c( ) (x) = 1 gets the rst color and that a vertex x X2 w th c( ) (x) = 1 gets the last color. Each vertex x X1 X2 gets exactly c( ) (x) colors. We note that we rst must take the colors n g ( ); otherw se we can generate a confl ct n G . Case 1: There s a color red g1 ( ) g2 ( ). For G1 = G[V1 ], we start w th red and, nally, the colors n color red and, then, take the colors n g1 ( ) 1 k g1( ). For G2 = G[V2 ], we color the vert ces w th colors n the order k [ red g2 ( )], red . g2 ( ), 1 In th s case, a vertex x X1 w th we ghts c(1) (x) = c(2) (x) = 1 gets the color sets g1 (x) = red and g2 (x) = red . Furthermore, a vertex x X2 w th c(1) (x) = c(2) (x) = 1 has the color sets g2 (x) = red and g1 (x) = red . For a vertex x X1 w th c(1) (x) = c(2) (x) = k − 1, we have red g1 (x) g2 (x) (s nce red s the last color for G[V2 ]). The oppos te statement holds for a vertex x X2 w th we ghts c(1) (x) = c(2) (x) = k − 1. The case n wh ch there s a color red g2 ( ) g1 ( ) s s m lar. c(1) ( ) = c(2) ( ) k − 2 and Case 3: Case 2: g1 ( ) = g2 ( ) and 2 k or g1 ( ) = g2 ( ) = are also s m lar. We note that g1 ( ) = g2 ( ) = 1
258
Klaus Jansen
a case w th g1 ( ) = g2 ( ) and c(1) ( ) = c(2) ( ) 1 k − 1 s not poss ble (otherw se the nvar ants are not sat s ed for ). In all three cases, for each x X1 X2 the color sets g (x) sat sfy (4)-(6).
1
3
2
4
1
2
3
4
1 3
2
(5)
1
2
3
6
5
2
5
(4)
4
1
2
3
(3)
4
1
6
(2)
5 (1)
7 1
2
3
4
1
6 8
5 7
(0)
F g. 6. The generat on of a par ty graph G = G(0) us ng the operat ons and 3
5
1
2
Example
An example of a par ty graph G = G(0) and ts generat on us ng the three operat ons 1 2 and 3 s g ven n F g. 6. In Table 1 we have llustrated the computat on of the we ghts for the par ty graph start ng w th G(0) . We start w th the or g nal graph G = G(0) and we ghts c(v) = 1 for all v V . Not ce that the s ze of a max mum cl que n G s 4, and th s s ze can be found n G(4) as the we ght of a we ghted cl que C = 2 3 n the b part te graph. Furthermore, n Table 2, we have computed recurs vely two color ngs for the (0) (0) graphs G1 and G2 us ng the three nvar ants. In th s computat on, we use 4 (5) (5) colors r b g and s and start w th the s ngle vertex graphs G1 and G2 . Observe, () that the color sets g1 (v) and g2 (v) of s ze 1 are d erent for each vertex v n G1
A Character zat on for Par ty Graphs and a Color ng Problem w th Costs
259
Table 1. Computat on of the we ghts graph 1 (0) 1 (1) 1 (2) 1 (3) 1 (4) 1 (5) 1
2 1 1 1 2 2 -
3 1 1 1 1 2 -
4 1 1 1 1 1 -
5 1 1 1 1 -
6 1 1 1 -
7 1 1 -
8 1 -
()
(0)
and G2 (for 0 5). At the end, we obta n two color ngs g1 g2 for G1 and (0) G2 such that g1 (x) = g2 (x) for all x V . Table 2. Recurs ve computat on of color ngs graph 1 (5) 1 (5) 2 (4) 1 (4) 2 (3) 1 (3) 2 (2) 1 (2) 2 (1) 1 (1) 2 (0) 1 (0) 2
6
2
3
4
5
6
7
8
r b r b
g s r s
r b b g
s r
r b
g s r s
b g
s r
r b
r b
s r
b g
s r
r b
g s
r b
s r
b g
s r
r b
g s
b g
r b
s r
b g
s r
r b
g s
b g
s r
Conclus ons
In th s paper, we have proved that the GOCCP problem restr cted to par ty graphs G = (V E) can be solved n polynom al t me us ng a l near program 3 and v V . Th s result follows from the f the costs kv c = kv 3 for c character zat on that G s a par ty graph, f and only f the Cartes an product G K2 s a perfect graph. Furthermore, we can show that the OCCP problem
260
Klaus Jansen
w th three d erent cost values k1 = = kq < kq+1 = = kp < kp+1 = kn can be solved n polynom al t me for b part te graphs. The follow ng quest ons are nterest ng for further research:
=
(1) g ve a fast comb nator al algor thm for the GOCCP problem restr cted to par ty graphs and costs kv c = kv 3 for c 3, (2) study the polyhedron for the GOCCP problem w th m = 3 colors, (3) nd the complex ty of the OCCP problem for par ty graphs w th three d fferent cost values, (4) study a mod ed nteger l near program formulat on for the OCCP problem w th two and three d erent cost values.
References 1. G.S. Adhar and S. Peng, Parallel algor thms for cographs and par ty graphs w th appl cat ons, Journal of Algor thms 11 (1990), 252-284. 2. H.J. Bandelt and H.M. Mulder, Metr c character zat on of par ty graphs, D screte Mathemat cs 91 (1991), 221-230. 3. M. Burlet and J.P. Uhry, Par ty graphs, Annals of D screte Mathemat cs 21 (1984), 253-277. 4. V. Chvatal, On certa n polytopes assoc ated w th graphs, Journal Comb nator al Theory B-18 (1975), 138 154. 5. S. C cerone and G. D Stefano, On the equ valence n complex ty among bas c problems on b part te graphs and par ty graphs, to appear n: Internat onal Sympos um on Algor thms and Computat on ISAAC 97, S ngapore, LNCS (1997). 6. D. de Werra and A. Hertz, On perfectness of sums of graphs, Techn cal Report ORWP 87-13 and 97-06, Dept. de Mathemat ques, Ecole Polytechn que Federale de Lausanne. 7. M.R. Garey and D.S. Johnson, Computers and Intractab l ty: A Gu de to the Theory of NP-Completness, Freeman, San Franc sco, 1979. 8. M. Gr¨ otschel, L. Lovasz and A. Schr jver, Geometr c Algor thms and Comb nator al Opt m zat on, Spr nger, Berl n, 1988. 9. K. Jansen, The opt mum cost chromat c part t on problem, Algor thms and Complex ty CIAC 97, Rome, LNCS 1203 (1997), 25-36. 10. L.G. Kroon, A. Sen, H. Deng and A. Roy, The opt mal cost chromat c part t on problem for trees and nterval graphs, Graph Theoret cal Concepts n Computer Sc ence WG 96, Como, LNCS (1996). 11. H. Meyn el, The graphs whose odd cycles have at least two cross ng chords, Annals of D screte Mathemat cs 21 (1984), 115-119. 12. T. Przytycka and D.G. Corne l, Parallel algor thms for par ty graphs, Journal of Algor thms 12 (1991), 96-109. 13. G. Rav ndra and K.R. Parathasarathy, Perfect product graphs, D screte Mathemat cs, 20 (1977), 177-186. 14. B. Reed, pr vate commun cat on. 15. H. Sachs, On the Berge conjecture concern ng perfect graphs, n: Comb nator al Structures and the r Appl cat ons, Gordon and Beach, New York, 1970, 377-384. 16. A. Sen, H. Deng and S. Guha, On a graph part t on problem w th an appl cat on to VLSI layout, Informat on Process ng Letters 43 (1992), 87-94. 17. K.J. Supow t, F nd ng a max mum planar subset of a set of nets n a channel, IEEE Transact ons on Computer A ded Des gn, CAD 6, 1 (1987) 93 94.
On the Cl que Operator Mar sa Gut errez1 and Joao Me dan s2 1
Departamento de Matemat ca, Un vers dad Nac onal de La Plata C. C. 172, (1900) La Plata, Argent na mar
[email protected] 2 Inst tute of Comput ng, Un vers ty of Camp nas P. O. Box 6176, 13083-970 Camp nas SP, Braz l me dan
[email protected] camp.br
Abs rac . The cl que operator K maps a graph G nto ts cl que graph, wh ch s the ntersect on graph of the (max mal) cl ques of G. Among all the better stud ed graph operators, K seems to be the r chest one and many quest ons regard ng t rema n open. In part cular, t s not known whether recogn z ng a cl que graph s n P. In th s note we descr be our progress toward answer ng th s quest on. We obta n a necessary cond t on for a graph to be n the mage of K n terms of the presence of certa n subgraphs A and B. We show that be ng a cl que graph s not a property that s ma nta ned by add t on of tw ns. We present a result nvolv ng d stances that reduces the recogn t on problem to graphs of d ameter at most two. We also g ve a construct ve character zat on of K −1 (G) for a xed but gener c G.
1
Introduct on
The cl que operator K transforms a graph G nto a graph K(G) hav ng as vert ces all the cl ques of G, w th two cl ques be ng adjacent when they ntersect. The graph K(G) s called the cl que graph of G. (Th s and other de n t ons can be found n Sect. 2.) In th s note we w ll be nterested n the mage K( ) of the operator K, where s the class of all graphs. We are part cularly nterested n the complex ty of the recogn t on problem for K( ), wh ch s st ll open. Roberts and Spencer [5], bu ld ng upon deas from Hamel nk [3], gave a character zat on of K( ), but d rect appl cat on of these results leads to an exponent al t me algor thm. Wh le try ng to nd alternat ve character zat ons that could poss ble shed more l ght nto the problem, we came across an nterest ng quest on: s K( ) the same as K 2 ( )? Th s quest on turns out to be very d cult, and the present paper represents an e ort toward ts solut on. Our contr but on can be summar zed as follows. In Sect. 3 we nvest gate the structure of graphs n K( ) that are not Helly graphs, nd ng certa n subgraphs that have to be present n th s s tuat on. Because Helly graphs are all n K 2 ( ), non-Helly graphs are the only cand dates to separate K 2 ( ) from the rest of K( ). We show that one of these subgraphs belongs to K 2 ( ). We do not know f the other one does. C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 261 272, 1998. c Spr nger-Verlag Berl n He delberg 1998
262
Mar sa Gut errez and Joao Me dan s
In Sect. 4 we show that H a a graph n K −1 (G) f and only f t s the ntersect on graph of an ERS fam ly of G (please see de n t on n Sect. 2). We also study several propert es of both RS and ERS fam l es of G. In Sect. 5 we obta n results that show that t s enough to study the recogn t on problem for K( ) n graphs w th d ameter at most two. The study of K 2 ( ) s further compl cated by the fact that be ng n K( ) s not a property nher ted by reduced graphs, as we show n Sect. 6. In fact, when H K( ) t s poss ble to get a graph n K( ) by add ng tw n vert ces to H. Of course, th s add t on w ll not mod fy K(H). F nally, Sect. 7 conta ns our conclud ng remarks. Some proofs are om tted for space l m tat ons. All proofs appear n full n the extended vers on of th s paper.
2
De n t ons
In th s note all graphs are s mple, .e., w thout loops or mult ple edges. Let G be a graph. We denote by V (G) and E(G) the vertex set and edge set of G, respect vely. A set C of vert ces of G s complete when any two vert ces of C are adjacent. A max mal complete subset of V (G) s called a cl que. We denote by C(G) the cl que fam ly of G. Let F = (Fi )i2I be a n te fam ly of n te sets. Its dual fam ly F s the I x Fi . We denote fam ly (F (x))x2X where X = i2I Fi and F (x) = by ΩF the ntersect on graph of F , .e., V (ΩF ) = I and two vert ces and j are adjacent f and only f Fi Fj = . We also say that F represents ΩF . The 2-sect on of F , denoted by F2 , s the graph w th V (F2 ) = i2I Fi and two vert ces x and y are adjacent f and only f there ex sts I such that x y Fi . It s easy to see that ΩF = F2 [1]. A fam ly F of arb trary sets sat s es the Helly property, or s Helly, when for every subfam ly J F such that any two sets A B J ntersect, we have A = . A graph s Helly when the fam ly of ts cl ques s Helly. We denote A2J by H the class of Helly graphs. A fam ly F s conformal when the cl ques of F2 are all members of F . Th s amounts to say ng that ts dual fam ly F s Helly [1]. A fam ly F s reduced when none of ts members s conta ned n another one. As we sa d earl er, the cl que operator K transforms a graph G nto a graph K(G) hav ng as vert ces all the cl ques of G, w th two cl ques be ng adjacent when they ntersect. Thus, K(G) s noth ng else than the ntersect on graph of the fam ly of all cl ques of G. The graph K(G) s called the cl que graph of G. In th s note we w ll be nterested n the mage K( ) of the operator K, where s the class of all graphs. In part cular, we would l ke to determ ne the complex ty of recogn z ng whether a graph s a cl que graph, that s, s n K( ). There are only two general results about K( ) n the l terature. The rst result, due to Hamel nk [3], says that H s properly conta ned n K( ). In the second result, based on the prev ous one, Roberts and Spencer [5] nd the follow ng character zat on of K( ):
On the Cl que Operator
263
Theorem 1 (Roberts and Spencer, 1971). A graph G s n K( ) f and only f there s a fam ly K of complete sets n G such that: (1) K covers all the edges of G ( .e., f xy some element of K). (2) K sat s es the Helly property.
E(G), then x y
s conta ned n
In sp te of th s character zat on, the complex ty of the recogn t on problem for K( ) s st ll open. In the proof of the r theorem, Roberts and Spencer bu ld, g ven a graph G sat sfy ng the hypothes s, another graph H such that K(H) = G. We call and RS fam ly of G a fam ly of complete sets n G that ful lls the hypothes s of the Roberts and Spencer theorem. In add t on, f the fam ly has a reduced dual, then t s an ERS fam ly of G.
3
Non-Helly Graphs n K(G)
We denote by F2 the graph dep cted n F g. 1. We say that a graph G has F2 when G has three mutually adjacent vert ces v1 , v2 , and v3 , and three other vert ces v4 , v5 , and v6 such that v4 s adjacent to v1 and v2 but not to v3 , v5 s adjacent to v2 and v3 but not to v1 , v6 s adjacent to v1 and v3 but not to v2 .
v4 v1
v2
v6
v5 v3
F g. 1. The graph F2 Not ce that th s s d erent from say ng that G has F2 as an nduced subgraph, and t s also d erent from say ng that G has a subgraph (not necessar ly nduced) somorph c to F2 . However, th s concept s mportant because of the follow ng fact. De ne a graph to be Helly hered tary when t s Helly and all ts nduced subgraphs are Helly as well. Pr sner [4] showed that G s Helly hered tary f and only f G does not have F2 n the sense de ned above. The follow ng result tells us more about the structure of a graph n K( ) that has F2 .
264
Mar sa Gut errez and Joao Me dan s
Theorem 2. If G K( ) and G has F2 then G has a subgraph somorph c to e ther A or B (see F g. 2).
v4
v4
v1
v1
v2
v2
x v6
v3
v5
v3
v6
A
v5
B F g. 2. Graphs A and B
Proof. We sketch the ma n deas of the proof here. To show that G has a subgraph somorph c to A, we just need to nd a vertex x adjacent to the three central vert ces v1 , v2 , and v3 of an F2 plus at least one per pheral vertex (v4 , v5 , or v6 ). If G K( ) there s a fam ly of complete sets of G, K, wh ch holds the RS character zat on. The proof s done bas cally by a case analys s. We ask ourselves the follow ng quest ons: Is Is Is Is
there there there there
a a a a
set set set set
L L L L
K K K K
such such such such
that that that that
v1 v1 v1 v2
v2 v2 v3 v3
v3 v4 v6 v5
L? L? L? L?
It turns out that f the answer s yes to any of these quest ons, then the graph G has a subgraph somorph c to A. On the other hand, f the answer s no to all of them, then there s a vertex x adjacent to v1 , v2 , and v3 s multaneously (see F g. 3). If x s adjacent to one of v4 , v5 , or v6 , we are done. Otherw se we can argue that G adm ts a subgraph somorph c to B. A graph can have F2 and also subgraphs somorph c to A or B w thout belong ng to K( ), as the example n F g. 4 shows.
On the Cl que Operator
265
v4
v1
v2 x
v6
v3
v5
F g. 3. The vert ces v1 , v2 , and v3 are not s multaneously conta ned n any set of the RS fam ly K. Th s mpl es the ex stence of x
F g. 4. Graph w th two A’s but not a cl que graph
266
Mar sa Gut errez and Joao Me dan s
The next corollary follows eas ly. Corollary 1. If G K( ), then G s Helly hered tary f and only f G does not have subgraphs somorph c to e ther A or B. Proof. S nce G s Helly hered tary f and only f G does not have F2 [4], the result follows. The graphs A and B are therefore the ones that separate the Helly hered tary ones ns de K( ). Hence t s natural to take A and B as natural cand dates to be n K( ) but not n K 2 ( ). For the graph A, we have that t actually belongs to K 2 ( ) (see F g. 5). However, we do not know the status of B w th respect to K 2 ( ). We conjecture that B K 2 ( ).
4
Results for a F xed G
G ven a graph G n K( ), we character ze the class of graphs whose mage under K s G. Before go ng to the character zat on, let us recall a couple of results on ntersect on graphs. Lemma 1. If F s a fam ly of complete sets of G wh ch covers all edges of G, then G s the ntersect on graph of fam ly F . [2] Lemma 2. If G s the ntersect on graph of a fam ly F then F s a fam ly of complete sets of G wh ch covers all edges of G. [2] The proofs are easy and w ll be om tted. The follow ng result s a consequence. Theorem 3. Let G and H be two graphs. Then K(H) = G f and only f H s the ntersect on graph of an ERS fam ly of G. Proof. If K(H) = G, then G s the ntersect on graph of the fam ly C(H). Hence, by Lemma 2, C(H) s a fam ly of complete sets of G that covers the edges of G. In add t on, (C(H) ) = C(H), wh ch s conformal and reduced, s nce t s a fam ly of cl ques. Therefore C(H) s an ERS fam ly of G that represents H. Conversely, suppose H s the ntersect on graph of an ERS fam ly F of G. Then H s the 2-sect on of F . Not ce that g ven that L s an ERS fam ly, F s conformal and reduced. Therefore C(H) = F . But s nce F s also a fam ly of complete sets that covers all the edges of G, we have by Lemma 1 that G s the ntersect on graph of F . We conclude that G = K(H). Th s result shows that the propert es of the RS and ERS fam l es of G K( ) s of great mportance. We develop some of these propert es n the sequel.
On the Cl que Operator
267
H1 K
A H2 K
F g. 5. Graph A and two graphs H1 and H2 n K −1 (A). Not ce that H2 s n K( ), show ng that A K 2 ( ) .
268
Mar sa Gut errez and Joao Me dan s
The rst result tells us how to construct ERS fam l es from RS fam l es on a graph. Its proof s mmed ate. Theorem 4. Let F be an RS fam ly of G. If for all v V (G) such that there s u V (G) w th F (v) F (u), we add v to F , the result s an ERS fam ly of G. The next result tells us how to get a reduced RS fam ly from an RS fam ly. Theorem 5. If F s an RS fam ly of G then the fam ly of max mal sets n F w th respect to nclus on s also an RS fam ly of G. Proof. It s clear that the fam ly of max mal sets of F covers all the edges of G. The result follows from the fact that every subfam ly of a Helly fam ly s Helly. The follow ng results prov de cond t ons for the presence of two- and threeelement sets n RS fam l es of a graph. Theorem 6. Let F be a reduced RS fam ly of G. A two-element set A s n F f and only f A s a cl que of G. Proof. If x y F and t s not a cl que of G then there s z V (G) such that xz yz E(G). S nce F s a reduced RS fam ly of G we have that there are two completes sets, C L F, such that x z C and y z L. Thus C, L and x y are three pa rw se ntersect ng elements n F then they have a nonempty ntersect on. But f x (resp. y) s n the ntersect on then x y L (resp. x y C) and that s a contrad ct on because F s a reduced fam ly. Conversely, f x y s a cl que of G then x y F because t s the only complete set of G that covers the edge xy. In the prev ous theorem, not ce that when a two-element set s a cl que t s present n all RS fam l es, reduced or not. A s m lar result holds for three-element sets. Theorem 7. If x y z fam ly of G.
s a cl que of G, then x y z
belongs to every RS
Proof. Let F be an RS fam ly of G and let C L T F be sets that cover edges xy yz and xz respect vely. By the Helly property there s an element h n the ntersect on C L T . S nce x y z s a cl que of G, h must be one of these vert ces and therefore we have e ther x y z C or x y z L or x y z T . But x y z s a cl que of G, hence x y z s one of these three complete sets and belongs to F n any case. The converse s false. For nstance, n graph A of F g. 2 the tr angle v2 v4 x s n all RS fam l es of A, but s not a cl que of A. Theorem 8. If F s an RS fam ly of G and F1 , F2 are two members of F w th F1 F2 s also an RS fam ly nonempty ntersect on, then the fam ly F 0 = F of G. If moreover F s an ERS fam ly of G, then F 0 s also and ERS fam ly.
On the Cl que Operator
269
Proof. Obv ously F1 F2 s a complete set of G and F 0 covers all the edges Fn be a pa rw se nG. We w ll show that F 0 s Helly. Let F1 F2 F3 F4 Fn and F2 F3 F4 Fn are tersect ng subfam ly of (F )0 . Then F1 F3 F4 pa rw se ntersect ng subfam l es of (F ). Moreover, s nce F1 F2 = , the famFn s a pa rw se ntersect ng subfam ly of F and there s a ly F1 F2 F3 F4 common element. Thus F 0 s Helly. The cla m nvolv ng ERS fam l es s true because add ng any set to a fam ly w th reduced dual conserves th s property. Note that, f F1 and F2 are the same, add ng the r ntersect on to F amounts to dupl cat ng a set already n F . Hence, the mere act of repl cat ng an element n an ERS fam ly generates a new ERS fam ly.
5
Metr c Results
In a graph G, the d stance d(x y) between two vert ces s the m n mum number of edges n a shortest path from x to y. Our ma n result n th s sect on follows. Theorem 9. Let G be a connected graph and x, y two vert ces of G w th d(x y) > 2. Then G K( ) f and only f G + xy K( ). Proof. S nce G K( ) there s a graph H such that K(H) = G. Let Cx Cy be the cl ques of H wh ch represent x and y respect vely. Observe that (1) Cx Cy = (2) If r Cx and s
Cy then rs
E(H).
The rst statement s true because x and y are not adjacent. To prove the second statement suppose that rs E(H). Then there s a cl que C n H wh ch conta ns th s edge. Because of (1), C = Cx and C = Cy . Therefore dG (x y) = 2, a contrad ct on. Let H 0 be the graph obta ned by add ng a new vertex a adjacent only to the a and Cy0 = Cy a . It s clear that Cx0 un on Cx Cy , and let Cx0 = Cx 0 0 and Cy are cl ques of H . On the other hand, f C s a cl que of H 0 , we have two cases. If a C then C s a cl que of H. If a C, the rest of C must be n the un on Cx Cy , but, by property (2), C cannot meet both Cx and Cy , and then C s e ther Cx0 or Cy0 . Therefore, the fam ly of cl ques of H 0 can be obta ned from that of H by subst tut ng Cx0 and Cy0 for Cx and Cy , respect vely. S nce these new two cl ques meet, and the other adjacency relat onsh ps are not mod ed, we have that K(H 0 ) = K(H) + xy = G + xy K( ). Conversely, let H be graph such that K(H) = G + xy and let Cx Cy be the cl ques of H wh ch represent x and y respect vely. S nce xy E(G + xy), rt . We construct a new graph H 0 by doubl ng Cx Cy = . Let Cx Cy = r1 0 r10 rt0 , where each ri0 has the same of these vert ces, . e. V (H ) = V (H) 0 ne ghbors as ri and ri and ri are not adjacent. The cl ques of H 0 are the same as those of H except Cy0 , wh ch s obta ned from Cy by replac ng ri by ri0 , for all =1 t.
270
Mar sa Gut errez and Joao Me dan s
S nce d(x y) > 2, G does not have vert ces adjacent to both x and y. Then rt . Therefrom K(H 0 ) = no cl que of H, d st nct from Cx and Cy , meets r1 K(H) − xy = G K( ). We wr te G / H when there s a pa r of vert ces x, y n G w th d(x y) > 2 and H = G + xy. Extend th s relat on to a symmetr c relat on by de n ng G H f and only f G / H or H / G. Now extend th s relat on to an equ valence relat on Gk of graphs such by de n ng G H f and only f there s a ser es G0 G1 that G = G0
G1
Gk = H
The follow ng result s mmed ate from the prev ous theorem and de n t ons. Corollary 2. Let g and H be two graphs such that G and only f H K( ).
H. Then G
K( ) f
Th s shows that t s su c ent to deal w th graphs of d ameter at most 2 when determ n ng whether a graph belongs to K( ).
6
Tw n Vert ces
We have seen that the graphs n the nverse mage K −1 (G) are n one-to-one correspondence w th the ERS fam l es of G. There s an n n te number of these fam l es. It s easy to see that f a g ven complete set L of G appears n an ERS fam ly two or more t mes, th s produces tw n vert ces n the correspond ng H K −1 (G). We would l ke to s mpl fy the study of K −1 (G) by tak ng only ERS fam l es w th no repeated elements, because there s a n te number of such fam l es. For nstance, to test whether a g ven graph G belongs to K 2 ( ) we could take all reduced ( .e., w thout tw ns) graphs n K −1 (G) and check each one for pert nence n K( ). To do th s, however, we need a result s m lar to Thm. 9, w th the add t on of edge xy replaced by the add t on of a tw n. Unfortunately, only half of such a result s true: Theorem 10. Let G be a graph and u, v tw n vert ces n G. If G − u then G K( ).
K( )
Proof. Let F an RS fam ly of G − u. Add u to each member of F that conta ns v. The new fam ly s an RS fam ly for G. The converse of Thm. 10 does not hold. The graph n F g. 6 s obta ned from F2 by add ng a tw n to one of the central vert ces. However, F2 K( ) wh le C6 the b gger graph does belong to K( ), because the complete sets C1 C2 nd cated n the gure form an RS fam ly of the graph n quest on. A weaker converse holds, though. Theorem 11. Let u and v be tw n vert ces of a graph G. If there s an RS fam ly F of G w th the property that every member of F that has u also has v, then G − u K( ).
On the Cl que Operator
271
C4 C2 u C1
C6
v C5
C3
F g. 6. A graph n K( ) obta ned by add t on of tw n to F2 , wh ch does not belong to K( ) Proof. Remove u from all sets n F that conta n t. A related quest on has to do w th s mpl c al vert ces whose removal does not destroy cl ques. A vertex v s s mpl c al when v belongs to only one cl que of G. A s mpl c al vertex v s superfluous when the set of ts ne ghbors s a cl que n G − v. Not ce that K(G) = K(G − v) n th s case. Here aga n we could bene t from the equ valence G K( ) G−v K( ) when v s a superfluous s mpl c al. However, n th s case we are not even sure whether any of the mpl cat ons holds. The only result we were able to prove s for one of the mpl cat ons n some spec al c rcumstances. Let G be a graph and C a cl que of G. We say that an RS fam ly K of G s C-RS when there s a pa rw se ntersect ng subfam ly of K whose un on s C. Theorem 12. Let C be a cl que of a graph G and v C a superfluous s mpl c al vertex. If there s a (C − v)-RS fam ly of G − v, then G K( ). Fn be a pa rw se ntersect ng subfam ly of F such Proof. Let (F ) be a F1 Fn = C − v. that F1 n. Let F 0 be the fam ly obta ned from F add ng v to each Fi , for = 1 We w ll prove that F 0 s an RS fam ly of G. Indeed F 0 covers all the edges of G because F covers those of G − v; those nc dent to v are covered because the un on of the Fi s C. Hm To show that F 0 s Helly, we take a pa rw se ntersect ng subfam ly H1 of F 0 . If Hi = Fi + v for all = 1 m, then v s a common element and we are done. If ne ther of them has th s form then t s an ntersect ng subfam ly of F ) and there s a common element as well.. Fs +v Hs+1 Hm . S nce In the rema n ng cases we have a m xture: F1 +v m and F1 Fs s a pa rw se ntersect ng subfam ly of v s not n Hj , j = s+1 Fs Hs+1 Hm s a pa rw se ntersect ng subfam ly F then we have that F1 of F and there s a common element.
272
7
Mar sa Gut errez and Joao Me dan s
Conclus ons
Our goal n th s paper was to shed more l ght on the cl que operator K. In part cular, we were nterested n whether K( ) = K 2 ( ). An answer to th s quest on can potent ally lead to a character zat on of class K( ) that would help determ ne the complex ty of ts recogn t on problem. The ma n contr but ons of th s paper can be summar zed as follows. We prove that non-Helly graphs n K( ) must conta n subgraphs somorph c to e ther A or B of F g. 2. We prove that recogn t on of graphs n K( ) can be reduced to graphs w th d ameter at most two. We show that the add t on of tw ns does not preserve pert nence n K( ). A number of quest ons were ra sed as well. We do not know whether B K 2 ( ). We do not know whether superfluous s mpl c al vert ces nfluence pert nence n K( ). If K( ) = K 2 ( ) the mages of other powers of K w ll be of nterest.
References 1. C. Berge. Hypergraphes. Gauth er-V llars, Par s, 1987. 2. M. Gut errez. Intersect on graphs and cl que appl cat on. Unpubl shed manuscr pt. 3. R. Hamel nk. A part al character zat on of cl que graphs. J. Comb n. Theory, 5:192 197, 1968. 4. E. Pr sner. Hered tary cl que-helly graphs. J. Comb. Math. Comb. Comput., 1991. 5. F. S. Roberts and J. H. Spencer. A character zat on of cl que graphs. J. Comb n. Theory, Ser es B, 10:102 108, 1971.
Dynam c Packet Rout ng on Arrays w th Bounded Bu ers Andre Z. Broder1 , Alan M. Fr eze2 , and El Upfal3 1
D g tal Systems Research Center, 130 Lytton Avenue, Palo Alto, CA 94301, USA. 2 Department of Mathemat cal Sc ences, Carneg e Mellon Un vers ty, P ttsburgh PA15213, USA 3 IBM Almaden Research Center, San Jose, CA 95120, USA and Department of Appl ed Mathemat cs, The We zmann Inst tute of Sc ence Rehovot, Israel.
[email protected],
[email protected], el @w sdom.we zmann.ac. l
Abs rac . We study the performance of packet rout ng on arrays (or meshes) w th bounded bu ers n the rout ng sw tches, assum ng that new packets are cont nuously nserted at all the nodes. We g ve the rst rout ng algor thm on th s topology that s stable under an nject on rate w th n a constant factor of the hardware bandw dth. Unl ke prev ous results, our algor thm does not requ re the global synchron zat on of the nsert on t mes or the retract on and re nsert on of excess vely delayed messages and our analys s holds for a broad range of packet generat on stochast c d str but ons. Th s result represents a new appl cat on of a general techn que for the des gn and analys s of dynam c algor thms that we rst presented n [Broder et al., FOCS 96, pp. 390 399.].
1
Introduct on
The r gorous analys s of the dynam c performance of rout ng algor thms s one of the most challeng ng current goals n the study of commun cat on networks. So far, most theoret cal work on th s area has focused on stat c rout ng: A set of packets s njected nto the system at t me 0, and the rout ng algor thm s measured by the t me t takes to del ver all these packets to the r dest nat ons, assum ng that no new packets are njected nto the system n the meant me (see Le ghton [8] for an extens ve survey). In pract ce however, networks are rarely used n th s batch mode. Most real-l fe networks operate n a dynam c mode whereby new packets are cont nuously njected nto the system. Each processor usually controls only the rate at wh ch t njects ts own packets and has only a l m ted knowledge of the global state. Th s s tuat on s better modeled by a stochast c parad gm whereby packets are cont nuously njected nto the system accord ng to some nter-arr val d str but on, and the rout ng algor thm s evaluated accord ng to ts long term behav or. ?
Research supported n part by NSF Grant CCRR-9530974
C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 273 281, 1998. c Spr nger-Verlag Berl n He delberg 1998
274
Andre Z. Broder, Alan M. Fr eze, and El Upfal
In part cular, quant t es of nterest are the max mum arr val rate for wh ch the system s stable (that s, the arr val rate that ensures that the expected number of packets wa t ng n queues does not grow w th t me), and the expected t me a packet spends n the system n the steady state. The performance of a dynam c algor thm s a funct on of the nter-arr val d str but on. The goal s to develop algor thms that perform close to opt mal for any nter-arr val d str but on. Several recent art cles have addressed the dynam c rout ng problem, n the context of packet rout ng on arrays [7,6,9], on the hypercube and the butterfly [12] and general networks [11]. The analys s n all these works assumes a Po sson nject on rate and requ res unbounded queues n the rout ng sw tches (though some works g ve a h gh probab l ty bound on the s ze of the queue used [7,6]). Unbounded queues allow the appl cat on of some tools from queu ng theory (see [4,5]) and help reduce the correlat on between events n the system, thus s mpl fy ng the analys s at the cost of a less real st c model. Clearly bounded bu ers n the rout ng sw tches s a sett ng that most accurately models real networks. A general techn que for the des gn and analys s of dynam c packet rout ng algor thms has been developed n [1]. The crux of that work s a general theorem show ng that any commun cat on scheme (a rout ng algor thm and a network) that sat s es a g ven set of cond t ons, de ned only w th respect to a n te h story, s stable up to a certa n nter-arr val rate. Thus, the analys s of the long term behav or of a dynam c algor thm s reduced to a s mpler quest on of analyz ng a n te execut on of the algor thm. Furthermore, th s techn que also g ves a bound on the expected rout ng t me n the stable state. The theorem appl es to any nter-arr val d str but on: the stab l ty results and the expected rout ng t me of a packet ns de the network depend only on the nter-arr val rate. The wa t ng t me n queues depends on the nter-arr val d str but on and an expl c t relat on s g ven n the ma n theorem of [1]. To apply the general techn que one needs to present an algor thm whose performance analys s (on n te segments) sat s es the cond t ons of the theorem. Several appl cat ons of the general techn que to rout ng algor thms for low d ameter networks such as the butterfly have been demonstrated n [1]. Here we present the rst appl cat on to arrays: we cons der an n n mesh of rout ng sw tches w th bounded bu ers. Each rout ng sw tch node s connected to a processor. A processor has ts own queue for the packets generated by t that are wa t ng to enter the network. We can assume th s processor queue to be unbounded. (De facto, th s queue s n te and when t becomes full the processor stops generat ng new packets.) Other packets pass only through the rout ng sw tch. The rout ng sw tch has a rout ng queue stored n a bounded bu er where packets wa t ng to be routed are placed. For s mpl c ty we assume that packets have random dest nat ons. Th s assumpt on can be relaxed as long as no dest nat on s overloaded w th packets. Theorem 1. There s a packet rout ng algor thm for the n n mesh w th bounded bu ers that s stable for any nter-arr val d str but on w th expectat on at least Cn for some xed constant C. The expected t me a packet spends n the network
Dynam c Packet Rout ng on Arrays w th Bounded Bu ers
275
s O(n). In the case of Po sson arr val (geometr c nter-arr val d str but on) the expected t me the packet spends n the queue s also O(n). S nce the d stance between most pa rs of nodes on the mesh s Ω(n) the above theorem s clearly opt mal up to constant factors. Le ghton [7] has stud ed th s problem and obta ned s m lar results prov ded that the bu ers n the rout ng sw tches are unbounded. More prec sely, Le ghton’s algor thm ensures that at any xed t me, w th h gh probab l ty, no rout ng queue has more than 4 packets. However, for any su c ently long execut on the max mum s ze of any queue exceeds any g ven bound. Our results bu ld on Le ghton’s analys s, by augment ng h s algor thm w th a s mple flow control mechan sm, wh ch ensures that every rout ng queue s bounded at all t mes, and thus only n te bu ers are needed. In a d erent d rect on, several recent works stud ed the dynam c performance of deflect on (hot potato) rout ng. In deflect on rout ng packets n the networks always move and there are no bu ers n the rout ng sw tches. However, the analys s n these works requ res e ther strong synchron zat on between processors n t m ng nject ons of new packets to the network [3,10], or a mechan sm to retract and re nsert excess vely delayed packets [2].
2
The General Techn que
We outl ne here the general techn que developed n [1]. The sett ng s as follows: we are g ven a rout ng algor thm A act ng on a synchronous network Γ (n) w th n nputs and n outputs. Each nput rece ves new packets w th nter-arr val d str but on F . The packets are placed nto an unbounded fifo queue at the nput node. Packets have an output dest nat on chosen ndependently and un formly at random. When a packet reaches the top of ts queue, we call t act ve. At some po nt after becom ng act ve, the packet s removed from ts queue and eventually routed to ts dest nat on. We are nterested n determ n ng under wh ch cond t ons the queu ng system s ergod c (or stable), that s, under wh ch cond t ons the expected length of the nput queues s bounded as . To th s purpose we have to study the nter-departure t me, wh ch s the nterval from when a packet becomes act ve unt l t leaves the queue, and the next packet n l ne ( f any) becomes act ve. Bes des stab l ty, we are also nterested n the expected t me a packet spends n the queue, and the expected t me t spends n the network. S nce the nter-arr val t mes are ndependent, f the nter-departure t mes are also ndependent, then each queue can s mply be v ewed as a g/g/1 system and the stab l ty cond t on would tr v ally be that the nter-departure rate exceeds the nter-arr val rate. However the usual s tuat on s that there are complex nteract ons among packets dur ng rout ng and thus the nter-departure t mes are h ghly dependent and hard to analyze. The follow ng theorem de nes a set of relat vely s mple su c ent cond t ons such that f the rout ng algor thm sat s es them, then the system s stable up to
276
Andre Z. Broder, Alan M. Fr eze, and El Upfal
a certa n nter-arr val rate and we can bound the expected t me a packet spends n the queue and n the network. Theorem 2. Assume that the random zed rout ng algor thm A act ng on the network Γ (n) s character zed by four parameters a, b, m, and T , where a and b are constants, m and T m ght depend on n, and m T < 1, and that t sat s es the follow ng cond t ons: (1) Every packet s del vered at most O(na ) steps after becom ng act ve. s a t me) w th the follow ng propert es: (2) There ex sts an event E (where (a) E depends only on random cho ces made n the open nterval ( −nb + nb ). (These cho ces are: random dest nat ons chosen by the packets that became act ve n the nterval, random arr vals n th s nterval, and random co n fl ps made by the algor thm.) (b) E mpl es that any packet that at t me was among the rst m packets n ts queue, s del vered before t me + T . (c) For any xed t me the probab l ty Pr(E ) s bounded by (m T )7 n2a+2b+3 If there ex sts a pos t ve constant such that the nter-arr val d str but on F has an nter-arr val rate smaller than (1 − )m T , then the system s stable and T ), where the t me a packet spends n the nput queue s bounded by O(T ) + f ( m f s a funct on that depends only on F and not on the rout ng process. (For T ) = O(T m)). Furthermore the reasonable d str but ons such as Po sson f ( m average t me elapsed s nce a packet becomes act ve unt l t s del vered s also O(T ). Pr(E )
3 3.1
The Ma n Result The Algor thm
The descr pt on of the algor thm has three components: path select on; rout ng sw tch pol cy; and flow control. Path select on. A packet takes the shortest one-bend route from or g n to dest nat on. F rst (left or r ght) on ts or g n row to ts dest nat on column, then up or down on that column to the packet’s dest nat on. Sw tch pol cy. We assume that a sw tch can rece ve up to four packets per step, one from each ncom ng edge, and send four packets per step, one through each outgo ng edge. A sw tch ma nta ns a bu er for each outgo ng edge. When there s a space n a bu er the sw tch rece ves packets to that bu er accord ng to the follow ng rule: A packet s old f t has spent at least Kn steps n the network, where K s a su c ently large constant (K 2e2 ) w ll su ce. Old packets have h gher pr or ty. Among the old packets the oldest packet has the h ghest pr or ty. Old packets of the same age are dealt w th n lex cograph c order of pa r (or g n,dest nat on). Among packets that are not old, the packet that has to travel farthest has the h ghest pr or ty.
Dynam c Packet Rout ng on Arrays w th Bounded Bu ers
277
Adm ss on control. The algor thm uses a token based adm ss on control mechan sm. Each nput has one token. A token can be n one of three modes: enabled, used or suspended. In t ally all the tokens are n enabled mode. To nject a packet nto the network the nput needs an enabled token. The packet at the top of the processor queue s sent together w th the enabled token to ts dest nat on, prov ded there s room n the edge bu er wh ch w ll rece ve t. When a packet s del vered the mode of ts token sw tches to used mode and the token (acknowledgment) s returned to the nput node where t came from. We use a separate network to route tokens back to the r sources and the analys s of th s rout ng m rrors that of the ma n network. Deta ls are left to the full paper. Let s be the last t me a g ven token was sent w th a packet, and let r be the last t me t returns to ts nput node. If r − s 2Kn then the token becomes enabled aga n at t me r + 2Kn + Z, where Z s a random number chosen un formly from [0 Kn]. If r − s > 2Kn then the token mode s sw tched to suspended mode unt l t me r + 6n7 + Z steps, then t s sw tched aga n to enabled mode, where Z s chosen as above. Th s flow mechan sm guarantees that an nput cannot nject more than one packet w th n each nterval of 2Kn steps, and that the nput does not nject new packets when the network s congested. Furthermore, observe that the probab lty that a token becomes enabled at any xed t me s at most 1 (Kn). 3.2
Analys s of the Algor thm
In th s sect on we g ve an analys s of the performance of the above algor thm for n te ntervals show ng that the algor thm sat s es the requ rements of the general techn que (Thm. 2). Th s w ll prove Thm. 1. We w ll assume Po sson arr vals .e. at each step, a packet arr ves w th probab l ty 1 (Cn). The general case can be handled as n [1]. The follow ng lemma sat s es requ rement (1) of the general theorem: Lemma 1. Under th s protocol, no packet takes more than S = 7n7 steps to reach ts dest nat on, once t has become act ve and at most 2n5 steps once t has obta ned an enabled token. Proof. Cons der a packet P . Let P0 be ts predecessor n the queue. P0 does not leave the queue unt l t has an enabled token. At that t me there are no more than n2 other packets n the network. Cons der the progress of the h ghest pr or ty old packet n the network. If s mov ng along a column then t moves at every t me step. If t s mov ng along a row, then t could fa l to move because further along that row there s content on for a column bu er. wa ts at most Kn + n2 steps before mak ng another move. Th s s because the packet wa t ng to move along the column n quest on w ll have become old and t w ll be the oldest packet try ng to get nto the column edge bu er. Thus after w ll have reached ts dest nat on column and w ll reach Kn + Kn2 + n3 steps ts nal dest nat on w th n a further n steps. So after at most Kn3 + Kn4 + n5 steps, P0 w ll be the h ghest pr or ty packet n the network and w ll be del vered
278
Andre Z. Broder, Alan M. Fr eze, and El Upfal
w th n a further Kn + Kn2 + n3 steps. Thus P0 gets to ts dest nat on at most Kn + Kn2 + (K + 1)n3 + Kn4 + n5 2n5 steps after leav ng the queue. The used token comes back after at most another 2n5 steps and after at most 6n7 + Kn steps s re-act vated. F nally, after at most another 2n5 steps the packet P s del vered. The sum of these delays s less than 7n7 . Next we turn to the de n t on of the event E , and to the probab l st c analys s of the algor thm’s performance n n te ntervals. Our analys s s based on the the techn que n [7] as descr bed n [8] [Sect. 1.7.2]. As n [7] we relate the execut on of the algor thm to an art c al execut on on a w de-channel model n wh ch an arb trary number of packets can traverse an edge at any step, and no packet s ever delayed. We assume that execut on on the w de-channel starts at t me . Let c and q be constants to be de ned later, let d0 = (c + 3) log n + log(2K) Th s serves as a su table h gh probab l ty upper bound on the delay of any g ven packet. De ne the events E as follows: E
s the event that at least one of the follow ng occurred:
(1) There s an old packet n the system at any t me n the nterval [ + Kn]. (2) There s a row edge e, a 0, and an nterval [ 0 0 + +d0 ] [ + Kn], such that + d0 packets traverse edge e n that nterval n the w de-channel model. (3) There s a column edge e, a 0, and an nterval [ 0 0 + + Kn], such that + d0 packets traverse edge + 2d0 ] [ e n that nterval n the w de-channel model. (4) A rout ng bu er has q packets n some step n the nterval [ + Kn]. We say that a packet was delayed d steps n travers ng an edge f there s a d steps gap between the t me t traverses the edge n the w de-channel model and the t me t traverses the edge n the standard model. Le ghton’s analys s n [7] s based on the follow ng fact (see Cor. 1.9 and Lemma 1.10 n [8]): If bu ers are unbounded and the farthest to go packet always has the h ghest pr or ty, then a packet s delayed d steps n travers ng a row edge e only f there s an nterval of + d steps such that + d packets crossed edge e n that nterval n the w de-channel model. S m larly, f a packet s delayed d steps n cross ng a column edge e then, assum ng no packet s delayed more than d0 steps on a row, there s an nterval of + d0 + d steps n wh ch + d packets cross edge e n the w de-channel model (see [8] for a deta led proof). Thus we have the follow ng corollary that sat s es requ rement (2)(b) n the general theorem: Corollary 1. The event E mpl es that any packet w th an enabled token at t me s del vered w th n the next 2n + 2d0 Kn steps.
Dynam c Packet Rout ng on Arrays w th Bounded Bu ers
279
Proof. Any packet w th an enabled token s del vered w th n 2n steps n the w de-channel model. In the standard model ts add t onal delay s at most 2d0 . Next we bound the probab l ty of the event E . Assume rst that old packets are removed from the system the moment they become old, and all bu ers are unbounded. + Kn] let E0 (e 1 2 r) be the For an edge e and an nterval [ 1 2 ] [ event that n the w de-channel model r packets cross e dur ng t me nterval [ 1 2 ]. Note that every token s used at most once n the nterval. Case 1. e s a row edge: Pr(E0 (e
1
2
r))
r n 2− 1 r Kn
e(
− rK
2
1)
r
(The nodes on the row under cons derat on have a total of n tokens. Each token s used at most once n the nterval. Choose r of them to transport the packets of nterest. The probab l ty that a token becomes enabled at any xed t me s at most 1 (Kn).) Case 2. e s a column edge: Pr(E0 (e
1
2
r))
2 r n 2− 1 r Kn2
e(
− rK
2
1)
r
(There s a total of n2 tokens. Each token s used at most once n the nterval. Choose r of them to transport the packets of nterest. The probab l ty that a token becomes enabled at any xed t me s at most 1 (Kn) and the probab l ty that the uses a part cular column s 1 n.) Thus, under the assumpt on that no old packets are present n the nterval [ + Kn] the probab l ty that there s a row edge e, a 0, and an nterval [ + Kn], such that + d0 packets traverse edge e n that [ 0 0 + + d0 ] nterval n the w de-channel model s bounded by e d0 X e(d0 + ) d0 + 2Kn3 n−c Kn3 (d0 + )K K 0
for K e2 . (There are Kn poss ble values for 0 and n2 edges.) S m larly, under the same assumpt ons, the probab l ty that there s a column + Kn], such that + d0 edge e, a 0, and an nterval [ 0 0 + + 2d0 ] [ packets traverse edge e n that nterval n the w de-channel model s bounded by d0 X e(2d0 + ) d0 + X 2e d0 + 2e 3 3 3 Kn 2Kn n−c Kn (d0 + )K K K 0
prov ded that K
0
2e2 .
280
Andre Z. Broder, Alan M. Fr eze, and El Upfal
Remark 1. It follows (see [8, Sect. 1.7.2, Lemma 1.10]) that w th probab l ty at least 1 − 2n−c , for every edge e and t me nterval I of d0 steps, there s a step n I n wh ch e s empty. Now we show that the assumpt ons that no old packets are present n the nterval [ + Kn] holds w th h gh probab l ty. Cons der an nterval [ 0 0 + 2Kn]. By de n t on, the only packets that can become old n the nterval [ 0 + Kn 0 + 2Kn] had enabled tokens at t me 0 . Thus, n v ew of the d scuss on above, the probab l ty that any packet becomes old n [ 0 + Kn 0 + 2Kn] g ven that there were no old packets present n the nterval [ 0 0 + Kn] s at most 2n−c . Hence f the nterval [ − +Kn] conta ns any sub nterval of length at least Kn when no old messages are present, we can generously bound the probab l ty that any old messages are present n the nterval [ + Kn], by 2( + Kn)n−c . 2 Take = n S + kN . Then f a packet from a part cular source becomes old at some t me n [ − + Kn], t s del vered w th n S steps, and ts token s not enabled aga n unt l after t me + Kn. Hence there s an nterval of length at least − n2 S = kN when certa nly no old messages are present, and thus w th h gh probab l ty no old messages are present n [ + Kn]. Next we bound the probab l ty that any bu er s full n the nterval [ − + Kn]. From Remark 1 we can assume that some row edge has q packets n ts bu er only f there s a w ndow of d0 steps n wh ch some nput tr es to nject at least q packets. The probab l ty of th s s at most 2 1 d0 = O(n−c ) ( + Kn)n2 q Cn for su c ently large q. From Remark 1 we can further assume that a column edge has q packets n ts bu er only f there s a w ndow of d0 steps n wh ch q packets turn at e, Assum ng no row delay of d0 or more, the probab l ty of th s s bounded by q n 2d0 = O(n−c ) ( + Kn)n2 Kn2 q for su c ently large q (see [8, Sect. 1.7.2, Thm. 1.13]). (The factor ( + Kn)n2 bounds the number of ( nterval I = [ 1 2 ], edge e) pa rs. There are nq cho ces of token. There s a probab l ty 1 n that ts dest nat on uses e. There s a probab l ty of at most 2d0 (Kn) that t becomes enabled at a t me wh ch means t would cross e dur ng [ 1 − d0 2 ] n the w de-channel model.) Now choose a, b, and c such that na nb c
S = 2((K + 2)n + n2 ) = n2 S + Kn 2a + 2b + 11
Then summar z ng the d scuss on above Pr(E )
n−(2a+2b+3)
Dynam c Packet Rout ng on Arrays w th Bounded Bu ers
281
F nally we observe that n evaluat ng the probab l ty of the event E we only used events n the nterval [ − + Kn] and thus requ rement (2)(a) n the general theorem s sat s ed for nb . Thus we showed that all the cond t on of Thm. 2 are sat s ed for T = 2Kn and m = 1.
4
Conclus on
We have shown how to comb ne our general techn que, rst presented n [1], w th that of Le ghton’s analys s [8] of a greedy rout ng algor thm on an n n mesh. The results are opt mal up to constant factors.
References 1. A. Z. Broder, A. M. Fr eze, and E. Upfal. A general approach to dynam c packet rout ng w th bounded bu ers. Proceed ngs of the 37th IEEE Symp. on Foundat ons of Computer Sc ence. Burl ngton, 1996, pp. 390 399. 2. A. Z. Broder and E. Upfal. Dynam c deflect on rout ng on arrays. Proceed ngs of the 28th ACM Symp. on Theory of Comput ng. Ph ladelph a, 1996, pp. 348 355. 3. U. Fe ge. Nonmonoton c phenomena n packet rout ng , manuscr pt, march 1997. 4. M. Harcol-Balter and P. Black. Queu ng analys s of obl v ous packet rout ng networks. Procs. of the 5th Annual ACM-SIAM Symp. on D screte Algor thms. Pages 583 592, 1994. 5. M. Harcol-Balter and D. Wolf. Bound ng delays n packet-rout ng networks. Procs. of the 27th Annual ACM Symp. on Theory of Comput ng, 1995, pp. 248 257. 6. N. Kahale and T. Le ghton. Greedy dynam c rout ng on arrays. Procs. of the 6th Annual ACM-SIAM Symp. on D screte Algor thms. Pages 558 566, 1995. 7. F. T. Le ghton. Average case analys s of greedy rout ng algor thms on arrays. Procs. of the Second Annual ACM Symp. on Parallel Algor thms and Arch tectures. Pages 2 10, 1990. 8. F. T. Le ghton. Introduct on to Parallel Algor thms and Arch tectures. MorganKaufmann, San Mateo, CA 1992. 9. M. M tzenmacher. Bounds on the greedy algor thms for array networks. Procs. of the 6th Annual ACM Symp. on Parallel Algor thms and Arch tectures. Pages 346 353, 1994. 10. T. Rubshte n. A Dynam c Hot Potato Rout ng Algor thm for the Torus. M.Sc. thes s, The We zmann Inst tute, 1996. 11. C. Sche deler and B. Voeck ng Un versal cont nuous rout ng strateg es. SPAA ’96. 12. G. D. Stamoul s and J. N. Ts ts kl s. The e c ency of greedy rout ng n hypercubes and butterfl es. Procs. of the 6th Annual ACM Symp. on Parallel Algor thms and Arch tectures. Pages 346 353, 1994.
On-L ne Match ng Rout ng on Trees Alan Roberts and Anton os Symvon s Department of Computer Sc ence, Un vers ty of Sydney, NSW 2006, Austral a. alanr,symvon s @cs.su.oz.au
Abstract. We exam ne on-l ne heap construct on and on-l ne permutat on rout ng on trees under the match ng model. Let T be and n-node tree of max mum degree d. By prov d ng on-l ne algor thms we prove that: (1) For a rooted tree of he ght h, on-l ne heap construct on can be completed w th n (2d − 1)h rout ng steps. (2) For an arb trary tree, on-l ne permutat on rout ng can be completed w th n 4dn rout ng steps. (3) For a complete d-ary tree, on-l ne permutat on rout ng can be completed w th n 2(d − 1)n + 2d log2 n rout ng steps.
1
Introduct on
In packet rout ng problems we are g ven a network (usually represented by a connected, und rected graph) and a set of packets d str buted over the nodes of the network. Each packet has an or g n node and a dest nat on node and our a m s to route the packets to the r dest nat ons as fast as poss ble. The movement of the packets s subject to a set of rout ng rules wh ch de ne the rout ng model. Typ cal rout ng rules found on frequently used models nclude: restr ct ons on the number of packets that can res de on a node at any g ven t me nstance, restr ct ons on the number of packets that can be transm tted/rece ved by a node at any s ngle step, restr ct ons on whether two packets can s multaneously traverse an edge n oppos te d rect ons, restr ct ons on the amount of nformat on that can be used n mak ng rout ng dec s ons, etc. The d str but on of packet or g ns and dest nat ons spec es the rout ng pattern. In th s paper, we exam ne the permutat on rout ng pattern where each node s the or g n and dest nat on of exactly one packet. Packet rout ng algor thms fall nto two ma n categor es, namely o -l ne and on-l ne algor thms. In o -l ne rout ng, a rout ng schedule wh ch d ctates how each packet moves dur ng each step of the rout ng s precomputed. Then, when the rout ng s actually carr ed out, all packets move accord ng to the rout ng schedule. In contrast w th o -l ne rout ne, n on-l ne rout ng, rout ng dec s ons are made n a d str buted manner by the nodes of the network. At each rout ng step, every node exam nes the packets that res de n t and dec des, whether to Supported by an ARC Inst tut onal Grant. C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 282 292, 1998. c Spr nger-Verlag Berl n He delberg 1998
On-L ne Match ng Rout ng on Trees
283
advance them to ne ghbour ng nodes or to store them n local queues. The dec s on made by each node depends on local nformat on, usually cons st ng of the or g n/dest nat on nodes of the packets res d ng n t, and knowledge regard ng the topology of the network. In th s paper we exam ne permutat on rout ng on trees under the matchng model of rout ng. The model was or g nally ntroduced by Alon, Chung and Graham n the r study of permutat on rout ng [2]. The only rout ng operat on allowed s the exchange of the packets at the end-po nts of an edge. Many exchanges can occur s multaneously on a g ven step, however, these exchanges must occur over a d sjo nt set of edges, .e., a match ng, of the network. Another feature of the match ng model s that packets are not necessar ly consumed when they reach the r dest nat ons for the rst t me. Instead they cont nue mov ng unt l all packets reach the r dest nat ons s multaneously, at wh ch t me they are all consumed and the rout ng s over. Most of the work ava lable on the match ng model concentrates to o -l ne rout ng. Alon, Graham and Chung [2] gave an o -l ne algor thm for rout ng permutat ons on arb trary trees under the match ng model. The r algor thm route any permutat on on a tree of n nodes w th n 3n steps. They also gave a lower bound of 3n 2 steps for o -l ne match ng rout ng on trees. Roberts, Symvon s and Zhang [9] establ shed a new upper bound of 2 3n steps for o -l ne match ng rout ng. They also proved that for the case of bounded degree trees and complete d-ary trees, rout ng can always be completed w th n 2n + o(n) steps and n + o(n) steps, respect vely. Zhang [10] further reduced the bound for permutat on rout ng on arb trary trees to 3n 2 + o(n) steps. The odd-even transpos t on algor thm [3] (see also [1,5] ) can be cons dered as the rst work related to on-l ne match ng rout ng. The odd-even transpos t on algor thm sorts a permutat on on a l near array of n nodes w th n n steps by perform ng at each step compar sons/exchanges over a set of d sjo nt array edges. We are not aware of any other on-l ne algor thms for packet rout ng under the match ng model. More recently, Pantz ou, Roberts and Symvon s [6,7] gave an algor thm for on-l ne rout ng on trees under a var at on of the match ng model called the match ng model w th consumpt ons. In th s var at on of the model, packets are consumed as soon as they reach the r dest nat ons, thus mak ng t poss ble to route any k nd of rout ng pattern. The r algor thm was able to complete the rout ng of any k-packet pattern on any n-node tree T w th n d(k − 1) + d d st rout ng steps, where d s the max mum degree of T and d st s the max mum or g n to dest nat on d stance of any packet. They also gave an opt mal o -l ne algor thm wh ch can complete the many-to-many rout ng of k packets on any tree w th n 2(k − 1) + d st rout ng steps. For the same model (match ng model w th consumpt ons) Zhang and Kr zanc [4] have also g ven an o -l ne algor thm for many-to-many rout ng on trees. However, unt l now no bounds were known for on-l ne rout ng on trees under the or g nal match ng model of [2]. In our attempt to der ve on-l ne algor thms for rout ng on trees under the match ng rout ng model, we run nto a problem of ndependent nterest. Th s
284
Alan Roberts and Anton os Symvon s
s the problem of heap construct on. Cons der a rooted tree T and let each of ts nodes have a key-value assoc ated w th t. We say that T s heap ordered f each non-leaf node sat s es the heap nvar ant: the key-value of the node s not larger than the key-values of ts ch ldren . When the key-value at each node s carr ed by (or assoc ated w th) the packet currently n the node, the problem of heap construct on s s mply to route the packets on the tree n a way that guarantees that at the end of the rout ng the packets are heap-ordered based on the key-values they carry. Needless to say, we are nterested n form ng the heap on-l ne and n the smallest number of parallel rout ng steps where rout ng s performed accord ng to the match ng rout ng model. In th s paper, we prov de and analyze on-l ne match ng rout ng algor thms that support the follow ng results: (1) For an arb trary n-node rooted tree of max mum degree d and he ght h, heap construct on can be completed w th n 2dh rout ng steps. (2) For an arb trary n-node tree of max mum degree d, permutat on rout ng can be completed w th n 4dn steps. (3) For a complete n-node d-ary tree, permutat on rout ng can be completed w th n 2(d − 1)n + 2d log2 n steps. These are the rst results concern ng on-l ne rout ng of permutat ons under the match ng model of rout ng. Note that, our results also prov de on-l ne algor thms for rout ng on general und rected graphs s nce, n the case of general graphs rout ng can be performed over a spann ng tree of the graph. Due to space l m tat ons, most of the proofs are om tted and can be found n [8].
2
Prel m nar es
A tree T = (V E) s a connected und rected acycl c graph w th node set V and edge set E. A rooted tree s a tree n wh ch one of ts nodes, say r, s des gnated as ts root. The depth of a node v of a rooted tree s equal to the d stance ( .e., the length of the shortest path) between node v and the root r of the tree. The he ght of a rooted tree s de ned to be the largest depth over all nodes of the tree. Cons der an arb trary node v of a tree rooted at node r. All nodes that appear n the s mple path from v to r, nclud ng v and r, are called ancestors of v. All nodes that have v as the r ancestor, form the set of the descendents of v. Note that v s an ancestor and dependent of tself. The subtree rooted at v, denoted by Tv , cons sts of all the descendent nodes of v and the edges that connect them. Throughout th s paper we analyze our algor thms n terms of rooted trees. It s thus necessary to refer to the relat ve pos t on of nodes, edges and packets w th n a tree or sub-tree. To do th s we use the not ons of node levels and edge levels. All nodes of depth k are referred as level-k nodes wh le all edges connect ng a k-level node w th a (k + 1)-level node are referred as level-k edges. We also frequently use the not on of up and down, above and below. These d rect ons are
On-L ne Match ng Rout ng on Trees
285
naturally de ned by cons der ng the usual recurs ve tree layout where the root s drawn at the top of the d agram and the rest of the tree s hung below the root. F nally, by ch(v) we denoted the number of ch ldren of node v. In the course of analyz ng the algor thms, t s necessary to refer to paths w th n the tree dur ng rout ng. Cons der a packet p that s routed on a rooted tree. We de ne M (p t) to be the path from the node conta n ng p to the root node at the end of step t of the algor thm. Note that M (p t) ncludes the node conta n ng p. M (p 0) denotes the n t al path from p to the root of the tree. In th s paper, we assumed that the tree topology s known n advance and th s knowledge s used n preprocess ng. The follow ng two lemmata (cons dered to be part of the folklore) are used n our algor thms: Lemma 1. A tree T of max mum degree d can be edge-coloured w th exactly d colours. Moreover, a val d edge colour ng can be computed n l near t me. Lemma 2. Cons der an n-node tree T . There ex sts a node r of T such that each tree of T r has at most n 2 nodes. Moreover, node r can be dent ed n l near t me. In an on-l ne algor thm the dec s on about whether or not to exchange the packets at the endpo nts of a g ven edge usually nvolves compar ng the packet. The edges over wh ch compar sons occur on a g ven step are sa d to be act ve on that step. The dec s on on whether to exchange or not the packets at the endpo nts of an act ve edge as well as the actual exchange can be both mplemented w th one rout ng step as follows: both nodes at the endpo nts of the act ve edge send the packets they hold to the node at the other s de of the edge wh le, at the same t me they also keep a copy of the r own packet. Then, the two packets are ava lable for compar son (accord ng to some order ng cr ter on) at both nodes and thus a cons stent dec s on can be made regard ng wh ch packet s kept at each node and wh ch packet s d sregarded.
3
On L ne Heap Construct on
We present an algor thm that heap orders a rooted tree T of he ght h and maxmum degree d. The algor thm completes the heap construct on n 2dh rout ng steps and ts correctness s proved based on potent al funct on arguments. A ne-tuned vers on of the algor thm saves h add t onal steps, thus complet ng the rout ng n (2d − 1)h rout ng steps. 3.1
The Heap Construct on Algor thm
The algor thm proceeds n cycles of length d. At alternat ve cycles, alternat velevel edges become potent ally act ve. Dur ng any cycle, each of the potent ally act ve edges that are nc dent to any node become act ve exactly once, each at a d erent step. So, at each t me step, the set of act ve edges forms a match ng and exchanges that br ng the smallest keys closer to the root take place over these act ve edges. The algor thm n deta l s g ven n F g. 1.
286
Alan Roberts and Anton os Symvon s Algor thm Heap(T ) /* T s a rooted tree of he ght h and max mum degree d. */
(1) For each node v of T , label the ch(v) edges nc dent to v wh ch connect t to ts ch ldren w th d st nct labels n 0 ch(v) − 1 . (2) cycle = 1; t = 1 (3) Wh le cycle 2h do (a) Dur ng odd cycles, let the set of potent ally act ve edges cons sts of all level-(h − ) edges, for all odd ntegers n 1 h , wh le, dur ng even cycles, let the set of potent ally act ve edges cons sts of all level-(h − ) edges, for all even ntegers n 1 h . /* Note that dur ng the rst cycle level-(h − 1) edges become act ve. */ (b) Wh le t d cycle do Out of all potent ally act ve edges, let the edges w th label congruent to t mod d be act ve. For all act ve edges, f the exchange of the keys at the endpo nts of the edge results to gett ng the lower key closer to the root, perform the exchange; otherw se, do noth ng. t = t+1 (c) cycle = cycle + 1
F g. 1. Algor thm Heap.
3.2
Analys s of Algor thm Heap
In [9] an o -l ne heap construct on algor thm was g ven. Algor thm Heap s s m lar to the o -l ne algor thm of [9] except that t s on-l ne. However, t s poss ble that the heap order ng g ven by Algor thm Heap w ll be d erent to that g ven by the algor thm of [9] when they are run on the same problem nstance. From the statement of the algor thm, t s obv ous that t term nates after exactly 2dh rout ng steps. Thus, we need to show that w th n 2dh rout ng steps the tree has been heap ordered. By us ng the techn ques of [9] we develop an analys s of Algor thm Heap. The analys s rel es on the comb nat ons of colour ng and potent al funct on arguments. We choose an arb trary packet p and colour each packet of T wh te or black us ng a s mple scheme that depends on p. Each packet of T that has a larger key than the key of p s coloured black. All other packets ( nclud ng p) are coloured wh te. By prov ng statements about the movement of these wh te and black packets a bound can be placed on the t me that t takes for p to reach ts nal pos t on. Cons der the path M (p t) from the root to p for any step t. It cons sts of alternat ng blocks of black and wh te packets. We shall refer to these as black blocks and wh te blocks. In any block we refer to the packet wh ch s closest to the root as the rst packet of the block. In any block the packet wh ch s furthest from the root s referred to as the last packet of the block.
On-L ne Match ng Rout ng on Trees
287
Let the blocks be labelled as shown n F g. 2. As can be seen the blocks are labelled n order, start ng at the root and mov ng down M (p t). g(t) denotes the number of black blocks n M (p t). The wh te blocks are labelled W t start ng w th block W0t (wh ch can be empty) wh ch s closest to the root and cont nu ng t wh ch s furthest from the root and conta ns packet p. The black down to Wg(t) blocks are labelled B t start ng w th block B1t wh ch s closest to the root and t wh ch s furthest from the root. In the analys s we use cont nu ng down to Bg(t) the notat on X to refer to the number of packets n block X. As ment oned prev ously, the algor thm proceeds n cycles of d steps. Cons der the path M (p t) from the node of p to the root at the end of step t. As t me proceeds black packets move down the path or o the path unt l eventually at t) the path from the node of p to the root conta ns only some t me t0 (t0 wh te packets. If all packets on M (p t0 ) are wh te then all packets on M (p t0 ) 0 ( nclud ng p) are n the block W0t . The a m of the analys s s to bound the t me that t takes for th s to happen and use th s to bound the runn ng t me of Algor thm Heap(T ).
Root P
t
W0
t
B1
t
W1
t
W g(t)
F g. 2. The path M (p t) for an arb trary packet p. The packets w th keys greater than the key of p are coloured black. All other packets are coloured wh te. Block W0t can be empty.
Lemma 3. Cons der the path M (p t). Suppose that M (p t+ 1) conta ns a black packet x that was not n M (p t). Then dur ng step t + 1, x swapped w th a black packet that was on M (p t). Lemma 4. Let p be an arb trary packet n T routed by algor thm Algor thm Heap(T ). Cons der t me step t = kd (k 1) (the end of the last step of cycle g(t)). Let L denote the node that k) and an arb trary black block B t (1 conta ns the last packet of block B t (L does not vary when we refer to d erent t me steps). It holds that at the end of cycle k + 1 ( .e. at the end of step t + d), node L conta ns a wh te packet. Lemma 5. Let p be an arb trary packet n T . Cons der t me step t = kd, (k 1) (the end of the last step of cycle k). Assume that the number of black blocks at the beg nn ng of cycle k+1 s the same as the number at the end (g(t) = g(t+d)). Then at least one of the follow ng holds :
288
Alan Roberts and Anton os Symvon s
The number of black packets n M (p t) s greater than the number of black packets n M (p t + d), or W0t + 1 W0t+d Lemma 6. Let p be an arb trary packet n T . Cons der t me step t = kd, (k 1) (the end of the last step of cycle k). Assume that g(t + d) = g(t) − ( > 0). Then at least one of the follow ng holds true. There are at least +1 more black packets present n M (p t) than n M (p t+ d), or there are more black packets present n M (p t) than n M (p t + d) and W0t + 1. W0t+d Theorem 1. Cons der an n-node tree T w th max mum degree d and he ght h. Algor thm Heap(T ) heap orders T w th n 2dh steps. Proof. The proof s ach eved us ng a potent al funct on argument. Let p be an arb trary packet and let all packets be coloured relat ve to p. We de ne a potent al funct on (p t) wh ch g ves the potent al of a packet p at the end of step t of the rout ng. The magn tude of (p t) corresponds to how well sorted the path M (p t) s, relat ve to packet p. By prov ng that (p t) s str ctly decreas ng we are able to place a bound on the runn ng t me of Algor thm Heap(T ). We beg n by de n ng a quant ty mp wh ch s used to de ne (p t). Cons der the rst t me step t0 for wh ch M (p t0 ) cons sts only of wh te packets. mp s de ned as the number of wh te packets ( nclud ng p) on M (p t0 ). Note that 0 mp = W0t . G ven th s de n t on of mp , (p t) s de ned by the follow ng equat on: g(t)
(p t) = max(mp W0t ) − W0t +
B t − g(t) =1
t s Note that f g(t) = 0 then there are no black blocks and so the term g(t) =1 B zero. We prove (see [8]) that each of the follow ng cla ms hold for any packet p.
(1) (p t) 0 for t 0 (2) If (p t) = 0 (t 0) then (p t + 1) = 0 (3) If t = kd and (p t) > 0 (k 1) then (p t + d) (4) (p d) 2h − 1
(p t) − 1
These cla ms allow us to make the follow ng deduct ons. By (4) each packet has a potent al of at most 2h− 1 at the end of the rst cycle. By (3) the potent al of each packet drops by at least 1 for every cycle of rout ng beyond the rst. By (2) the potent al rema ns at zero once t has reached zero for the rst t me. By (1) the potent al cannot drop below zero. Hence, us ng all of these cla ms, t holds that the potent al of all packets w ll be zero after at most 2h cycles,
On-L ne Match ng Rout ng on Trees
289
and w ll rema n at zero. The proof of the theorem s completed by show ng that when the potent al of all packets s 0 the tree s heap ordered. In conclus on, us ng all of the above cla ms t follows that for any packet p, (p 2dh) = 0. Hence, n the colour ng nduced by packet p, all packets above p on M (p 2dh) must be wh te. Th s means that for any packet p, all packets above p on M (p 2dh) have keys less than or equal to the key of p. We conclude that Algor thm Heap(T ) heap orders T w th n 2dh steps. Observe that n a rooted tree of max mum degree d, the only node wh ch can have d ch ldren s the root of the tree. All other nodes can have at most d − 1 ch ldren. Thus, n our algor thm, only the cycles where the edges nc dent to the root, .e., level-0 edges, are act ve need to last for d steps. All other cycles need to last at most d− 1 steps. Thus, we can save one step for every alternat ve cycle, for a total sav ngs of h steps. Thus, Algor thm Heap can be ne tuned such that t heap orders any tree w th n 2dh − h = (2d − 1)h steps. Theorem 2. Any rooted tree of he ght h and max mum degree d can be heap ordered n (2d − 1)h rout ng steps by a ne tuned vers on of Algor thm Heap. Even though Algor thm Heap term nates after exactly 2dh steps, for several d str but ons of the keys, the tree s heap ordered several steps before the term nat on of the algor thm. Thus, t s tempt ng to stop the execut on of the algor thm earl er. The next theorem shows that th s s not the case. Its proof s based on an nstance of the heap order ng problem for wh ch Algor thm Heap completes the heap construct on after 2d(h − 1) rout ng steps. Theorem 3. There ex st trees T and n t al d str but ons of keys such that Algor thm Heap(T ) requ res at least 2d(h − 1) steps to complete the heap order ng.
4
On L ne Tree Rout ng
Rout ng on an arb trary tree T s performed n a recurs ve way. We des gnate a node r of T to be ts root, and thus turn T nto a rooted tree. Then, we move all packets to the r dest nat on subtrees rooted at ch ldren of r, but not necessar ly to the r correct dest nat on node. Lemma 2 guarantees that no subtree rooted at a ch ld of r has more that n 2 nodes. Then, we complete the rout ng recurs vely by rout ng one permutat on n each subtree. Note that the rout ng w th n all subtrees can proceed n parallel. The deta led algor thm s g ven n F g. 3. The algor thm g ven n F g. 4 moves packets across the root nto the r dest nat on sub-trees. It s used by Algor thm TreeRoute(T ) to ach eve the rout ng for each level of recurs on. It assumes that each subtree s heap ordered w th respect to the keys ass gned to the packets by step (2)(a) of Algor thm TreeRoute. Lemma 7. Algor thm Transfer(T ) routes all packets to the r dest nat on subtrees w th n (d − 1) T steps.
290
Alan Roberts and Anton os Symvon s
Algor thm TreeRoute(T ) /* T s an arb trary tree of max mum degree d */ (1) Turn T nto a rooted tree by choos ng a node r to be ts root such that each subtree rooted at each ch ld of r has at most n 2 nodes (Lemma 2). Denote by T (0 j < ch(r) the subtree rooted at the j-th ch ld of r. (2) For all subtrees T (0 j < ch(r)) n parallel, do (a) Set the keys of those packets n T that have dest nat ons outs de of T to 0. Set the keys of all other packets n T to 1. (b) Run Algor thm Heap(T ) (3) Set the key of the packet at r to 0 (4) Run Algor thm Transfer(T ) (5) For each T (0 j < ch(r)) n parallel, run Algor thm TreeRoute(T )
F g. 3. Algor thm TreeRoute.
Algor thm Transfer(T ) /* T s a rooted tree w th root r and max mum degree d */ (1) Label the edges of T w th labels n 0 d − 1 so that no two adjacent edges have the same label (Lemma 1). (2) t = 1 (3) Mark the packet that s dest ned for the root r w th a * (4) Wh le t (d − 1) T + 1 do /* step t of Algor thm Transfer(T ) */ (a) Make all edges labelled w th numbers congruent to t mod d act ve. (b) For each act ve edge n parallel, do : Let e denote the act ve edge nc dent on the root r. Let p denote the packet at r and let q denote the packet at the other end-po nt of e. If p and q must both cross e to advance towards the r dest nat ons then set the key of p to 1 and exchange p w th q. If p s marked w th a * ( .e. dest ned for r) and q must pass through r to reach ts dest nat on, then exchange p w th q. Otherw se, do noth ng. For all other act ve edges, f a packet that has key 0 can be moved towards the root by exchang ng w th a packet that has key 1, or w th the packet that s marked w th a *, then do t. If the packet that s marked w th a * can be moved towards the root by exchang ng w th a packet that has key 1, then do t. Otherw se, do noth ng. (c) t = t + 1
F g. 4. Algor thmTransfer.
On-L ne Match ng Rout ng on Trees
291
Theorem 4. Algor thm TreeRoute routes any permutat on on an n-node tree T of max mum degree d w th n 4dn steps. Proof. The algor thm s recurs ve. When t s run on a sub-tree T 0 t part t ons 0 T 0 nto sub-trees of s ze at most jT2 j . Let T 0 denote the largest sub-tree that the algor thm runs on n the -th level of recurs on (T00 = T ). We know that T 0 n 2 and so there cannot be more than log n levels of recurs on. Accord ngly, log n 0 n 2+ +1
References 1. S. Akl. Parallel Sort ng Algor thms. Academ c Press, 1985. 2. Alon, Chung, and Graham. Rout ng permutat ons on graphs v a match ngs. SIAM Journal on D screte Mathemat cs, 7:513 530, 1994. 3. N. Haberman. Parallel ne ghbor-sort (or the glory of the nduct on pr nc ple). Techn cal Report AD-759 248, Nat onal Techn cal Informat on Serv ce, US Department of Commerce, 5285 Port Royal Road, Spr ng eldn VA 22151, 1972. 4. D. Kr zanc and L. Zhang. Many-to-one packet rout ng v a match ngs. In Proceedngs of the Th rd Annual Internat onal Comput ng and Comb nator cs Conference, Shangha , Ch na, August 1997. To appear. 5. F. T. Le ghton. Introduct on to Parallel Algor thms and Arch tectures: Arrays Trees - Hypercubes. Morgan Kaufmann, San Mateo, CA 94403, 1991. 6. G. Pantz ou, A. Roberts, and A. Symvon s. Dynam c tree rout ng under the match ng w th consumpt on model. In T. Asano, Y. Igarash , H. Nagamoch , S. M yano, and S. Sur , ed tors, Proceed ngs of the 7th Internat onal Sympos um on Algor thms and Computat on ISAAC ’96 (Osaka, Japan, December 1996), LNCS 1178, pages 275 284. Spr nger-Verlag, 1996. 7. G. Pantz ou, A. Roberts, and A. Symvon s. Many-to-many rout ng on trees v a match ngs. Techn cal Report TR-507, Basser Dept of Computer Sc ence, Un vers ty of Sydney, July 1996. To appear n Theoret cal Computer Sc ence. Ava lable from ftp://ftp.cs.su.oz.au/pub/tr/TR96 507.ps.Z.
292
Alan Roberts and Anton os Symvon s
8. A. Roberts, and A. Symvon s. On Deflect on Worm Rout ng on Meshes. Techn cal Report TR-514, Basser Dept of Computer Sc ence, Un vers ty of Sydney, May 1997. Ava lable from ftp://ftp.cs.su.oz.au/pub/tr/TR97 514.ps.Z. 9. A. Roberts, A. Symvon s, and L. Zhang. Rout ng on trees v a match ngs. In Proceed ngs of the Fourth Workshop on Algor thms and Data Structures (WADS’95), K ngston, Ontar o, Canada, pages 251 262. Spr nger-Verlag, LNCS 955, aug 1995. Also TR 494, January 1995, Basser Dept of Computer Sc ence, Un vers ty of Sydney. Ava lable from ftp://ftp.cs.su.oz.au/pub/tr/TR95 494.ps.Z. 10. Loux n Zhang. Opt mal bounds for match ng rout ng on trees. In Proceed ngs of the E ghth Annual ACM-SIAM Sympos um on D screte Algor thms, pages 445 453, New Orleans, Lou s ana, 5 7 January 1997.
Analyz ng Glauber Dynam cs by Compar son of Markov Cha ns? Dana Randall1 and Prasad Tetal 2 1
School of Mathemat cs and College of Comput ng, Georg a Inst tute of Technology, Atlanta GA 30332, USA 2 School of Mathemat cs, Georg a Inst tute of Technology, Atlanta GA 30332, USA randall,tetal @math.gatech.edu
Abs rac . A popular techn que for study ng random propert es of a comb nator al set s to des gn a Markov cha n Monte Carlo algor thm. For many problems there are natural Markov cha ns connect ng the set of allowable con gurat ons wh ch are based on local moves, or Glauber dynam cs. Typ cally these s ngle s te update algor thms are d cult to analyze, so often the Markov cha n s mod ed to update several s tes s multaneously. Recently there has been progress n analyz ng these more compl cated algor thms for several mportant comb nator al problems. In th s work we use the compar son techn que of D acon s and Salo Coste to show that several of the natural s ngle po nt update algor thms are e c ent. The strategy s to relate the m x ng rate of these algor thms to the correspond ng non-local algor thms wh ch have already been analyzed. Th s allows us to g ve polynom al bounds for s ngle po nt update algor thms for problems such as generat ng t l ngs, color ngs and ndependent sets.
1
Introduct on
Random sampl ng of comb nator al structures such as lozenge t l ngs of a tr angular latt ce, Euler an or entat ons and 3-color ngs of planar latt ces, and ndependent sets of a graph has attracted the attent on of researchers n comb nator cs, theoret cal computer sc ence and stat st cal phys cs n recent years. The Markov cha n Monte-Carlo method has played a cruc al role n establ sh ng e c ent algor thms for almost un form sampl ng of such structures and n y eld ng fully polynom al random zed approx mat on schemes for the correspond ng count ng problem (see e.g. [8],[13],[15],[10]). F rst we cons der planar t l ng problems, namely lozenge t l ngs on reg ons of the tr angular latt ce and dom no t l ngs on the Cartes an latt ce. For each problem the t les cover two adjacent cells n the latt ce and the t l ng s a way of cover ng the reg on so that each cell s covered by exactly one t le. The set of t l ngs (or match ngs of the dual latt ces) corresponds to d mer con gurat ons n ?
Research supported by NSF Grants No. CCR-9703206 and CCR-9503952.
C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 292 304, 1998. c Spr nger-Verlag Berl n He delberg 1998
Analyz ng Glauber Dynam cs by Compar son of Markov Cha ns
293
stat st cal phys cs. In add t on, lozenge t l ngs are well known to correspond to plane part t ons wh ch have comb nator al nterest. The second problem s three-color ng subreg ons of the Cartes an latt ce. A proper three-color ng ass gns a color to each vertex n the reg on so that ne ghbors are d erently colored. Aga n the pr mary mot vat on for the generat on problem comes from stat st cal phys cs where color ngs relate to the ce model and the dom nant coe c ent of the part t on funct on of the three-state Potts model [2]. The nal problem we cons der s that of generat ng a random ndependent set of a graph. The we ghted vers on namely, sampl ng the ndependent sets accord ng to a G bbs measure, was mot vated by the nterest n est mat ng the part t on funct on of the hard-core d str but on (see e.g. [3]). Each of these generat on problems have natural Markov cha n algor thms based on s ngle s te updates, also known as Glauber dynam cs. For example an obv ous algor thm for generat ng a random ndependent set can be descr bed as follows. The state space s the set of all (val d) ndependent sets of the g ven graph. Start ng w th an arb trary ndependent set of the graph, at each step p ck a vertex un formly at random. If t s conta ned n the ndependent set then remove t w th a certa n probab l ty, and otherw se add t to the ndependent set, also w th an appropr ate probab l ty, f th s new set rema ns an ndependent set. The probab l t es are chosen carefully so that the stat onary d str but on of the Markov cha n s exactly the d str but on we would l ke to sample from. In th s work we analyze the s ngle s te update/replacement algor thms for the above problems, and we prov de low-degree polynom al bounds for the rates of m x ng of the correspond ng Markov cha ns. Our analys s makes use of the results on the cha ns us ng the fanc er moves nsp red by heat bath algor thms the tower moves n the case of t l ngs and the edge moves n the case of ndependent sets. The proof techn que uses a compar son theorem due to D acon s and Salo Coste [5]. The r theorem y elds a geometr c compar son nequal ty that g ves bounds on the e genvalues of a revers ble Markov cha n n terms of the e genvalues of a second cha n. The ma n appl cat on n [5] was to g ve a sharp upper bound on the second e genvalue of the symmetr c exclus on process on a graph. The symmetr c exclus on process on a graph s a certa n general zat on of a s mple random walk on a graph and can be descr bed as follows. Start w th an arb trary placement of r part cles on r vert ces of a graph. At each d screte t me step, a part cle s chosen at random, and then one of ts ne ghbor ng vert ces s chosen at random. If the ne ghbor s unoccup ed (by a part cle) then the chosen part cle s moved there, otherw se the system stays as t was. The spec al case of r = 1 corresponds to the s mple random walk on a graph. D acon s and Salo -Coste bound the second e genvalue of th s cha n by compar ng t to a well stud ed cha n (the Bernoull Laplace model for d us on) whose e genvalues are known. Our approach s somewhat d erent n th s paper, s nce the Markov cha ns we work w th are much more comb nator al n nature; n part cular, t s very hard to determ ne the second e genvalue of these cha ns or of related cha ns w th the same stat onary d str but ons. However, the known cha ns n our appl cat ons are
294
Dana Randall and Prasad Tetal
cha ns whose m x ng t mes are known. Us ng the ex st ng l terature on relat ng the t me to reach equ l br um and the second e genvalue (e.g. [4],[8]), together w th the compar son theorem, we der ve an nequal ty relat ng the m x ng t mes of two cha ns. Th s allows us to est mate the rate of m x ng of the unknown cha ns (based on s ngle s te updates) ment oned above. D rect analys s of any of these cha ns seems challeng ng and m ght y eld t ghter bounds on the m x ng t mes. In Sect. 2 we descr be relevant results from the theory of rap dly m x ng Markov cha ns, nclud ng relat ons between m x ng t mes and e genvalues, and compar son nequal t es. In Sect. 3 we descr be n deta l the appl cat on to t l ngs, color ngs and ndependent sets.
2
Prel m nar es
Let (Ω P ) denote an ergod c ( .e. rreduc ble and aper od c) Markov cha n w th n te state space Ω, trans t on probab l ty matr x P , and stat onary d str but on . Furthermore, we assume that the cha n s revers ble, .e. that we have the deta led balance cond t ons, (x)P (x y) = (y)P (y x), for all x y Ω. Assum ng we are deal ng w th d screte-t me Markov cha ns, for x y Ω, Z +, t let P (x y) denote the -step probab l ty of go ng from x to y. Then the t me a Markov cha n takes to be close to equ l br um can be measured us ng the var at on d stance between P t and , where the var at on d stance s g ven by x(
)=
1X t P (x y) − (y) 2 y2Ω
We also denote by ( ) the var at on d stance start ng from the worst state, .e. ( ) = max x ( ). x2Ω
M x ng T me and the Second E genvalue. For start ng from state x, s de ned by x(
)=mn
:
x(
0
)
> 0, the m x ng t me,
0
Once aga n we denote by ( ), the m x ng t me start ng from the worst state, .e. ( ) = max x ( ). For the rest of the paper, when we refer to m x ng t me, x2Ω
we always mean ( ). Let 1 = 0 > 1 2 jΩj−1 > −1 denote the e genvalues of P . The follow ng result of Welsh [22] (wh ch s an extens on of a key result from [16]; see also [4], [17]) shows the relat onsh p between m x ng t mes and the max mum e genvalues. Str ctly speak ng, 1 n the follow ng theorem should be replaced by max = max( 1 jΩj−1 ), but n all our appl cat ons below we make sure that > > 0 by add ng self-loops w th we ght 1/2. 1 jΩj−1
Analyz ng Glauber Dynam cs by Compar son of Markov Cha ns
> 0, we have
Theorem 1. For ( ) for all x ( ) maxx
x(
Ω, )
1 1− 1 log 1 1 2(1− 1 ) log 2 . x(
)
1 (x)
295
;
M x ng T me and Coupl ng T me. Another method for bound ng the m x ng t me ( ) s to construct a coupl ng for the Markov cha n. A coupl ng s a new Markov cha n on the state space Ω Ω (where Ω s the or g nal state space, e.g., the set of 3-color ngs) w th the follow ng propert es. Rather than updat ng two con gurat ons ndependently, the coupled process correlates the random co n fl ps wh le ma nta n ng that each con gurat on, when observed n solat on, s just perform ng trans t ons of the or g nal Markov cha n. In add t on, we need that f the two con gurat ons agree, the coupled process w ll force them to agree at all future t mes. Coupl ng s a cruc al ngred ent n all of the appl cat ons n Sect. 3. The follow ng theorem states that the coupl ng t me, wh ch s the expected t me t takes for two con gurat ons to meet start ng from the worst start ng po nt, prov des a good bound on the m x ng t me. More formally, let x and y be the start ng con gurat ons. Then Tx y = m n
: Xt = Yt X0 = x Y0 = y
and de ne the coupl ng t me to be T = maxx y ET x y . The follow ng result relates the m x ng t me and the coupl ng t me (see [1]). Theorem 2.
( )
6T (1 + ln
−1
).
Compar son of E genvalues (v a D r chlet Forms). Let P and P denote two revers ble Markov cha ns on the same state space Ω w th the same stat onary d str but on . Then D acon s and Salo -Coste (see [5]) prov de the follow ng geometr c bound between the two e genvalues 1 (P ) and 1 (P ). (Str ctly speakng, the result n [5] compares the D r chlet forms assoc ated w th P and P thus y eld ng the follow ng compar son result between all nontr v al e genvalues, and not just the second e genvalue.) F rst we need some more notat on. As we shall see, n appl cat ons, P s the cha n w th known e genvalues (or known m x ng t me), and P s the cha n whose m x ng t me we would l ke to bound by compar ng w th P . Let E(P ) = (x y) : P (x y) > 0 and E(P ) = (x y) : P (x y) > 0 denote the sets of edges of the two cha ns, v ewed as d rected graphs. For each x y w th P (x y) > 0, xk−1 xk = y de ne a path γxy us ng a xed sequence of states, x0 = x x1 w th P (x x +1 ) > 0. The length (= k) of such a path w ll be denoted by γxy . Further let o n Γ (z w) = (x y) E(P ) such that (z w) γxy denote the set of paths wh ch use the trans t on (z w).
296
Dana Randall and Prasad Tetal
Theorem 3. W th the above notat on, we have (1 − 8 <
where A=
max
(z w)2E(P ) :
1 (1 − A
1 (P ))
1 (z)P (z w)
X
1 (P ))
9 = (x)P (x y) ;
γxy
Γ (z w)
It s worth not ng that the quant ty A above depends on our cho ce of paths γxy ; thus these paths play a role ak n to that of the canon cal paths n bound ng the conductance of a Markov cha n. However, the cruc al d erence, as po nted n [5], s that we need only de ne these paths between pa rs of states wh ch are adjacent n the known cha n. In the follow ng our strategy s as follows. We beg n w th a bound on the m x ng t me of a cha n establ shed, say, v a the coupl ng method and Thm. 2. We then use part ( ) of Thm. 1 above to lower bound the spectral gap of such a cha n. Next we use the compar son theorem (Thm. 3) to lower bound the spectral gap of an unknown cha n by carefully bound ng the parameter A. Th s n turn prov des us w th a bound on the m x ng t me of the unknown cha n n v ew of part ( ) of Thm. 1. The follow ng techn cal propos t on makes prec se the aforement oned strategy, and s thus cruc al to our results of the next sect on. Let ( ) and ( ) denote the m x ng t mes of P and P respect vely. Then w th A as n Thm. 3, we have the follow ng compar son result relat ng the m x ng t mes. Let denote m n (x). x2Ω
Propos t on 1. For 0 < < 1, and for all x ( )
Ω, we have that
4 log(1 ( )) A ( ) log(1 2 )
Proof. For 0 < < 1, from part ( ) of Thm. 1, we have, ( )
1 (P )
2(1 −
1 (P ))
log
1 2
Th s mpl es that (1 −
1 log 4 ( )
1 (P ))
where n we also used the tr v al bound, theorem, we get that 1−
1 (P )
1 (1 − A
1 (P )
1 (P ))
1 2
1 2. Now us ng the compar son 1 1 log A4 ( )
1 2
Analyz ng Glauber Dynam cs by Compar son of Markov Cha ns
297
F nally, us ng part ( ) of Thm. 1 we can bound the m x ng t me of P , start ng from any state x, x(
)
4 log(1 ( (x))) A ( ) log(1 2 )
complet ng the proof of the propos t on. Remark 1. The above propos t on llustrates the fact that the compar son argument s e ect ve as long as we can control the factor A, wh ch depends on the cho ce of paths n the unknown cha n. The dependence on , albe t not as cruc al, can a ect the m x ng t me by another factor nvolv ng the s ze of the nput, s nce 1 n most cases s at most exponent al n the s ze of the nput.
3 3.1
Appl cat ons Lozenge T l ngs
Let R be a reg on of the tr angular latt ce. A lozenge t l ng of R s a cover ng of the reg on w th lozenges t les, where each lozenge covers two adjacent cells n R and no two lozenges overlap. Just look ng at a lozenge t l ng causes a three d mens onal surface to appear n fact the set of lozenge t l ngs correspond b ject vely w th the surfaces formed by plac ng un t cubes n a larger threed mens onal frame such that each cube s supported on ts back three s des. The shape of the frame s un quely determ ned from the reg on R (see F g. 1).
F g. 1. A lozenge t l ng v ewed as a surface G ven th s equ valence, there s an obv ous Markov cha n Mssu for generatng t l ngs. Namely, connect any two t l ngs whose surfaces d er by the add t on or removal of a s ngle cube. In the two d mens onal p cture of a t l ng th s corresponds to choos ng a hexagonal w ndow and f t s compr sed of three t les, rotate them by 60o (see F g. 2). More prec sely, the trans t on matr x P ( ) of Mssu s de ned to be ( 1 2N f x y s a cube (or hexagon), P P (x y) = 1 − P (x z) otherw se. z6=x
298
Dana Randall and Prasad Tetal
−
F g. 2. A move n the Markov cha n Mssu
In [13] a mod ed algor thm based on tower moves was analyzed. Aga n the state space s the set of all lozenge t l ngs. Two t l ngs d er by a tower of he ght k f they d er by the add t on or removal of a 1 1 k vert cal column of cubes. Let x and y be lozenge t l ngs of the reg on R. Let Mloz represent the Markov cha n n wh ch there s a move from x to y f and only f the symmetr c d erence of the edges of x and y s a tower. Recall that the trans t on probab l t es P ( ) of Mloz are de ned by ( P (x y) =
1 2N h f x y s a tower of he ght h, P 1 − P (x z) otherw se. z6=x
where N s the area of the reg on be ng t led. Note that both Mssu and Mloz have the un form d str but on as the stat onary d str but on. W lson mproves the analys s g ven n [13] to show that the Markov cha n based on tower moves m xes n t me O(W 2 N log N log(2 )), where W s the w dth and N s the area of the tr angular latt ce reg on to be t led [23]. However, ne ther approach g ves the m x ng t me of Mssu . We now de ne a set of paths and then bound A correspond ng to these paths. For each (x y) wh ch d er by a tower move of he ght hxy (and are thus adjacent n the known cha n), there s a un que m n mum length sequence of s ngle s te update moves of length hxy wh ch transforms x nto y. Such a sequence de nes a path γxy , n a natural way, us ng trans t ons of P ( ). Note that the length of the path s hxy and P (x y) = 1 2N hxy . Cons der an arb trary (z w) where z and w are lozenge t l ngs wh ch d er by a s ngle cube. Note that P (z w) = 1 2N . Furthermore, for a g ven (z w), the number of (x y) such that the path γxy uses (z w) s at most H 2 , where H s the max mum he ght of a tower. (Th s s because the bottom and the top of any tower conta n ng a part cular s ngle s te can be chosen n at most H ways.) The follow ng bound for the quant ty A s then der ved from the compar son theorem.
Analyz ng Glauber Dynam cs by Compar son of Markov Cha ns
299
9 = X 1 A = max γxy (x)P (x y) ; (z w)2E(P ) : (z)P (z w) Γ (z w) 9 8 = < 2N X hxy (x)(1 2N hxy ) = max ; (z w)2E(P ) : (z) Γ (z w) X 1 H2 8 <
Γ (z w)
Theorem 4. Let R be a reg on n the tr angular latt ce whose convex hull has area N . Then the m x ng t me of Mssu for generat ng a lozenge t l ng of R s g ven by 4 3 ssu = O N log N + N log N log(1 ) Proof. Clearly the number of lozenge t l ngs of a reg on of s ze N s at most 3N (s nce we can overcount by replac ng each tr angle n the underly ng reg on w th a tr angle w th one s de dent ed those con gurat ons where dent ed edges l ne up are the set of val d t l ngs). The bound on the m x ng t me of Mloz g ven by W lson [23] s ( ) = O(W 2 N log N log(1 )), where W s the w dth of the reg on. Therefore, by Prop. 1, ssu
s nce H
log(1 ( )) 2 (H )(W 2 N log N log(1 )) log(2 ) = O N 4 log N + N 3 log N log(1 )
W = O(N ).
Dom no T l ngs. The s ngle s te algor thm for dom no t l ngs follows exactly the same analys s. Start ng from any dom no t l ng, choose a 2 2 w ndow; f there are two parallel dom nos, rotate them by 90o . Th s s mple algor thm s mot vated by the l near t me t l ng algor thm of Thurston [19] for generat ng a s ngle t l ng. In [13] a tower algor thm s presented wh ch ach eves a m x ng t me of O(N 3 5 (1 + log(1 ))), where N s the area of the Cartes an latt ce reg on to be t led. We can show that A H 2 , where H s the s ze of the max mal tower; for square reg ons th s s O(N 1 2 ) and n general s at most N . In add t on, the number of dom no t l ngs s tr v ally bounded by 4N . Thus, the compar son theorem establ shes the e c ency of th s local algor thm. 3.2
Three-Color ngs and Euler an Or entat ons
The second appl cat on we cons der s generat ng three-color ngs of a subreg on of the 2-d mens onal Cartes an latt ce. If we x the colors around the boundary of the reg on, the set of three-color ngs corresponds to Euler an or entat ons of a reg on n the dual latt ce an Euler an or entat on s a way of d rect ng the
300
Dana Randall and Prasad Tetal
edges w th n a reg on so that each vertex has ts ndegree equal to ts outdegree (see, e.g. [2]). In [13] an algor thm based on tower moves was analyzed wh ch s s m lar to the algor thm for lozenges. We now show that a s ngle po nt update algor thm s also rap dly m x ng. Us ng the deas n [15] th s can be general zed further to show that the s ngle po nt update algor thm s rap dly m x ng on the set of all color ngs (allow ng the boundary colors to vary) but we leave the deta ls for the full paper. The s ngle po nt update algor thm s de ned as follows. Start ng from a val d three-color ng (e.g., the 2-color ng ar s ng from the b part t on of the Cartes an latt ce) choose a vertex v and a color c 0 1 2 un formly at random. If v can be recolored us ng c, recolor t w th probab l ty 1/2, otherw se rema n at the current color ng. In [13] t was shown that th s Markov cha n s ergod c. Jerrum showed that th s s ngle po nt update algor thm s rap dly m x ng on the set of k-color ngs (choos ng c 0 k − 1 ) when k 2 , where s the max mum degree [7]; th s mpl es we can k-color the 2-d latt ce whenever k 8. Here we show the same algor thm s e c ent when k = 3. We start by recall ng the 3-color ng algor thm based on tower moves. Let x and y be three-color ngs of R, and let cx (v) and cy (v) be the respect ve colors of vertex v. We say x and y d er by a tower of he ght k f there ex sts a vert cal or hor zontal set of vert ces where we can add 1 to each of the colors of x (or y) to get the other color ng. More prec sely, t mpl es that there ex sts a vertex v and a un t bas s vector e such that cy (v) = cx (v) + 1 cy (v + e) = cy (v + (k − 1)e) = cx (v + (k − 1)e) + 1 or cy (v) = cx (v) − cx (v + e) + 1 1 cy (c + (k − 1)e) = cx (c + (k − 1)e) − 1 (where all sums are taken mod 3), and cy (v 0 ) = cx (v 0 ) for all other vert ces v 0 . Then we de ne our trans t on matr x Pcol of Mcol to be ( 1 2N h f x y s a tower of he ght h, P Pcol (x y) = 1 − Pcol (x z) otherw se. z6=x
Aga n, the cruc al ngred ent s bound ng A, the factor ar s ng n the compar son theorem. As before, we de ne our paths between x and y wh ch are connected by a tower move of he ght k by decompos ng the moves nto k s ngle s te update moves n the obv ous way. It s not d cult to see that for any (z w) wh ch d er by a s ngle vertex that Γ (z w) 2H 2 , where H s the maxmal he ght of a tower. Follow ng exactly the same analys s as n the prev ous subsect on y elds A 2H 2 S nce the number of three-color ngs s bounded by 3N , th s, together w th the polynom al bound on the m x ng t me of Mcol shows that the s ngle s te update algor thm s rap dly m x ng as well.
Analyz ng Glauber Dynam cs by Compar son of Markov Cha ns
3.3
301
Independent Sets
Cons der the problem of sampl ng from the set I of all ndependent sets of a graph G = (V E) w th the follow ng d str but on. Let > 0 be an arb trary real. Then we are nterested n the probab l ty d str but on on I wh ch assoc ates w th each S I, jSj
(S) = P
Z
jSj
, s the normal z ng factor. Th s problem s mot vated by where Z = S2I the hard-core latt ce gas model n stat st cal phys cs, and n that context Z s referred to as the part t on funct on. A natural s ngle s te replacement algor thm for th s problem s the one n wh ch we start w th an arb trary ndependent set of G, and n each step, we p ck a vertex v un formly at random from V , and w th appropr ate probab l ty e ther ( ) nclude t n the current ndependent set S, f none of ts ne ghbors are n S, or ( ) remove t from S, f v s already n S. The prec se formulat on s as follows. Let V = n and E = m. Also, let = (G) denote the max mum degree n G. Let x denote the current state ( .e. ndependent set). Then the trans t ons P (x y) of such a cha n Mv , based on s ngle replacements, can be descr bed as follows. Choose u V un formly at random If u x then y = x u w th probab l ty 1 (1 + ) and y = x w th the rema n ng probab l ty. If u x and f x u s an ndependent set, then y = x u w th probab l ty (1 + ); otherw se y = x. Recently, Luby and V goda cons dered [14] the follow ng sl ghtly d erent cha n Me wh ch works by choos ng edges un formly at random. The trans t on probab l t es P ( ) of Me can be descr bed as follows. Let x be the current state and y the next state. Choose e = (u v) Let
E un formly at random
8 <x Y= x : x
u v u v
v u
w th probab l ty 1 (1 + 2 ) w th probab l ty (1 + 2 ) w th probab l ty (1 + 2 )
If Y s a (val d) ndependent set, then set y = Y , else y = x. In [14], under the assumpt on that 1 ( − 3), the cha n Me was shown to have the m x ng rate ( ) of O(n3 ln( −1 )). Once aga n, there s a natural way to de ne a path between pa rs (x y) w th P (x y) > 0. If (x y) s a val d trans t on n P ( .e. P (x y) > 0), then we let the
302
Dana Randall and Prasad Tetal
same trans t on denote the (tr v al) path of length one between x and y. Th s leaves the only case of when x and y are such that there ex st adjacent vert ces u and v n the graph w th u x, u y and v y, v x. In such a case we v = y; note that such de ne the path γxy to be x0 = x x1 = x u x2 = x1 a path s of length two. For (z w) E(P ), cons der A(z w) =
1 (z)P (z w)
X
γxy (x)P (x y)
(x y)2Γ (z w)
Th s can be bounded by cons der ng the follow ng poss ble cases. Case 1. Suppose (z w) s such that z = w u (see F g. 3). Then P (z w) = 1 [(1 + )n]. Such a trans t on can be n a path γxy of length one or a path γx0 y0 of length two. In the former case x = z and y = w, and there are at most edges moves wh ch cause th s trans t on. S m larly, n the latter case x0 = z and so (x0 ) = (z) and there are at most such (x0 y 0 ). Lett ng Γ2 (z w) = (x0 y 0 ) such that γx0 y0 has length two , we nd that 0 1 X 1 @P (z w) + 2P (x0 y 0 )A A(z w) P (z w) 0 0 (x y )2Γ2 (z w) 1 +2 (1 + )n m (1 + 2 ) (1 + 2 )m n (1 + )n (1 + 2 ) = (1 + ) = (1 + 2 )m m Case 2. Suppose (z w) s such that z = w u . By revers b l ty, (z)P (z w) = (w)P (w z). For each (x y) Γ (z w), we have a path (y x) Γ (w z), and once aga n, by revers b l ty, (x)P (x y) = (y)P (y x). Therefore A(z w) = A(w z), and by Case 1 we aga n nd A(z w)
(1 + )
n m
Thus from both cases, we have, 9 8 = < X 1 γxy (x)P (x y) A = max ; (z w)2E(P ) : (z)P (z w) Γ (z w)
n (1 + ) m
Analyz ng Glauber Dynam cs by Compar son of Markov Cha ns P (x y)
303
P (z w) =
(removal): z=x P (x0 y 0 )
w=y P (w y 0 ) =
P (z w) =
(sh ft): z = x0
w
y0
F g. 3. Case 1: Edge moves decomposed nto one or two step paths n the s ngle s te cha n.
Remark 2. V goda recently d scovered an argument for d rectly bound ng the m x ng t me of the s ngle s te dynam cs for the ndependent set problem us ng a clever potent al funct on and path coupl ng [20]. Th s shows that the m x< 2 ( − 2), and ng t me of Mv s bounded by O(n log n log(2 )) for O(n2 log n log(2 )) for = 2 ( − 2).
4
Conclus ons
In th s paper we address the ssue that although two Markov cha ns appear qu te s m lar, t s often the case that only one adm ts a s mple analys s us ng currently ava lable tools. We env s on that there are many other appl cat ons for compar son techn ques. For nstance, the results from Sect. 3.3 for ndependent sets can be extended to relate the m x ng rate of the s ngle s te updates to the m x ng rate of a dynam cs wh ch would update all the s tes n a rectangle of xed s ze a b. Th s so-called heat bath algor thm s used exper mentally n stat st cal phys cs to study the un queness of the G bbs state of the hard-core latt ce gas model. Us ng the method g ven here, we der ve a bound on the m x ng rate of th s new Markov cha n wh ch ntroduces a factor wh ch depends exponent ally on m n(a b). F nally, we remark that D acon s Salo -Coste have also extended the r compar son theorem to compare the log-Sobolev constants of two Markov cha ns [6]. It would be n ce to nd appl cat ons n computer sc ence for th s result as well.
References 1. Aldous, D. Random walks on n te groups and rap dly m x ng Markov cha ns. Sem na re de Probab l tes XVII, 1981/82, Spr nger Lecture Notes n Mathemat cs 986, pp. 243 297.
304
Dana Randall and Prasad Tetal
2. Baxter, R.J. Exactly solved models n stat st cal mechan cs. Academ c Press, London, 1982. 3. van den Berg, J. and Ste f, J.E. Percolat on and the hard-core latt ce gas model. Stochast c Processes and the r Appl cat ons 49, 1994, pp. 179-197. 4. D acon s, P. and Stroock, D. Geometr c bounds for e genvalues of Markov cha ns. Ann. Appl. Probab l ty 1, 1991, pp. 36-61. 5. D acon s, P. and Salo -Coste, L. Compar son theorems for revers ble Markov cha ns. Ann. Appl. Probab l ty 3, 1993, pp. 696-730. 6. D acon s, P. and Salo -Coste, L. Logar thm c Sobolev Inequal t es for F n te Markov Cha ns. Ann. of Appl. Probab. 6 (1996), pp. 695-750. 7. Jerrum, M. A very s mple algor thm for est mat ng the number of k-color ngs of a low-degree graph. Random Structures and Algor thms 7, 1995, pp. 157-165. 8. Jerrum, M.R. and S ncla r, A.J. Approx mat ng the permanent. SIAM Journal on Comput ng 18 (1989), pp. 1149 1178. 9. Jerrum, M., Val ant, L. and Vaz ran , V. Random generat on of comb nator al structures from a un form d str but on. Theoret cal Computer Sc ence 43 (1986), pp. 169 188. 10. Kannan, R., Tetal , and P., Vempala, S. S mple Markov cha n algor thms for generat ng b part te graphs and tournaments. Proc. of the 8th ACM-SIAM Symp. on D screte Algor thms January 1997. 11. L eb, E.H. Res dual entropy of square ce. Phys cal Rev ew 162, 1967, pp. 162172. 12. Lub n, M. and Sokal A.D. Comment on Ant ferromagnet c Potts Model. Phys. Rev. Lett. 71, 1993, pp. 1778. 13. Luby, M., Randall, D. and S ncla r, A. Markov Cha n Algor thms for Planar Latt ce Structures. Proc. 36th IEEE Sympos um on Foundat ons of Comput ng (1995), pp. 150-159. 14. Luby, M., V goda, E. Approx mately count ng up to four. Proc. 29th ACM Sympos um on Theory of Comput ng (1997), pp. 150-159. 15. Madras, N. and Randall, D. Factor ng Graphs to Bound M x ng T me. Proc. 37th IEEE Sympos um on Foundat ons of Comput ng (1996). 16. S ncla r, A.J. and Jerrum, M.R. Approx mate count ng, un form generat on and rap dly m x ng Markov cha ns. Informat on and Computat on 82 (1989), pp. 93 133. 17. S ncla r, A.J. Algor thms for random generat on & count ng : a Markov cha n approach. B rkh¨ auser, Boston, 1993, pp. 47-48. 18. S ncla r, A.J. Improved bounds for m x ng rates of Markov cha ns and mult commod ty flow. Comb nator cs, Probab l ty, & Comput ng. 1 (1992), pp. 351-370. 19. Thurston, W. Conway’s t l ng groups. Amer can Mathemat cal Monthly 97, 1990, pp. 757-773. 20. V goda, E. personal commun cat on. 21. Welsh, D.J.A. The computat onal complex ty of some class cal problems from stat st cal phys cs. In D sorder n Phys cal Systems, (G. Gr mmett and D. Welsh eds.). Claredon Press, Oxford, 1990, pp. 323-335. 22. Welsh, D.J.A. Approx mate count ng. In Surveys n Comb nator cs, (R.A. Ba ley, ed.). Cambr dge Un vers ty Press, London Math Soc ety Lecture Notes 241, 1997, pp. 287-317. 23. W lson, D.B. M x ng t mes of lozenge t l ng and card shu ng Markov cha ns, draft of manuscr pt.
The CREW PRAM Complex ty of Modular Invers on Joach m von zur Gathen1 and Igor Shparl nsk 2 1
2
FB Mathemat k-Informat k, Un vers t¨ at-GH Paderborn, 33095 Paderborn, Germany, gathen@un -paderborn.de School of Mathemat cs, Phys cs, Comput ng and Electron cs, Macquar e Un vers ty, Sydney, NSW 2109, Austral a
[email protected]
Abs rac . One of the long-stand ng open quest ons n the theory of parallel computat on s the parallel complex ty of the nteger gcd and related problems, such as modular nvers on. We present a lower bound Ω(log n) for the CREW PRAM complex ty for nvers on modulo certa n n-b t ntegers, nclud ng all such pr mes. For n n tely many modul , our lower bound matches asymptot cally the known upper bound. We obta n a s m lar lower bound for comput ng a spec ed b t n a large power of an nteger. Our ma n tools are certa n est mates for exponent al sums n n te elds.
1
Introduct on
We address the problem of parallel computat on of the nverse of ntegers modulo an nteger M . That s, g ven pos t ve ntegers M 3 and x < M , w th gcd(x M ) = 1, we want to compute the modular nverse nvM (x) IN de ned by the cond t ons x nvM (x)
1 mod M
1
nvM (x) < M
(1)
S nce nvM (x) x (M)−1 mod M , where s the Euler funct on, nvers on s a spec al case of the more general quest on of modular exponent at on. Both these problems can also be cons dered over n te elds and other algebra c doma ns. For nvers on, exponent at on and gcd, several parallel algor thms are n the l terature [1,2,8,9,10,11,12,13,14,19,20,17]. The quest on of obta n ng a general parallel algor thm runn ng n poly-logar thm c t me (log n)O(1) for n-b t ntegers M s w de open [10,11]. Some lower bounds on the depth of ar thmet c c rcu ts are known [10,14]. On the other hand, some examples nd cate that for th s k nd of problem the Boolean model of computat on may be more powerful than the ar thmet c model; see d scuss ons of these phenomena n [8,10,14]. We show here n that the method of [4,24] can be adapted to der ve non-tr v al lower bounds on Boolean CREW PRAMs. It s based on est mates of exponent al sums. C. L. Lucches , A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 305 315, 1998. c Spr nger-Verlag Berl n He delberg 1998
306
Joach m von zur Gathen and Igor Shparl nsk
Our bounds are der ved from lower bounds for the sens t v ty (f ) (or cr t cal Xn ) w th b nary nputs X1 Xn . complex ty) of a Boolean funct on f (X1 It s de ned as the largest nteger m n such that there s a b nary vector xn ) for wh ch f (x) = f (x( ) ) for m values of n, where x( ) s x = (x1 the vector obta ned from x by fl pp ng ts th coord nate. In other words, (f ) s the max mum, over all nput vectors x, of the number of po nts y on the un t Hamm ng sphere around x w th f (y) = f (x); see e.g., [27]. S nce [3], the sens t v ty has been used as an e ect ve tool for obta n ng lower bounds of the CREW PRAM complex ty, .e., the complex ty on a parallel random access mach ne w th an unl m ted number of all-powerful processors, where each mach ne can read from and wr te to one memory cell at each step, but where no wr te confl cts are allowed: each memory cell may be wr tten nto by only one processor, at each t me step. By [21], 0 5 log2 ( (f ) 3) s a lower bound on the parallel t me for comput ng f on such mach nes, see also [5,6,7,27]. Th s y elds mmed ately the lower bound Ω(log n) for the OR and the AND of n nput b ts. It should be contrasted w th the common CRCW PRAM, where wr te confl cts are allowed, prov ded every processor wr tes the same result, and where all Boolean funct ons can be computed n constant t me (w th a large number of processors). The contents of th s art cle are as follows. In Sect. 2, we prove some aux l ary results on exponent al sums. We apply these n Sect. 3 to obta n a lower bound on the sens t v ty of the least b t of the nverse modulo a pr me. In Sect. 4, we use the same approach to obta n a lower bound on the sens t v ty of the least b t of the nverse modulo an odd square free M . The bound s somewhat weaker, and the proof becomes more nvolved due to zero-d v sors n the res due r ng modulo M , but for some such modul we are able to match the known upper and the new lower bounds. Namely, we obta n the lower bound Ω(log n) on the CREW PRAM complex ty of nvers on modulo an n-b t odd square free M w th not ‘too many’ pr me d v sors, and we exh b t n n te sequences of M for wh ch th s bound matches the upper bound O(log n) from [10] on the depth of P -un form Boolean c rcu ts for nvers on modulo a ‘smooth’ M w th only ‘small’ pr me d v sors; see (7) and (8). For example, the bounds co nc de for modul ps , where p1 ps are any s log s pr me numbers between s3 M = p1 3 and 2s . We apply our method n Sect. 5 to the follow ng problem posed by Allan Borod n (see Open Quest on 7.2 of [10]): g ven n-b t pos t ve ntegers m x e, compute the mth b t of xe . Generally speak ng, a parallel lower bound Ω(log n) for a problem w th n nputs s not a b g surpr se. Our nterest n these bounds comes from the r follow ng features: some of these quest ons have been around for over a decade; no s m lar lower bounds are known for the gcd; on the common CRCW PRAM, the problems can be solved n constant t me; for some types of nputs, our bounds are asymptot cally opt mal; the powerful tools we use from the theory of n te elds m ght prove helpful for other problems n th s area.
The CREW PRAM Complex ty of Modular Invers on
2
307
Exponent al Sums
The ma n tool for our bounds are est mates of exponent al sums. For a pr me p and a pos t ve nteger z, we wr te ep (z) = exp(2 z p) C. The follow ng dent ty follows from the formula for a geometr c sum. Lemma 1. For any pr me p and any nteger a, ep (au) = 0u
0 fa p fa
0 mod p 0 mod p
Lemma 2. For any pr me p and any pos t ve nteger H
p, we have
ep (a(y − x)) = pH 0a
Proof. We note that 2
ep (a(y − x)) = 0x y
ep (ax)
>0
0x
Thus ep (a(y − x)) = 0a
ep (a(y − x)) 0a
ep (a(y − x))
= 0x y
From Lemma 1 we see that the last sum s equal to pW , where W s the number of (x y) w th x y mod p and 0 x y < H. Obv ously W = H. In the sequel, we cons der several sums over values of rat onal funct ons n res due r ngs, wh ch may not be de ned for all values. We use the symbol to express that the summat on s extended over those arguments for wh ch the rat onal funct on s well-de ned, .e., ts denom nator s relat vely pr me to the modulus. We g ve an expl c t de n t on only n the example of the follow ng statement, wh ch s essent ally the We l bound, see [18,23,28]. Lemma 3. Let f g ZZ[X] be two polynom als of degrees n, m, respect vely, and p a pr me number such that the rat onal funct on f g s de ned and not constant modulo p. Then 0x
ep (f (x) g(x)) =
ep (f (x) g(x)) 0≤x
(n + m − 1)p1
2
308
Joach m von zur Gathen and Igor Shparl nsk
Lemma 4. Let p be a pr me number, f g ZZ[X] of degrees n, m, respect vely, such that f g s de ned and ne ther constant nor a l near funct on modulo p. Then for any N H IN w th H p we have
ep
0x y
f (N + x − y) g(N + x − y)
(n + m − 1)Hp1
2
Proof. From Lemma 1 we obta n
f (N + x − y) g(N + x − y)
ep
0x y
= =
1 p 1 p 1 p
ep (d f (u) g(u))
0u
ep (a(u − N − x + y)) 0a
ep (−aN ) 0a<M
f (u) + au g(u)
ep
0u
0a
ep
f (u) + au g(u)
ep (a(y − x)) 0x y
ep (a(y − x)) 0x y
From Lemma 3 we see that for each a < p the sum over u can be est mated as (n + m − 1)p1 2 . Apply ng Lemma 2, we obta n the result. Throughout th s paper, log z means the logar thm of z n base 2, ln z means the natural logar thm, and ln z f z > 1 Ln z = 1
3
fz
1
PRAM Complex ty of the Least B t of the Inverse Modulo a Pr me Number
In th s sect on, we prove a lower bound on the sens t v ty of the Boolean funct on represent ng the least b t of the nverse modulo p, for an n-b t pr me p. For x IN w th gcd(x p) = 1, we recall the de n t on of nvp (x) IN n (1). Furthermore, xn−2 0 1 , we let for x0 num(x0
xn−2 ) =
x2
(2)
0 n−2
xn−2 . We cons der Boolean be the number w th b nary representat on x0 funct ons f w th n − 1 nputs wh ch sat sfy the congruence f (x0
xn−2 )
nvp (num(x0
xn−2 )) mod 2
(3)
xn−2 0 1 w th (x0 xn−2 ) = (0 0). Thus no cond t on for all x0 s mposed for the value of f (0 0). F nally we recall the sens t v ty from the ntroduct on.
The CREW PRAM Complex ty of Modular Invers on
309
Theorem 1. Let p be a su c ently large n-b t pr me. Suppose that a Boolean Xn−2 ) sat s es the congruence (3). Then funct on f (X0 1 1 n − log n − 1 6 3
(f )
Proof. We let k be an nteger parameter to be determ ned later, w th 2 k n − 3, and show that (f ) k for p large enough. For th s, we prove that there s some nteger x w th 1 x 2n−k−1 and nvp (2k x)
nvp (2k x + 2 −1 )
1 mod 2
0 mod 2
for 1
k
prov ded that p s large enough. We note that all these 2k x and 2k x + 2 are ndeed nvert ble modulo p. = 0 for 1 k. Then t s We put e0 = 0, 0 = 1, and e = 2 −1 , uk w th su c ent to show that there ex st ntegers x u0 1
(2k x + e )−1 0 u 2
2u + mod p (p − 3) 2 for 0
n−k−1
x
k
= 2K + for 0 Next we put A = 2k , H = 2n−k−2 , K = (p − 3) 4 , and uk v0 vk sat sfy ng k. Then t s su c ent to nd ntegers x y u0 (A(H + x − y) + e )−1 2(u − v ) + uk v0 0 x y
mod p vk < K
(4)
A typ cal appl cat on of character sum est mates to systems of equat ons proceeds as follows. One expresses the number of solut ons as a sum over a IFp , us ng Lemma 1, then solates the term correspond ng to a = 0, and hopes to nd that the rema n ng sum s less than the solated term. Usually, the challenge s to ver fy the last part. In the task at hand, Lemma 1 expresses the number of solut ons of (4) as
p−(k+1) 0x y
ep
a
0≤u0 uk v0 vk
0a0
ak
−1
(A(H + x − y) + e )
− 2(u − v ) −
0 k
= p−(k+1)
ep 0a0
−
ak
a 0 k
0x y
ep
a (A(H + x − y) + e ) 0 k
= p−(k+1) (H 2 K 2(k+1) + R)
ep 0≤u0 uk v0 vk
2a (v − u ) 0 k
310
Joach m von zur Gathen and Igor Shparl nsk
where the rst summand corresponds to a0 = = ak = 0. For other nd ces ak ), the sum over x y sat s es the cond t ons of Lemma 4, w th n = k (a0 and m = k + 1, and thus 2kHp1
R
ep
2 0a0
= 2kHp1 2kHp
ak
2a (v − u ) 0 k
0≤u0 uk v0 vk
ep (a (v − u ))
2
0 k 0a
(pK)
) , wh ch equal 1, transformed the sumWe have left out the factors ep (−a mat on ndex 2a nto a , and used Lemma 2. It s su c ent to show that H 2 K 2(k+1) s larger than R , or that HK k+1 > 2kpk+3
2
(p − 6) 4, t s su c ent that
S nce K
2n−k−2 > 2k(
p k+1 1 2 k+1 ) p 4 p−6
We now set k = (n − 3 log n) 6 , so that 6(k + 1) Now (1 + z −1 )z < e for real z > 0, and p p−6 Furthermore, p1 2n
2
2
> 2n
2n 2
2
n
(5) 2n−2 ln 2 < (p − 6) ln 2.
k+1
< e6(k+1)
(p−6)
<2
and 32n 3 < n3 2 , and (5) follows from
3 32 n 2− 2 log n 3
64
n n 2 6
2− 32 log n
64k 23k
From [21] we know that the CREW PRAM complex ty of any Boolean funct on f s at least 0 5 log( (f ) 3), and we have the follow ng consequence. Corollary 1. Any CREW PRAM comput ng the least b t of the nverse modulo a su c ently large n-b t pr me needs at least 0 5 log n − 3 steps.
4
PRAM Complex ty of the Least B t of the Inverse Modulo an Odd Square Free Integer
In th s sect on, we prove a lower bound on the PRAM complex ty of nd ng the least b t of the nverse modulo an odd square free nteger. To avo d compl cat ons w th gcd computat ons, we make the follow ng (generous) de n t on. Let M be an odd square free n-b t nteger, and f a Boolean
The CREW PRAM Complex ty of Modular Invers on
311
funct on w th n nputs. Then f computes the least b t of the nverse modulo M f and only f nvM (num(x)) f (x) mod 2 for all x 0 1 n−1 w th gcd(num(x) M ) = 1 . Thus no cond t on s mposed for ntegers x 2n or that have a nontr v al common factor w th M . Theorem 2. Let M > 2 be an odd square free nteger w th (M ) d st nct pr me d v sors, and f a Boolean funct on represent ng the least b t of the nverse modulo M , as above. Then (f )
0 5 ln M − (M )LnlnM Lnln (M ) + O(1)
Proof. (Sketch.) The proof follows the same l nes as the proof of Thm. 1, replacng Lemma 3 by ts analogue for square free modul (where the d st nct pr me d v sors show up) and a lower bound on the number of values of rat onal funct on wh ch are relat vely pr me to M . Our bound takes the form (f ) = Ω(n Lnlnn)
(6)
for an odd square free n-b t M w th (M ) ln M LnlnM for some constant < 0 5. We recall that (M ) (1 + o(1)) ln M LnlnM for any M > 1, and that (M ) = O(LnlnM ) for almost all odd square free numbers M . We denote by PRAM (M ) and BC (M ) the CREW PRAM complex ty and the Boolean c rcu t complex ty, respect vely, of nvers on modulo M . We know from [10,20] that PRAM (M )
BC (M )
= O(n)
(7)
for any n-b t nteger M . The smoothness γ(M ) of an nteger M s de ned as ts largest pr me d v sor, and M s b-smooth f and only f γ(M ) b. Then PRAM (M )
BC (M )
= O(log(nγ(M )))
(8)
S nce we are ma nly nterested n lower bounds n th s paper, we do not d scuss the ssue of un form ty. Corollary 2. BC (M )
PRAM (M )
(0 5 + o(1)) log n
for any odd square free n-b t nteger M w th
(M )
0 49 ln M LnlnM .
(9)
312
Joach m von zur Gathen and Igor Shparl nsk
Theorem 3. There s an n n te sequence of modul M such that the CREW PRAM complex ty and the Boolean c rcu t complex ty of comput ng the least b t of the nverse modulo M are both (log n), where n s the b t length of M . Proof. We show how to construct n n tely many odd square free ntegers M w th (M ) 0 34 ln M LnlnM , thus sat sfy ng the lower bound (9), and w th smoothness γ(M ) = O(log3 M ), thus sat sfy ng the upper bound O(ln ln M ) = O(log n) of [10] on the depth of Boolean c rcu ts for nvers on modulo such M . For each nteger s > 1 we select s ln s pr mes between s3 and 2s3 , and let M be the product of these pr mes. Then, M s3s ln s = exp(3s), and thus (M ) s ln s 0 34 ln M ln ln M , prov ded that s s large enough.
5
Complex ty of One B t of an Integer Power
For nonnegat ve ntegers u and m, we let Btm (u) be the mth lower b t of u, 0 1 . If u < 2m , then .e., Btm (u) = um f u = 0 u 2 w th each u Btm (u) = 0. In th s sect on, we obta n a lower bound on the CREW PRAM complex ty of comput ng Btm (xe ). For small m, th s funct on s s mple, for example Bt0 (xe ) = Bt0 (x) can be computed n one step. However, we show that for larger m th s s not the case, and the PRAM complex ty s Ω(log n) for n-b t data. Exponent al sums modulo M are eas est to use when M s a pr me, as n Sect. 3. In Sect. 4 we had the more d cult case of a square free M , and now we have the extreme case M = 2m . Theorem 4. Let m and n be pos t ve ntegers w th n the Boolean funct on w th 2n nputs and f (x0 where x = num(x0
where γ = 3 − 71
2
en−1 ) = Btm−1 (xe )
xn−1 e0
xn−1 ) and e = num(e0 (f ) = 0 3542
m + m1 2 , and let f be
γm1
2
en−1 ); see (2). Then
+ o(m)
.
The proof s based on s m lar cons derat ons as the proofs of Thm.s 1 and 2, us ng the bound of [26] on exponent al sums w th denom nator 2m . Corollary 3. Let n m + m1 2 . The CREW PRAM complex ty of nd ng the mth b t of an n-b t power of an n-b t nteger s at least 0 25 log m − o(log m). In part cular, for m = n 2 t s Ω(log n).
6
Conclus on and Open Problems
Invers on n arb trary res due r ngs can be cons dered along these l nes. There are two ma n obstacles for obta n ng s m lar results. Instead of the powerful We l
The CREW PRAM Complex ty of Modular Invers on
313
est mate of Lemma 3, only essent ally weaker (and un mprovable) est mates are ava lable [16,25,26]. Also, we need a good expl c t est mate, wh le the bounds of [16,25] conta n non-spec ed constants depend ng on the degree of the rat onal funct on n the exponent al sum. The paper [26] deals w th polynom als rather than w th rat onal funct ons, and ts general zat on has not been worked out yet. Quest on 1. Extend Thm. 2 to arb trary modul M . Modul of the form M = pm , where p s a small pr me number, are of spec al nterest because Hensel’s l ft ng allows to des gn e c ent parallel algor thms for them [2,10,14]. Thm. 4 and ts proof demonstrate how to deal w th such modul and what k nd of results should be expected. Xn ) can be un quely represented as a mulEach Boolean funct on f (X1 t l near polynom al of degree n over IF2 of the form f (X1
Xn ) =
A1 0kd 1 1 <
<
k
X1
Xk
IF2 [X1
Xn ]
k r
We de ne ts we ght wt f as the number of nonzero coe c ents n th s representat on. Both the we ght and the degree can be cons dered as measures of complex ty of f . In [4,24], the same method was appl ed to obta n good lower bounds on these character st cs of the Boolean funct on f dec d ng whether x s a quadrat c res due modulo p. However, for the Boolean funct ons of th s paper, the same approach produces rather poor results. Quest on 2. Obta n lower bounds on the we ght wt B and the degree deg B of the Boolean funct on of Thm. 2. It s well known that the modular nvers on problem s closely related to the GCD-problem. Quest on 3. Obta n a lower bound on the PRAM complex ty of comput ng ntegers u v such that M u + N v = 1 for g ven relat vely pr me ntegers M N > 1. In the prev ous quest on we assume that gcd(N M ) = 1 s guaranteed. Otherw se one can eas ly obta n the lower bound (f ) = Ω(n) on the sens t v ty of the Boolean funct on f wh ch on nput of two n-b t ntegers M and N , returns 1 f they are relat vely pr me, and 0 otherw se. Indeed, f M = p s an n b t nteger, then the funct on returns 0 for N = p and 1 for all other n b t ntegers. That s, the PRAM complex ty of th s Boolean funct on s at least 0 5 log n + O(1). Acknowledgmen . Th s paper was essent ally wr tten dur ng a sabbat cal v s t to the Un vers ty of Paderborn by the second author, who gratefully acknowledges ts hosp tal ty and excellent work ng cond t ons.
314
Joach m von zur Gathen and Igor Shparl nsk
References 1. L. M. Adleman and K. Kompella, ‘Us ng smoothness to ach eve parallel sm’, Proc. 20th ACM Symp. on Theory of Comp., (1988), 528 538. 2. P. W. Beame, S. A. Cook and H. J. Hoover, ‘Log depth c rcu ts for d v s on and related problems’, SIAM J. Comp., 15 (1986) 994 1003. 3. S. A. Cook, C. Dwork and R. Re schuk, ‘Upper and lower t me bounds for parallel random access mach nes w thout s multaneous wr tes’, SIAM J. Comp., 15 (1986), 87 97. 4. D. Coppersm th and I. E. Shparl nsk , ‘On polynom al approx mat on and the parallel complex ty of the d screte logar thm and break ng the D e Hellman cryptosystem’, Research Report RC 20724 , IBM T. J. Watson Research Centre, 1997, 1 103. 5. M. D etzfelb nger, M. Kutylowsk and R. Re schuk, ‘Exact t me bounds for comput ng Boolean funct ons on PRAMs w thout s multaneous wr tes’, J. Comp. and Syst. Sc ., 48 (1994), 231 254. 6. M. D etzfelb nger, M. Kutylowsk and R. Re schuk, ‘Feas ble t me-opt mal algor thms for Boolean funct ons on exclus ve-wr te parallel random access mach ne’, SIAM J. Comp., 25 (1996), 1196 1230. 7. F. E. F ch, ‘The complex ty of computat on on the parallel random access mach ne’, Handbook of Theoret cal Comp. Sc ., Vol.A, Elsev er, Amsterdam, 1990, 757 804. 8. E. F ch and M. Tompa, ‘The parallel complex ty of exponent at ng polynom als over n te elds’, J. ACM , 35 (1988), 651 667. 9. S. Gao, J. von zur Gathen and D. Panar o, ‘Gauss per ods and fast exponent at on n n te elds’, Lecture Notes n Comp. Sc ., 911 (1995), 311 322. 10. J. von zur Gathen, ‘Comput ng powers n parallel’, SIAM J. Comp., 16 (1987), 930 945. 11. J. von zur Gathen, ‘Invers on n n te elds us ng logar thm c depth’, J. Symb. Comp., 9 (1990), 175 183. 12. J. von zur Gathen, ‘E c ent and opt mal exponent at on n n te elds’, Comp. Complex ty, 1 (1991), 360 394. 13. J. von zur Gathen, ‘Processor e c ent exponent at on n n te elds’, Inform. Proc. Letters, 41 (1992), 81 86. 14. J. von zur Gathen and G. Serouss , ‘Boolean c rcu ts versus ar thmet c c rcu ts’, Inform. and Comp., 91 (1991), 142 154. 15. L.-K. Hua, Introduct on to number theory, Spr nger-Verlag, 1982. 16. D. Isma lov, ‘On a method of Hua Loo-Keng of est mat ng complete tr gonometr c sums’, Adv. Math. (Ben j ng), 23 (1992), 31 49. 17. R. Kannan, G. M ller and L. Rudolph, ‘Subl near parallel algor thm for computng the greatest common d v sor of two ntegers’, SIAM J. Comp., 16 (1987), 7 16. 18. R. L dl and H. N ederre ter, F n te elds, Add son-Wesley, MA, 1983. 19. B. E. L tow and G. I. Dav da, ‘O(log(n)) parallel t me n te eld nvers on’, Lect. Notes n Comp. Sc ence, 319 (1988), 74 80. 20. M. Mnuk, ‘A d v (n) depth Boolean c rcu t for smooth modular nverse’, Inform. Proc. Letters, 38 (1991), 153 156. 21. I. Parberry and P. Yuan Yan, ‘Improved upper and lower t me bounds for parallel random access mach nes w thout s multaneous wr tes’, SIAM J. Comp., 20 (1991), 88 99.
The CREW PRAM Complex ty of Modular Invers on
315
22. J. B. Rosser and L. Schoenfeld, ‘Approx mate formulas for some funct ons of pr me numbers’, Ill. J. Math. 6 (1962), 64-94. 23. I. E. Shparl nsk , Computat onal and algor thm c problems n n te elds, Kluwer Acad. Publ., Dordrecht, The Netherlands, 1992. 24. I. E. Shparl nsk , ‘Number theoret c methods n lower bounds of the complex ty of the d screte logar thm and related problems’, Prepr nt, 1997, 1 168. 25. I. E. Shparl nsk and S. A. Stepanov, ‘Est mates of exponent al sums w th rat onal and algebra c funct ons’, Automorph c Funct ons and Number Theory, Vlad vostok, 1989, 5 18 ( n Russ an). 26. S. B. Steck n , ‘An est mate of a complete rat onal exponent al sum’, Proc. Math. Inst. Acad. Sc . of the USSR, Moscow, 143 (1977), 188 207 ( n Russ an). 27. I. Wegener, The complex ty of Boolean funct ons, W ley Intersc ence Publ., 1987. 28. A. We l, Bas c number theory, Spr nger-Verlag, NY, 1974.
Communication-E cient Parallel Multiway and Approximate Minimum Cut Computation Friedhelm Meyer auf der Heide and Gabriel Teran Martinez Heinz Nixdorf Institute and Department of Computer Science University of Paderborn, D-33095 Paderborn, Germany fmadh@, ab@hni. uni-paderborn.de
Abstract. We examine di erent variants of minimum cut problems on undirected wei hted raphs on the p-processor bulk synchronous parallel (BSP) model of Valiant. This model and the correspondin cost measure uide al orithm desi ners to develop work e cient al orithms that need only very little communication. Kar er and Stein have presented a recursive contraction al orithm to solve minimum cut problems. They su est a PRAM implementation of their al orithm workin in polynomial polylo arithmic time, but bein not work-optimal. Typically the problem size n is much lar er than the number of processors p on real-world parallel computers (p n). For this settin we present improved BSP implementations of the al orithm of Kar er and Stein. For the case of multiway cut and approximate minimum cut we obtain optimal, communication e cient results. A nice e ect, beside the optimality, is that communication is e cient for a lar e spectrum of BSP-parameters. In the case of the minimal cut problem our results are close to optimal.
1
Introduction
Most of the research on parallel al orithms for raph optimization problems has focused on ne rained massively parallel models of computation [8,18], such as the parallel random access machine (PRAM), in the last decades. However, the PRAM model is unrealistic because it assumes that interprocessor communication is as fast as internal computation. The weak communication performance of existin parallel systems is a major impediment to the e cient application of parallelism in practice. That is, the true bottleneck in parallel computation is interprocessor communication [4,14,17,20], which mainly su ers from low communication bandwidth and hi h network latency. The Bulk Synchronous Parallel (BSP) model of Valiant [20] is based on latency and communication bandwidth. A way to attack bandwidth limitation is to develop techniques for distributin data amon the processors such that most of the time each processor only needs Supported by DFG-Graduate Colle e Parallele Rechnernetzwerke in der Produktiontechnik , ME 872/4-1 by DFG-Sonderforschun sbereich376 Massive Parallelit¨ at: Al orithmen, Entwurfsmethoden, Anwendun en , by EU ESPRIT Lon Term Research Project 20244 (ALCOLM-IT) C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 316 330, 1998. c Sprin er-Verla Berlin Heidelber 1998
Communication-E cient Parallel Computation
317
data from its local memory. BSP al orithms have already been desi ned for various problems like sortin , multisearch and computational eometry problems [1,5,2,6,7]. A oal of this paper is the desi n of communication e cient parallel al orithms for several variants of the minimum cut problem for undirected wei hted raphs. 1.1
The Problem
Consider some undirected raph G = (V E) with n nodes, m ed es, and positive wei hts c(e) for all e E. A cut (A B) is a partition of V . A cross-ed e is an ed e which has one node in A and one node in B. The value of a cut is de ned as the sum of the wei hts of the correspondin cross ed es. A solution of a minimum cut problem is a cut with minimum value. Denote by copt the value of a minimum cut. The objective of the approximate minimum cut problem is to nd an copt . The minimal cut for > 1, i.e., a cut (A B) satisfyin copt c(A B) minimum r-way cut problem is comparable to the minimum cut problem, but V is partitioned into r subsets for r > 2. 1.2
The BSP Model
In the BSP (Bulk Synchronous Parallel) Model of parallel computin of Valiant [20], a parallel computer consists of a number of processors, each of which is equipped with a lar e local memory. The processors are connected by a router that transports messa es between processors. A computation in the BSP model proceeds in a succession of supersteps. Conceptually, a superstep consists of a computation and a communication phase followed by a barrier synchronization. In the computation phase, processors independently perform operations on data that resides in their local memories at the be innin of the superstep. In the communication phase, the messa es are transported to their destinations by the router. After the communication phase a barrier synchronization takes place. In the BSP model a parallel computer is characterized by the followin parameters: The parameter p is the number of processors. The parameter l models the communication latency and the time needed for a barrier synchronization. One can view l as the minimum time for a superstep. The parameter models the ap, i.e. the minimum time between the arrival of succeedin words of data; −1 reflects the available communication bandwidth per processor. Consider a BSP computation consistin of L supersteps, where in the i-th superstep each processor performs at most w local operations and a c -relation is realized, i.e., in the communication phase each processor sends and receives c + l. The at most c messa es. The runtime of the i-th superstep is w + runtime of the BSP computation is de ned as W+
C+l L
318
Friedhelm Meyer auf der Heide and Gabriel Teran Martinez
P P where W = L=1 w and C = L=1 c . The costs W and C are denoted computation time and communication volume, respectively. Classical parallel models like the PRAM only aim to be work e cient, i.e., to use time close to T p (T : sequential complexity of the underlyin problem; p: number of processors). The desi n oal of a BSP al orithm is, in addition to be work e cient, also to reduce the number of supersteps and the communication volume as much as possible, because communication is a main bottleneck in real parallel machines. A (n) be the sequential runtime al orithm A for some problem with Let Tseq input size n. We call the BSP al orithm work optimal with respect to A if W = A (n) p). It is communication e cient if C = O(W ) and L = O(W l) for O(Tseq lar e ran e of values for , l and p. For detailed discussion of the BSP model and BSP al orithms see, e. ., [20,5,2]. 1.3
Previous Work
The classical approach to solve the minimum cut problem is via reduction to the maximum flow problem. New al orithms for solvin the minimum cut problem have been proposed recently. These al orithms are superior to previous al orithms with respect to simplicity and time complexity. Typically the new techniques do not perform any maximum flow computations. For instance, one of these techniques is based on the (deterministic or randomized) identi cation and contraction of ed es which do not belon to a minimum cut. The Scan First Search procedure of Na amochi and Ibaraki [16] identi es an ed e with this property deterministically within linear time. The time complexity of their al orithm is O(mn + n2 lo n). Stoer and Wa ner [19] proposed a simpler version of this al orithm which has the same time complexity. Matula [15] found a linear (2 + )-approximation al orithm for the minimum cut problem. A randomized version of contraction techniques has been discovered by Kar er [9]. The al orithm works as follows: In each round one ed e is selected at random and is contracted. This process is repeated until the raph is composed of only two meta-nodes. Kar er shows that a certain minimum cut has survived the whole sequence of contraction with probability Ω(1 n2 ). Stein and Kar er [13] proposed the recursive contraction al orithm as a far more e cient variant of the contraction al orithm [9]. It can be thou ht of as a binary computation tree of depth lo ((n 2)2 ), whose nodes represent contracted k lo ((n 2)2 ), has raphs. Each of the 2k contracted raphs of level k, 0 k 2 nodes. Particularly, the leaves of the computation tree are contracted n 2 raphs with two meta-nodes. Each of these leaves may represent a minimum cut. The al orithms of Kar er and Stein use di erent strate ies for traversin the computation tree. For instance, their sequential al orithm makes use of a depth rst search-strate y for traversin the computation tree. This al orithm nds a particular minimum cut with probability Ω(1 lo n). The time complexity of this sequential search process is O(n2 lo n). Their PRAM implementation uses n2 processors for realizin a breadth rst search-strate y. The transition from one level to the next is done in polylo arithmic time. The runtime of their
Communication-E cient Parallel Computation
319
best RNC-implementation (RNC is the class of problems that can be solved by a randomized al orithm in polylo arithmic time usin a PRAM with a polynomial number of processors) is O(lo 3 n)1 with n2 processors, i.e., this al orithm is not work optimal. The methods for the multiway cut and the approximate minimum cut problems are similar. The sequential al orithm nds a particular -minimum cut with probability Ω(1 lo n) in time O(n2 ). The parallel implementation uses n2 processors and performs work O(n2 lo 3 n). A particular minimum r-way cut of the multiway cut problem is found sequentially in time O(n2(r−1) ) or, usin parallel methods, in O(lo 3 n) time with n2(r−1) processors. Both PRAM al orithms are not work optimal.
1.4
Our Results
The number p of processors of existin parallel computers is typically much smaller than the input size n. For such values p we obtain trivial BSP al orithms by simulatin the correspondin PRAM al orithm on the p processors. A BSP implementation of, e. ., the parallel minimum cut al orithm of Kar er and Stein with p n2 processors needs O(lo 3 n) supersteps, communication vol2 3 2 ume O(n lo n), and runtime O( np lo 3 n) to nd a particular minimum cut with probability Ω(1 lo n). Thus, the hi h amount of communication makes the al orithm infeasible on real parallel machines. As far as we know, no other al orithms based on the practically relevant BSP model exist for the raph optimization problems treated in this paper. We use the approximate minimum cut problem in order to explain why the PRAM-al orithms of Kar er and Stein are not work optimal. The objective of Kar er and Stein is to traverse the computation tree in polylo arithmic time. This is achieved by means of a transformation from one level within the computation tree to the next which is done in polylo arithmic time. The size of all subproblems increases with the level in the computation tree. For instance, the size of all subproblems in the lowest level is n2 . Indeed, for this level the 2 PRAM-al orithm needs n processors. On the other hand, the problem size is only n2 for level 0. Thus, not all n2 processors are busy in the rst levels. Our method is based on a di erent idea. We be in with the sequential al orithm, i.e., with only one processor. If we have more processors at our disposal we try to nd a strate y for traversin the computation tree which is still optimal. For instance, if two processors are available then each of these processors is assi ned one of the two subtrees which have the root of the computation tree as a parent. Both subtrees are then traversed sequentially by the correspondin processor. For lar er numbers p of processors we proceed analo ously, but not before level lo p. A nice side e ect, in addition to work optimality, is a savin of communication. 1
lo
k
n denotes (lo n)k
320
Friedhelm Meyer auf der Heide and Gabriel Teran Martinez
For the multiway cut problem resp. the approximate minimum cut problem p1−1 respectively the BSP-al orithms are work optimal for n2( −1) lo n r−2 2(r−2) lo n p r−1 and communication e cient for n =o
n2( p1−1
−1)
lo p
resp.
=o
n2(r−2)
!
1
p1− r−1 lo p
for r = 3, = o(n2 ( p lo p)). Our BSP-al orithm for the minimum cut problem is communication e cient only for low values of and not work optimal, because the best sequential alorithm [11] nds all minimum cuts with hi h probability in time O(n2 lo n). Nevertheless, it is by far more e cient than the simulation of the PRAM alorithm of Kar er and Stein. In the next section we review the contraction al orithms. Our BSP al orithms are presented in Sect. 3.
2
The Contraction Al orithm
In this section, we introduce the contraction al orithm for the minimum cutproblem followin Kar er [9]. The main operation of this al orithm is the contraction of an ed e (v1 v2 ). This operation replaces two nodes v1 and v2 by one node v and two ed es (u v1 ) and (u v2 ) by one ed e (u v) with wei ht c(u v) = c(u v1 ) + c(u v2 ). The rest of the raph remains unchan ed. The implementation of this al orithm makes use of the wei hted n n adjacency matrix associated with the raph G. Usin this representation of the raph, the contraction of an ed e is reduced to elementary manipulations of the rows and columns of the adjacency matrix. For a detailed description of this implementation see [13]. Denote by G (v u) the raph resultin from G by the contraction of the ed e (v u). Likewise, denote by G F the raph resultin from G by the contraction of a set F E of ed es. In followin denoted n k the contraction of the raph from n to k node. proc Contract(G n k) repeat until G has k meta node choose ed e (v u) with probability proportional to c(v u) ; G := G (v u) return G
Fi . 1. The contraction al orithm of Kar er
Theorem 1 (Kar er [9]). A xed minimum cut of the raph G survives the contractions to k nodes with probability of at least k2 n2 = Ω((k n)2 ). For k = 2 the contraction al orithm returns a certain minimum cut of the raph G with probability Ω(1 n2 ).
Communication-E cient Parallel Computation
2.1
321
The Recursive Contraction Al orithm
A more e cient version of the contraction al orithm is based on the followin idea of Kar er and Stein: The probability of the rst contracted ed e to belon to the minimum cut is only 2 n. However, for the last contraction we already have a probability of 2 3. Thus, the probability for a minimum cut to survive a few contractions is rather lar e. For instance, this probability is approximately 1 2 if the raph is contracted to n 2 nodes. Therefore, a certain minimum cut is expected to survive almost surely a (twice) repeated contraction. The recursive continuation of this approach leads to the recursive contraction al orithm. proc RContract(G n) fn=2 then return cut else repeat followin steps twice G0 := Contract(G n n 2) ; RContract(G0 n 2)
Fi . 2. The recursive contraction al orithm
Theorem 2 (Kar er and Stein [13]). The recursive contraction nds a particular minimum cut with probability Ω(1 lo n) and time O(n2 lo n). It nds all minimum cuts with hi h probability in time O(n2 lo 3 n). The recursive contraction al orithm can be represented by means of a binary computation tree. Each node of this tree is associated with a contracted raph; an ed e represents a reduction of the nodes by the factor 2 by executin Contract(G n n 2). The reduction factor for the minimum cut problem is said to be 2. The leaves of the tree are contracted raphs with two meta-nodes which may represent a minimum cut. The depth of the tree is lo (n2 ) + O(1), the number of leaves is O(n2 ).
3
The BSP Al orithms
We describe our BSP-al orithms by ivin a strate y for traversin the computation tree. Reducin the communication volume and the number of supersteps are two important objectives in connection with the desi n of BSP-al orithms. Takin these objectives into account, we need a clever distribution of the nodes of the computation tree amon the processors. Essentially, the computation tree is traversed in lo p + 1 phases, where the last phase only consists of local computations. In phase k (k = 0 lo p − 1), all contractions associated with the transition from level k to level k + 1 within the computation tree are performed
322
Friedhelm Meyer auf der Heide and Gabriel Teran Martinez
in parallel in more supersteps; each contraction is assi ned the same number of processors. In phase lo p, each of the p nodes of level lo p is assi ned to exactly one processor. Thus, the computation is local in this phase: No communication is needed in order to traverse each of the p subtrees. We will denote by nk the number of nodes of a contracted raph in level k, (k = 0 lo p). All 2k contracted raphs in level k have the same number of nodes nk . For representin the raph, we make use of the adjacency matrix as a data structure. At the be innin , each of the p processors stores n p rows of the adjacency matrix. Without loss of enerality we assume that the number of processors p is a power of 2. For each level k, (k = 0 lo p), we roup the Pj pk + pk ], 0 j < 2k . p processors in 2k sets of processors Pk j = [Pj pk +1 2 2 2 The invariant of the BSP-al orithm over all the phases k, (k = 0 lo p) is: At the be innin of the phase k, each of the contracted raphs in level k are stored in exactly one processor set Pk j , 0 j < 2k . The rows of an nk nk adjacency matrix of a contracted raph are distributed uniformly amon the processors of Pk j . The invariant is true for phase 0 because the assumption of the distribution of the adjacency matrix for the input raph. We ive a proof of the invariant in the next section by a detailed description of the transition from level k to k + 1 in the computation tree. The computation tree for the minimum cut, the approximately minimum cut, and the multiway cut problem are comparable. In the subsequent sections we explain the BSP-al orithms for these problems. 3.1
Minimum Cut
The reduction factor for the minimum cut problem is properties of the computation tree for this problem.
2. We summarize the
Lemma 1. (1) The number of nodes of a contracted raph in level k is 2kn 2 . (2) The depth of the computation tree is lo ((n 2)2 ) + O(1). (3) Each contracted raph in level lo p has n p nodes. Proof. (1) The result follows from n0 = n and nk+1 = nk 2. (2) The depth t of the computation tree follows from nt = 2. (3) The number of nodes of a contracted raph in level lo p is nlo n p.
p
=n 2
log p 2
=
We ive a detailed description of our implementation in the followin two para raphs. We rst considered the last phase lo p. The description of the transition from phase k to k + 1 in the next para raph is the induction step in the proof of the invariant.
Communication-E cient Parallel Computation
323
The Phase lo p. In phase lo p, each of the p contracted raphs of level lo p of the computation tree is assi ned exactly one of the p processors. That is, a processor stores exactly one raph. This will become clear from the discussion of the implementation of the previous phases in the next para raph. From the above lemma, we know each of the raphs to have n p nodes. In this phase, each of the p computation subtrees is traversed sequentially by the correspondin processor. Therefore, this phase consists of one superstep without any communication, 2 whose computation time is O( np lo n). The Transition from Phase k to k + 1. We describe the phases 0 to lo p − 1 by a detailed description of the transition from level k to k + 1, for 0 2k − 1 k < lo p. At be innin of phase k each processor set Pk j , j = 0 stores one contracted raph Gk j with nk nodes from level k of the computation tree. Because all processor sets Pk j execute the same procedure we describe the al orithm only for processor set Pk 0 . proc Contract(Gk nk nk+1 Pk+1 ) enerate permutation L of the ed es of Gk usin exponentially distributed scores; return Compact(Gk nk+1 L Pk+1 )
Fi . 3. A parallel version of Contract
At be innin of phase k we divide the processor set Pk 0 in two equally sized processor sets Pk+1 0 and Pk+1 1 . In order that each of the processor set nk+1 ) we Pk+1 0 and Pk+1 1 can execute the procedure Contract(Gk 0 nk need that both processor sets store the adjacency matrix of Gk 0 . We reach this by realizin a simple 2(nk )2 pk -relation. The idea hereby is that each processor of Pk+1 0 informs exactly one processor of Pk+1 1 about which part of the nk+1 Pk+1 ) we raph it has stored and vice versa. With Contract(Gk nk denote the procedure in which the processor set Pk+1 0 executes the procedure nk+1 ). Contract(Gk 0 nk nk+1 Pk+1 ) is a BSP implementation Our procedure Contract(Gk nk of a parallel al orithm of Kar er [9] (see Fi .3), with pk = p 2k+1 processors. Instead of choosin one ed e at random repeatedly, it enerates a list of ed es. This list determines the order in which ed es are contracted. The procedure Compact(Gk nk+1 L Pk+1 ) is responsible for the contraction of ed es until the raph consists of nk+1 nodes. The procedure Compact(Gk nk+1 L Pk+1 ) has to nd a pre x L0 of ed es from the list L, whose contraction results in a raph havin nk+1 nodes (see Fi . 4). This pre x is found by means of binary search. The number of the connected components of the raph (V (nk ) L0 ) is determined in order to test whether a pre x L0 of L belon s to some contracted raph havin nk+1 nodes.
324
Friedhelm Meyer auf der Heide and Gabriel Teran Martinez proc Compact(Gk nk+1 L Pk+1 ) f Gk has nk+1 nodes then return Gk else let L1 be the rst and L2 the second half of L l1 := number of connected components of (V (Gk ) L1 ) f l1 nk+1 then Compact(Gk nk+1 L1 Pk+1 ) else Compact(Gk L1 nk+1 L2 L1 Pk+1 )
Fi . 4. The procedure Compact
In case of an undirected raph the eneration of the list L of ed es is relatively simple. It works as follows: Each ed e is assi ned a score which is chosen at random from the unit interval accordin to the uniform distribution. Then, the list L is determined by sortin the ed es accordin to the score values. This method also works for wei hted raphs. However, in this case the score of an ed e is the realization of an exponentially distributed random variable. Kar er proposed an e cient implementation in his PhD thesis [10]. Lemma 2 (Kar er[10]). steps per ed e, it is possible to assi n In O(lo nk ) to each ed e an approximately exponentially distributed score such that all comparisons are the same as for exact exponential distributions, with hi h probability. Now we are in a position to explain the BSP-implementation in more detail and to estimate the correspondin cost. Our BSP-implementation use the BSPsortin al orithm of Goodrich [6] and the BSP-al orithm of Caceres et. al. [3] for computin the connected components of a raph Lemma 3. Consider a phase k, 0
k < lo p.
(1) The cost for eneratin the list L of the ed es of Gk is ! 2 2 (nk )2 (n ) lo (n ) k k and L = O W = O pk lo (nk )2 C = O (nk )2 pk lo
pk
! lo n2k lo
(nk )2 pk
(2) The cost of the BSP implementation of Compact(Gk nk+1 L Pk+1 ) is W=O
n2k pk
lo pk
C=O
n2k pk
lo pk and L = O(lo pk lo (nk )2 )
nk+1 Pk+1 ) (3) The cost of the BSP-implementation of Contract(Gk nk !! 2 2 2 lo (nk ) + lo pk and L = is W = O (npkk) lo (nk )2 C = O (npkk) (nk )2 lo
O(lo pk lo (nk ) ) 2
pk
Communication-E cient Parallel Computation
325
Proof. (1) The eneration of a score for each ed e is a local computation which can be done within one superstep. Since each processor stores at most (nk )2 pk n2 ed es, the correspondin computation time is O( pkk lo n2k ) by Lemma 4. In order to sort the n2k ed es accordin to their score, we apply the BSP al orithm of Goodrich [6] with pk processors. The cost of this BSP-al orithm is ! ! (nk )2 (nk )2 lo (nk )2 lo (nk )2 2 W = O pk lo (nk ) , C = O and L = O . (nk )2 (nk )2 pk lo
lo
pk
pk
(2) For computin the connected components of the implementation of the procedure Compact we make use of the BSP al orithm of Caceres et. al. [3]. For the computation time of Compact it holds that W((nk )2 pk ) = O
(nk )2 lo pk pk
+ W((nk )2 2 pk )
where the rst part of the ri ht hand side essentially expresses the computation time of the BSP al orith 2m of Caceres et. al. [3] for computin the connected component of (V (nk ) L1 ). We solve this recurrence to W((nk )2 pk ) = O
(nk )2 lo pk pk
Analo ously, for the communication volume we have (nk )2 2 lo pk + C((nk )2 2 pk ) C((nk ) pk ) = O pk where the rst part of the ri ht hand side corresponds to the communication volume of the BSP-al orithm of Caceres et. al. The number of supersteps is L((nk )2 pk ) = O(lo pk ) + L((nk )2 2 pk ) = O(lo pk lo (nk )2 ) The term lo pk is the number of supersteps in the al orithm of Caceres et. al. [3]. (3) The overall cost results from (1) and (2). We are now in the position to estimate the cost for all phases. Theorem 3. Our BSP al orithm nds a particular minimum cut with probability Ω(1 lo n). Its cost is W =O
n2 lo p lo n p
C=O
n2 lo p
2
p and L = O(lo n lo
2
p)
326
Friedhelm Meyer auf der Heide and Gabriel Teran Martinez
Proof. The computation time of the phases k, 0 loX p−1
loX p−1
W(P hase k) =
k=0
O(
k=0
(nk )2 lo (nk )2 ) pk
loX p−1
=O
k=0
k < lo p is iven by
n2 n2 2 k lo p 2k+1 2k
!
n2 = O( lo p lo n) p The computation time of the last phase lo p is iven by O (n
n p
p) lo 2
= O n2 p lo n
Thus, we obtain W=O
n2 lo p lo n p
for the computation time of the al orithm. For the communication volume of all phases it hold that C=
loX p−1
C(P hase k)
k=0
=
loX p−1
O
k=0 2
(nk )2 pk
lo (nk )2 + lo pk lo n2k pk
= O(n p lo 2 p) The number of supersteps of the al orithm is
L=
loX p−1 k=0
L(P hase k) =
loX p−1
O(lo pk lo (nk )2 ) = O(lo n lo 2 p)
k=0
The work performed by our al orithm is lar er by a factor O(lo p) than the work of the sequential al orithm by Kar er and Stein (, but better by a factor O(lo 2 n lo p) than their PRAM al orithm). This is a consequence of the al orithm of Caceres et. al. which needs O(lo p) supersteps for computin the connected components.
Communication-E cient Parallel Computation
3.2
327
Approximate Minimum Cuts
The contraction al orithm of Kar er and Stein can be adapted in order to compute approximations of minimum cuts. Let > 1. Theorem 4 (Kar er and Stein [13]). (1) The number of -minimal cuts is O((2n)2 ). (2) A particular -minimal cut can be found with probability Ω(1 lo n) in time O(n2 ). (3) All -minimal cuts can be found in time O(n2 lo n) w.h.p.. The al orithm for eneratin a particular -minimal cut corresponds to the recursive contraction al orithm with the followin modi cations n 2 2) is executed inThe reduction factor is 2 2, i.e., Contract(G n stead of Contract(G n n 2). The contraction procedure ends if the number of nodes of the raph is 2 . Lemma 4. The computation tree for the -minimal cut has the followin properties: (1) The depth of the computation tree is lo nd2 e + O(1). k (2) The number of nodes of a contracted raph in level k is nk = n 2 2 . (3) A contracted raph in level lo p has n 2 p nodes. Proof. Analo ous to that of Lemma 3. Our BSP-scheme for traversin the computation tree is the same for the minimum cut problem and the -minimum cut problem. The computation is divided into lo p+1 phases. Each processor stores one instance of an -minimum cut problem with n 2 p nodes at the be innin of the nal phase lo p. These problems can be solved by the correspondin processor within one superstep and without communication between the p processors. The phase k, 0 k < lo p, nk 2 2) is executed is composed of several supersteps. Each Contract(G nk k+1 processors. with pk = p 2 Theorem 5. Our BSP al orithm nds a particular bility Ω(1 lo n). Its cost is W =O
n2 n2 lo n2 + p p
C=O
-minimal cut with proba-
n2 lo p and L = O(lo n lo p
2
p)
Proof. Analo ous to that of Thm. 6. Corollary orithm nds all -minimal cuts w.h.p. in optimal 2 1. OurBSP al 2( −1) p1−1 . It is communication e cient for = time O np lo 2 n for n lo n 2( −1) n . o p1−1 lo p
328
Friedhelm Meyer auf der Heide and Gabriel Teran Martinez
3.3
Multiway Cuts
Kar er and Stein have shown that their techniques can also be applied to the minimum r-way cut problem. Let r > 3 be a xed inte er. Theorem 6 (Kar er and Stein [13]). (1) The number of minimum r-way cuts is O((n)2(r−1) ). (2) A particular minimum r-way cut can be found in time O(n2(r−1) ) with probability Ω(1 lo n). (3) All minimum r-way cuts can be found w.h.p. in time O(n2(r−1) lo n). The al orithm for eneratin all minimum r-way cuts corresponds to the recursive contraction al orithm modi ed in the followin way. 1
n 2(r−1) 2) is executed. The reduction factor is 2 2(r−1) , i.e., Contract(G n The contraction process is nished if the raph has r nodes.
Lemma 5. The computation tree for the minimum r-way cut has the followin properties: (1) The depth of the computation tree is lo n2(r−1) + O(1). k
(2) The number of nodes of a contracted raph in level k is nk = n 2 2(r−1) . 1
(3) A contracted raph in level lo p has n p 2(r−1) nodes. Proof. Analo ous to that of Lemma 3. The proof of the followin result is similar to that of Thm. 6. Theorem 7. Our BSP al orithm probability Ω(1 lo n). Its cost is W =O
n2 1
p r−1
n2(r−1) lo n + p 2
nd a particular minimum r-way cuts with
! C=O
n2 1
p r−1
! lo p
L = O(lo n lo
2
p)
Corollary nds all minimum r-way cuts w.h.p. in optimal time BSP2(r−2) 2(r−1) 2. Our r−2 2 n n lo n for lo n p r−1 and is communication e cient for = O p 2(r−2) . o 1−n 1 p
r−1
lo p
Communication-E cient Parallel Computation
4
329
Conclusion
We have proposed e cient BSP-al orithms for the minimum cut problem, the -minimum cut problem, and the multiway cut problem. These al orithms are based on the contraction al orithm of Kar er and Stein [13]. The question whether an optimal BSP-implementation for the new al orithm of Kar er exists has not been answered yet. An al orithm for computin the connected components with O(1) supersteps would improve our al orithm by the factor lo p. For the multiway- and -minimum cut- problem with realistic parameters our BSP-al orithms are optimal and communication e cient. A nice property of the recursive contraction al orithm is the eneration of p small problems in level lo p. These small problems can be solved by other sequential al orithms, such as, e. ., the deterministic al orithm of Na amochi and Ibaraki [16] in time O(n3 p3 2 ) or the new al orithm of Kar er [11] in time O(n2 p), for the minimum cut problem. This implies that only O(lo p lo n) instead of O(lo 2 n) repetitions are necessary in order to achieve hi h probability. This e ect may be quite interestin for practical implementations. For the all terminal network reliability problem Kar er [12] has proposed a fully polynomial randomized approximation scheme (FPRAS). An important part of this al orithm is the eneration of all -minimum cuts usin the recursive contraction al orithm. It is an interestin question whether our BSP-al orithm for the -minimum cut problem can be extended to a BSP-FPRAS for the all terminal network reliability problem.
References 1. M. Adler, J. W. Byers, R. M. Karp, Parallel sortin with limited bandwidth, SPAA 95, pa es 129 136, 1995. 2. A. B¨ aumker, W. Dittrich, and F. Meyer auf der Heide, Truly e cient parallel al orithms: 1-optimal multisearch for an extension of the BSP model , ESA 95, 1995. 3. E. Caceres, F. Dehne, A. Ferreira, P. Flocchini, I. Riepin , A. Roncato, N.Santoro, S. W. Son , E cient parallel raph al orithms for coarse rained multicomputers and BSP , ICALP 97, Bolo na, Italy, 1997. 4. D. E. Culler, R. M. Karp, D. A. Peterson, A. Sahay, K. E. Schauser, E. Santos, R.Subramonian, T. von Eicken, Lo P: Towards a realistic model of parallel computation, in Proc. 4th ACM SIGPLAN Symp. on Princ. and Practice of Parallel Pro rammin , pa es 1 12, 1993. 5. A. V. Gerbessiotis L. G. Valiant, Direct bulk-synchronous parallel al orithms, J. of Parallel and Distributed Computin , 22:251 267, 1994. 6. M. T. Goodrich, Communication-e cient parallel sortin , STOC 96, 1996. 7. M. T. Goodrich, Randomized fully-scalable BSP techniques for multisearchin and convex hull construction, SODA 97, 1997. 8. J. Jaja, An Introduction to Parallel Al orithms, Addison-Wesley, Readin , Mass., 1992.
330
Friedhelm Meyer auf der Heide and Gabriel Teran Martinez
9. D. R. Kar er, Global min-cuts in RNC and other rami cations of a simple mincut al orithm, SODA 93, pa es 21 30, 1993. 10. D. R. Kar er, Random Samplin in Graph Optimization Problems, PhD thesis, Stanford University, 1994. 11. D. R. Kar er, Minimum cuts in near-linear time , STOC 96, pa es 56 63, 1996. 12. D. R. Kar er, A randomized fully polynomial approximation scheme for the all terminal network reliability problem, STOC 95, pa es 11 17, 1995. 13. D. R. Kar er, C. Stein, A new approach to the minimum cut problem, Journal of the ACM, 43(4):601 640, 1996. 14. Y. Mansour, N. Nisan, U. Vishkin, Trade-o s between communication throu hput and parallel time STOC 94, pa es 372 381, 1994. 15. D. W. Matula, A linear 2 + approximation al orithm for ed e connectivity, SODA 93, pa es 500 504, 1993. 16. H. Na amochi, T. Ibaraki, Computin ed e connectivity in multi raphs and capacitated raphs, SIAM J. of Discrete Mathematics, 5:54 66, 1992. 17. C. Papadimitriou, M. Yannakakis, Towards an architecture-independent analysis of parallel al orithms, STOC 98,pa es 510 513, 1988. 18. J. H. Reif, Synthesis or Parallel Al orithms, Mor an Kaufmann Publishers, Inc., San Mateo, CA, 1993. 19. M. Stoer, F. Wa ner, A simple min cut al orithm, ESA 94, pa e 141 147, 1994. 20. L. G. Valiant, A brid in model for parallel computation, Comm. ACM,33:103 111, 1990.
The Geometry of Browsin (Invited Paper) Richard Bei el1 and E emen Tanin2 1
2
Lehi h University, Dept. of EE&CS, 19 Memorial Dr W Ste 2, Bethlehem, PA 18015-3084, USA
[email protected] University of Maryland, Human Computer Interaction Laboratory, Colle e Park, MD 20742-3251, USA
[email protected]
Abs rac . We present a eometric countin problem that arises in browsin and solve it in constant time per query usin nonexhaustive tables. On the other hand, we prove that several closely related problems require exhaustive tables, no matter how much time we allow per query.
1
Introduction
In this paper we address some al orithmic problems that arise in connection with browsin a really lar e collection of datasets. The interface paradi ms we present have been developed and user-tested as part of the visual data minin e ort (cf. [14]) at the Human Computer Interaction Laboratory (HCIL) in the University of Maryland. The tar et application is the EOSDIS collection of datasets in development by the U.S. National Aeronautics and Space Administration (NASA).
1.1
What Is EOSDIS?
The NASA EOSDIS (Earth Observin System Data and Information System) project is attemptin to provide online access to a rapidly rowin archive of scienti c data about the Earth’s land, water, and air. Data is collected from satellites, ships, aircraft, and round crews, and stored in desi nated archive centers. Scientists, teachers, and the eneral public are iven access to this data via the Internet. HCIL is developin user interfaces that use the dynamic query paradi m and the query preview paradi m to facilitate the browsin and retrieval of data from this very lar e archive. The back round for interfaces to EOSDIS is discussed further on the web at [6]. C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 331 340, 1998. c Sprin er-Verla Berlin Heidelber 1998
332
Richard Bei el and E emen Tanin
1.2
What Are Dynamic Queries?
Dynamic queries are an interface paradi m that allow the user to interactively control query parameters and enerate a rapidly animated visual display of database search results [1,2,5,7,8,10,11,12,13,15,16]. As users adjust sliders or buttons, results are updated nearly continuously on the display. Each adjustment to a slider and each button click is called a query. The answer to the query is presented raphically. Experimental results have shown that dynamic queries are a fast, e ective, fun, and easy-to-use tool for novice and expert users to nd trends and spot exceptions [2,16]. Dynamic query user interfaces apply the principles of direct manipulation to query formulation and provide: a visual representation of the query and the results, rapid, incremental, and reversible actions, selection by pointin (not typin ), and immediate and continuous display of results. Some demos of dynamic queries are available from [10,12]. 1.3
What Are Query Previews?
In a networked information system, there are three major obstacles facin users in a queryin process: slow network performance, lar e data volume, and data complexity. EOSDIS has all of these. The collection is predicted to reach into the petabytes (1015 bytes). Additionally, EOSDIS datasets have numerous attributes, such as when and where it was collected, and the types of features or measurements in the dataset. With a forms-based interface, ndin a particular dataset or a roup of datasets that match certain characteristics would typically involve several iterations of queryin and waitin for results over the network. This would not only slow down the process of ndin datasets, but would slow down the network and servers for all users. Query previews, as developed in [4], take advanta e of the fact that in many cases, perhaps most, users are only interested in a small subset of the entire collection. For example, a user mi ht only be interested in data for Europe, instead of the whole world. Narrowin the scope of the collection can reatly improve the e ciency of browsin and queryin . A query previewer ives the user overviews of the entire collection such as a map showin the distribution of datasets over the Earth. The total number of datasets is also displayed. Usin a dynamic-query interface, the user can narrow his search to a selected re ion of latitude and lon itude by adjustin sliders. With another slider, the user can narrow his search to a certain ran e of years. With checkboxes, the user can narrow his search to only those datasets containin selected attributes (for example, temperature and pressure). As search parameters are adjusted (query), the distribution of datasets and the total count are updated. Once the scope of the search is su ciently narrow, i.e., the number of datasets matchin the dynamic query is mana eably small, the user and the system are
The Geometry of Browsin
333
ready for more detailed query and exploration. This second query phase, called query re nement, is another dynamic-query interface to a more detailed view of the datasets selected by the query preview. The details of query re nement are independent of the query previewer and will not be addressed in this paper. Practical Requirements. Queries will be answered via a computation that consults a table of summary information about all datasets. Since the query previewer is a dynamic-query interface, updates should ideally appear to be continuous. Studies su est that most users will in fact tolerate a delay of at most 0.1 or 0.2 seconds per update [1]. Since even this is much slower than typical worldwide-web turnaround, it is necessary that the aforementioned table be stored on the user’s own node. Thus, disk-space limitations and download times both dictate that the table not be very lar e; 1 me abyte seems like a ood rule of thumb. To summarize, we need small tables that support fast queryin .
2
The Query-Previewin Problem
In the EOSDIS example, each dataset is described by a 4-dimensional record whose elds indicate the scope of information that the dataset contains: (1) ran e of latitude, (2) ran e of lon itude, (3) ran e of years, and (4) set of attributes. The user’s query is also a record of this type. An EOSDIS dataset matches the record if its ran e of latitudes overlaps with the user’s ran e of latitudes, and so on for the other three dimensions. The result of the query is the number of EOSDIS datasets that match the query. In order to compute the distribution of datasets over the Earth, the map is partitioned into squares and one query of the type described above is evaluated for each square in order to determine the number of datasets with information about that re ion of the Earth. Because we will, in fact, ive a constant-time queryin al orithm, the partition of the Earth into squares can in practice be very ne. 2.1
Geometric Interpretation
The rst three elds in a record are ran es of numbers, so they can be represented as intervals. Let us i nore temporarily the fourth eld (set of attributes). Then the record can be represented as a 3-dimensional rectan le, the Cartesian (cross) product of those three intervals. An EOSDIS record matches the query if the two correspondin rectan les overlap, i.e, if their intersection is nonempty. Please for ive us for belaborin an obvious point: rectan le overlap is a lo ical AND, i.e., two rectan les overlap if the intervals alon the rst dimension overlap and the intervals alon the second dimension overlap and the intervals alon the third dimension overlap. Let us return to the fourth eld in our EOSDIS records, the set of attributes. We will represent each attribute by a number, so the fourth eld is a set of
334
Richard Bei el and E emen Tanin
numbers. At the risk of forcin the eometric metaphor, we will call a set of numbers a eneralized interval. Because the Earth is round, it may also be necessary to consider intervals in dimension (2) that wrap around from the ri ht ed e of the map to the left ed e. These are called wrapped intervals. 2.2
Formal Problem Statements
In eneral we will consider records with d elds, ivin rise to d-dimensional problems. De nition 1. N= 0 1 , the set of natural numbers Nd is the set of all d-dimensional lattice points in the 1st quadrant An interval is a set a a + 1 b of consecutive natural numbers. A wrapped interval in 1 m is either an interval or the union of two intervals that contain 1 and m. A eneralized interval is a subset of N. Id of d intervals I1 Id . A rectan le in Nd is a cross-product I1 A wrapped rectan le in Nd is a cross-product I1 Id of d wrapped Id . intervals I1 A eneralized rectan le in Nd is a cross-product I1 Id of d eneralized Id . intervals I1 A intersects B if A B = . A is skew to B if there is no hyperplane parallel to the coordinate axes that intersects both A and B. In this paper, we will be mainly interested in three problems: Rectan le Intersection (lo ical AND). Data to be preprocessed: A list D of rectan les in Nd Problem Instance: A sin le rectan le Q in Nd Question: How many elements of D intersect Q? Wrapped Rectan le Intersection (lo ical AND). Data to be preprocessed: A list D of wrapped rectan les in 1 Problem Instance: A sin le wrapped rectan le Q in 1 m d Question: How many elements of D intersect Q?
m
d
Generalized Rectan le Intersection (lo ical AND). Data to be preprocessed: A list D of eneralized rectan les in Nd Problem Instance: A sin le eneralized rectan le Q in Nd Question: How many elements of D intersect Q? Althou h we do not state it as a formal problem, in practice we are interested in the mixed case, where some dimensions of the records are intervals, others are wrapped intervals, and yet others are eneralized intervals. Our results for
The Geometry of Browsin
335
the homo eneous problems described above are directly applicable to the mixed case. The followin problem corresponds to queries based on lo ical OR rather than lo ical AND. Because such queries are sometimes useful in browsin , we consider them as well. Rectan le Nonskewness (lo ical OR). Data to be preprocessed: A list D of rectan les in Nd Problem Instance: A sin le rectan le Q in Nd Question: How many elements of D are not skew to Q? 2.3
Complexity Bounds
Throu hout, let R denote a xed rectan le in Nd that contains each element of D. We present al orithms for rectan le intersection and rectan le nonskewness that use tables whose size depends only on the dimension d and the boundin rectan le R, and answer queries in time that depends only on d. The preprocessin time depends on D, but the cost for addin a sin le dataset to the list depends only on d and R. nd rectan le. Let
Results. Assume that R is an n1 = (2n1 − 1)
(2nd − 1)
2d R
Rectan le Intersection can be solved with O( ) preprocessin per element of D, usin tables of size , in time O(d2d ) per query. Rectan le Nonskewness can be solved with O( ) preprocessin per element of D, usin tables of size , in time O(d4d ) per query. Generalized Rectan le Intersection requires exhaustive tables (size 2n1 2n2 2nd ). Wrapped Interval Intersection requires exhaustive tables nd (nd − 1)). (size n1 (n1 − 1)n2 (n2 − 1)
3
Geometric Al orithms
We identify each rectan le in Nd with the polytope obtained upon replacin each of its points p with a unit d-cube centered at p. A face of a bounded-polytope S is interior to S if it is not the exterior face and it is not entirely contained in the boundary of S. Let Fk (S) denote the number of k-dimensional faces of S and let Fk (S) denote the number of k-dimensional faces interior to S. Lemma 1. If S is a bounded, connected d-dimensional polytope then (−1)d−k Fk (S) = 1 0kd
336
Richard Bei el and E emen Tanin
Proof. By Euler’s theorem (see, for example, [9]), k
d
(−1) Fk (S) = 1 + (−1) 0kd
Let B denote the boundary of S. Then B is a connected (d − 1)-dimensional polytope, so by Euler’s theorem k
d−1
(−1) Fk (B) = 1 + (−1) 0kd−1
Therefore (−1) Fk (S) = k
0kd
k
d
k
(−1) Fk (S) − (−1) − 0kd
(−1) Fk (B) 0kd−1
d
d
d−1
= 1 + (−1) − (−1) − (1 + (−1)
)
d−1
= −(−1) d
= (−1)
so
d−k
0kd
(−1)
Fk (S) = (−1) (−1) = 1. d
d
For each 0-, 1-, , or d-dimensional cube c, let #(c) denote the number of rectan les r in the list D such that the interior of r intersects the interior of c. In particular, the answer to the query Q is #(Q). We have d−k
(−1)
#(Q) = 0kd
#(c)
(1)
c2C
where C = c : c is a k-dimensional unit cube in the interior of Q Why? Because each rectan le in D that does not intersect Q contributes 0 to the sum, and each rectan le in D that intersects Q contributes 1 to the sum. Let dim(r) denote the dimension of a rectan le. If we stored (−1)d−dim(c) #(c) for each unit cube c, then we could compute the sum speci ed in (1) by summin over each unit cube contained in Q. Better yet, as noted in [3], if we store d-dimensional pre x sums, we can evaluate that sum in constant time. For each unit cube a we store b (−1)d−dim(b) #(b), where the inequality must hold on every coordinate. Given such a table, the sum speci ed in (1) may be obtained with 2d − 2 additions and subtractions, by the principle of inclusion and exclusion. The total stora e needed is the number of unit cubes interior to R whose dimension is d or less, which is exactly . For concreteness we present the table construction and the query al orithm for the case d = 2 (Fi . 1). In order to simplify the al orithm, we have used a table of size 2d R , which is lar er than we claimed. This allows us to store 0s alon the ed es and avoid special cases.
The Geometry of Browsin
procedure CountRecord(x1 x2 y1 y2 ) for = 2x1 − 1 to 2x2 − 1 do for j = 2y1 − 1 to 2y2 − 1 do table[ j] = table[ j] + (−1)i+1 (−1)j+1 end procedure CountAllRecords for = 0 to 2n1 − 1 do for j = 0 to 2n2 − 1 do table[ j] = 0 for each rectan le (x1 x2 y1 y2 ) in the list D do CountRecord(x1 x2 y1 y2 ) end
procedure ComputePartialSums for = 2 to 2n1 − 1 do for j = 1 to 2n2 − 1 do table[ j] = table[ − 1 j] + table[ j] for = 1 to 2n1 − 1 do for j = 2 to 2n2 − 1 do table[ j] = table[ j − 1] + table[ j] end procedure BuildTable CountAllRecords ComputePartialSums end function IncludeExclude(a1 a2 b1 b2 ) return table(a2 b2 ) − table(a2 b1 ) − table(a1 b2 ) + table(a1 b1 ) end function Query(x1 x2 y1 y2 ) return IncludeExclude(2x1 − 2 2x2 − 1 2y1 − 2 2y2 − 1) end
Fi . 1. Table Construction and Query Al orithm for the case d = 2.
337
338
3.1
Richard Bei el and E emen Tanin
Rectan le Nonskewness
By the principle of inclusion and exclusion, rectan le nonskewness is reduced to 2d − 1 instances of rectan le intersection. Therefore it can be solved with exactly the same table, in time O(d4d ).
4
Lower Bounds
There are exactly 2n eneralized intervals and exactly n(n−1) wrapped intervals in 1 n . Therefore, Generalized Rectan le Intersection can be solved by 2nd , and Wrapped Interval Section can table lookup with a table of size 2n1 nd (nd − 1). We will be solved by table lookup with a table of size n1 (n1 − 1) show that no smaller tables su ce for either problem, no matter how much time is allowed. For simplicity we will consider only the case d = 1. In the full version of this paper we will explain how the eneral case is a corollary of this one. Henceforth let n = n1 . 4.1
Generalized Rectan le Intersection
Suppose that iven some table T we can answer questions of the form how many elements of D intersect Q? , where D is a xed multiset of eneralized intervals and Q is a eneralized interval. Previously, we said that the intersection questions are really lo ical-AND questions. Actually, they are ANDs over all dimensions. But in each sin le dimension, the question is an OR, i.e., does at least one square of the query interval belon to the dataset rectan le? xk ) Let denote lo ical OR, and denote lo ical AND. Let #(x1 xk r. denote the number of eneralized rectan les r in D such that x1 r xk ) denote the number of eneralized rectan les r in D such Let #(x1 xk r. By assumption, we can compute #(x1 xk ) from that x1 r T . By the principle of inclusion and exclusion, we have #(x1
x2 ) = #(x1 ) + #(x2 ) − #(x1
x2 )
Thus we can compute #(x1 x2 ) from T . By a simple induction, we can compute xk ) from T . #(x1 xk xk+1 xm ) denote the number of eneralized Let #(x1 xk r xk+1 r xm r. We rectan les r in D such that x1 r have #(x1
xk
xk+1 ) = #(x1
xk ) − #(x1
xk+1 )
xk xk+1 ) from T . By a simple inducThus we can compute #(x1 xk xk+1 xn ) from T , where tion we can compute #(x1 xn = 1 n . Thus we can determine from T exactly how many x1 xk appears in the multiset D. Since we times the eneralized rectan le x1 can recover 2n independent numbers from T , the size of T must be at least 2n . Note: a similar ar ument shows that if we limit the size of eneralized intervals to k then we still can’t et by with nonexhaustive tables.
The Geometry of Browsin
4.2
339
Wrapped Rectan le Intersection
Suppose that iven some table T we can answer questions of the form how many elements of D intersect Q? , where D is a xed multiset of wrapped intervals and Q is a wrapped interval. If i j, let #[i j] denote the number of elements of D contained in the interval [i j]. Then #[i j] is equal to the number of elements of D (that intersect [1 n]) minus the number of elements of D that intersect j+1 n 1 i − 1 , so we can compute #[i j] from T . The number of times that the interval #[i j] appears in the list D is iven by the formula: #[i j] − #[i − 1 j] − #[i j − 1] + #[i − 1 j − 1] By a similar ar ument we can determine the number of times each wrapped interval appears in D. Since we can recover n(n − 1) independent numbers from T , the size of T must be at least n(n − 1). Acknowled ments. This work was supported by NASA rant 52895. The research was performed while the rst author was on sabbatical from Yale University, visitin the Human Computer Interaction Laboratory at the University of Maryland. He is also partially supported by NSF under rants CCR-9700417 and CCR-9796317. Both authors are rateful to Catherine Plaisant and Ben Shneiderman for their part in formulatin this problem and to Dave Mount and Dan Spielman for helpful discussions.
References 1. Ahlber , C. and Shneiderman, B., Visual Information Seekin : Ti ht Couplin of Dynamic Query Filters with Star eld Displays, Proc. ACM SIGCHI (1994) 313 317 2. Ahlber , C. and Wistrand, E., IVEE: An Information Visualization and Exploration Environment, Proc. IEEE Info. Vis., (1995) 66 73 3. Bestul, T., Parallel paradi ms and practices for spatial data, Ph.D. Thesis, Univ. Maryland Dept. Comp. Sci., TR-2897, (1992) 4. Doan, K., Plaisant, C., and Shneiderman, B., Query Previews in Networked Information Systems, Proc. Forum Adv. Di it. Libr., IEEE Comp. Soc. Press, (1996) 120 129 5. Eick, S., Data Visualization Sliders, Proc. User Interf. Softw. Techn. (1994) 119 120 6. HCIL, http://www.cs.umd.edu/projects/hcil/Research/1995/dq-for-eosdis.html 7. Fishkin, K. and Stone, M. C., Enhanced Dynamic Queries via Movable Filters, Proc. ACM SIGCHI (1995) 415 420 8. Goldstein, J. and Roth, S. F., Usin A re ation and Dynamic Queries for Explorin Lar e Data Sets, Proc. ACM SIGCHI (1994) 23 29 9. Harary, F., Graph Theory, Addison Wesley (1969) 10. HCIL, ftp://ftp.cs.umd.edu/pub/hcil/Demos/DQ/dq-home.zip. Down-loadable PC demonstration.
340
Richard Bei el and E emen Tanin
11. Ioannidis, Y., Dynamic Information Visualization, ACM SIGMOD Rec., 25 (1996) 16 20 12. Information Visualization and Exploration Environment (IVEE) Development AB, http://www.ivee.com/. Online Java demo and down-loadable demos for various platforms 13. Shneiderman, B., Dynamic Queries for Visual Information Seekin , IEEE Softw., 11 (1994) 70 77 14. Shneiderman, B., Racin to the winnin line with visual data minin , http://www.ivee.com/corporate/columns/race.html 15. Tanin, E., Bei el, R., and Shneiderman, B., Incremental Data Structures and Alorithms for Dynamic Query Interfaces, ACM SIGMOD Rec., 25 (1996) 21 24 16. Williamson, C. and Shneiderman, B., The Dynamic HomeFinder: Evaluatin Dynamic Queries in a Real-Estate Information Exploration System, Proc. ACM SIGIR (1992) 339 346
Fast Two-Dimensional Approximate Pattern Matchin ? Ricardo Baeza-Yates and Gonzalo Navarro Dept. of Computer Science, University of Chile. Blanco Encalada 2120, Santia o, Chile. rbaeza, navarro @dcc.uchile.cl
Abs rac . We address the problem of approximate strin matchin in two dimensions, that is, to nd a pattern of size m m in a text of size n n with at most k errors (substitutions, insertions and deletions). Althou h the problem can be solved usin dynamic pro rammin in time O(m2 n2 ), this is in eneral too expensive for small k. So we desi n a lterin al orithm which avoids verifyin most of the text with dynamic pro rammin . This lter is based on a one-dimensional multi-pattern approximate search al orithm. The avera e complexity of our resultin al orithm is O(n2 k lo m m2 ) for k < m(m + 1) (5 lo m), which is optimal and matches the best previous result which allows only substitutions. For hi her error levels, we present an al orithm with time complexity O(n2 k (w )) (where w is the size in bits of the computer word and is the alphabet size). This al orithm works for k < m(m+1)(1−e ), where e = 2 718 , a limit which is not possible to improve. These are the rst ood expected-case al orithms for the problem. Our al orithms work also for rectan ular patterns and rectan ular text and can even be extended to the case where each row in the pattern and the text has a di erent len th.
1
Introduction
A number of important problems related to strin processin lead to al orithms for approximate strin matchin : text searchin , pattern reco nition, computational biolo y, audio processin , etc. Two dimensional pattern matchin with errors has applications, for instance, in computer vision. The edit distance between two strin s a and b, ed(a b), is de ned as the minimum number of edit operations that must be carried out to make them equal. The allowed operations are insertion, deletion and substitution of characters in a or b. The problem of approximate strin matchin is de ned as follows: iven a text of len th n, and a pattern of len th m, both bein sequences over an alphabet of size , nd all se ments (or occurrences ) in text whose edit distance to pattern is at most k, where 0 < k < m. The classical solution is O(mn) time and involves dynamic pro rammin [19]. Support from Fondecyt rants 1-95-0622 and 1-96-0881 are ratefully acknowled ed. C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 341 351, 1998. c Sprin er-Verla Berlin Heidelber 1998
342
Ricardo Baeza-Yates and Gonzalo Navarro
Krithivasan and Sitalakshmi (KS) [14] proposed the followin extension of edit distance for two dimensions. Given two ima es of the same size, the edit distance is the sum of the edit distance of the correspondin row ima es. This de nition is justi ed when the ima es are transmitted row by row and there are not too many communication errors. On the other hand, it is not clear how to lift the row restriction (i.e. lettin insertions and deletions alon rows and columns) as then an approximate match is harder to de ne. Fi . 1 ives an example.
111111 000000 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111
11111 00000 00000 11111 00000 11111 00000 11111 00000 11111
General
KS
Fi . 1. Alternative error models.
Usin this model they de ne an approximate search problem where a subima e of size m m is searched into a lar e ima e of size n n, which they solve in O(m2 n2 ) time usin a eneralization of the classical one-dimensional al orithm. We use the same model and improve the expected case usin a lter al orithm based in multiple one-dimensional approximate strin matchin , in the same vein of [9,8,7]. Our al orithm has O(n2 k lo m m2 ) avera e-case behavior for k < m(m+ 1) (5 lo m), usin O(m2 ) space. This time matches the best known result for the same problem allowin only substitutions and is optimal [12], bein the restriction on k only a bit more strict. For hi her error levels, we present an al orithm with time complexity O(n2 k (w )) (where w is the size in bits of ). We also show the computer word), which works for k < m(m + 1)(1 − e that this limit on k cannot be improved. Given a two-dimensional strin S, we denote as S[i] its i-th row (i 1), and S[i][j] the j-th column of row i (j 1). The two-dimensional strin s we use are the pattern P and the text T .
Fast Two-Dimensional Approximate Pattern Matchin
2
343
Previous Work
The classical O(mn) dynamic pro rammin solution to the one-dimensional problem [19] keeps an array C[0 m], which for each new text position T [j] is updated to C 0 [0 m] with the formula C 0 [0]
C[0] C 0 [i]
if P [i] = T [j] then C[i − 1] else 1 + min(C[i − 1] C 0 [i − 1] C[i])
and a match is reported whenever C[m] k. This solution was later improved by a number of al orithms. The di erent approaches can be divided in three main areas: Those that use cleverly the eometric properties of the dynamic pro rammin matrix, e. . [15,21,10]. These al orithms normally achieve O(kn) time complexity in the worst or the avera e case. Those that lter the text, quickly leavin out most of the text and verifyin only the areas that seem interestin , e. . [20,6]. They achieve sublinear expected time in many cases (e. . O(kn lo m m)) for small k m ratios. Those that parallelize the computation of a classical al orithm in the bits of computer words [22,23,4]. We call w the number of bits in the computer word, which is assumed to be (lo n). These al orithms obtain in the best case a factor of O(1 lo n) over their classical counterparts. On the other hand, multi-pattern approximate search has only recently been considered. In [16], hashin is used to search thousands of patterns in parallel, althou h with only one error. In [5], extensions of [4] and [6] are presented based on superimposin automata. In [17], a countin lter is bit-parallelized to keep the state of many searches in parallel. Most multipattern al orithms consist of a lter which discards most of the text at low cost, and verify usin dynamic pro rammin the text areas that cannot be discarded. If the error level is low enou h, the avera e number of veri cations is so low that their total cost is of lower order and can be ne lected. Otherwise the cost of veri cations dominates and the al orithm is not useful, as it is as costly as plain dynamic pro rammin . Finally, the case of two dimensional approximate strin matchin usually considers only substitutions for rectan ular patterns, which is much simpler than the eneral case with insertions and deletions. For substitutions, the pattern shape matches the same shape in the text (e. . if the pattern is a rectan le, it matches a rectan le of the same size in the text). For insertions and deletions, instead, rows and/or columns of the pattern can match pieces of the text of di erent len th. If we consider matchin the pattern with at most k substitutions, one of the best results on the worst case is due to Amir and Landau [2], which achieves O((k + lo )n2 ) time but uses O(n2 ) space. A similar al orithm is presented in Crochemore and Rytter [11]. Ranka and Heywood, on the other hand, solve the problem in O((k + m)n2 ) time and O(kn) space. Amir and Landau also present a di erent al orithm runnin in O(n2 lo n lo lo n lo m) time. On avera e, the
344
Ricardo Baeza-Yates and Gonzalo Navarro
best al orithm is due to Karkk¨ainen and Ukkonen [12], with its analysis and space usa e improved by Park [18]. The expected time is O(n2 k m2 lo m) for k
m lo (m2 )
m −1 2
m2 4 lo m
usin O(m2 ) space (O(k) space on avera e). This time result is optimal for the expected case. Under the KS de nition (i.e. allowin insertions and deletions alon rows), Krithivasan [13] presents an O(m(k+lo m)n2 ) al orithm that uses O(mn) space. This was improved (for k < m) by Amir and Landau [2] to O(k 2 n2 ) worst case time usin O(n2 ) space. Amir and Farach [1] also considered non-rectan ular patterns achievin O(k(k + m lo m k lo k)n2 ) time. This al orithm is very complicated, as it uses numerical convolutions.
3
Error Model for Two Dimensions
We assume that pattern and text are rectan ular, of sizes m1 m2 and n1 n2 respectively (rows columns). We use sometimes M = m1 m2 and N = n1 n2 as the size of the pattern and the text respectively. However, our al orithms can be easily extended to the more eneral case where each row in the pattern and the text has di erent len th. For simplicity we only explain the rectan ular case in this paper. Sometimes we even simplify more, considerin the case m1 = m2 = m and n1 = n2 = n. In the KS error model we allow errors alon rows, but errors cannot occur alon columns. This means that, for instance, a sin le insertion cannot move all the characters of its column one position down. Or we cannot perform m2 deletions alon a row and eliminate the row. All insertions and deletions displace the characters of the row they occur in. In this simple model every row is exactly where it is expected to be in an exact search. That is, we can see the pattern as an m1 -tuple of strin s of len th m2 , and each error is a one-dimensional error occurrin in exactly one of the strin s. Formally, De nition: Given a pattern P of size m1 m2 and a text T of size n1 n2 , we say that the pattern P occurs in the text at position (i j) with at most k errors if m1
led(T [i + r − 1][1 j] P [r])
k
r=1
where led(t[1 j] pat) = mini21 j ed(t[i j] pat). Observe that in this case the problem still makes sense for k > m2 , althou h it must hold k < m1 m2 (since otherwise every text position matches the pattern by performin m1 m2 substitutions). The natural eneralization of the classical dynamic pro rammin al orithm for one dimension to the case of two dimensions was presented in [14]. Its complexity is O(M N ), which is also a natural extension of the O(mn) complexity
Fast Two-Dimensional Approximate Pattern Matchin
345
for one-dimensional text. The al orithm is presented in Fi . 2 as it is the basic procedure for the veri cation phase of our lterin al orithm. Instead of the sin le column vector C[j] of len th m + 1 used in [19], we have an m1 (m2 + 1) matrix indexed by pattern rows and columns, C[r][j], for r 1 m1 j 0 m2 .
for i
1 to n1-m1 --- initialize C --for r 1 to m1 for j 0 to m2 C[r][j] j --- compute values for each text column j --for j 1 to n2 err 0 for r 1 to m1 for s 1 to m2 if P[r][s] = T[i+r-1][j] then C’[r][s] C[r][s-1] else C’[r][s] 1 + min(C[r][s-1],C[r][s],C’[r][s-1]) err err + C’[r][m2] exchan e C and C’ --- just exchan e pointers --if err <= k then report match at (i,j)
Fi . 2. Two dimensional approximate matchin by dynamic pro rammin . The variable err sums up the errors alon the rows of the pattern.
This al orithm uses O(M ) extra space, which is the only state information it needs to be started at any text position. Althou h Amir and Landau have an O(k 2 n2 ) al orithm, notice that dynamic pro rammin is always better if k > m, so dependin on k we have to choose the best al orithm.
4
A Fast Al orithm on Avera e
We be in by provin a lemma which allows us to quickly discard lar e areas of the text. Lemma 1. If the pattern occurs with k errors at position (i j) in the text, and r1 r2 rs are s di erent rows in the ran e 1 to m1 , then min led(T [i + rt − 1][1 j] P [rt ])
t=1 s
k s
1 + k s > k s for all t. Proof. Otherwise, led(T [i + rt − 1][1 j] P [rt ]) Just summin up the errors in the s selected rows we have strictly more than s k s = k errors and therefore a match is not possible.
346
Ricardo Baeza-Yates and Gonzalo Navarro
The Lemma can be used in many ways. The simplest case is to set s = 1. This tells us that if we cannot nd a row r of the pattern with at most k errors at text row i, then the pattern cannot occur at row i − r + 1. Therefore, we can search for all rows of the pattern at text row m1 . If we cannot nd a match of any of the pattern rows with at most k errors, then no possible match be ins at text rows 1 m1 . There cannot be a match at text row 1 because pattern row m1 was not found at text row m1 . There cannot be a match at text row 2 because pattern row m1 − 1 was not found at text row m1 . Finally, there cannot be a match at text row m1 because pattern row 1 was not found at text row m1 . This shows that we can search only text rows i m1 , for i = 1 n1 m1 . Only in the case that we nd a match of pattern row r at text position (i m1 j), we must verify a possible match be innin at text row i m1 − r + 1. We must perform the veri cation from text column j − m2 − k + 1 to j, usin the dynamic pro rammin al orithm. However, if k > m2 we can start at j − 2m2 + 1, since otherwise we would pay more than m2 insertions, in which case it is cheaper to just perform m2 substitutions. This veri cation costs O(m1 m22 ) = O(m3 ). To avoid re-verifyin the same areas due to overlappin veri cation requirements, we can force all veri cations to be made in ascendin row order and ascendin column order inside rows. By rememberin the state of the last veried positions we avoid re-verifyin the same columns, this way keepin the worst case of this al orithm at O(m2 n2 ) cost instead of O(m3 n2 ). Fi ure 3 shows how the al orithm works.
1111111 0000000 0000000 1111111 0000000 1111111 3 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111
n = 24, m = 6, k = 3 111111111111111111111111111111 000000000000000000000000000000 000000000000000000000000000000 111111111111111111111111111111 000000000000000000000000000000 111111111111111111111111111111 0000000 000000000000000000000000000000 1111111 111111111111111111111111111111 0000000 1111111
1111111 0000000 0000000 1111111 0000000 1111111 0000000 1111111 0000000 1111111 000000 111111 0000000 1111111 000000 111111 2 5 0000000 1111111 000000 111111 0000000 1111111 000000 111111 0000000 1111111 000000 111111 000000 111111 000000 111111 000000 111111
111111111111111111111111111111 000000000000000000000000000000 000000000000000000000000000000 111111111111111111111111111111 000000000000000000000000000000 111111111111111111111111111111 000000000000000000000000000000 111111111111111111111111111111
text rows searched with 1-dimensional multipattern
i
pattern row i found possible position of an approximate occurrence
11 00 00 11 00 11
text area to verify with dynamic programming
Fi . 3. Example of how the al orithm works.
We have still not explained how to perform a multi-pattern search for all the rows of the pattern at text rows numbered i m1 . We can use any available one-dimensional multi-pattern lterin al orithm. Each such al orithm has a
Fast Two-Dimensional Approximate Pattern Matchin
347
di erent complexity and a maximum error level (i.e. k m ratio) up to where it works well. For hi her error levels, the lter tri ers too many veri cations, which dominate the search time. A problem with this approach is that, if k m2 holds in our ori inal problem, this ltration phase will be completely ine ective (since all text positions will match all the patterns, and all the text will be veri ed with dynamic prorammin ). Even for k < m2 the error level k m2 can be very hi h for the multipattern lter we choose. This is where the s of the Lemma comes to play. We can search, instead of all text rows of the form i m1 , all text rows of the form i m1 2 , for all patterns, with k 2 errors. This corresponds to s = 2. If we nd nothin at rows i m1 2 and (i + 1) m1 2 , then no occurrence can be found at text rows (i − 1) m1 2 + 1 to i m1 2 , because that occurrence has already two rows with more than k 2 errors. In eneral, we can search only the text rows numbered i m1 s , for all the patterns, with k s errors. In the extreme case, we can search all text rows with k m1 errors (which is always < m2 and therefore lterin is in principle possible). There is another alternative way to use s, which is to search only the rst m1 s rows of the pattern with k errors and consider the text rows of the form i m1 s . That is, reduce the number of patterns instead of reducin the error level (this is because the tolerance to errors of some lters is reduced as the number of patterns rows). This alternative, however, is not promisin since we pay s more times searches of (1 s)-th of the patterns. If the search cost for r patterns is C(r), we pay sC(r s). The aim of any multi-pattern matchin al orithm is precisely that C(r) < sC(r s) (since the worst thin that can happen is that searchin for r patterns costs the same as r searches for one pattern, i.e. C(r) = sC(r s)).
5
Avera e Case Analysis
Once we have selected a iven one-dimensional multipattern search al orithm to support our two-dimensional lter, two values of the one-dimensional al orithm influence the analysis of the two-dimensional lter: C(m k r), which is the cost per text character to search r patterns of len th m with k errors. Notice that in our case, m = m2 and r = m1 . Hence, the cost to search a text row with this al orithm is n2 C(m2 k m1 ). L(m r), which is the maximum acceptable value for k m up to where the one-dimensional al orithm works. That is, the cost of the search is C(m k r) per text character, plus the veri cations. If the error level is low enou h (i.e. k m < L(m r)), the number of those veri cations is so low that their cost can be ne lected. Otherwise the cost of veri cations dominates and the al orithm is not useful, as it is as costly as plain dynamic pro rammin and our whole scheme does not work. A ain, in our case, m = m2 and r = m1 .
348
Ricardo Baeza-Yates and Gonzalo Navarro
Given a multi-pattern search al orithm, our search strate y for the twodimensional lter is as follows. If we search with k s errors, it must hold k s < L(m2 m1 ) m2
=
s=
k m2 L(m2 m1 )
(1)
Since we traverse only the text rows of the form i m1 s , we work on O(n1 s m1 ) rows, and therefore our total complexity to lter the text is O(n1 s m1 n2 C(m2 k s m1 )) = O
N k C(m2 m2 L(m2 m1 ) m1 ) M L(m2 m1 )
(2)
where we recall that L has been selected so that the cost of veri cations has, on avera e, lower order and therefore we ne lect veri cation costs. The al orithm is applicable when it holds s m1 , i.e. for k < m2 (m1 + 1)L(m2 m1 )
(3)
since if it requires s > m1 , this means that the error level is too hi h even if we search all rows of the text (s = m1 ). We consider speci c multi-pattern al orithms now, each one with a iven C and L functions. As we only reference the al orithms, we do not include here their analysis leadin to C and L, which is done in the ori inal papers. Exact Partitionin [5] can be implemented such that C(m k r) = O(1) (i.e. linear search time). For our O(m1 m22 ) = O(rm2 ) veri cation costs, we have L(m r) = 1 lo (m3 r2 ). Therefore, usin this al orithm we would select (Eq. (1)) s =
k lo
(m21 m32 ) m2
=
5k lo m m
our avera e search cost would be (Eq. (2)) O
N k lo
max(m1 m2 ) M
= O
n2 k lo m m2
and the al orithm would be applicable for k < m2 (m1 + 1) lo m(m + 1) (5 lo m) (Eq. (3)).
(m21 m32 ) =
(where e = 2 718 ), and Superimposed Automata [5] has L(m r) = 1 − e C(m k r) = O(mr ( w(1−k m))) in its best version (automaton partitionin ). Therefore, we have (Eq. (1)) s =
k m2 (1 − e
)
=
k m(1 − e
)
the avera e complexity is (Eq. (2)) O
Nk M (1 − e
m2 m1 ) we
= O
Nk w
= O
and the al orithm is applicable for k < m2 (m1 + 1)(1 − e ) (Eq. (3)). e
n2 k w
) = m(m + 1)(1 −
Fast Two-Dimensional Approximate Pattern Matchin
349
Countin [17] has L(m r) = e−m and C(m k r) = O(r w lo m). Therefore, usin this al orithm we would select (Eq. (1)) s =
kem2 m2
=
kem m
the avera e search cost would be (Eq. (2)) O
N kem2 M
m1 lo m2 w
=O
N kem2 lo m2 m2 w
=O
n2 kem lo m mw
= m(m + and the al orithm would be applicable for k < m2 (m1 + 1)e−m2 1)e−m (Eq. (3)). Notice that this al orithm is asymmetric with respect to the shape of the pattern, i.e. it works better on tall patterns than on wide ones. This is because its cost formula and error level are not symmetric in terms of m and r as the previous ones. One Error [16] can only search with k = 1 errors (i.e. L(m r) = 2 m), with time cost C(m k r) = m. Therefore we must have s = k 2 + 1, which means that we can only apply the al orithm for k < 2m1 . In this case, the complexity would be O
N k m2 m2 M 2
= O
N km2 m1
= O(n2 k)
This al orithm is asymmetric with respect to the error level it tolerates, also preferrin taller rather than wider patterns. The best al orithm on avera e turns out to be a hybrid. Countin is the best lo 2 m > ), superimposed automata option for small patterns (i.e. me−m lo 2 ), and is the best option for intermediate patterns (i.e. m2 lo 2 m < w exact partitionin is the best option for lar er patterns. The combined complexity is therefore O
n2 k lo m mw max(m w lo lo m m e−m )
As m rows, the best (and optimal) complexity is iven by the exact partitionin , O(n2 k lo m m2 ). However, this is true for k < m(m + 1) (5 lo m), because otherwise the veri cation phase dominates. Once s = 1 and we cannot reduce the error level by reducin s (i.e. by searchin on more rows), the approach most resistant to the error level is superimposed automata, which works ) (at that point its cost is O(m2 n2 (w )), up to k < m(m + 1)(1 − e very close to simple dynamic pro rammin , and the veri cation time becomes dominant). 1−e the number of text Moreover, we prove in [4] that if k m2 positions matchin the pattern is hi h. Therefore, the limit for automaton partitionin is not just the limit of another lterin al orithm, but the true limit
350
Ricardo Baeza-Yates and Gonzalo Navarro
up to where it is possible at all to lter the text. In this sense, this lter has optimal tolerance to errors. We summarize our results in Fi . 4, where the best al orithm for each case is presented.
error level
Dynamic Programming O(n2 m2 ) p 1,e=
k=m2
Automaton Partitioning O wnp2 k
e,m= O
Exact Partitioning
Counting
2 O n k mlog2 m
n ke,m= log m 2
mw
me,m= = p log2 m
1=(5 log m)
m2 = wp log2 m log2
m
pattern size
Fi . 4. The best al orithm with respect to the pattern len th and error level. The complexity of each al orithm is also included.
6
Concludin Remarks
We present the rst lterin al orithm for two dimensional approximate strin matchin allowin also insertions and deletions. This lter avoids verifyin most of the text with the expensive dynamic pro rammin al orithm, and is based on a one-dimensional multi-pattern approximate search al orithm. Our analysis ives the complexity of the lterin al orithm, obtainin expected case time O(n2 k lo m m2 ) for k < m2 (5 lo m). This time is optimal on avera e [12]. The edit distance that we use is simpli ed (row-wise) and does not model well simple cases of approximate matchin in other settin s. For example, we could have a match that only has the middle row of the pattern missin . In the KS de nition (which we use), the edit distance would be O(m2 ) if all pattern rows are di erent. Intuitively, the ri ht answer should be m, because only m characters were deleted in the pattern. We are currently workin on more eneral error models [3], but as they are more eneral, the search complexity should be hi her.
Fast Two-Dimensional Approximate Pattern Matchin
351
References 1. A. Amir and M. Farach. E cient 2-dimensional approximate matchin of nonrectan ular ures. In Proc. SODA’91, pa es 212 223, 1991. 2. A. Amir and G. Landau. Fast parallel and serial multidimensional approximate array matchin . Theoretical Computer Science, 81:97 115, 1991. 3. R. Baeza-Yates. Similarity in two dimensional strin s. Dept. of Computer Science, University of Chile, 1996. 4. R. Baeza-Yates and G. Navarro. A faster al orithm for approximate strin matchin . In Proc. CPM’96, LNCS 1075, pa es 1 23, 1996. ftp://ftp.dcc.uchile.cl/pub/users/ navarro/cpm96.ps. z. 5. R. Baeza-Yates and G. Navarro. Multiple approximate strin matchin . In Proc. WADS’97, LNCS 1272, pa es 174 184, 1997. ftp://ftp.dcc.uchile.cl/pub/users/ navarro/wads97.ps. z. 6. R. Baeza-Yates and C. Perleber . Fast and practical approximate pattern matchin . In Proc. CPM’92, LNCS 644, pa es 185 192, 1992. 7. R. Baeza-Yates and M. Re nier. Fast two dimensional pattern matchin . Information Processin Letters, 45:51 57, 1993. 8. T. Baker. A technique for extendin rapid exact strin matchin to arrays of more than one dimension. SIAM Journal on Computin , 7:533 541, 1978. 9. R. Bird. Two dimensional pattern matchin . Inf. Proc. Letters, 6:168 170, 1977. 10. W. Chan and J. Lampe. Theoretical and empirical comparisons of approximate strin matchin al orithms. In Proc. CPM’92, LNCS 644, pa es 172 181, 1992. 11. M. Crochemore and W. Rytter. Text Al orithms. Oxford University Press, Oxford, UK, 1994. 12. J. Karkk¨ ainen and E. Ukkonen. Two and hi her dimensional pattern matchin in optimal expected time. In Proc. SODA’94, pa es 715 723. SIAM, 1994. 13. K. Krithivasan. E cient two-dimensional parallel and serial approximate pattern matchin . Technical Report CAR-TR-259, University of Maryland, 1987. 14. K. Krithivasan and R. Sitalakshmi. E cient two-dimensional pattern matchin in the presence of errors. Information Sciences, 43:169 184, 1987. 15. G. Landau and U. Vishkin. Fast strin matchin with k di erences. J. of Computer Systems Science, 37:63 78, 1988. 16. R. Muth and U. Manber. Approximate multiple strin search. In Proc. CPM’96, LNCS 1075, pa es 75 86, 1996. 17. G. Navarro. Multiple approximate strin matchin by countin . In Proc. WSP’97, pa es 125 139, 1997. ftp://ftp.dcc.uchile.cl/pub/users/ navarro/wsp97.1.ps. z. 18. K. Park. Analysis of two dimensional approximate pattern matchin al orithms. In Proc. CPM’96, LNCS 1075, pa es 335 347, 1996. 19. P. Sellers. The theory and computation of evolutionary distances: pattern reco nition. J. of Al orithms, 1:359 373, 1980. 20. E. Sutinen and J. Tarhio. On usin q- ram locations in approximate strin matchin . In Proc. ESA’95, LNCS 979, 1995. 21. Esko Ukkonen. Findin approximate patterns in strin s. J. of Al orithms, 6:132 137, 1985. 22. S. Wu and U. Manber. Fast text searchin allowin errors. CACM, 35(10):83 91, October 1992. 23. S. Wu, U. Manber, and E. Myers. A sub-quadratic al orithm for approximate limited expression matchin . Al orithmica, 15(1):50 67, 1996.
Improved Approximate Pattern Matchin on Hypertext ? Gonzalo Navarro Dept. of Computer Science, Univ. of Chile Blanco Encalada 2120, Santia o, Chile.
[email protected]
Abs rac . The problem of approximate pattern matchin on hypertext is de ned and solved by Amir et al. in O(m(n lo m + e)) time, where m is the len th of the pattern, n is the total text size and e is the total number of ed es. Their space complexity is O(mn). We present a new al orithm which is O(mk(n + e)) time and needs only O(n) extra space, where k m is the number of allowed errors in the pattern. If the raph is acyclic, our time complexity drops to O(m(n + e)), improvin Amir’s results.
1
Introduction
Approximate strin matchin problems appear in a number of important areas related to strin processin : text searchin , pattern reco nition, computational biolo y, audio processin , etc. The edit distance between two strin s a and b, ed(a b), is de ned as the minimum number of edit operations that must be carried out to make them equal. The allowed operations are insertion, deletion and substitution of characters in a or b. The problem of approximate strin matchin is de ned as follows: iven a text of len th n, and a pattern of len th m, both bein sequences over an alphabet of size , and a maximum number of allowed errors k m, nd all se ments (or occurrences ) in text whose edit distance to pattern is at most k. That is, report all text positions j such that there is a su x x of text[1 j] such that ed(x patt) k. The classical solution is O(mn) time and involves dynamic pro rammin [11]. This solution is the most flexible to allow di erent distance functions. For the particular case of ed(), a number of al orithms have been presented to improve the worst case to O(kn) or the avera e case, e. . [8,13,5,12,4,14,15,3] Pattern matchin on hypertext [6] has been considered only recently. The model is that the text forms a raph of N nodes and E ed es, where a strin is stored inside each node, and the ed es indicate alternative texts that may follow the current node. The pattern is still a simple strin of len th m. It is also customary to transform this raph into one where there is exactly one character per node (by convertin each node containin a text of len th into a chain of This work has been supported in part by Fondecyt rants 1-950622 and 1-960881. C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 352 357, 1998. c Sprin er-Verla Berlin Heidelber 1998
Improved Approximate Pattern Matchin on Hypertext
353
nodes). This raph has n nodes and e ed es (note that n is the text size and e = n − N + E). Approximate strin matchin over hypertext is not only motivated by the structure of the World-Wide-Web and the possibility to search sequences of elements across paths of references, but also because raphs model naturally complex processes. In [7] it is considered the possibility of usin approximate strin matchin as a model for data minin , where the symbols are in fact events and sequences of interestin events (perhaps separated by uninterestin events) are sou ht. This corresponds to allowin only insertions into the pattern. A raph may be a functional description of a process (paths representin possible alternative sequences of events), and we may want to identify potentially dan erous sequences of events in the process under analysis. The rst attempt to de ne pattern matchin on hypertext is due to Manber and Wu [9], which view a hypertext as a raph of les with no links inside (it is easy to transform any hypertext to that form, by endin the node at its rst reference). They solve the problem for an acyclic raph in O(N + mE + R lo lo m) (where R is the size of the answer). Akutsu [1] solved the problem of exact pattern matchin on a hypertext which has a tree structure in O(n) time, while Park and Kim [10] extended this result to an O(n + mE) al orithm for directed acyclic raphs and for raphs with cycles where no text node can match the pattern in two places. Amir et al. [2] were the rst in considerin approximate strin matchin over hypertext. In this case they consider the raph with n nodes and e ed es and want to report all nodes v where in the text raph there is a su x x endin at node v such that ed(x patt) k. We say that x is a text su x endin at v if there is a path in the raph endin at v such that the concatenation of all characters of the traversed nodes yields x. Amir et al. prove that the problem is NP-Complete if the errors can occur in the text, and ive an al orithm to solve the case of errors only in the pattern, which is O(m(n lo m + e)) time and O(mn) space. Their al orithm can handle eneral raphs, not only acyclic ones. We present a new al orithm for approximate pattern matchin over hypertext raphs. For acyclic raphs, the al orithm is O(m(n + e)), which raises to O(mk(n + e)) for raphs with cycles. In both cases, our space complexity is O(n), which is by far smaller than that of [2]. On the other hand, we improve their time complexity for a small number of errors, namely for k = O(lo m) if e = O(n) and for kn = O(e lo m) otherwise. We also improve previous work in the case of acyclic raphs.
2
Rethinkin the Classical Al orithm
The classical al orithm to solve the eneral approximate strin matchin problem [11] is de ned in terms of a matrix C[i j]. When used to compute edit distance between two strin s a and b, we have that C[i j] is the edit distance
354
Gonzalo Navarro
between a[1 i] and b[1 j]. Therefore C[i 0] = C[0 i] = i for all i, and the update formula is C[i j] = (a[i] == b[j]) ? C[i − 1 j − 1] : 1 + min(C[i − 1 j] C[i j − 1] C[i−1 j−1]) where in the minimization the term C[i−1 j] corresponds to deletin the current character of the pattern, C[i j − 1] to insertin the current text character into the pattern, and C[i − 1 j − 1] to replacin the current character of the pattern by the current text character. Now, if a turns out to be a short pattern of len th m and b a lon text of len th n, and we want to search the approximate occurrences of the pattern into the text (i.e. text positions j such that the pattern occurs with at most k errors in a su x of text[1 j]), almost the same al orithm can be applied. The only modi cation needed is to set C[0 j] = 0 for all j (so as to ive each text position a chance to start a match). The problem with a lar e text is space. In principle, we should store the O(mn) size matrix C, which is prohibitively expensive. It is not hard to see, however, that to compute the column j of the matrix we only need to keep the column j −1. Therefore, it is enou h to keep and old and a new column to do the job, at a total space complexity O(m), which is very low. The time complexity does not chan e. For obvious reasons, the other alternative of computin the matrix row by row, keepin old and new rows at a space complexity of O(n), has never been considered. However, this is what we propose if the text is a raph. See Fi . 1.
[ ]
text
C i; j
dynamic programming processing direction
pattern
our processing direction Fig. 1. The classical and our traversal of the dynamic pro rammin matrix
3
Applyin the Al orithm to a Hypertext
Followin [2], we rst consider hypertexts where each node has just one character (it is easy to convert any hypertext to this form). Since the pattern keeps its
Improved Approximate Pattern Matchin on Hypertext
355
linear structure but the text does not, implementin the classical al orithm column-wise is di cult, because in a raph the notion of advancin in the text is not clear as in the linear version. However, we take advanta e of the fact that the pattern is still linear and apply the classical al orithm row-wise. That is, we perform m lon iterations. At the end of iteration i, we have computed for every node v of the raph the best edit distance between pattern[1 i] and any text su x in the raph which ends at node v. We recall that x is a text su x endin at v if there is a path in the raph endin at v such that the concatenation of all characters of the traversed nodes ives x. We denote by t[v] the text character at node v. The al orithm needs to keep a state per node, called C[v]. At each iteration new values for all C[v], denoted C 0 [v], are computed. This accounts for our O(n) extra space. The pseudocode for the al orithm is presented in Fi . 2. for all v V , C[v] 0. for i = 1 to m for all v V C [v] min ( v) E f (u v patt[i]) for all v V , C[v] C [v]
Fig. 2. Our al orithm for approximate strin matchin on hypertext. The function f () depends on the distance function used It is not hard to see that this al orithm takes O(m(n + e)) time and needs O(n) extra space. To follow the idea of the classical al orithm, the f function of the al orithm should be de ned as f (u v x) = (t[v] == x) ? C[u] : 1 + min(C[u] C[v] C 0 [u]) the problem bein to ensure that C 0 [u] has been already computed. If the raph has no loops this is easily achieved by computin the new C 0 values in topolo ical order (a topolo ical sortin takes O(n + e) time). This improves the previous result [2] both in time and space complexity. However, this does not work in case of loops. The problem is that the insertion of the current text character into the pattern makes the current value of C[v] to depend on its predecessors in the raph up to k nodes away. In a loop of len th less than k, there seems not to be easy way to determine the proper place to start the computation of the values of the loop. We solve the problem by not considerin insertions in the f function. Instead, insertions are simulated by modifyin the pattern. We take a new character that does not belon to the alphabet. This character can be deleted at zero cost, but replacin it costs the same as an insertion. We insert k such characters after each letter of the pattern. Therefore, if the al orithm would insert a text
356
Gonzalo Navarro
character between two pattern characters, what it does now is to replace one of the characters. The others can be deleted at zero cost. We insert k special characters at each position to allow all the k insertions to occur at the same place, if necessary. Therefore, if the pattern is loh and k = 3, we search for l
o
h
and our new f function is f (u v x) = (t[v] == x) ? C[u] : min(1 + C[u] del(C[v] x)) where del(C[v] x) = (x == ) ? C[v] : 1 + C[v] Since our pattern is now of len th mk, the cost of the al orithm becomes O(mk(n + e)) when the raph has loops. This improves the previous result of [2] especially in space, since we need O(n) extra space and they need O(mn) extra space. We improve their O(m(n lo m + e)) time complexity for the case k = O(lo m) if e = O(n), and kn = O(e lo m) otherwise.
4
Generalizations
We consider now the case where the text has a strin at each node, instead of a sin le character. In this case we distin uish the total text size, n, from the number of nodes, N . Since inside each node the text is linear, we can search at O(kn) worst-case cost inside the node. The state of the search at character j of a node depends only on characters from j − m − k + 1 to j. Therefore, althou h the rst (m + k) text characters of each node still depend on the state of the lobal search (i.e. previous characters in the raph), the rest of the search at each node can be computed independently. The nal state of the search in node v has the information needed by the lobal search at the nodes that follow v in the raph. It is easy to modify the dynamic pro rammin al orithm to keep count of the number of insertions performed in the pattern at each position. With that information it is possible to deduce which would be the state of the search at the end of node v for the lobal al orithm that uses the modi ed pattern. Therefore, if there are N nodes we must perform at most O(min(n N (m + k))) = O(min(n N m)) iterations of the lobal al orithm. The rest of the search on the whole text proceeds internally at each node at O(kn) total cost. Since our al orithm pays O(mk) per node and per ed e of the raph, our search cost is O(mk(min(n mN ) + E) + kn). This is O(kn) provided N = O(n m2 ) and E = O(n m). The distance function can be easily modi ed to allow exact searchin , or searchin allowin only insertions (which is the case in data minin ) or to ive a particular edit cost to each operation.
Improved Approximate Pattern Matchin on Hypertext
5
357
Conclusions and Further Work
We have addressed the problem of approximate strin matchin when the text is a hypertext and the pattern is a strin . The only previous al orithm is [2], which is O(m(n lo m+e)) time and O(mn) space. We presented a new al orithm which in case of acyclic raphs is O(m(n + e)) and in case of raphs with loops is O(mk(n + e)) time. Our al orithm needs only O(n) extra space. The main problem that prevents an O(m(n + e)) time and O(n) space alorithm is the combination of loops in the raph with operations that allow to insert text characters in the pattern. This situation creates circular dependencies that cannot be easily broken. We solved the problem by disallowin such operations and simulatin them with a di erent, lon er pattern. This solution is open to improvements and we are workin at it.
References 1. T. Akutsu. A linear time pattern matchin al orithm between a strin and a tree. In Proc. CPM’93, pa es 1 10, 1993. 2. A. Amir, M. Lewenstein, and N. Lewenstein. Pattern matchin in hypertext. In Proc. WADS’97, LNCS 1272, pa es 160 173, 1997. 3. R. Baeza-Yates and G. Navarro. A faster al orithm for approximate strin matchin . In Proc. CPM’96, LNCS 1075, pa es 1 23, 1996. ftp://ftp.dcc.uchile.cl/pub/users/ navarro/cpm96.ps. z. 4. R. Baeza-Yates and C. Perleber . Fast and practical approximate pattern matchin . In Proc. CPM’92, LNCS 644, pa es 185 192, 1992. 5. W. Chan and J. Lampe. Theoretical and empirical comparisons of approximate strin matchin al orithms. In Proc. CPM’92, LNCS 644, pa es 172 181, 1992. 6. J. Conklin. Hypertext: An introduction and survey. IEEE Computer, 20(9):17 41, September 1987. 7. G. Das, R. Fleischer, L. Gasieniec, D. Gunopulos, and J. Kark¨ ainen. Episode matchin . In Proc. CPM’97, LNCS 1264, pa es 12 27, 1997. 8. G. Landau and U. Vishkin. Fast strin matchin with k di erences. J. of Computer Systems Science, 37:63 78, 1988. 9. U. Manber and S. Wu. Approximate strin matchin with arbitrary costs for text and hypertext. In Proc. IAPR Workshop on Structural and Syntactic Pattern Reco nition, pa es 22 33, Bern, Switzerland, 1992. 10. K. Park and D. Kim. Strin matchin in hypertext. In Proc. CPM’95, pa es 318 329, 1995. 11. P. Sellers. The theory and computation of evolutionary distances: pattern reco nition. J. of Al orithms, 1:359 373, 1980. 12. E. Sutinen and J. Tarhio. On usin q- ram locations in approximate strin matchin . In Proc. ESA’95, LNCS 979, 1995. 13. Esko Ukkonen. Findin approximate patterns in strin s. J. of Al orithms, 6:132 137, 1985. 14. S. Wu and U. Manber. Fast text searchin allowin errors. CACM, 35(10):83 91, October 1992. 15. S. Wu, U. Manber, and E. Myers. A sub-quadratic al orithm for approximate limited expression matchin . Al orithmica, 15(1):50 67, 1996.
Solvin Equations in Strin s: On Makanin’s Al orithm Claudio Gu ierrez Wesleyan University, Middletown, CT 06459, U.S.A.
Abstract. We present a further simpli cation of Makanin’s al orithm, still the only known eneral procedure for solvin strin equations. We also ive pseudo-code, a thorou h analysis of its complexity, and complete proofs of correctness and termination.
1
Introduction
Checking if wo s rings are iden ical is a ra her rivial problem. Theore ically i corresponds o solving an equa ion wi h bo h sides cons an . For example, are hese s rings equal? ?
ababababbbbabaaabbbba = ababababbbababaaabbbba Finding pa erns in s rings is sligh ly more complica ed. This corresponds o solving equa ions in s rings, one of whose sides is a cons an , he ex , and he o her con ains pa erns (variables). For example, are here s rings s1 and s2 in he alphabe a b such ha when replacing x by s1 and y by s2 in ?
xxabxby = abaabababaaabbababababa you ge he same s ring on bo h sides? Equa ions of his kind are no di cul o solve. Indeed, many cases of his problem have very e cien algori hms and are he subjec of he eld of pa ern ma ching (see [2]). Finding solu ions o equa ions in s rings in general (i.e. where bo h sides con ain variables) is a surprisingly di cul problem.1 Try o nd a solu ion o his simple equa ion (or show i has none): ?
xaxby = bybyx Par ial solu ions o his problem were known long ago: in he seven ies Len in [7], Plo kin [11] and Siekmann [12] gave semi-decision procedures (which give a solu ion if he equa ion has one, bu if no , could run forever). In 1971, Hmelevski [6] solved he problem for equa ions in hree variables. 1
2 E
The current bound on its time computational complexity is O(22 ) where E is the len th of the equation E . Other anecdotal numbers: The paper in which Makanin presented the al orithm for the rst time has 70 pa es; later simpli ed versions (Ja ar, Schulz) have more than 30 pa es each. Also there have been at least two Ph.D. theses [1], [10], studyin this al orithm, possible simpli cations and implementations.
C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 358 373, 1998. c Sprin er-Verla Berlin Heidelber 1998
Solvin Equations in Strin s: On Makanin’s Al orithm
359
In 1977 Makanin [8] solved he problem in i s comple e generali y giving us he rs (and s ill he only known) algori hm o nd solu ions for arbi rary s ring equa ions. I was la er ex ended by Ja ar [5] o give all possible solu ions o an equa ion as well. In he mean ime, here has been some work simplifying various aspec s of he algori hm and even some implemen a ions [10], [1], [14], [13]. The problem of solving equa ions in (equa ionally de ned free) algebras is a well-es ablished area in compu er science called Uni ca ion, wi h a wide range of applica ions (see [3]). Solving equa ions in s rings has po en ial applica ions in many areas e.g. s ring uni ca ion in PROLOG-3, ex ensions of s ring rewri e sys ems, uni ca ion in some heories wi h associa ive non-commu a ive opera ors, which, due o he curren s a e of he ar of he problem, are s ill of no prac ical use. This highligh s he impor ance of s udying he only curren ly known general algori hm for solving s ring equa ions, i s complexi y and possible improvemen s. Makanin’s original paper focused on proving ha he ques ion Does he word equa ion E has a solu ion? is decidable. He was no in eres ed in eiher complexi y or implemen a ion. Af erwards Pecuche , Abdulrab, Ja ar and Schulz, among o hers, simpli ed some of he echnicali ies of he algori hm and i s proof of correc ness and ermina ion, and s ar ed o approach he problem from a compu a ional poin of view. On he o her hand, Ja ar, Koscielski and Pacholski s ar ed a sys ema ic s udy of i s complexi y. In his paper, we presen one more s ep owards i s simpli ca ion which also gives be er complexi y bounds. Firs , we in roduce a subs an ially simpler da a ype for he concep of eneralized equation which considerably simpli es he algori hm, making i more unders andable and allowing shor er and simpler proofs of he correc ness and ermina ion of he algori hm (compare [5], [13]). Secondly, we in roduce he associated Diophantine equations for an equa ion, which prune he search ree signi can ly, and by i self could possibly give ano her approach o solve s ring equa ions. Third, we give a horough analysis of he complexi y of he algori hm, obaining smaller bounds (al hough s ill in he same complexi y class) han Ja ar’s [5] (on which [13] and [9] are based). Las , bu no leas , we include comple e proofs of correc ness and ermina ion, and presen for he rs ime pseudo-code ready o be implemen ed in any language. Finally le us say ha our presen a ion owes much o Schulz [13], par icularly in Sec . 4. We use he erms word and strin in erchangeably.
2
Word Equations: Basic Concepts and Examples
De nition 1. Let C = a1 ar be a nite set of cons an s, and V = be an in nite set of variables. A word w over C V is a (possiv1 v2 bly empty) nite sequence of elements of C V. The leng h of w, denoted w , is the len th of the sequence. The exponen of periodici y of a word w is the maximal number p such that w can be written as uv p z for some words u v z with v non-empty.
360
Claudio Gutierrez
A word equa ion E is a pair (w1 w2 ) of words over C V, usually written as w1 = w2 . The number E = w1 + w2 is the leng h of the equation E. Note that xn V. A in E only a nite number of variables occur, let us say X = x1 Un ) of words over C V such that both uni er of E is a sequence U = (U1 sides of the equation become raphically identical when we replace all occurrences n. The exponen of periodici y of the uni er U of x by U , for each i = 1 is the maximal exponent of periodicity of the words U . I is very convenien o have a graphical idea of word equa ions. Consider he equa ion xaby = ybax. The variables x y represen unknown words. Graphically, x
b
− y where he leng h of he horizon al line xaby will be represen ed as −−− a − in each case is unknown, excep hose of he cons an s which are always of uni leng h. The ver ical lines will be called boundaries. In his represen a ion, he equa ion has a solu ion if here is a way of consis en ly overlapping bo h sides of i such ha he words (represen ed by segmen s of horizon al lines) be ween boundaries are he same. In general, here may be many such overlappings. Below we show wo possibili ies among many o hers (we draw he variables in di eren levels in order o highligh he limi s of each variable): x y
a
b
b (a)
y a
x
x y
a b
a
b
y x
(b)
The nex s ep is o replace equals by equals (elimina ion of variables) from lef o righ , e.g. in case (a) we can replace y = xa in he o her occurrence of y. Af er his, we have o guess he order of some boundaries again, and so on. This example con ains he basic idea a he hear of he algori hm: (i) guess an ordering of he boundaries, ha is, which comes rs , which second, and so on, for all he ini ial boundaries on bo h sides of he equa ion, and (ii) proceed from lef o righ replacing equals by equals. Bu his naive recipe has some problems: (1) he number of occurrences of some variables s ar s growing af er replacemen , (2) wha o do in cases where here is no eviden replacemen (cf. example (b) above), and (3) you can go on forever (cf. he equa ion xa = ax). Tha is why a more elabora e idea is needed. The s ar ing poin is o build, for each word equa ion, an arrangemen like he above. I is convenien ( o avoid problem 1) o work wi h an equivalen sys em of equa ions in which each variable occurs no more han wice. No e ha his is always possible. Consider for example he equa ion bxyx = yaxz. I is equivalen o bx1 yx1 = yax2 z and x1 = x2 . One possible arrangemen of boundaries will look like
Solvin Equations in Strin s: On Makanin’s Al orithm
b
y 1
y
x1
x1
x2 a
2
3
z
x2 4
361
(1)
z 5
6
7
No e ha x1 = x2 can be easily expressed in he same arrangemen (see he las wo columns). I also will be convenien o have exac ly wo copies of each variable in he arrangemen ( ha is why we pu one more copy of z on op of i in he las column). This presen a ion of a word equa ion is he s ar ing poin of Makanin’s algori hm.
3
Generalized Equations
The main concep in Makanin’s algori hm is ha of Generalized Equation, essenially a da a- ype ha codi es arrangemen s as hose shown above. The version presen ed here di ers somewha from he classical ones [8], [5], [13], allowing a considerably simpler algori hm. De nition 2. A generalized equa ion GE consists of (1) Two nite sets C and X , the labels. (2) A nite linear ordered set (BD ), the boundaries. en )), where (3) A nite set BS of bases. A base bs has the form (t (e1 en ) is a sequence of boundaries ordered n 2, t C X , and Ebs = (e1 by . subject to the followin conditions:2 (C1) For each x X , there are exactly two bases with label x, called duals, and (abusin notation) denoted by x and x respectively. Also, their respective boundary sequences Ex Ex must have the same len th. (C2) For each base bs with t C, the boundary sequence Ebs has exactly two elements and they are consecutive in the order . Some de ni ions and conven ions o ease he no a ion: A base bs = (t Ebs ) is called constant if t C, and variable if t X . The rs elemen in Ebs is called he left boundary of he base, deno ed left(bs), and he las , he ri ht boundary, ri ht(bs). 2
The data above is intended to represent arran ements like 1. So we must impose some additional conditions. (C1) says that we have exactly two occurrences of each variable. The boundary sequences are to record known information about identical columns in these pairs of variables. Intuitively they are codin ‘the word between column ei and ej is equal to that between ei and ej ’. (C2) says that constants have len th 1.
362
Claudio Gutierrez
Le ers x y z will be used as me a variables for variable bases. Also le ers i j will deno e boundaries. A pair (i j) of boundaries wi h i j is called a column of GE. Columns (i i) are called empty; columns (i i + 1) are called indecomposable. The column of a base bs is de ned as col(bs) = (left(bs) ri ht(bs)). A base is empty if i s column is emp y. A generalized equa ion is solved if all i s variable bases are emp y. De nition . A uni er of GE is a function U that assi ns to each indecomposable column of GE a word over C V (extend it by concatenation to all non-empty columns of GE) with the followin properties: (1) For each constant base bs of label c, U (col(bs)) = c. Ex , U (e1 ej ) = (2) For every pair of dual variables x x, and for every ej U (e1 ej ) (recall e1 ej Ex ). In particular U (col(x)) = U (col(x)). U is s ric if U (i i + 1) is non-empty for every i BD. The index of U is the number U (b1 bM ) , where b1 is the rst and bM the last element of BD. The exponen of periodici y of U is the maximal exponent of periodicity of the words U (col(x)), where x is a variable base.
De nition 4. For a eneralized equation GE, and c C, the associa ed sys em of linear Diophan ine equa ions, L(GE c), is de ned by: (1) A variable Z for each indecomposable column (i i + 1) of GE. en )) and (x (e1 (2) For each pair of dual variables bases (x (e1 de ne (n − 1) equations, for j = 1 n − 1: P ej ej+1
Z =
P ej ej+1
en ))
Z
(3) For each constant base (t (i i + 1)), de ne the equation Z = 1 if t = c and Z = 0 if t = c.
Lemma 1. If GE has a uni er, then L(GE c) is solvable for each c
C.
Proof. Le U be a uni er of GE and c C. De ne Z = U (i i + 1) − Dc where Dc is he number of occurrences of cons an s di eren from c in he word U (i i + 1). Using he fac ha U is a uni er, i is easy o check ha his is a solu ion o L(GE c). Checking solvabili y of sys ems of linear Diophan ine sys ems is decidable, al hough expensive (NP-comple e). A generalized equa ion GE whose sys em L(GE c) is solvable for all c C is called admissible.
Solvin Equations in Strin s: On Makanin’s Al orithm
.1
363
The Translation from Word Equations to Generalized Equations
Given a word equa ion E, we can ob ain (possibly) many generalized equa ions by a procedure like ha of Sec . 2: for each possible overlapping of bo h sides of E, proceed as in examples in Sec . 2, and hen check admissibili y. The deailed descrip ion of he algori hm and he checking of he proper ies below is s raigh forward, so we will omi i . Lemma 2. There exists an al orithm Gen which for every word equation E outputs a nite set Gen(E) of eneralized equations with the followin properties: (1) E has a uni er with exponent of periodicity p if and only if some GE Gen(E) has a strict uni er with exponent of periodicity p. (2) For each GE Gen(E), every boundary is the ri ht or left boundary of a base. Also, every boundary sequence contains exactly these two boundaries. (3) For GE Gen(E), the number of bases of GE does not exceed 2 E . (4) Every GE Gen(E) is admissible. As an illus ra ion le us show an elemen in Gen(bxyx = yaxz), he one corresponding o he arrangemen (2) in Sec . 2. The corresponding generalized equa7 and BS = (b (1 2)), ion is: C = a b , X = x1 x2 y z , BD = 1 (a (3 4)), (x1 (2 3)), (x1 (5 7)), (y (3 5)), (y (1 3)), (x2 (4 6)), (x2 (5 7)), (z (6 7)), (z (6 7)) .
4
The Transformation Al orithm
Now we know ha every word equa ion E has a se of generalized equa ions Gen(E) equivalen o i in he sense of Lemma 2. Hence he problem is reduced o work on generalized equa ions. Given a generalized equa ion, he basic idea of he algori hm as was shown in Sec . 2 is he successive replacemen of equal variables from lef o righ . The naive idea is o pick he lef mos and bigges variable (called he carrier) and ranspor all i s columns o he posi ion of i s dual. Unfor una ely some imes no all i s columns can be moved wi hou losing essen ial informa ion (see example (b) in Sec . 2). Wha is o be done? Answering his ques ion is he mo iva ion of he following wo de ni ions. Le us x hroughou his sec ion a non-solved generalized equa ion GE = (C X BD BS). De nition 5. The carrier of GE, denoted xc , is the non-empty variable base with smallest left boundary. If there is more than one, xc is the one with lar est ri ht boundary. If there is still more than one, choose one amon them randomly. We will denote lc = left(xc ) and rc = ri ht(xc ). The cri ical boundary of GE is de ned as cr = min left(y) : rc col(y) if the set is non-empty, and cr = rc if not.
364
Claudio Gutierrez
De nition 6. Let bs be base of GE, bs is not the carrier. Then (1) bs is superfluous if col(bs) = (i i) lc . (2) bs is ranspor if lc left(bs) cr or col(bs) = (cr cr). (3) bs is xed if it is not superfluous and not transport. Note that all variable bases with left(x) lc are empty by de nition of the carrier. Also, each base except the carrier is exactly one of these: superfluous, transport or xed, dependin on what re ion of the dia ram below its left boundary is: b1
lc
superfluous transport
cr
rc
xed
bM
xed
Let us illustrate these de nitions with the examples in Sect. 2. In (a) the carrier is y, lc = 1, rc = cr = 3. In (b) the carrier is x, lc = 1, cr = 4 and rc = 5. Now we know what bases should be moved: he ranspor bases. I is ime o de ne where o move hem. The nex de ni ion poin s o his problem. Notation. For each boundary lc i rc in BD, le us in roduce a new symbol itr (which will indica e he place where he boundary i should go) and deno e en ) = (etr etr r(Ex ) = r(e1 1 n ). De nition 7. A prin of GE is a linear order [lc rc ] satisfyin the followin conditions:
on the set BD
itr : i
(1) extends the order of BD and j tr k tr for lc j k rc . (2) tr(Ec ) = Ec . (The structure of the carrier overlaps its dual.) Ex , etr = e , then tr(Ex ) = Ex . (3) If x is transport, x xed, then if for some e (The order is consistent with the boundary sequence information.) (4) If (c (i j)) is a constant base, then i j (and also itr j tr if i j [lc rc ]) are consecutive in the order . (Constants are preserved.) Finally we are ready o presen he hear of he Makanin’s algori hm, he procedure Transport, which corresponds o he replacemen of equals by equals from lef o righ . Once we have he classi ca ion of bases and a prin (a guess abou where each boundary o be ranspor ed will go), hings are rela ively s raigh forward: leave he xed bases un ouched and move all ranspor bases. All he in ricacies of he algori hm will hen rely on he carrier and i s dual (lines 1-5 and 7-8). We need some no a ion o describe i : for a se of boundaries A, Ex A will deno e he subsequence of Ex of he elemen s in A. Similarly, Ex A is he super-sequence of Ex ob ained by adding he elemen s of A in he be a prin of GE. corresponding order. Ec is a shor hand for Exc . Le Transport(GE ) 1. if cr rc then Ec i 2. Ec Ec i 3. Ec
BD : cr i BD : crtr i
Solvin Equations in Strin s: On Makanin’s Al orithm
4. 5. 6. 7. 8. 9. 10. 11. 12 13. 14. 15. 16. 17.
365
cr = rc Ec r(Ec ) recall r(Ec ) = Ec for each ranspor base bs BS do Ec i : i Ebs and cr i Ec Ec itr : i Ebs and cr i Ec r(Ebs ) Ebs for each variable base x BS wi h col(x) = col(x) do (ri ht(x) ri ht(x)) Ex Ex Ex BS bs BS : cr left(bs) BD i BD : i Ebs and bs BS X x X : (x E) BS C c C : (c E) BS return (C X BD BS). else
Remarks. Firs no e ha xed bases are lef un ouched. Lines 1-5 process he carrier and i s dual: If cr rc , hen xc xc are shrunk. If cr = rc , xc is ranspor ed comple ely on o xc . Lines 6-9 process ranspor bases: add new boundary equa ions o he carrier (lines 7-8) and give he new posi ion (line 9). Lines 10-12 op imize: once wo duals overlap, hey are no necessary anymore. Lines 13-16 elimina e superfluous bases, boundaries, and labels respec ively. An example will help. In he diagrams below, on he lef here is a generalized equa ion GE (suppose Ey = (2 4 5), Ey = (3 5 7) and Ez = (left(z) ri ht(z)) for all he o her bases) and on he righ Transport(GE ) for a prin wi h 1tr = 5, 2tr = 6, 3tr = 7 and 5tr = 8. No e ha 4tr in roduces a new boundary. u
xc
xc y 1
2
y 3
4
5
u u
6
xc y 7
8
1
2
3
u
xc
y 4
5
6
7
4tr 8
The cri ical boundary of GE is 3. Because 3 5 = rc , xc and xc are shrunk. Nex , he ranspor bases u, y are moved o heir new posi ions. No e ha when moving y we lose informa ion, e.g. ha y and y have a common segmen , column (3 4). The algori hm keeps rack of i by adding he boundary 4 o Ec and 4tr o Ec , i,e. he new Ec = (3 4 5) and Ec = (6 4tr 8) (lines 7-8 in he code), and hence he segmen s con inue o be equal hrough he ‘boundary equa ion’ which says ha columns (3 4) and (7 4tr ) are he same. Observe ha u produces no new boundary equa ion and he xed bases u y remained un ouched. Finally, we can dele e he boundaries o he lef of cr = 3 (line 14). The nex lemma follows easily from he de ni ions and he code. Lemma . Transport(GE
) is a eneralized equation.
No e ha a generalized equa ion has only ni ely many di eren prin s. So he following procedure re urns a ni e se of generalized equa ions.
366
Claudio Gutierrez
Transf(GE) 1. S 2. Print he se of all prin s of GE 3. for each prin Print do Transport(GE ) 4. GE 0 5. if GE 0 is admissible then 6. S S GE 0 7. return S Lemma 4. The followin assertions hold: (1) If GE has a strict uni er S with index I and exponent of periodicity p, then Transf(GE) has an element GE 0 which has a strict uni er S 0 with index I 0 I and exponent of periodicity p0 p. (2) If an element of Transf(GE) has a uni er, then GE has a uni er. Proof. (Ske ch). us = Proof of (1). Because S is a uni er, hus in par icular S(col(xc )) = u1 V C. Hence, we have a func ion f from he boundaries S(col(xc )), wi h u s wi h S(i j) = uf ( ) uf (j)−1 . Ex end i by in [lc rc ] [lc rc ] o 1 2 [lc rc ]. Then he following order in BD0 = de ning f (itr ) = f (i) for i tr BD i : i [lc rc ] is a prin of GE: For i j For lc For i j
BD, de ne i j i i BD j. j BD rc , de ne lc j tr rc . BD0 and lc i j rc , de ne i j i f (i)
BD
f (j).
De ne GE 0 = Transport(GE ). In order o cons ruc he uni er S 0 of GE 0 , uf ( +1)−1 if lc i rc , and S 0 (i i+1) = for i BD0 : de ne S 0 (i i+1) = uf ( ) 0 S(i i + 1) o herwise. Then S is a uni er of GE 0 and s ric if S is s ric . Also, from lc BD cr i follows ha he index of S 0 index of S. Also, he exponen of periodici y of S 0 does no exceed ha of S since S 0 (col(x)) is a su x of S(col(x)) for any base x of GE 0 . Transf(GE). Le i j Proof of (2). Suppose S 0 is a uni er of some GE 0 BD0 ; as S 0 (itr j tr ) if be consecu ive in BD. De ne S(i j) as S 0 (i j) if i j tr tr 0 BD ; as c if here is a cons an base (c (i j)) in GE, and nally as he i j emp y word if here is no base (c (i j)) in GE and i or j is no in BD0 . I can be checked ha S is a uni er of GE.
5
The Final Al orithm
Given a word equa ion E, de ne i s associa ed Makanin’s tree, T (E), recursively as follows:
Solvin Equations in Strin s: On Makanin’s Al orithm
367
The roo of T (E) is E. The children of E are Gen(E). (see Lemma 2) For each node GE (no he roo ), he se of i s children is Transf(GE). Theorem 1. Let E be a word equation. Then E has a uni er if and only if T (E) has a node labelled with a solved eneralized equation. Proof. Suppose E has a uni er. By Lemma 2 here is an elemen of Gen(E) which has a s ric uni er wi h some index IE . By induc ion on he dep h of he node, using Lemma 4, i can be proven ha T (E) has a branch, wi h each node labelled by a s ric ly-uni able generalized equa ion and he index decreases for every child. Since he index is non-nega ive, he branch is ni e; hence here mus be a node GE for which Transf does no apply. The only possibili y is ha GE is solved. On he o her direc ion, apply induc ion again, using lemmas 2 and 4. Theorem 1 immedia ely gives a semi-decision procedure: examine all nodes of T (E) o nd ou if E has a solu ion. Bu in general, he ree could be in ni e. Here comes he kernel of Makanin’s algori hm: here exis s a ni e number KE ha bounds he number of nodes we have o visi . Makanin(E) bound of he search 1. K KE 2. S Gen(E) 3. Search(S K) Search(S K) 1. if all elemen s of S are marked then 2. return FAILURE 3. else 4. pick a non-marked GE = (C X BD BS) 5. if GE is solved then 6. return SUCCESS 7. else if BD > K then 8. mark GE ; Search(S K) 9. else 10. S S Transf(GE) 11. mark GE ; Search(S K)
6
S
Correctness and Termination
From now on, le us x a word equa ion E, and le T (E) be i s associa ed Makanin ree. All generalized equa ions will be nodes of T (E). For GE = (C X BD BS) wi h parame ers M = BD , N = BS , V = 2 X ( he number of variable bases) wri e GE(M N V ). The corners ones of Makanin’s algori hm are he nex wo heorems. The rs is based on a deep resul in word combina orics, s a ed by Buli ko in 1970, whose bound was improved recen ly by Koscielski and Pacholski [9].
368
Claudio Gutierrez
Theorem 2. If a word equation E is uni able, then it has a uni er with exponent of periodicity pE 3 E 21 07jEj . A simple proof (for a weaker bound) which gives a good in ui ion of why, if E is uni able, here mus be uni ers of his kind can be found in [8]. The nex heorem is due o Makanin. Theorem . If GE(M N V ) is a node of T (E), then the exponent of periodicity N −2) . of all its strict uni ers exceeds 2 lo V (M V3 The proof of his resul is ricky, and he res of he paper is devo ed o i . Before proving i , le us show why Makanin’s algori hm works. Theorem 4. Makanin is correct and terminates. Proof. Le E be a word equa ion. Termina ion of Makanin(E) reduces o show ha Search(Gen(E) KE ) ermina es. 3 De ne p = 3 E 21 07jEj and KE = 24pjEj l 2jEj+l 2jEj + 4 E . Now observe ha here are only ni ely many generalized equa ions GE(M N ) wi h xed parame ers M N , and ha a every s age in Search, every elemen GE(M N ) S 2 E (Lemma 2(3) and line 13 in has M KE (line 7 of Search) and N Transport). Hence, because in each loop one more elemen of S is marked, Search will even ually s op. Makanin is correc . If E has no uni er, hen by Thm. 1 here is no solved node in T (E). Hence Search will never reach line 6. Therefore even ually all nodes will be marked and Search will ou pu FAILURE. Now suppose ha E has a uni er. Then by Thm. 2, i has a uni er wi h exponen of periodici y less han p. From he proof of Thm. 1 i follows ha here is a branch in T (E) ending in a node labelled wi h a solved generalized equa ion SGE. By Lemmas 2(1) and 4(1), i follows ha each node GE(M N V ) in he branch has a s ric uni er wi h exponen of periodici y p0 p. Also from N −2) p0 . So we can conclude, using V N 2E, Thm. 3 we have 2 lo V (M V3 0 5pV 3 l V +l N + 2N KE . Hence all he nodes in he branch ha M 2 even ually will be in S, so Search will visi SGE and check ha i is solved (line 5) and re urn SUCCESS. Now le us prove Thm. 3. The general lines are as follows: (i) From each GE T (E) we can ob ain (using he rela ions genera ed by he boundary sequences of he bases) cer ain chains of words. This is Prop. 2, whose proof is long and very echnical; (ii) By a coun ing argumen i follows ha a large number of boundaries in GE produces long chains of words. This is Prop. 3; (iii) Combine (ii), using Lemma 5, wi h a combina orics resul (Prop. 1) es ablishing a rela ion be ween long chains of words and high exponen of periodici y. Le us begin wi h he formal de ni ions of hose chains of words. Bk Ck (B and De nition 8. A domino ower is a sequence of words B1 C1 C non-empty) such that for all 1 i k 1. There are (possible empty) words S such that B +1 = S B 2. There are (possibly empty) words R T such that C R = C +1 T .
Solvin Equations in Strin s: On Makanin’s Al orithm Si
Bi
Ci
Bi+1
Ci+1
Si Ti
Bi
Ci
Bi+1
369
Ri
Ci+1
The len th of the sequence is called the heigh of the domino tower. The following resul (whose proof can be found in [14]) es ablishes a rela ion be ween he leng h of a domino ower and he exponen of periodici y of some of i s words. XN be a set of non-empty words. Suppose Proposition 1. Let X = X1 Bk Ck is a domino tower of hei ht k and each the sequence of words B1 C1 X . If for all i, B +m > B , then some word Bt Ct has the form BC k Bt Ct = P s Q, where P is non-empty and s + 1 mN 2. So we need o genera e long domino owers whose building blocks are elemen s of X . In his way, one variable will have a large exponen of periodici y. De nition 9. Let GE be a eneralized equation, x a variable base of GE. (1) A sub-base of x, Sx , is a column of the form (left(x) i) with i Ex . If left(x) = i the sub-base is called emp y. (2) Each sub-base Sx has its dual (the correspondin column in the dual variable), denoted Sx or Sx . This pair is called boundary equa ion and denoted Sx Sx . Note that if U is a uni er of GE, then U (S) = U (S). GE 0 will deno e Transport(GE ). Also left0 is he corresponding funcion in GE 0 , e c. So, if S = (left(x) i)x is a sub-base of GE, hen S 0 will deno e i s ‘image’ in GE 0 , i.e. (left0 (x) itr )x if x is ranspor and (left0 (x) i)x o herwise. In case S 0 is emp y or i is no a sub-base of GE 0 (i.e. x becomes emp y in GE 0 or x = xc and i cr in GE) hen S is called a terminal sub-base. De nition 10. Let S1 S2
Sn be sub-bases of GE.
(1) Let S1 = (b1 i) and S2 = (b2 i) be sub-bases with the same second boundary. S1 is a su x of S2 if b2 b1 . We write S1 S2 . (2) A (mono one) su x chain in GE is a sequence S1 S2 Sn of sub-bases with S1
S2
S2
S3
S3
Sn−1
Sn−1
Sn
We will denote this chain by S1 Sn . (3) A convex su x chain is a sequence S1 St Sn such that S1 St and St St and St Sn . We write S1 Sn . Note that when t = 1 or t = n we have chains as in 1. (i.e. convex chains eneralize monotone chains.) The nex lemma (whose proof is an easy check) shows he rela ion be ween su x chains and domino owers.
370
Claudio Gutierrez
Lemma 5. Let S1 Sk be a monotone su x chain of GE and U a uni er of GE. Suppose Sj is a sub-base of x j . Then n. (1) U (Sj ) is a su x of U (Sj+1 ) for all j = 1 (2) De ne Bj = U (Sj ) and Cj such that U (col(x j )) = Bj Cj . Then the sequence Bk Ck forms a domino tower of hei ht k. of words B1 C1 The nex lemmas have long (bu s raigh forward) proofs by exhaus ion of all possible cases of he bases involved ( ranspor , xed, carrier, i s dual). We will do one case o give he flavor of he echnique. Lemma 6. Let Sx Sy in GE and Sx Sy be non-terminal. Then 1. If y is the carrier or its dual, then Sx0 Sy0 or Sx0 Sy0 in GE 0 . 2. If y is neither the carrier nor its dual, then Sx0 Sy0 in GE 0 . Proof of 1. Le Sx = (b1 i)x (b2 i)y = Sy . b1 . Suppose rs ha x is ranspor , i.e. (a) y is he carrier. So lc = b2 tr i ) (crtr itr ) (cr i) (cr i) = Sy0 in GE 0 . b1 cr. I holds ha Sx0 = (btr 1 Now, suppose x is xed, i.e. cr b1 . We have Sx0 = (b1 i) (cr i) = Sy0 in GE 0 . No e ha his also works if x is he dual of he carrier. b1 . No e (b) Now, assume y is he dual of he carrier, ha is lc = b2 ha x canno be he carrier now. So le us suppose x is nei her he carrier nor i s dual. Because he dual of he carrier is xed (always), x mus be xed oo b1 ). Hence we have S 0 = (b1 i) (crtr i) = Sy0 or Sx0 Sy0 , (cr lc = b2 tr tr b1 . depending on whe her b1 c or c Lemma 7. Let Sx
Sy
Sy
Sz in GE and Sx be non-terminal.
(1) If z is the carrier or its dual and y is the carrier or its dual, then Sz is non-terminal and Sx0 Sz0 in GE 0 . (2) If z is the carrier or its dual, Sz is non-terminal, and y is neither the carrier nor its dual, then Sx0 Sz0 . (3) If z is neither the carrier nor its dual, Sz is non-terminal, then Sx0 Sz0 . Lemma 8. Suppose Sx
Sz in GE, and Sx Sz are non-terminal. Then
(1) If z is the carrier or its dual, Sx0 Sz0 in GE 0 . (2) If z is neither the carrier nor its dual, Sx0 Sz0 in GE 0 . Proof. A simul aneous induc ion for (1) and (2) on he leng h of he chain. The base cases are lemmas 6 and 7. Lemma 9. (convex chains) Let Sx and Sz be sub-bases of GE which are nonterminal. Suppose there is a convex chain from Sx to Sz in GE. Then there is a convex chain from Sx0 to Sz0 in GE 0 . St Proof. Induc ion on he leng h of S1 St poin t and he possibles cases for he chains S1
Sn . Consider he urning St and Sn St .
All he preceding work was done in order o prove he nex lemma. Ex end he no a ion S B o allow B o be a cons an base, i.e. S = (b i) (l r) = B i i = r and b l. Similarly for S B.
Solvin Equations in Strin s: On Makanin’s Al orithm
371
Proposition 2. For each non-empty sub-base S of GE T (E), there is a conSn B with B = col(bs) for some base bs of GE. vex chain S = S1 Proof. Induc ion on he dep h of s ruc ure of T (E). For elemen s GE Gen(E), he only sub-bases are of he form (left(x) ri ht(x)) (Lemma 2(2)), so he s a emen is rivially rue. Now suppose he s a emen is valid for GE. We will prove i for GE 0 = Transport(GE ). Le S 0 be a non-emp y sub-base of GE 0 . I is an image of a sub-base S in Sn B = col(bs) in GE GE, so by hypo hesis here is a convex chain S = S1 and S is non- erminal because S 0 is i s image. If Sn is non- erminal, applying Sn0 B 0 is convex and B 0 = col0 (bs). Lemma 9 i follows ha S10 So suppose Sn is erminal. Le t (1 t n) be he smalles index such ha Sn are all erminal sub-bases. So St−1 is non- erminal, and by Lemma St 0 in GE 0 . Le us show ha i can be 9 here is a convex chain from S10 o St−1 comple ed o end wi h col(bs) for some base bs. Deno e St−1 = (ly j)y and St = (lz j)z . 0 = (lytr j tr )y (j tr j tr )z = If z is nei her he carrier nor i s dual, hen St−1 col0 (z) in GE 0 if y is ranspor . If y is xed hen cr ly , hence St−1 St and 0 = (ly j)y (cr j) so S1 St−1 because he chain is convex. Then St−1 tr tr tr tr 0 (cr j ) (j j )z in GE . If z is he carrier hen j cr (St is erminal), so y is ranspor . Now, if 0 = (lytr j tr )y B 0 . If St+1 = Su St+1 = B cons an , i mus be xed, so St−1 wi h u a variable, u can nei her be he carrier nor i s dual (because St+1 is also 0 = (lytr j tr )y (j tr j tr )u in GE 0 . If z is he dual of he erminal), hence St−1 carrier he analysis is similar. A strict convex chain is one in which each sub-base appears jus once. (No e ha a sub-base is charac erized by i s column and i s base.) Lemma 10. Let S0 = (b0 i) be a xed sub-base of GE(M N V ). The number of di erent sub-bases S of GE such that there is a strict convex chain S = Sj = S0 of len th j k is less than V k . S1 Sj wi h S S +1 or S S +1 for Proof. Consider he se of chains S1 each i. Clearly i con ains he se of s ric convex chains. For j = 1 no e ha if (b i) (b0 i) or (b i) (b0 i), b mus be a lef boundary of a variable base, and here are less han V such boundaries di eren from b0 . The general case follows by simple combina orics, i.e. here are no more han V k chains of ha ype. Proposition . Let GE(M N V ) T (E). Then there is a strict convex chain of len th bi er than logV (M N − 2) in GE. Proof. A sub-base is of he form (b i)x wi h i Ex . There are a leas (M − 2N ) di eren non-emp y sub-bases ( he number of boundaries -line 14 of Transportminus he lef and righ boundaries of each base). By Prop. 2, for each such subSn B = col(bs) for some base bs. Bu base S here is a convex chain S = S1 here are N bases in GE, hence here is a base bs0 , such ha a leas (M −2N ) N
372
Claudio Gutierrez
sub-bases have a convex chain o bs0 (which is s ric because all sub-bases were di eren ). Now, by Lemma 10, V k > (M − 2N ) N , hence k > logV (M N − 2).
Proof of Theorem 3. By Prop. 3, for n = logV (M N − 2) here is a s ric convex St Sn . Hence S1 St or Sn St is a (mono one) chain of chain S1 leng h k > n 2. Bk Ck Le U be a s ric uni er of GE and consider he domino ower B1 C1 of heigh k associa ed o he chain as in Lemma 5, all of whose words Bj Cj = U (col(x j )) are in U (col(x)) : x X . There are V variable bases, so, for every S +V . Now because i wo sub-bases of he same variable mus appear in S all sub-bases are di eren (s ric chain), B +V = U (S +V ) > U (S ) = B . We conclude from Prop. 1 ha here is a word Bj Cj of he form P s Q wi h P 2 lo V (M N −2) k . non-emp y and s + 1 > V jX j2 > V3
7
Final Remarks
There are hree key poin s in es ima ing he ime complexi y of Makanin: rs , bounds on pE , he exponen of periodici y of word equa ions. Thm. 2 20 29jEj (see [9]); The second poin is almos op imal: i is known ha pE is bounds on KE , he dep h of he search. Ja ar’s es ima e [5] was of he or2 5 3 der 16N 15 p(6 E 2 )2 (2N )32p(6jEj )N . We improved i o 20 5p(jEj)V l V +l N where N 2 E ; The hird poin , bounds on Search. A rough p(x) = 3x21 07x and V bound is he number of all di eren GE(M N ) wi h N 2 E and M KE . There seems o be much room for improvemen on hese las wo poin s. Also a ner analysis of Transport would imply a clearer pic ure of he in erplay among prin s, associa ed Diophan ine equa ions, and he kind of search needed. Rounding, he curren ime complexi y bound on Makanin is riple exponen ial in E . Finally, le us say ha i is easy o add wo lines o Transport in order o ge explici solu ions: we need an ex ra variable U (a lis of pair of boundaries) for each pair of duals o keep rack of he value of he original variable. The proof of Lemma 4 ells how hey have o be upda ed.
References 1. H. Abdulrab, 1987. Resolution d’equations sur les mots: etude et implementation LISP de l’al oritme de Makanin, Ph.D. dissertation, Univ. Rouen, Rouen. 2. A.V. Aho, 1990 Al orithms for ndin patterns in strin s, in Handbook of Theoretical Computer Science (J. van Leeuwen, ed.), Elsevier Sc. Pub., pp. 256-300. 3. F. Baader, J.H. Siekmann, 1994. Uni cation Theory, in Handbook of Lo ic in Artif. Int. and Lo ic Pro ., Vol. 2, (D. Gabbay et al, ed.), Clarendon Press, Oxford. 4. V.K. Bulitko, 1970. Equations and Inequalities in a Free Group and a Free Semiroup, Tul. Gos. Ped. Inst. Ucen. Zap. Mat. Kafedr. Vyp. 2 Geometr. i Al ebra (1970), 242-252.
Solvin Equations in Strin s: On Makanin’s Al orithm
373
5. J. Ja ar, 1990. Minimal and Complete Word Uni cation, Journal ACM, Vol. 37, No.1, January 1990, pp.47-85. 6. J.I. Hmelevski , 1971. Equations in free semi roups, Trudy Mat. Inst. Steklov. 107 (1971). En lish translation: Proc. Steklov Inst. Math. 107 (1971). 7. A. Lentin, 1972. Equations in Free Monoids, in Automata La ua es and Pro rammin (M. Nivat ed.), North Holland, 67-85. 8. G.S. Makanin, 1977. The problem of solvability of equations in a free semi roup, Math. USSR Sbornik 2(1977) (2), 129-198. 9. A. Koscielski, L. Pacholski, 1996. Complexity of Makanin’s Al orithm, Journal of the ACM, Vol. 43, July 1996, pp. 670-684. 10. J.P. Pecuchet, 1981. Equations avec constantes et al oritme de Makanin, These de doctorat, Laboratoire d’informatique, Rouen. 11. G.D. Plotkin, 1972. Buildin -in equational theories, Mach. Int. 7, 1972, pp. 73-90. 12. J. Siekmann, 1972. A modi cation of Robinson’s Uni cation Procedure, M.Sc. Thesis. 13. K.U. Schulz, 1993. Word Uni cation and Transformation of Generalized Equations, Journal of Automated Reasonin 11: 149-184, 1993. 14. K.U. Schulz, 1990. Makanin’s Al orithm for Word Equations: two improvements and a eneralization, in LNCS 572, pp. 85-150.
Spellin Approximate Repeated or Common Motifs Usin a Su x Tree Marie-France Sago Service d’Informa ique Scien i que, Ins i u Pas eur 28, rue du Dr. Roux - Paris and Ins i u Gaspard Monge, Universi e de Marne la Vallee 2, rue de la Bu e Ver e - Noisy le Grand
Abs rac . We presen in his paper wo algori hms. The rs one exrac s repea ed mo ifs from a sequence de ned over an alphabe . For ins ance, may be equal o A, C, G, T and he sequence represen s an encoding of a DNA macromolecule. The mo ifs searched correspond o words over he same alphabe which occur a minimum number q of imes in he sequence wi h a mos e misma ches each ime (q is called he quorum cons rain ). The second algori hm ex rac s common mo ifs from a se of N 2 sequences. In his case, he mo ifs mus occur, again wi h a mos e misma ches, in 1 q N dis inc sequences of he se . In bo h cases, he words represen ing he mo ifs may never be presen exac ly in he sequences. We herefore speak of he mo ifs, repea ed in a sequence or common o a se of hem, as being ex ernal objec s and deno e hem by he expression valid models if hey verify he quorum cons rain q. The approach we in roduce here for nding all valid models corresponding o ei her repea ed or common mo ifs s ar s by building a su x ree of he sequence(s) and hen, af er some fur her preprocessing, uses his ree o simply spell he models. Assuming an alphabe of xed size, he o al ime needed is O(nN 2 V(e k)) using O(nN 2 w) space, where n is he (average) leng h of he sequence(s), k is he leng h of he models sough or is he leng h of he longes possible valid models, w is he size of a word machine and V(e k) is he number of words of leng h k a a Hamming dis ance a mos e from ano her k-leng h word. V(e k) may be majored by ke e . This improves on an algori hm by Wa erman [23]. I is also a be er ime bound han our previous approach [15] for he common mo ifs problem whenever N k , and a be er space bound when N w k. I is a be er ime and space bound in absolu e for he repea ed mo ifs problem. The complexi ies ob ained in his second case are O(nV(e k)) and O(n) respec ively. Finally, we sugges how o ex end hese algori hms o deal wi h gaps.
1
Introduction
We presen in his paper wo algori hms. The rs one ex rac s repea ed mo ifs from a sequence, ypically of DNA, ha is, a sequence de ned over = A, C. L. Lucchesi, A. V. Moura (Eds.): LATIN’98, LNCS 1380, pp. 374 390, 1998. c Sprin er-Verla Berlin Heidelber 1998
Spelling Approxima e Repea ed or Common Mo ifs Using a Su x Tree
375
C, G, T . The mo ifs searched correspond o words over he same alphabe which occur a minimum number q of imes in he sequence wi h a mos e misma ches each ime. The second algori hm ex rac s common mo ifs from a se of N 2 sequences. In his las case, he mo ifs mus occur, again wi h a mos e misma ches, in 1 q N dis inc sequences of he se . In bo h cases, he words represen ing he mo ifs may never be presen exac ly in he sequences. We herefore speak of he mo ifs, repea ed in a sequence or common o a se of hem, as being ex ernal objec s and deno e hem by he erm models. We also call q he quorum cons rain such models have o verify o be considered valid. Objec s such as hese were rs in roduced in he li era ure by Wa erman (under he name of consensus pa erns) [7] [21] [22] [23] and la er employed by ourselves [15] wi h he aim of solving he common mo ifs problem. The main inconvenien of Wa erman’s approach is ha i ob ains he models ei her by genera ing all words over k for some k and hen looking for hem in he sequences, or by looking only for hose models, also of leng h k, ha have a chance of being valid bu his requires more space. In he rs case, he amoun of memory necessary is O(nN ) where n is he average leng h of he sequences, however he ime complexi y is O(nN k k ). In he second case, models of leng h k which have a chance of being valid are hose in he e-neighborhood [13] of he words of same leng h presen in he sequences of he se . A model m of leng h k is said o be in he e-neighborhood of a word u if he (in his case Hamming) dis ance from m o u is no more han e (i.e. we need a mos e subs i u ions o ob ain m from u). The e-neighborhood of a word u of leng h k e k e e elemen s - we deno e his number ( − 1)j k con ains j=0 j by V(e k) (V for Vicini y ). The ime requiremen for he second approach o Wa erman’s algori hm can be reduced o O(nN kV(e k)) bu his requires now k ) space (as given in [23] using a window of leng h n since hey have O(nN o remember which models have already been genera ed). In bo h cases, he me hod is herefore limi ed o small values of k ( ypically 6). I is also sui able for small alphabe s only. One could improve Wa erman’s approach by using more e cien echniques of pa ern ma ching wi h e-misma ches agains a ex like hose by Baeza-Ya es [1], Manber [24] [25] or Myers [14] ha are based on bi -parallelism. This would k ) or O(nN V(e k)) bu one would hen reduce he ime complexi y o O(nN k in he ime or space coms ill have o deal wi h a mul iplica ive fac or of plexi y of he algori hm ha would make such approaches prohibi ive for big alphabe s (if one deal wi h pro eins for ins ance ins ead of DNA sequences) and/or big values of k (as happens wi h some DNA signals such as he CRP binding si e which is believed o be 22 bases long [10]). Our own algori hm for he common mo ifs problem [15] genera es he models by increasing leng hs by simula ing he raversal of a lexicographic ree of all possible objec s over + where a each node x are preserved he occurrences in s of he model m labeling he pa h from he roo o x. The raversal is kep
376
Marie-France Sago
e cien because he ree may be pruned a he branch leading o a model m whenever m does no verify he quorum cons rain anymore. Models m0 having m as pre x are hus never considered. The coun erpar of his approach as agains Wa erman’s is ha , since models are buil by increasing leng hs, a mul iplica ive fac or of k is in roduced in he ime and space complexi ies, where k is he leng h of he models looked for. However, nei her complexi y depends anymore k . The search for all models of leng h k having occurrences a a Hamming on dis ance a mos e in a leas q dis inc sequences of he se of N akes hen O(nN k V(e k)) ime and O(nN k) space as indica ed in he paper. A more space demanding version of he algori hm allows o perform he same search in O(nN log k V(e k)) ime bu wi h O(nN V(e k)) space. Le us observe ha , k since e k is approxima ely 10-15%. in prac ice, (k )e is much lower han Consider for ins ance he following no un ypical values of k = 16, e = 2 and k = 4−10 . In bo h versions of he algori hm, = 4. We hen have (k )e as he process of model cons ruc ion is no based on he genera ion of all hose of a given leng h k, we have he fur her advan age of no having o x he value of k beforehand as is he case wi h Wa erman. We may herefore look for valid models of maximum leng h s ill verifying he quorum, or for all hose be ween (using he rs version) while remaining leng hs k1 and k2 for 1 k1 k2 wi hin he same bounds. When ei her Wa erman’s algori hm or our own is applied o he repea ed mo ifs problem, bo h ime and space complexi ies remain he same excep N is now equal o 1. The new approach we in roduce here s ar s by building a su x ree of he sequences and hen, af er some fur her preprocessing, uses his ree o simply spell he valid models. I is herefore his ree ha is now raversed o ob ain such models. Assuming an alphabe of xed size, he ree can be cons ruc ed in O(nN ) ime employing O(nN ) space and he preprocessing akes ime O(nN 2 w), where w is he size of a word machine, and space O(nN 2 w). The ime needed for he model spelling opera ion i self is O(nN 2 V(e k)) wi h O(nN ) addi ional space required. This is a be er ime bound for he common mo ifs problem whenever N < k , and also a be er space bound when N w < k. I is a be er ime and space bound in absolu e for he repea ed mo ifs problem since he complexi ies hen become O(nV(e k)) and O(n) respec ively. Observe ha , in his second case, if no errors are allowed, we ob ain he same ime and space complexi ies, in O(n), of he bes algori hms for iden ifying repea ed mo ifs [3] [5]. This is no rue for he common mo ifs problem where we have an O(nN 2 ) ime bound whereas Hui ob ains an O(nN ) bound [9]. His approach should hus be preferred when e = 0. Since bo h algori hms share similar s ruc ures, we show ha only a minor modi ca ion o ours is needed so as o be able o swi ch o Hui’s when e is zero (which seldom happens when one is dealing wi h biological sequences). Su x rees for approxima e searches (allowing misma ches and gaps) have been used before, no ably by Ukkonen [20] and Cobbs [4]. Al hough in bo h cases
Spelling Approxima e Repea ed or Common Mo ifs Using a Su x Tree
377
wha is searched is known beforehand making of i a qui e di eren problem (of pa ern ma ching as agains pa ern ex rac ing), heir approaches and ours share some similari y, if only because he same basic da a s ruc ure (a su x ree) and he same echnique (a form of dynamic programming) are used. However, a much simpler raversal of he ree is required here. This is obviously he case when misma ches only are allowed, bu is rue also should gaps be permi ed. Indeed, we quickly ske ch an ex ension of he algori hm for he repea ed mo ifs problem ( he common mo ifs problem would be handled in a similar way) ha deals wi h gaps by following he same philosophy. The ime complexi y is hen O(nN (e k)) where N (e k) is he number of models m a a ( his ime) Levensh ein dis ance a mos e from a word of leng h k. We herefore avoid in roducing an addi ional fac or of k in he complexi y as would be he case should we adop Cobbs’ me hod, bu may bring in ins ead a fac or in, a mos , 2e in rela ion o his approach. Since in general we have 2e < k, we never heless ob ain a be er bound besides presen ing a simpler algori hm. This paper is organized as follows. We give in sec ion 2 some basic de ni ions and s a e he wo problems we wish o solve. We hen discuss in sec ion 3 he solu ion o he repea ed mo ifs problem rs . We s ar by recalling he su x ree da a s ruc ure and in roduce he fur her preprocessing of i we have o perform before using he ree o ob ain he models. We hen show how o use he ree o spell he models and discuss he ime and space complexi ies ob ained. In sec ion 4, we presen he modi ca ions we have o do o he previous algori hm o rea he common mo ifs problem. These concern bo h he preprocessing of he su x ree before he spelling opera ion and he spelling i self. We end by sugges ing in sec ion 5 how o ex end he rs of he wo algori hms o be able o deal wi h gaps as well as misma ches.
2
Basic De nitions and Statement of the Problems
In wha follows, we deno e by s a (unique) sequence where repea ed mo ifs are searched and by s , 1 i N for some N 2 a se of sequences from which we wan o ex rac common mo ifs. In he case of DNA sequences, s and s are herefore elemen s of + where = A, C, G, T . We call u a word in s, or s , . The emp y word is deno ed by . if he sequence is equal o xuy wi h x y + . I is said o occur (or o be presen ) in a A model m is also an elemen of sequence s if here is a leas one word u in s of same leng h as m and such ha Hammin (m u) e where Hammin (m u) is he Hamming dis ance be ween m and u (i is he minimum number of subs i u ions needed o ransform m in o u) and e is a non nega ive in eger. The problems we wish o solve may hen be s a ed as follows: The Repeated Motifs Problem. Given a sequence s and wo in egers e 0 and q 2, nd all models m such ha m is presen a leas q imes in s (some of he occurrences of m may overlap);
378
Marie-France Sago
The Common Motifs Problem. Given a se of N sequences s (for 1 i N ) and wo in egers e 0 and 2 q N , nd all models m such ha m is presen in a leas q dis inc sequences of he se . In bo h cases, models sa isfying he above condi ions are called valid. We are herefore looking for all valid models ha correspond ei her o repea ed or common mo ifs depending on he problem. In [15], we proposed an algori hm for solving he common mo ifs problem (ac ually, a li le more han ha since we also deal wi h gaps). I is easy o modify i so ha i can handle he repea ed mo ifs problem as well. However, he approach described here did no ry o ake advan age of he sequence (or sequences) s ruc ure in order o ob ain he valid models as was done in [9] bu for iden ically repea ed mo ifs only (no misma ches allowed) or in [11] bu for xed-leng h mo ifs ha had o appear a leas once exac ly in he sequence. In he presen paper, his underlying s ruc ure is exploi ed o ob ain a new model building algori hm dealing wi h a Hamming dis ance ha has a be er complexi y in absolu e for he rs problem s a ed above, and a be er complexi y over some range of parame ers we explici la er on in he case of he second problem. We recall ha he models need never be presen exac ly in he sequence(s). We s ar by looking on his new way of solving he problem when repea ed mo ifs are sough .
3 3.1
Solvin the Repeated Motifs Problem Preprocessin
Constructin the Su x Tree. We do no describe he su x ree cons rucion, his can be found in ei her [12], [19] or (for a review of his and o her da a s ruc ures and ex algori hms) [6] and [8]. We jus recall here some of he basic proper ies of such s ruc ures ( hese are aken from [12]). Basic Properties of the Su x Tree T of a Sequence s. (1) A branch of T may represen any nonemp y subs ring of s; (2) Each node of T ha is no a leaf, excep for he roo , mus have a leas wo o spring branches (compac version of he ree); (3) The s rings represen ed by sibling branches of T mus begin wi h di eren symbols of . Observe ha proper y 2 means ha a branch of T may be labeled by an 2 (for space considera ions, each branch of T is in fac elemen of k for k labeled by a pair of numbers corresponding o he s ar and end posi ions in s of he subs ring i represen s, or i s s ar posi ion and leng h). The key fea ure of a su x ree is ha for any leaf i, he conca ena ion of he labels of he branches on he pa h from he roo o leaf i spells he su x of s s ar ing a posi ion i. Reciprocally, he pa h spelled by every su x of s leads o a dis inc leaf if we assume ha he las symbol of s appears nowhere else
Spelling Approxima e Repea ed or Common Mo ifs Using a Su x Tree
379
in s. To achieve his, we jus need o conca ena e a he end of s a symbol no appearing in . An example of a su x ree T for s = AACCACG is given in Fig. 1. This is adap ed from [6].
A
C
G 7
ACG ACCACG
C 1
4
CACG
CACCG 3
G 6
G 2
5
Fi . 1. Compac su x ree for s = AACCACG
We shall assume we have adop ed he McCreigh ’s compac su x ree cons ruc ion. However, we need some more informa ion o be added o our ree. This is described nex . Addin Information to the Nodes of the Tree. In he case of he rs problem concerning us ( nding repea ed mo ifs in a sequence), he fur her preprocessing of he ree ha we need o do is easy o realize. Indeed, in order o be able o spell he models presen a leas q imes in s, all ha remains for us o know is, for each node x of T , how many leaves are con ained in he sub ree of T having x as roo . Le us deno e Leaves x his number for each node x. This informa ion can be added o he ree by a simple raversal of i . 3.2
Spellin the Models
Le us consider rs he case where e = 0, ha is, no misma ches are allowed. Valid models verify wo proper ies. (1) All heir pre xes are also valid; (2) Spelling hese models (which is he same as spelling any of heir occurrences) leads o a node x in T for which Leaves x is a leas q. Once errors are allowed, he rs proper y is s ill veri ed bu spelling all he occurrences of a valid model may now lead o more han one node of he ree. However, he values of Leaves x for all such nodes x sums up o a leas q also. The second proper y above is hus replaced by:
380
Marie-France Sago
2. Spelling all he occurrences of a model m leads o nodes x1 l which j=1 Leaves(xj ) is a leas q.
xl in T for
As a ma er of fac , i is no occurrences we shall spell, bu ins ead he models ha will be read o he ree. The main di erence wi h our previous approach is herefore ha , in he presen case, we ex rac models from he su x ree of s whereas in [15] we cons ruc ed hem by a simula ed raversal (wi hin bounds) of he lexicographic ree of all possible models, i.e. of all possible elemen s of . This means in par icular ha occurrences are now grouped in o classes and real ones, ha is occurrences considered as individual words in s, are never direc ly manipula ed. Presen case occurrences of a model are hus in fac nodes of he su x ree (we deno e hem by he erm node-occurrences ) and are ex ended in he ree ins ead of in he sequence as in [15]. Once he process of model spelling has ended, he s ar posi ions of he real occurrences of he valid models may be recovered by raversing he sub rees of he nodes reached so far and reading he labels of heir leaves. As in [15], he su x ree need no be en irely raversed, al hough his ime he ree i self mus be fully cons ruc ed. There are wo reasons ha may lead o s op he spelling of a model: we have reached he leng h required for models, he model may no be fur her ex ended while remaining valid. For any model, we may also s op he descen down a pa h in he ree as soon as oo many misma ches have been accumula ed along i . The algori hm is a developmen of he recurrence formula given in he lemma below where x deno es a node of he ree, father (x) i s fa her and err he number of misspellings be ween he label of he pa h going from he roo o x as agains a model m. Lemma 1. (x err ) is a node-occurrence of m0 = m with m if, and only if, one of the followin two conditions is veri ed:
k
and
(match) (father (x) err ) is a node-occurrence of m and the label of the branch from father (x) to x is ; (subst.) (father (x) err −1) is a node-occurrence of m and the label of the branch from father (x) to x is = . A ske ch of he procedure o follow is shown in Fig. 2 for he case where models of a given leng h k are sough . In order o do his model spelling opera ion, we have o make use of he following: a se Ext m of symbols by which a model may be ex ended a he nex s ep (implemen ed as a bi -vec or); a se Occ m of node-occurrences of a model m. We recall ha hese correspond in fac o classes of occurrences. Each node-occurrence x is represen ed by a pair (x, xerr ) where xerr is he number of misma ches be ween m and he label of he pa h leading from he roo o x in he ree; a variable nbocc ha coun s he number of real occurrences of he model we are curren ly rying o ex end;
Spelling Approxima e Repea ed or Common Mo ifs Using a Su x Tree
381
a func ion KeepModel (m) ha ei her s ores all informa ion concerning a valid model of he required leng h for prin ing la er, or immedia ely prin s his informa ion. The func ion SpellModels is called wi h argumen s: (0
Occ = (root 0) Ext ) where Ext =
if e > 0 labelb for branches b leaving he roo o herwise
Where valid models of maximum leng h are sough , we jus need o change lines 1 and 2 in o he code given in Fig. 3. Variable km x is ini ialized o 0 before rs en ering func ion SpellModels . For he sake of simplici y, he code shown here assumes we are dealing wi h an uncompac version of he ree ( ha is, wi h a rie). Using a compac version a ec s he opera ions done in lines 9, 10, 12, 14, 15 and 17. Indeed, we need in ha case o know a any given s ep whe her we are: a a node x, or inside a branch b be ween nodes x and x0 and, if we can ravel down T wi h a symbol , whe her ha : ge s us o a new node x0 , or keeps us inside branch b. This means ha addi ional informa ion has o be kep rela ive o each nodeoccurrence, and, consequen ly, ex ending such an occurrence implies more work. This however increases he algori hm’s ime and space complexi y by a cons an fac or only. 3.3
Complexity
Assuming an alphabe of xed size, a compac su x ree T may be cons ruc ed in O(n) ime where n is he leng h of sequence s and occupies O(n) space [12] [19]. Adding informa ion o he nodes of he ree as described in sec ion 3.1 akes ime O(n) and requires O(1) space per node of T . Concerning he spelling opera ion, we have ha : Lemma 2. Spellin all valid models for the Repeated Motifs Problem iven T requires O(nV(e k)) time where k is either the len th of the models sou ht or is a maximum len th. Proof. Le us consider valid models of leng h k are searched for. Spelling hem requires descending down he ree of a mos k levels (we may some imes no reach ha level if a given model s ops verifying he quorum cons rain q a an earlier s age). A level k, here are p n nodes (since here are exac ly
382
Marie-France Sago
SpellModels(l m Occ m Ext m ) 1. if (l = k) 2. KeepModel (m) 3. else if (l k) 4. for each symbol in Ext m 5. nbocc = 0 6. Ext m = 7. Occ m = 8. for each pair (x, xerr ) in Occ m 9. if there is a branch b leavin node x with a label startin with 10. add to Occ m the pair (x0 , xerr ) where x0 is the node reached by followin branch b from x 11. nbocc = nbocc + Leaves x0 0 0 Ext m labelb for b leavin x0 if xerr = e 12. Ext m = otherwise 13. if xerr e 14. for each branch b leavin x except the one labeled by if it exists 15. add to Occ m the pair (x0 , xerr + 1) where x0 is the node reached by followin branch b from x 16. nbocc = nbocc + Leaves x0 0 0 Ext m labelb for b leavin x0 if xerr = e − 1 17. Ext m = otherwise 18. if nbocc q 19. SpellModels(l + 1 m Occ m Ext m )
Fi . 2. Ske ch of he procedure for spelling models corresponding o repea ed mo ifs
1. 2. 3. 4. 5.
if (l km x ) if l > km x throw away all precedin kept models km x = l KeepModel (m)
Fi . 3. Modi ca ion o apply o he code of Fig. 2 in order o genera e valid models of maximum leng h - he lines given here replace lines 1 and 2
Spelling Approxima e Repea ed or Common Mo ifs Using a Su x Tree
383
n leaves in T and he number of leaves is grea er han he number of nodes a any par icular level k above ha of he lowes leaf). From hese p nodes, here are p pa hs up o he roo of T (no e ha in our algori hm, we in fac go down he pa hs, no up) and V(e k) ways of misspelling heir labels, ha is spelling he labels wi h a mos e misma ches. This corresponds also o an upper bound o he o al number of visi s ha may have o be done o he branches of T (or, equivalen ly, i s nodes) in order o ob ain he reques ed models. Since each visi o a branch cos s us cons an ime (basically, we need o incremen a coun er, add a node-occurrence o one se and a lis of symbols o ano her se - a cons an ime opera ion if he lis is implemen ed as a boolean array), hen he o al number of opera ions needed o spell all he valid models given T is bounded over by pV(e k), ha is, by O(nV(e k)).
Lemma 3. Spellin all valid models for the Repeated Motifs Problem iven T requires O(n) space. Proof. The space required is ha of he ree plus ha of he auxiliary s ruc ures Occ m and Ext m . We need o keep such s ruc ures for he model m curren ly being rea ed and for all i s pre xes (we are raversing he ree recursively). However, n. he se s Occ m are all disjoin be ween hem so ha m0 pref x of m Occ m0 k w, he o al space complexi y is O(n + n + Since m0 pref x of m Ext m0 k w) = O(n) if we assume a xed leng h alphabe . No e ha if e = 0, hen V(e k) = 1. Le us poin ou also ha V(e k) is an upper bound for he number of models ha corresponds o he maximum size of he ou pu and is seldom observed.
4 4.1
Solvin the Common Motifs Problem Generalized Su x Trees
i N for Trees for represen ing all he su xes of a se of sequences s , 1 some N 2 are called generalized su x rees and are cons ruc ed in a way very similar o he cons ruc ion of he su x ree for a single sequence [2] [9]. We deno e hese generalized rees by GT . They share all he proper ies of a su x ree given in sec ion 3.1 wi h, in proper y 1, sequence s subs i u ed by sequences sN . s1 In par icular, a generalized su x ree GT veri es he fac ha every su x 2 sequences of every sequence s in he se leads o a dis inc leaf. When p have a same su x, he generalized ree has herefore p leaves corresponding o his su x, each associa ed wi h a di eren sequence. To achieve his proper y during cons ruc ion, we jus need o conca ena e o each sequence s of he se a symbol ha is no in and is speci c o ha sequence.
384
4.2
Marie-France Sago
Addin Information to the Nodes of the Tree
If he cons ruc ion of a GT for a se s of sequences is similar o ha of a su x ree T for a single sequence s, i is no enough anymore o know he values of Leaves x for each node x in GT in order o be able o solve he Common Mo ifs Problem. We could hen modify our preprocessing of he ree so ha we calcula e, for each node x, no longer he number of leaves in he sub ree of GT having x as roo , bu he number of di eren sequences hose leaves refer o. Compu ing his number is wha is called he Color Size Problem by Hui [9]. The color se size of a node x is precisely he number of di eren leaf colors in he sub ree roo ed a x, where a leaf is assigned color i if i represen s a su x of s . Le us call his new number CSS x as in [9]. Knowing CSS x for all nodes x is however all ha is required only in he case where e = 0 (in his case, a model has only one node-occurrence). When e > 0, we also have o be able o ell which colors are common o 2 or more nodes of he ree. In order o do ha , we need o associa e o each node x in he GT of a se s an array, deno ed Colors x , of dimension N ha is de ned by: Colors x [i] =
1 if a leas one leaf in he sub ree roo ed a x represen s a su x of s 0 o herwise
(1
i
N)
Colors x may be implemen ed as a bi vec or, or as N w bi -vec ors if N > w where w is he size of a word machine. The array Colors x for all x may be ob ained by a simple raversal of he ree wi h each visi o a node aking O(N w) ime (for adding N w bi -vec ors). The addi ional space required is O(N w) per node. We shall also use he informa ion provided by CSS x which Hui showed can be ob ained in O(nN ) ime and uses O(1) space per node. Considering CSS x is no s ric ly necessary bu may be useful in prac ice as is sugges ed when we analyze he complexi y of his algori hm la er on. 4.3
Spellin the Models
For ease of presen a ion, we assume here once more ha we are looking for all valid models of a xed leng h k, and ha we are working wi h an uncompac version of he GT . A ske ch of he algori hm for solving he Common Mo ifs Problem is given in Fig. 4. We use he same auxiliary s ruc ures Occ m and Ext m as in he previous algori hm, o which we add he following: a variable CSS x as de ned in he previous sec ion; a boolean array Colors x (possibly N w arrays if N > w) as de ned in he previous sec ion ; a variable minseq ha indica es he minimum of CSS x for all node-occurrences x of he ex ended model;
Spelling Approxima e Repea ed or Common Mo ifs Using a Su x Tree
385
a variable maxseq ha indica es he sum of CSS x for all node-occurrences x of he ex ended model; a boolean array Colorm (possibly N w arrays if N > w) de ned by: Colors m [i] =
1 if m occurs in s 0 o herwise
Observe ha , in all cases, we have: minseq
4.4
(number of dis inc sequences he model is presen in)
maxseq.
Complexity
Wha produces an increase in he complexi y of he algori hm of Fig. 4 in rela ion o ha of Fig. 2 concerns simply he da a s ruc ure Colors x : he ime needed o crea e and manipula e i , and he space required o s ore i . The space requiremen of Colors x is O(N w) per node if i is implemen ed as a bi -vec or having same size w as a word machine. The o al space requiremen of he algori hm is herefore now bounded over by O(nN 2 w). This is smaller han our previously ob ained bound [15] of O(nN k) when N w < k. Crea ing Colors x for every node x of he ree akes ime O(nN 2 w), however manipula ing i , in par icular performing he opera ion indica ed in line 31, requires O(N ) ime per model. Since here can be O(nN V(e k)) valid models in he wors case, he algori hm’s ime complexi y becomes O(nN 2 V(e k)). This is a be er bound han he one given in [15] for N < k . The es s of lines 26 and 29 should improve he algori hm’s behaviour on average. Observe ha he es of line 29 has more chance of being rue a he beginning of he algori hm (where models ma ch almos every hing) while ha of line 26 has a be er chance of being veri ed he longer he model is (because he number of i s occurrences will hen be qui e close o q). As men ioned in he in roduc ion, when e = 0 we do no ob ain Hui’s be er bound of O(nN ). In his case hough, we only need o remove lines 7, 16, 24, 31 and 32 o fall back o he algori hm Hui in roduced in [9]. Observe his also means ge ing rid of he Colors s ruc ure ha is no longer necessary. I is easy o modify he algori hm of Fig. 4 so ha he ins ruc ions con ained in he lines jus indica ed are performed only if e > 0.
5
Sketch of Extension Dealin with Gaps
We ske ch in his sec ion how o ex end he algori hms so as o be able o rea gaps as well as misma ches. This is done only for he repea ed mo ifs problem. The common mo ifs problem would be deal wi h in a qui e similar manner. The algori hm is presen ed wi hou fur her ado in Fig. s 5 and 6. Node-occurrences mus be main ained in Occ m for m a model in he order in which hey would be encoun ered if he ree were raversed in a dep h- rs manner. This preorder follows na urally from he way nodes are processed a each s ep of he algori hm.
386
Marie-France Sago
SpellModels(l m Occ m Ext m ) 1. if (l = k) 2. KeepModel (m) 3. else if (l k) 4. for each symbol in Ext m 5. maxseq = 0 6. minseq = 7. Colors m is initialized with no colors 8. Ext m = 9. Occ m = 10. for each pair (x, xerr ) in Occ m 11. if there is a branch b leavin node x with a label startin with 12. add to Occ m the pair (x0 , xerr ) where x0 is the node reached by followin branch b from x 13. maxseq = maxseq + CSS x0 14. if CSS x0 minseq 15. minseq = CSS x0 16. add colors in Colors x to Colors m Ext m labelb0 for b0 leavin x0 if xerr = e 17. Ext m = otherwise 18. if xerr e 19. for each branch b leavin x except the one labeled if it exists 20. add to Occ m the pair (x0 , xerr + 1) where x0 is the node reached by followin branch b from x 21. maxseq = maxseq + CSS x0 22. if CSS x0 minseq 23. minseq = CSS x0 24. add colors in Colors x to Colors m Ext m labelb0 for b0 leavin x0 if xerr = e − 1 25. Ext m = otherwise 26. if maxseq q 27. return (no hope) 28. else 29. if minseq q 30. SpellModels(l + 1 m Occ m Ext m ) 31. else if the number of bits at 1 in Colors m is no less than q 32. SpellModels(l + 1 m Occ m Ext m )
Fi . 4. Ske ch of he procedure for spelling models corresponding o common mo ifs
Spelling Approxima e Repea ed or Common Mo ifs Using a Su x Tree
387
We do no prove i here, bu he only hing ha changes in he complexi y of he algori hm is ha V(e k) is replaced by N (e k) where N (e k) is he number of models m a a ( his ime) Levensh ein dis ance a mos e from a word of leng h k. This comes from he fac ha , since each node of he ree is considered a mos once as a node-occurrence, Occ m remains bounded over by O(n). There may simply now be more models. One can hink of he opera ion performed by he procedure T reat of Fig. 6 as adding he las row of a dynamic programming ma rix of model m agains he su x ree of s as in [20] or [4]. As men ioned in he in roduc ion however, he curren algori hm has a di eren way of accoun ing for he real occurrences of a model han he one by Cobbs [4]. Indeed, in his case, for each posi ion i in s, only one occurrence is kep ha ends a i. Doing ha however implies verifying cer ain hings and his may cos him as much as k addi ional opera ions per occurrence. In our case, al hough we may keep up o 2e node-occurrences per valid ending posi ion ( he o al number of nodes remaining less han Cn for a small cons an C), we s ill ge a be er ime bound since in general 2e < k. The algori hm is also simpler. Fur hermore, i may be in eres ing in some cases (e.g. when searching for andem repea s [16]) o know bo h he s ar and end posi ions of an occurrence.
SpellModels(l m Occ m Ext m ) 1. if (l = k) 2. KeepModel (m) 3. else if (l k) 4. for each symbol in Ext m 5. nbocc = 0 6. Ext m = 7. Occ m = 8. for each pair (x, xerr ) in Occ m 9. remove (x, xerr ) from Occ m 10. Treat(Occ m , Occ m , Ext m , Leaves , x, xerr , , nbocc, 0) 11. if nbocc q 12. SpellModels(l + 1 m Occ m Ext m )
Fi . 5. Ske ch of he procedure for spelling models corresponding o repea ed mo ifs when gaps as well as misma ches are allowed (in ernal procedure Treat is given in Fig. 6)
388
Marie-France Sago
Treat (Occ m Occ m 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22.
Ext m
Leaves x xerr
nbocc level)
if (level = 0) /* deletion */ if xerr e add to Occ m the pair (x, xerr + 1) nbocc = nbocc + Leaves x 0 0 Ext m labelb for b leavin x if (xerr + 1) = e Ext m = otherwise for each x0 obtained by followin , in lexico raphic order, a branch (labeled ) from x if (x0 , x0err ) is the next pair in Occ m remove (x0 , x0err ) from 8 Occ m if = /* match */ > < xxerr = /* substitution */ err + 1 if let minerr = min x0err + 1 /* deletion */ > : xerr + 1 /* insertion */ if minerr e add to Occ m the pair (x0 , minerr ) nbocc = nbocc + Leaves x0 0 0 Ext m labelb for b leavin x0 if minerr = e Ext m = otherwise Treat (Occ m Occ m Ext m Leaves x0 minerr nbocc level + 1) else remove from Occ m the sons of x0 and all the sons thereof recursively else 8 < xerr if = /* match */ let minerr = min xerr + 1 if = /* substitution */ : xerr + 1 /* insertion */ if minerr e add to Occ m the pair (x0 , minerr ) nbocc = nbocc + Leaves x0 Ext m labelb0 for b0 leavin x0 if minerr = e Ext m = otherwise
Fi . 6. Procedure Treat, used by he algori hm for spelling models corresponding o repea ed mo ifs, when gaps as well as misma ches are allowed
Spelling Approxima e Repea ed or Common Mo ifs Using a Su x Tree
6
389
Future Work
I is no oo di cul o see ha he new approach o approxima e mo if exrac ion presen ed in his paper may be ex ended o deal wi h he special kinds of alphabe s required for pro ein sequence analysis [18] and wi h combina orial alphabe s such as in roduced in [17]. This will be explored and formalized in a fu ure paper. The presen algori hm should also help deal wi h models ha , al hough differen , have se s of real occurrences (or, equivalen ly, of node-occurrences) ha are iden ical or included in he se of ano her model. This is a problem of en encoun ered and i may be appropria e o ge rid of included se s. I is no comple ely rivial how o do i in an e cien way. Acknowled ments. The au hor would like o hank Maxime Crochemore for his cons an encouragemen s and o all he las eigh mon hs visi ors o he Ins i u Gaspard Monge, in par icular Veronique Bruyere, Clelia de Felice, Gene Myers, An onio Res ivo, Paul Schupp and Thomas Wilke, for having helped crea e an even more s imula ing environmen in which o work. Thanks is also due o Dan Gus eld for being such a good advoca e of he meri s of su x rees. Finally, I would like o hank all he referees for heir commen s and sugges ions, in par icular he anonymous referee who very kindly sen o me his/her implemen a ion in perl of he rs algori hm presen ed in his paper.
References 1. R. Baeza-Ya es and G. H. Gonne . A new approach o ex searching. Commun. ACM, 35:74 82, 1992. 2. P. Bieganski, J. Riedl, J. V. Carlis, and E.M. Re zel. Generalized su x rees for biological sequence da a: applica ions and implemen a ions. In Proc. of the 27th Hawai Int. Conf. on Systems Sci., pages 35 44. IEEE Compu er Socie y Press, 1994. 3. B. Clif , D. Haussler, R. McConnell, T. D. Schneider, and G. D. S ormo. Sequence landscapes. Nucleic Acids Res., 14:141 158, 1986. 4. A.L. Cobbs. Fas iden i ca ion of approxima ely ma ching subs rings. In Z. Galil and E. Ukkonen, edi ors, Combinatorial Pattern Matchin , volume 937 of Lecture Notes in Computer Science, pages 41 54. Springer Verlag, 1995. 5. M. Crochemore. An op imal algori hm for compu ing he repe i ions in a word. Inf. Proc. Letters, 12:244 250, 1981. 6. M. Crochemore and W. Ry er. Text Al orithms. Oxford Universi y Press, 1994. 7. D. J. Galas, M. Egger , and M. S. Wa erman. Rigorous pa ern-recogni ion me hods for DNA sequences. Analysis of promo er sequences from Escherichia coli. J. Mol. Biol., 186:117 128, 1985. 8. D. Gus eld. Al orithms on Strin s, Trees, and Sequences. Computer Science and Computational Biolo y. Cambridge Universi y Press, 1997. 9. L. C. K. Hui. Color se size problem wi h applica ions o s ring ma ching. In A. Apos olico, M. Crochemore, Z. Galil, and U. Manber, edi ors, Combinatorial Pattern Matchin , volume 644 of Lecture Notes in Computer Science, pages 230 243. Springer-Verlag, 1992.
390
Marie-France Sago
10. C. E. Lawrence and A. A. Reilly. An expec a ion maximiza ion (EM) algori hm for he iden i ca ion and charac eriza ion of common si es in unaligned biopolymer sequences. Proteins: struct., funct., and enetics, 7:41 51, 1990. 11. C. Lefevre and J.-E. Ikeda. A fas word search algori hm for he represen a ion of sequence similari y in genomic DNA. Nucleic Acids Res., 22:404 411, 1994. 12. E. M. McCreigh . A space-economical su x ree cons ruc ion algori hm. J. ACM, 23:262 272, 1976. 13. E. W. Myers. A sublinear algori hm for approxima e keyword searching. Al orithmica, 12:345 374, 1994. 14. E. W. Myers. 1997. personal communica ion. 15. M.-F. Sago , V. Escalier, A. Viari, and H. Soldano. Searching for repea ed words in a ex allowing for misma ches and gaps. In R. Baeza-Ya es and U. Manber, edi ors, Second South American Workshop on Strin Processin , pages 87 100, Vinas del Mar, Chili, 1995. Universi y of Chili. 16. M.-F. Sago and E. W. Myers. Iden ifying sa elli es in nucleic acid sequences. 1998. submi ed o RECOMB 1998. 17. M.-F. Sago and A. Viari. A double combina orial approach o discovering pa erns in biological sequences. In D. Hirschberg and G. Myers, edi ors, Combinatorial Pattern Matchin , volume 1075 of Lecture Notes in Computer Science, pages 186 208. Springer-Verlag, 1996. 18. M.-F. Sago , A. Viari, and H. Soldano. Mul iple comparison: a pep ide ma ching approach. Theoret. Comput. Sci., 180:115 137, 1997. presen ed a Combinatorial Pattern Matchin 1995. 19. E. Ukkonen. Cons ruc ing su x rees on-line in linear ime. pages 484 492. IFIP’92, 1992. 20. E. Ukkonen. Approxima e s ring ma ching over su x rees. In Z. Galil A. Aposolico, M. Crochemore and U. Manber, edi ors, Combinatorial Pattern Matchin , volume 684 of Lecture Notes in Computer Science, pages 228 242. Springer-Verlag, 1993. 21. M. S. Wa erman. Mul iple sequence alignmen s by consensus. Nucleic Acids Res., 14:9095 9102, 1986. 22. M. S. Wa erman. Consensus pa erns in sequences. In M. S. Wa erman, edi or, Mathematical Methods for DNA Sequences, pages 93 116. CRC Press, 1989. 23. M. S. Wa erman, R. Arra ia, and D. J. Galas. Pa ern recogni ion in several sequences: consensus and alignmen . Bull. Math. Biol., 46:515 527, 1984. 24. S. Wu and U. Manber. Agrep - a fas approxima e pa ern-ma ching ool. pages 153 162, San Francisco, CA, 1992. USENIX Technical Conference. 25. S. Wu and U. Manber. Fas ex searching allowing errors. Commun. ACM, 35:83 91, 1992.