Theory of Codes
This is a volume in PURE AND APPLIED MATHEMATICS A Series of Monographs and Textbooks Editors: SAMUEL EILENBERG AND HYMAN BASS
A list of recent titles in this series appears at the end of this volume.
THEORY OF CODES
JEAN BERSTEL VniversitP Paris VI
DOMINIQUE PERRIN Vniversitk Paris V l l
1985
ACADEMIC PRESS, INC. (Harcourl Brace Jovanovlch, Publishers)
Orlando San Diego New York London Toronto Montreal Sydney Tokyo
COPYRIGHT@ 1985, BY ACADEMIC PRESS. INC. ALL RIGHTS RESERVED. NO PART O F THIS PUBLICATION MAY BE REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS. ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.
ACADEMIC PRESS, INC. Orlando, Florida 32887
United Kin om Edition ublished by
ACADEMG PRESS dc. (LONDON) LTD. 24/28 Oval Road, London NWI 7DX
Library of Congress Cataloging in Publication Data
Berstel, Jean, Date Theory of codes. and applied mathematics ; ) 11.Pure ibliO aphy: P. Codcg theory. I. Perrin. Domlnique.
11. Title. 111. Series: Pure and applled mathematics (Academlc
&?d0-12-093420-5 ' [ Q M h ] 510 s E519.4 (alk. paper] ISBN
PRINTEDIN THE UNITEDSTATESOP AMERICA
85868788
9 8 7 6 5 4 3 2 I
84-12445
to
Mahar
This Page Intentionally Left Blank
Contents
xi
PREFACE
CHAPTER
0
Preliminaries
0 1 2
9
Introduction Notation Monoids Words Automata Ideals in a Monoid Semirings and Matrices Formal Series Permutation Groups Notes
1
Codes
3 4 5
6 7 8
CHAPTER
Introduction Definitions Codes and Submonoids A Test for Codes Measure of a Code Complete Sets Composition Exercises Notes
1 1
2 4 10 18 26 29
32 36
37 37 42 50 54 61 71 78 81
CHAPTER
2
Prefix Codes
0 1 2 3 4 5 6 8 9 10
Introduction Prefix Codes The Automaton of a Prefix Code Maximal Prefix Codes Operations on Prefix Codes Semaphore Codes Synchronous Codes Average Length Deciphering Delay Exercises Notes
3
Biprefix Codes
7
CHAPTER
Introduction Parses Maximal Biprefix Codes Degree Kernel Finite Maximal Biprefix Codes Exercises Notes
CHAPTER
CHAPTER
4
Automata
0 1 2 3 4 5 6 7 8 9
Introduction Au toniata Flower Automaton Monoids of Unambiguous Relations Rank and Minimal Ideal Very Thin Codes Group and Degree of a Code Synchronization of Semaphores Exercises Notes
5
Groups of Biprefix Codes Introduction Group Codes Automata of Biprefix Codes Depth Groups of Finite Biprefix Codes Examples Exercises Notes
83 84 90 98 103 107 115 121 128 134 138
140 141 144 150 163 168 175 178
180 180
189 202 217 224 234 248 255 260
26 1 262 264 269 27 1 279 285 288
CONTENTS
CHAPTER
6
Densities
0 2 3 4 5
Introduction Densities Probabilities over a Monoid Contexts Exercises Notes
7
Conjugacy
0 1 2 3 4 5 6
7
Introduction Circular Codes Limited Codes Length Distributions Factorizations of Free Monoids Finite Factorizations Exercises Notes
8
Polynomial of a Code
0
Introduction Factorizing Codes Determinants State Space Evaluation of the Polynomials Proof of the Theorem Commutative Equivalence Complete Reducibility Exercises Notes
1
CHAPTER
CHAPTER
1
2 3 4 5 6
7 8 9
REFERENCES INDEX
290 290 296 309 318 320
32 1 322 329 337 35 1 364 374 377
379 380 382 385 394 398 404 408 418 422
424 429
This Page Intentionally Left Blank
Preface
The theory of codes takes its origin in the theory of information devised by Shannon in the 1950s. In the past thirty years, this theory has developed in several directions, one of which is the theory of entropy, a branch of probability theory. Another is the theory of error-correcting codes which is one of the beautiful applications of commutative algebra. This volume presents a comprehensive study of the theory of variable length codes, and provides a systematic exposition of the topic. The theory of codes, as presented here, uses both combinatorial and algebraic methods. Due to the nature of the questions raised and solved, this theory is now clearly a part of theoretical computer science and is strongly related to cornbinatorics on words, automata theory, formal languages, and the theory of semigroups. The object of the theory of codes is, from an elementary point of view, the study of the properties concerning factorizations of words into a sequence of words taken from a given set. Thus it is related to syntactic analysis, even though the connection seems loose at first glance. One of the basic techniques used in this book is constructing special automata that perform certain kinds of parsing. It is quite remarkable that the problem of encoding as treated here admits a rather simple mathematical formulation: it is the study of embeddings of one free monoid into another. We may consider this to be a basic problem of algebra. There are related problems in other algebraic structures. For instance, if we replace free monoids by free groups, the study of codes reduces to that of subgroups of a free group. However, the situation is quite different at the very beginning. In fact, according to the Nielsen-Schreier theorem, any subgroup of a free group is
xii
PREFACE
itself free, whereas the corresponding statement is false for free monoids. Nevertheless the relationship between codes and groups is more than an analogy, and we shall see in this book how the study of a group associated with a code can reveal some of its properties. It was M.P.Schutzenberger’s discovery that coding theory is closely related to classical algebra. He has been the main architect of this theory. The main basic results are due to him and most further developments were stimulated by his conjectures. The aim of the theory of codes is to give a structural description of the codes in a way that allows their construction. This is easily accomplished for prefix codes, as shown in Chapter 2. The case of biprefix codes is already much more difficult, and the complete structural description given in Chapter 3 is one of the highlights of the theory. However, the structure of general codes (neither prefix nor suffix) still remains unknown to a large extent. For example, no systematic method is known for constructing all finite codes. The result given in Chapter 8 about the factorization of the commutative polynomial of a code must be considered (despite the difficulty of its proof) as an intermediate step toward the understanding of codes. Many of the results given in this book are concerned with extremal properties, the interest in which comes from the interconnection that appears between different concepts. But it also goes back to the initial investigations on codes considered as communication tools. Indeed, these extremal properties in general reflect some optimization in the encoding process. Thus a maximal code uses, in this sense, the whole capacity of the transmission channel. Primarily, two types of methods are used in this book: direct methods on words and syntactic methods. Direct methods consist of a more or less refined analysis of the sequencing of letters and factors within a word. Syntactic methods, as used in Chapters 4-8, include the study of special automata associated with codes and of the corresponding monoids of relations. As usual during the preparation of a book, we had to make choices about the material to be covered. We have included mainly results concerning codes themselves. We have neglected in particular their relationship with varieties of languages and semigroups and also with free Lie algebras. We have not treated the developments leading toward ergodic theory. On the other hand, we have included results of combinatorial nature such as the description of comma-free codes or bisections and trisections of free monoids. There were other choices to be made, beyond terminology and notation which of course vary through the research papers. These concern the degree of generality of the exposition. It appears that many facts discovered for finite codes remain true for recognizable codes and even for a larger class of codes (thin codes) encountered in the literature. In general, the transition from finite to recognizable codes does not imply major changes in the proof. However, changing to thin codes may imply some rather delicate computations. This is clearly demonstrated in Chapters 4 and 6, where the summations to be made become infinite when the
PREFACE
xiii
codes are no longer recognizable. But this approach leads to a greater generality and, as we believe, to a better understanding by focusing attention on the main argument. Moreover, the characterization of the monoids associated with thin codes given in Chapter 4 may be considered to be a justification of our choice. The organizationof the book is as follows: A preliminary chapter (Chapter 0) is intended mainly to fix notation and should be consulted only when necessary. The book is composed of three major parts: part one consisting of Chapters 1-3, part two of Chapters 4-6, and part three of Chapters 7 and 8, which are independent of each other. Chapters 1-3 constitute an elementary introduction to the theory of codes in the sense that they primarily make use of direct methods. Chapter 1 contains the definition, the relationship with submonoids, the first results on measure, and the introduction of the notions of complete, maximal, and thin codes. Chapter 2 is devoted to a systematic study of prefix codes, developed at a very elementary level. Indeed, this is the most intuitive and easy part of the theory of codes and certainly deserves considerable discussion. We believe that its interest largely goes beyond the theory of codes. The chapter ends with the well-known theorem on deciphering delay. Chapter 3 is also elementary, although more dense. Its aims are to describe the structure of maximal biprefix codes and to give methods for constructing the finite ones. The use of formal power series is of great help. The presentation of the results, frequently scattered or unpublished, appears to be new. The next three chapters contain what is known about codes but can be proved only by syntactic methods. Chapter 4 is devoted to these techniques, using a more systematic treatment. Instead of the frequently encountered monoids of functions we study monoids of unambiguous relations which do not favor left or right. Chapter 4 concludes with a first illustration of these techniques: the proof of the theorem of synchronization of semaphore codes. Chapter 5 deals with the groups of biprefix codes. Chapter 6 shows how to compute the density of the submonoid generated by a code by transferring the computation into the associated monoid of unambiguous relations. The formula of densities, linking together the density of the submonoids, the degree of the code, and the densities of the contexts, is the most striking result. As previously mentioned, Chapter 7 is very combinatorial in nature. It contains for circular codes a systematic theory that leads to the study of the well-known comma-free codes. It is followed by the study of factorizations of a free monoid and more importantly of the characterizationof the codes that may appear as factors. Chapter 8 contains the proof and discussion of the theorem of the factorization of the polynomial of a finite maximal code. Many of the results of the preceding chapters are used in the proof of this theorem which contains the most current detailed information about the structure of general codes. The book ends with the connection between maximal biprefix codes and semisimple algebras. The book is written at an elementary level. In particular, the knowledge re-
XI\'
PREFACE
quired is covered by a basic mathematical culture. Complete proofs are given and the necessary results of automata theory or theory of semigroups are presented in Chapter 0. Each chapter is followed by an exercise section which frequently complements the material covered in the text. As is customary, difficult exercises are starred or double starred. The chapters end with notes containing references for the more important results, bibliographic discussions, complementary material, and references for the exercises. It seems impossible to cover the whole text in a one-year course; however, the book contains enough material for several courses, at various levels, in undergraduate or graduate curricula. Moreover, several chapters are largely independent and can be lectured on separately. As an example, we taught a course based solely on Chapter 7. In addition, a one-year lecture at the undergraduate level may be composed of Chapter 1, Chapter 2 without the last section, Chapter 3, and the first two sections of Chapter 4.The methods used there are all elementary and semigroup theory is not employed. Because of the extensive use of trees, Chapter 2 by itself might constitute an interesting complement to a course in the theory of programming. Chapters 4 and 5 , which rely on the structure of monoids of unambiguous relations, are an excellent illustration for a course in algebra. Similarly, Chapter 6 can be used as an adjunct to a course on probability theory. During the four years of preparation of this book, we have received help and collaboration from many people. It is a pleasure for us to express our thanks to them. First, several colleagues who offered to read preliminary versions of several chapters, discovered errors and suggested improvements and complements. We are greatly indebted to A. de Luca, G. Hansel, J.-E. Pin, C. Reutenauer, and A. Restivo. We also appreciated the invaluable help of S. W. Margolis and P. E. Schupp. Needless to say, we are greatly indebted to M.P. Schutzenberger. The project of writing this book stems from him and he has encouraged us constantly in many discussions. In addition, the typing of the manuscript, first in French then in English, involved the talents of Claudine Beaujean, Anne Berstel, Maryse Brochot, ZoC Civange, Martine Chaumard, Rosa De Marchi, and Arlette Dupont. Their help is gratefully acknowledged. Finally, this work was accomplished at the Laboratoire d'hformatique et Programmation, which provided moral and technical support.
CHAPTER
0
Preliminaries
0. INTRODUCTION
In this preliminary chapter, we give a short but complete account of some basic notions which will be used throughout the book. This chapter is not designed for a systematic reading but rather as a reference. The first three sections contain our notation and basic vocabulary. Each of the four subsequent sections is an introduction to a topic which is not completely treated in this book. These sections are concerned mainly with the theory of automata. Kleene’s theorem is given and we show how to construct a minimal automaton from a given automaton. Syntactic monoids are defined. These concepts and results will be discussed in another context in Chapter IV.
1. NOTATION As usual, N, Z, Q, 03, and C denote the sets of nonnegative integers, integers, and rational, real, and complex numbers, respectively. By convention, 0 E N. We set
R, = { x E R J X 2 0 } .
0 . PRELIMINARIES
2
Next, n!
(;)
= p!(n - p ) !
denotes the binomial coefficient of n and p. Given two subsets X, Y of a set 2,we define
x - Y = { z E Z l z E X , Z # Y}. An element x and the singleton set {x} will usually not be distinguished.The set of all subsets of a set X is denoted by Cp(X). The function symbols are usually written on the left of their arguments but with some exceptions: When we consider the composition of actions on a set, the action is written on the right. In particular, permutations are written on the right. A partition of a set X is a family (XJi., of nonempty subsets of X such that
(0 x = Ul.,Xi (ii) X, n X, = 0,
(i # j ) .
We usually define a partition as follows: “Let X =
UiE,Xi be a partition of
X.”We denote the cardinality of a set X by Card(X). 2. MONOIDS
A monoid is a set M equipped with an associativebinary operation and has a neutral element. The operation is usually written multiplicatively.The neutral element is unique and is denoted by 1, or simply by 1. For any monoid M, the set V ( M )is given a monoid structure by defining,for X,Y c M , XY = {xyIxEX,yE Y}. The neutral element is { 1). A submonoid of M is a subset N which is stable under the operation and which contains the neutral element of M NN c N, 1,
E
N.
(2.1) (2.2)
Note that a subset N of M satisfying (2.1) does not always satisfy 1, = 1, and therefore may be a monoid without being a submonoid of M. A morphism from a monoid M into a monoid N is a function cp:
M-iN
3
2. MONOIDS
which satisfies, for all m, m' E M , cpbm') = cp(m)cp(m'), and furthermore cp(1),
= .1,
A congruence on a monoid M is an equivalence relation 8 on M such that, for all m,m' E M,u,v E M m = m' mod 9 3 umv 3 um'v mod 8. Let cp be a morphism from M onto N . The equivalence 9 defined by m = m' mod 9 iff p(m) = cp(m') is a congruence. It is called the nuclear congruence induced by cp. Conversely,if 9 is a congruence on the monoid M , the set MI8 of the equivalence classes of 9 is equipped with a monoid structure, and the canonical function from M onto M / 9 is a monoid morphism. An idempotent of a monoid M is an element e of M such that e = e2. For each idempotent e of a monoid M, the set M(e) = eMe
is a monoid contained in M . It is easily seen that it is the largest monoid contained in M having e as a neutral element. It is called the monoid localized at e. It will sometimes be convenient to use the term semigroup to denote a set equipped with an associative operation but without the requirement of the existence of a neutral element. The notions of subsemigroup and semigroup morphism are then defined in the same way as the corresponding notions for monoids. Let M be a monoid. The set of (left and right) invertible elements of M is a group called the group of units of M . A cyclic monoid is a monoid with just one generator, i.e., M = {a"InE N}
with a' = 1. If M is infinite, it is isomorphic to the additive monoid N of nonnegative integers. If M is finite, the index of M is the smallest integer i 2 0 such that there exists an integer r 2 1 with =
(2.3) The smallest integer r such that (2.3)holds is called the period of M. The pair composed of index i and period p determines a monoid having i + p elements, Mi,p = { l,a, a',
...,ai-',a*, . ..,d'"-'}.
Its multiplication is conveniently represented in Fig. 0.1.
4
0. PRELIMINARIES
..
. *
Fig. 0.1 The monoid Mi,p.
The monoid Mi,p contains two idempotents (provided a # 1). Indeed, assume that ai = a'? Then either j = 0 or j 2 i and j and 2j have the same residue mod p, hence j = Omod p. Conversely, if j 2 i and j = Omodp, then I.
= a2j,
Consequently, the unique idempotent e # 1 in Mi,pis e = aj, where j is the unique integer in {i, i + 1,. . ,,i + p - I} which is a multiple of p . Let M be a monoid. For x,y E M , we define
1
and
x - 'y = { z E M xz = y}
I
xy - = { z E M x = zy}.
For subsets X , Y of M , this notation is extended to x-'y =
u
and
Ux-ly
XY-'=
xrXyoY
u uxy-'.
xeXyoY
The following identities hold for subsets X , Y, 2 of M :
(XY)-'Z = Y-'(X-'Z)
and
X-'(YZ-') =(X-'Y)Z-'.
Given a subset X of a monoid M , we define
F(X)= M - ' X M - ' to be the set of factors of elements in X . We have
F(X)=
{mE MI3u,uEM:umvEX}.
We sometimes use the notation F ( X )to denote the complement of F ( X )in M ,
F(X)= M - F(X). 3. WORDS
Let A be a set, which we call an alphabet. A word w on the alphabet A is a finite sequence of elements of A
w = (a1,a2,.
. .,a,),
a, E A.
3.
5
WORDS
The set of all words on the alphabet A is denoted by A* and is equipped with the associative operation defined by the concatenation of two sequences a,,
,an)(bl, b, , * * *
3
bm)
=
a,
9 .
*
>
an,b l , b,
9 * * * 3
bm).
This operation is associative. This allows us to write w = U,U,.*~U,,
instead of w = (a,, a,, . ..,a,,), by identifying each element a E A with the sequence (a). An element a E A is called a letter. The empty sequence is called the empty word and is denoted by 1. It is the neutral element for concatenation. Thus the set A* of words is equipped with the structure of a monoid. The monoid A* is called the free monoid on A. The set of nonempty words on A is denoted by A'. Thus we have A + = A*'- 1. The length IwI of the word w = alaz...anwith ai E A is the number n of letters in w. Clearly, 111 = 0. The function w H IwI is a morphism from A* onto the additive monoid N. For n 2 0, we use the notation A(,,)= {w E A* I IWI I n - 1)
and also A"" = { W EA*
I IwI
I n}.
In particular, A(') = 0 and At'] = { 1). For a subset B of A, we let Iwle denote the number of letters of w which are in B. Thus
For a word w E A*, the set alph(w) = {a E A I IwI, > 0} is the set of all letters occurring at least once in w. For a subset X of A*, we set alph(X) =
u alph(x).
xeX
A word w E A* is a factor of a word x E A* if there exist u, u E A* such that x = uwv.
The relation is a factor of is a partial order on A*. A factor w of x is proper if w # x. A word w E A* is a left factor of a word x E A* if there is a word u E A* such that x = wu. The factor w is called proper if w # x. The relation is a left factor of is again a partial order on A* called the prefix ordering. We write w I x when w is a left factor of x and w < x whenever w Ix and w # x. This order has the following fundamental property. If for some x, w I x, WI I x,
6
0 . PRELIMINARIES
then w and w‘ are comparable, i.e., or
w 5 w’
W ’ S w.
In other words, if uw = u’w’, then either there exists s E A* such that u = u‘s (and also sw = w‘) or there exists t E A* such that u’ = ut (and then w = tw’). In an entirely symmetric manner we define a right factor w of a word x by x = uw for some u E A*. A set P c A* is caHed prejx-closed if it contains the left factors of its elements: uu E P =$ u E P. A suffix-closed set is defined symmetrically. The reuerse w of a word w = ala2~-.a,,, ai E A, is
3 = a,***a2a1. The notations 3 and w - are equivalent. Note that for all u, u E A*, (uu)-=
5ii.
Thereuersezof a s e t X c A * i s t h e s e t z = {XIXEX}. A factorization of a word w E A* is a sequence(ul,u 2 , ...,u,) of n 2 0 words in A* such that w = U1U2”‘U,. For a subset X of A*, we denote by X* the submonoid generated by X,
x* = {x1x2”’x,Jn 2 O,XiEX}. Similarly, we denote by X + the subsemigroup generated by X,
X + = { x l x z * * - x , I n2 l,xiEX}. We have .+=r*-1
ifl#X, otherwise.
X*
By definition, each word w in X* admits at least one factorization .. .,x,) whose elements are all in X.Such a factorization is called an Xfactorization. We frequently use the pictorial representation of an Xfactorization given in Fig. 0.2. A word x E A* is called primitive if it is not a power of another word. Thus x is primitive iff x = y” with n 2 0 impliesx = y. Observe that the empty word is not primitive.
(xi, x2,
...
...
W
Fig. 0.2 An X-factorization of w.
3.
7
WORDS X
Fig. 0.3 Two conjugate words.
Two words x, y are called conjugate if there exists words u, u such that x = uu, y = uu. (See Fig. 0.3.) We frequently say that y is a conjugate of x. Two conjugate words are obtained from each other by a cyclic permutation. More precisely, let y be the function from A* into itself defined by and y(au) = ua (3.1) for a E A, u E A*. It is clearly a bijection from A* onto itself. Two words x and y are conjugate iff there exists an integer n 2 0 such that y(1) = 1
x = YYY). This easily implies that the conjugacy relation is an equivalence relation. A conjugacy class is a class of this equivalence relation.
F~OPOSITION 3.1 Each nonempty word is a power of a unique primitive word. Proof Let x E A and 6 be the restriction of the function y defined by (3.1) to the conjugacy class of x. Then 6' = 1 iff x is a power of some word of length k. Let p be the order of 6, i.e., the g.c.d. of the integers k such that 6' = 1. Since aP = 1,there exists a word r of length p such that x = re withe 2 1.The word r is primitive, otherwise there would be a word s of length q dividing p such that r E s*, which in turn implies that x E s*, contrary to the definition of p. This proves the existence of the primitive word. To show unicity, consider a word t E A*suchthatx~t*andletk = Itl.Sincedk = 1,theintegerkisamultipleof p. Consequently t E r*. Thus if t is primitive, we have t = r. 0 +
Let x E A +.The unique primitive word r such that x = r" for some integer n is called the root of x. The integer n is the exponent of x. We sometimes write I=&.
PROPOSITION 3.2 Two nonempty conjugate words have the same exponent and their roots are conjugate. Proof Let x, y E A + be two conjugate words, and let i be an integer such that y = y'(x). Set r = f i and s = f i and let x = r". Then y = y'(r")= (yi(r))".
8
0. PRELIMINARIES
TABLE 0.1 The Number l,(k) of Primitive Conjugacy Classes over a k-Letter Alphabet n
1
2
3
4
5
6
7
8
9
10
11
12
4(2) 1.Q)
2 3
1
6 48
9 116
18
30
56
99
186
335
1J4)
4
6
2 8 20
3
3
18
This shows that y'(r) E s*. Interchanging the roles of x and y , we have yJ(s) E r*. It follows that y'(r) = s and yJ(s) = r. Thus r and s are conjugate and consequently x and y have the same exponent. 0
PROPOSITION 3.3 All words in a conjugacy class have the same exponent. If C is a conjugacy class of words of length n with exponent e, then
Card(C) = n/e. Proof Let x E A" and C be its conjugacy class. Let 6 be the restriction of y to C and p be the order of 6. The root of x is the word r of length p such that x = re. Thus n = pe. Now C = { x ,6(x), . . .,dP- '(x)}. These elements are distinct since p is the order of 6.Thus Card(C) = p. 0 We now compute the number of conjugacy classes of words of given length over a finite alphabet. Let A be an alphabet with k letters. For all n 2 1, the number of conjugacy classes of primitive words in A* of length n is denoted by l,,(k). The notation is justified by the fact that this number depends only on k and not on A. The first values of this function, for k = 2,3,4, are given in Table 0.1. Clearly 1,(1) = 1 if n = 1, and 1,,(1) = 0, otherwise. Now for n 2 1
k"
= xdld(k), dln
where d runs over the divisors of n. Indeed, every word of length n belongs to exactly one conjugacy class of words of length n. Each class has d = n / e elements, where e is the exponent of its words. Since there are as many classes whose words have exponent n/e as there are classes of primitive words of length d = n/e, the result follows. 0 We can obtain an explicit expression for the numbers l,(k) by using the classical technique of Mobius inversion which we now recall. The Mobius function is the function p:
defined by
N - (0)
+N
3.
9
WORDS
and p(n) =
n is the product of [(-t)i ifotherwise.
i distinct prime numbers,
PROPOSITION 3.4 (Mobius Inversion Formula) Let a, B be two functions from N - (0)into N. Then
!f P(n) = C A d ) a ( n / d )
(n 2 1).
(3.4)
dln
Proof Define a product on the set Y of functions from N - {0}into N by setting, for f, g E 9,
c
f*s(n)= n = d e f ( d ) s ( e ) . It is easily verified that Y is a commutative monoid for this product. Its neutral element is the function 1 taking the value 1 for n = 1 and 0 elsewhere. Let I E Y be the constant function with value 1. Let us verify that (3.5) Indeed i * p ( l ) = 1; for n 2 2, let n = pflp",' * * p i mby the prime decomposition of n. If d divides n, then p ( d ) # 0 iff d = p y p ? . . .pftl" r*p=l.
with all li = 0 or 1. Then p ( d ) = (- 1)' with t = Zy'
l i . It follows that
Now let a, p E 9'.Then formula (3.3) is equivalent to a = I * p and formula (3.4) is equivalent to p = p * a. By (3.5) these two formulas are equivalent. 0 PROPOSITION 3.5 The number of conjugacy classes of primitive words of length n over an alphabet with k letters is In(k) =
1p(n/d)kd* dln
Proof
This is immediate from formula (3.2) by Mobius inversion. D
A word w E A + is called unbordered if no proper nonempty left factor of w is a right factor of w. In other words, w is unbordered iff
W E U A +n A + u
implies u = 1.
0. PRELIMINARIES
-- a --
U
blU I 0
Fig. 0.4 The word uablYI.
If w is unbordered, then wA* n A*w = WA*Wu W. The following property holds.
PROPOSITION 3.6 Let A be an alphabet with at least two letters. For each word u E A', there exists v E A* such that uv is unbordered. Proof Let a be the first letter of u, and let b E A - a. Let us verify that the word w = uabl"1 is unbordered (Fig. 0.4). A nonempty left factor t of w starts with the letter a. It cannot be a right factor of w unless It1 > lul. But then we have t = sabl"l for some s E A*, and also t = uablSI.Thus Is1 = IuI, hence t=w. 0 Let A be an alphabet. The free group A@on A is defined as follows: Let an alphabet in bijection with A and disjoint from A. Denote by
be
UHii the bijection from A onto A. This notation is extended by setting, for all UEAUA,
-
a = a. Let 6 be the symmetric relation defined for u, v E ( A u A)* and a E A u
A by
uaiiv = uv mod 6. Let p be the reflexive and transitive closure of 6. Then p is a congruence. The quotient A@= ( A u A)*/p is a group. Indeed, for all a E A u 2, aZ E 1 mod p, Thus the images of the generators are invertible in
[email protected] group A@is called the free group on A. 4. AUTOMATA
Let A be an alphabet. An automaton over A is composed of a set Q (the set of states), a subset I of Q (the initial states), a subset T of Q (the terminal or final states), and a set 9 c Q x A x Q
4.
AUTOMATA
called the set of edges. The automaton is denoted by = (Q,I, T )
The automaton isjnite when the set Q is finite. A path in the automaton d is a sequence c = (fl ,f2,. edges f;: = (qi,ai,qi+l), 1 Ii I n.
. .,f,,)of consecutive
The integer n is called the length of the path c. The word w = ala2 ...a,, is the label of the path c. The state q1 is the origin of c, and the state q,,+ the end of c. A useful notation is W
c:
41-
4n+l.
By convention, there is, for each state q E Q, a path of length 0 from q to q. Its label is the empty word. A path c: i + t is successful if i E I and t E 'I: The set recognized by d , denoted by L ( d ) ,is defined as the set of labels of successful paths. A state q E Q is accessible (resp. coaccessible) if there exists a path c: i + q with i E I (resp. a path c: q + t with t E T). An automaton is trim if each state is both accessible and coaccessible. Let P be the set of accessible and coaccessible states, and let d o=L (P,I n P, T n P).Then it is easy to see that d ois trim and L ( d )= L(d"). An automaton d = (Q,I, T) is deterministic if Card(1) = 1 and if (p,a,q),( p , a , r )E L F
*
4 = r.
Thus for each p E Q and a E A, there is at most one state q in Q such that p L q . For p E Q, and a E A, define
The partial function QxA+Q defined in this way is extended to words by setting, for all p
E
Q,
P . 1 =p,
and, for w E A*and a E A, p wa = ( p * w) a. 9
This function is called the transitionfunction or next-state function of d.With this notation, we have with I = { i } ,
-
L ( d ) = { w E A* I i w E T } .
0. PRELIMINARIES
I2
An automaton is complete if for all p such that p p_ q.
E
Q,a E A, there exists at least one q E Q
PROPOSITION 4.1 For each automaton d,there exists a complete deterministic automaton 93 such that L ( d ) = L(W). If d isfinite, then 9 can be chosen to befinite.
Proof
Set d = (Q,I, T). Define 9 = (R, u, V )by setting R = Y(Q),u = I , V = {S c QlS n T #
0).
Define the transition function of 9,for S E R, a E A by S - a = {qEQ13sES:s-P--,q}.
The automaton W is complete and deterministic. It is easily seen that L ( d ) = L(W). 0 Let d = (Q,i, T ) be a deterministic automaton. For each q E Q,let
L,
= {w E A * l q
w
E
T}.
Two states p, q E Q are called inseparable if L, = L,;
they are separable otherwise. A deterministic automaton is reduced if two distinct states are always separable. Let X be a subset of A*. We define a special automaton d ( X ) = (Q,i, T ) in the following way. The states of d ( X )are the nonempty sets
u-'x for u E A*. The initial state is X = l-'X, and the final states are those containing the empty word. The transition function is defined for a state Y = u- ' X and a letter a E A by Yea = a-'X
Observe that this indeed defines a partial function. We have
L(d(X)) = x. Indeed, an easy induction shows that for w E A*
x
w = w-'x.
Consequently
WEL(d(X))0 1EX.W
0
lEw-'X
0
WEX.
4.
AUTOMATA
'3
The automaton d ( X ) is reduced. Indeed, for Y = u 'X, L y = {VEA*I Y . V
E
T }= {MA*
UV
E
X}.
Thus Ly = Y. The automaton d ( X ) is called the minimal automaton of X. This terminology is justified by the following proposition.
PROPOSITION 4.2 Let d = (Q, i, T )be a trim deterministic automaton and let X = L ( d ) . Let d ( X ) = (P,j , S) be the minimal automaton of X . The function from Q into P defined by cp:
q+L,
is surjective and satisfies
.
cp(4 ' a) = cp(d a.
Proof
Let q E Q. Let u E A* such that i u = q. Then
I .
L, = { w E A* q w E T } = u-'x. Since d is trim, L, # 0.This shows that L, E P. Thus cp is a function from Q into P. Next, let us show that cp is surjective. Let u-'X E P. Then u-'X # 0. Thus u # 0 and setting q = i u, we have L, = u-'X = cp(q). Thus cp is surjective. Finally, for q = i u, q ( q . a) = L,.a = (ua)-'X = (u-lx). a = L, a. 0
Assume furthermore that the automaton d in the proposition is reduced. Then the function cp is a bijection, which identifies d with the minimal automaton. In this sense, there exists just one reduced automaton recognizing a given set. Let d = (Q, i, T ) be a deterministic automaton. Consider the set 9-of partial functions from Q into Q. These functions are written on the right: if q E Q and m E F,then the image of q by m is denoted by qm. Composition is defined by dmn) = (qm)n. Thus F has a monoid structure. Let cp be the function which to a word w E A* associates the partial function from Q into Q defined by qcpw = 4 * w.
'4
0. PRELIMINARIES
The function cp is a morphism from A* into the monoid 9. The submonoid cp(A*)of F is called the transition monoid of the automaton d. Observe that, setting X = L ( d ) ,we have
q-'cp(X)= x. (4.1) Indeed w E cp-'cp(X)iff q(w)E cp(X)which is equivalent to icp(w)E ?: i.e., to WEX. A morphism cp from A* onto a monoid M is said to recognize a subset X of A* if
q-'cp(X)= x. Let X be a subset of A*. For w E A*, set
I
T(w)= {(u, U) E A* x A* uwu E X}. The syntactic congruence of X is the equivalence relation ox on A* defined by
w
= W'
mode, o r(w) = T(w').
It is easily verified that ox is a congruence. The quotient of A* by ox is, by definition, the syntactic monoid of X.We denote it by A(X), and we denote by q x the canonical morphism from A* onto A(X).
PROPOSITION 4.3 Let X be a subset of A*, and let cp: A* + M be a surjective morphism. If cp recognizes X,then there exists a morphism $from M onto the syntactic monoid A ( X ) such that cpx = * O c p .
Proof
It suffices to show that
cp(w) = cp(w') * c p x w = cpx(w'). (4.2) Indeed, if (4.2) holds, then for an element m E M, $(m) is defined as the unique element in cpx(rp- '(m)).To show (4.2), we consider (u, u) E r(w).Then uwv E X. Thus q(u)cp(w)cp(u)E cp(X).From cp(w) = cp(w'), it follows that cp(u)cp(w')cp(v) E cp(X).Since cp recognizes X, this implies that uw'u E X . Thus (u,u) E r(w'). 0 PROPOSITION 4.4 Let X be a subset of A*. The syntactic monoid of X is isomorphic to the transition monoid of the minimal automaton d(X). Proof Let M be the transition monoid of the automaton d(X)= (Q, i, T ) and let cp: A* + M be the canonical morphism. By (4.1), the morphism cp recognizes X.By Proposition 4.3, there exists a morphism $ from M onto the syntactic monoid A(X)such that cpx = $ 0 cp.
4.
15
AUTOMATA
+
It suffices to show that is injective. For this, consider m,m’ E M such that $(m) = $(m’). Let w, w‘ E A* such that cp(w) = m, cp(w’) = m‘. Then cpx(w) = cpx(w’). To prove that cp(w) = ~ ( w ’ )we , consider a state p E Q, and let u E A* be such that p = u - ‘ X . Then
I
p c p ( ~= ) p w = (uw)- ‘ X = {U E A* (u,0 ) E T(w)}.
Since T(w)= r(w’), we have pcp(w) = pcp(w’). Thus ~ ( w=) cp(w’), i.e.,
m = m’. 0 We now turn to the study of properties which are specificto finite automata.
THEOREM 4.5 Let X
c A*. The following conditions are equiualent.
(i) The set X is recognized by a finite automaton. (ii) The minimal automaton d ( X ) is finite. (iii) The family of sets u-
‘x,
for u E A*, is finite. (iv) The syntactic monoid A ( X ) is finite. (v) The set X is recognized by a morphism ffom A* onto a finite monoid. Proof (i)=$ (ii) Let d be a finite automaton recognizingX.By Proposition 4.1, we can assume that d is deterministic. By Proposition 4.2, the minimal automaton d ( X ) also is finite.
(ii) e (iii) is clear. (ii) =$ (iv)holds by Proposition 4.4 and by the fact that the transition monoid of a finite automaton is always finite. (iv) =$ (v) is clear. (v) =. (i) Let cp: A* + M be a morphism onto a finite monoid M ,and suppose that cp recognizes X. Let d = (M,l,cp(X)) be the deterministic automaton with transition function defined by
.
m a = mcp(a). Then 1 w E cp(X) iff cp(w)E cp(X), thus iff w E X . Consequently L ( d ) =
x.
0
A subset X of A* is called recognizable if it satisfies one of the equivalent conditions of Theorem 4.5.
PROPOSITION 4.6 The family of recognizable subsets of A* is closed under all boolean operations: union, intersection,complementation. Proof Let X , Y c A* be two recognizable subsets of A*. Let d = (P,i, S ) and 1 = (Q, j , T ) be complete deterministic automata such that X = L ( d ) ,
16
0. PRELIMINARIES
Y = L ( 1 ) . Let %' = ( P x
Q, (i,A, R)
be the complete deterministic automaton defined by ( p , q) a = ( p a, q a).
For R = (S x Q ) u (P x T), we have L(%) = X u Y. For R = S x T, we have L(W) = X n Y. Finally, for R = S x (Q - T), we have L(W) = x-Y. 0 Consider now a slight generalization of the notion of automaton. An asynchronous automaton on A is an automaton d = (Q,I, T), the edges of which may be labeled by either a letter or the empty word. Therefore the set of its edges satisfies 9 c Q x ( A u1) x Q. The notions of a path or a successful path extend in a natural way so that the notion of the set recognized by the automaton is clear.
PROPOSITION 4.7 For any finite asynchronous automaton d,there exists a finite automaton a such that L ( d ) = L(i3). Proof Let d = ( Q , I , T ) be an asynchronous automaton. Let 9 = (Q,I, T) be the automaton, the edges of which are the triples (p, a, q) such that there exists a path p A q in d.We have
L ( d ) n A + = L(a)n A'. If I n T # 0,both sets L ( d ) and L(@) contain the empty word and are therefore equal. Otherwise, the sets are equal up to the empty word and the result follows from Proposition 4.6 since the set { l} is recognizable. 0 The notion of an asynchronous automaton is useful to prove the following result.
PROPOSITION 4.8 I f X c A* is recognizable, then X* i s recognizable. If X,Y c A* are recognizable, then XY is recognizable. Proof Since X* = (X - 1)* and since X - 1 is recognizable, we may suppose that 1 # X.Let d = (Q,I, T) be a finite automaton recognizing X.Let 9 be the set of its edges. Since 1 # X,we have I n T = 0.Let = (Q, I, T) be the asynchronous automaton with edges 9 u (T x 11) x I).
Then L(i3)= X'. In fact, the inclusion X* - 1 c L(1)is clear. Conversely, let c: i 4j be a successful path in A?, By the definition of 1, this path has the
4.
‘7
AUTOMATA
form i l W 1 - t l + i 12 W 2 ‘ t 2 * . . in -%t n with i = i,, j = tn and where no path ck: ik5t k contains an edge labeled by the empty word. Then wl, w2,. ..,wn E X and therefore w E X’. Now let d = (P,I, S ) and L3i’ = (Q, J , T) be two finite automata with sets of edges 4t and Q, respectively. Let X = L ( d ) and let Y = L(9Y). One may assume that P n Q = 0.Let %? = (P u Q,I,T) be the asynchronous automaton with edges c:
4t u Q u (S x (1) x J ) .
Then L(%) = XY as we may easily check. 0 We shall now give another characterization of recognizable subsets of A *. Let M be a monoid. The family of rational subsets of M is the smallest family 9t of subsets of M such that (i) any finite subset of M is in W, (ii) ifX, Y E 9 , t h e n X u Y E W a n d X Y E 9 , (iii) if X E 9,then X* E 9.
THEOREM 4.9 Let A be a jinite alphabet. A subset of A* is recognizable iff it is rational. Proof Denote by Rec(A*) the family of recognizable subsets of A * and by Rat(A*) that of rational subsets of A*. Let us first prove the inclusion Rat(A*) c Rec(A*). In fact, any finite subset X of A* is clearly recognizable. Moreover, Propositions 4.6 and 4.8 show that the family Rec(A*) satisfies conditions (ii) and (iii) of the definition of Rat(A*). This proves the inclusion. To show that Rec(A*) c Rat(A*), let us consider a recognizable subset X of A*. Let d = (Q, I, T) be a finite automaton recognizing X. Set Q = {1,2,...,n } a n d f o r l I i , j ~ n ,
Xi,j = {w E A* I i L j } . We have
x=
u u xi,j. iE1j E T
It is therefore enough to prove that each X,,is rational. For k E {0,1,. ..,n}, denote by XIS the set of those w E A * such that there exists a path c: i “j passing only through states 1 I k except perhaps for i, j. In other words we have w E Xrfij’iff w = a,a,--.a,with c: i L i , - az t2*..i,,,-l5 j
18
0. PRELIMINARIES
and i , I k,.. .,i,-
5
k. We have the formulas X'O' i.j C A u 1,
(4.3)
x13 = xi,,
(4.4)
Xlk+ 1 ) = XW i,j v X(k) r.k+1(XJrk!l,k+l )*XW k+l.j J
(O
< n)'
(4*5)
Since A is finite, Xi!j E Rat(A*) by (4.3). Then (4.5) shows by induction on E Rat(A*). Therefore X,, E Rat@*) by (4.4). 0
k 2 0 that Xi:]
5. IDEALS IN A MONOID
Let M be a monoid. A right ideal of M is a nonempty subset R of M such that RMcR or equivalently such that for all r E R and all m E M , we have rm E R . Since M is a monoid, we then have RM = R because M contains a neutral element. A left ideal of M is a nonempty subset L of M such that M L c L. A two-sided ideal (also called an ideal)is a nonempty subset I of M such that
MIM c I .
A two-sided ideal is therefore both a left and a right ideal. In particular, M itself is an ideal of M . An element 0 of M is a zero if 0 # 1 and for all m E M Om = mO = 0. If M contains a zero it is unique and the set (0) is a two-sided ideal which is contained in any ideal of M. An ideal 1 (resp., a left, right ideal) is called minimal if for any ideal J (resp., left, right ideal) J c Z => J = I . If M contains a minimal two-sided ideal, it is unique because any nonempty intersection of ideals is again an ideal. If M contains a 0, the set (0)is the minimal two-sided ideal of M.An ideal I # 0 (resp., a left, right ideal) is then called 0-minimal if for any ideal J (resp., left, right ideal) Jcl
=S
J=O
For any m E M , the set
R=mM
or
J=I.
5.
19
IDEALS IN A MONOID
is a right ideal. It is the smallest right ideal containing m. In the same way, the set
L=Mm is the smallest left ideal containing m and the set I = MmM
is the smallest two-sided ideal containing m. We now define in a monoid M four equivalence relations 9,Y,f and 2 as
m9m'
iff
mM = m ' M ,
mYm'
iff
M m = Mm',
mfm'
iff
MmM=Mm'M,
mXm'
iff
mM = m ' M
and M m = Mm'.
Therefore, we have for instance, d m ' iff there exist u, u' E M such that
m' = mu,
m = m'u'.
We have W c f , Y c f , and &' = W n 8.
PROPOSITION 5.1 The two equivalence W and 14 commute:
BY
= 99.
Proof Let m, n E M be such that m W f i . There exists p p Y n (see Fig. 0.5).
E
M such that
&p,
There exist by the definitions, u, u', u, u' n = up, p = v'n.
EM
Fig. 0.5 419 = 9 4 1 .
such that p = mu, m = pu',
0. PRELIMINARIES
20
Set q = urn. We then have q = um = u(pu’) = (up)u’ = nu’, n = up = u(mu) = (um)u = qu.
This shows that qWn. Furthermore, we have
m = pu’ = (u’n)u’ = u’(nu’)= u’q. Since q = urn by the definition of q, we obtain m Y q . Therefore m Y q W n and consequently rnYWn. This proves the inclusion 82’c 9%’ The . proof of the converse inclusion is symmetrical. 0 Since W and Y commute, the relation 9=WY=YW is an equivalence relation. We have the inclusions x c 9,Y c 9 c y. The classes of the relation 9,called 9-classesYcan be represented by a schema called an “egg-box” as in Fig. 0.6. The W-classes are figured by rows and the Y-classes by columns. The squares at the intersection of an W-class and an Y-class are the X-classes. We denote by L(m), R(m), D(m), H(m), respectively, the 9, W,9, and Af‘ classes of an element m E M.We have
H(m) = R(m) n L(m)
and
R(m), L(m) c D(m).
PROPOSITION 5.2 Let M be a monoid. Let rn, m’ E M be W-equivalent. Let u, u’ E M be such that
m = m’u‘,
m’ = mu.
Fig. 0.6 A %class.
5.
21
IDEALS IN A MONOID
The mappings pu: q -P qu,
pu,: q’ -+q’u’
are bijections from L(m) onto L(m’) inverse to each other. Moreover they preserve 9’-classes. Proof We first verify that p,, maps L(m) into L(m’). If q E L ( r n ) , then Mq = M m and therefore M q u = Mmu = Mm‘. Hence qu = p,,(q) is in L(m’). AnaIogously, putmaps L(m’)into L(m). Let q E L(m)and compute p,,p,,.(q).Since q Y m , there exist v, v’ E M such that q = urn, m = v‘q (see Fig. 0.7). We have p,p,,.(q) = quu‘ = (urn)uu’ = o(mu)u’ = vm’u’ = vm = q,
This shows that pupu.is the identity on L(m). We show in the same way that pup,,. is the identity on L(m’). Finally, since quu’ = q for all q E L(m), the elements q and pu(q) are in the same W class. 0 Proposition 5.2 has the following consequence which justifies the regular shape of Fig. 0.7.
PROPOSITION 5.3 Any two &-classes contained in the same .%class have the same cardinality. 0 We now come to the point of locating the idempotents in an ideal. The first result describes the &‘-class of an idempotent.
Fig. 0.7 The reciprocal bijections.
0. PRELIMINARIES
22
PROPOSITION5.4 Let M be a monoid and let e E M be an idempotent. The 2-class of e is fhe group of units of the monoid. M(e) = eMe. Proof
Let m E H(e).Then, we have for some u, u’, u, u’
e = mu,
m = eu‘,
e = om,
E
M
m = v’e.
Therefore em = e(eu’) = eu’ = m and in the same way me = m. This shows that m E M(e). Since
m(eue) = mue = e,
(eue)m= eum = e.
the element m is both right and left invertible in M. Hence, m belongs to the group of units of M(e).Conversely, if m E M(e)is right and left invertible, we have mu = um = e for some u, u E M(e).Since m = em = me, we obtain m 2 e . 0
PROFQSITION 5.5 An 2-class of a monoid M is a group iff it contains an idempotent. Proof Let H be an 2-class of M. If H contains an idempotent e, then H = H(e)is a group by Proposition 5.4. The converse is obvious. 0 PROPOSITION5.6 Let M be a monoid and m,n E M. Then mn E R(m) n L(n)
zfl
R(n) n L(m)contains an idempotent. Proof
If R(n) n L(m)contains an idempotent e, then
e = nu,
n = eu’,
e = urn,
m = v’e
for some u, u’, u, u‘ E M. Hence mnu = m(nu) = me = (u’e)e= u’e = m so that mnWm. We show in the same way that mnYn. Hence mn E R(m) n L(n). Conversely,if mn E R(m) n L(n),then m d m and nYmn. By Proposition 5.2 the multiplication on the right by n is a bijection from L(m)onto L(mn).Since n E L(mn), this implies the existence of e E L(m) such that en = n. Since n preserves W classes, we have additionally e E R(h). Hence there exists u E M such that e = nu. Hence
nunu = enu = nu and e = nu is an idempotent in R(n) n L(rn). 0
5.
IDEALS IN A MONOID
23
PROPOSITION 5.7 Let M be a monoid and let D be a %class of M . The following conditions are equivalent. (i) D contains an idempotent. (ii) Each W-class of D contains an idempotent. (iii) Each 9-class of D contains an idempotent. Proof Obviously, only (i)*(ii) requires a proof. Let e E D be an idempotent. Let R be an 9 class of D and let n E L(e) n R. (See Fig. 0.8). Since n z e , there exist v, v' E M such that
n = ve,
e = u'n.
Let m = ev'. Then mn = e because
mn = (ev')n = e(v'n) = ee = e. Moreover, we have m 9 e since mn = e and m = ev'. Therefore, e = mn is in R(m)n L(n). This implies, by Proposition 5.6, that R contains an idempotent. 0 A %class D satisfying one of the conditions of Proposition 5.7 is called regular.
PROPOSITION5.8 Let M be a monoid and let H be an %'-class of M . The two following conditions are equivalent. (i) There exist h, h' E H such that hh'
E
H.
(ii) H is a group. Proof (i)*(ii) If hh' E H, then by Proposition 5.6, H contains an idempotent. By Proposition 5.5, it is a group. (ii) * (i) is obvious. 0 We now study the minimal and O-minimal ideals in a monoid. Recall that if
M contains a minimal ideal, it is unique. However, it may contain several @minimal ideals.
rn
Fig. 0.8 Finding an idempotent in R.
24
0. PRELIMINARIES
Let M be a monoid containing a zero. We say that M is prime if for any m, n E M - 0, there exists u E M such that mun # 0. PROPOSITION 5.9 Let M be a prime monoid. 1. I f M contains a O-minimal ideal, it is unique. 2. If M contains a O-minimal right (resp., left) ideal, then M contains a O-minimal ideal; this ideal is the union of all minimal right (resp. left) ideals of M . 3. I f M both contains a O-minimal right ideal and a O-minimal lefi ideal, its O-minimal ideal is composed of a regular %class and zero.
Proof 1. Let I, J be two O-minimal ideals of M. Let m E I - 0 and let n E J - 0. Since M is prime, there exist u E M such that mun # 0. Then mun E I n J implies I n J # 0. Since I n J is an ideal, we obtain I n J = I = J . 2. Let R be a O-minimal right ideal. We first show that for all m E M , either m R = (0) or the set mR is a O-minimal right ideal. In fact, m R is clearly a right ideal. Suppose m R # 0 and let R' # 0 be a right ideal contained in mR. Set S = { r E R I mr E R'}.Then R' = mS and S # (0) since R' # {O}. Moreover, S is a right ideal because R' is a right ideal. Since S c R , the fact that R is a O-minimalright ideal implies the equality S = R. This shows that m R = R' and consequently that m R is a O-minimal right ideal. Let I be the union of all the minimal right ideals. It is a right ideal, and by the preceding discussion, it is also a left ideal. Let J # (0) be an ideal of M . Then for any O-minimal right ideal R of M , R J c R n J c R. We have RJ # (0) since for any r E R and m E J - 0, there exists u E M such that rum # 0 whence rum E R J - { 0). Since R is a O-minimal right ideal and R n J is a right ideal distinct from {O}, we have R n J = R. Hence R c J. This shows that I c J. Hence I is contained in any nonzero ideal of M and therefore is the O-minimal ideal of M . 3. Let I be the O-minimal ideal of M. Let m, n E I - (0). By 2 the right ideal mM and the left ideal M n are O-minimal. Since M is prime, there exists u E M such that mun # 0. The right ideal mM being O-minimal, we have mM = munM and therefore mgmun. In the same way, rnun9n. Hence we have m 9 n . This shows that I - (0) is contained in a %class. Conversely, if m E 1 - {0), n E M and m g n , there exists a k E M such that mM = k M and M k = Mn. Hence MmM = M k M = M n M and this implies n E I - (0). This shows that I - (0) is a 9-class. Let us show that I - (0) is a regular .%class. By Proposition 5.7, it is enough to prove that I - (0) contains an idempotent. Let m, n E I - (0).
5.
IDEALS IN A MONOID
25
Since M is prime, there exists u E M such that mun # 0. Since the right ideal mM is 0-minimal and since mun # 0, we have mM = muM = munM. Whence mun E R(mu). Symmetrically, since Mn is a 0-minimal left ideal, we have M n = Mun = Mmun, whence mun E L(n). Therefore mun E R (mu)n L(n)and by Proposition 5.6, this implies that R(n)n L(mu) contains an idempotent. This idempotent belongs to the 9 class of m and n and therefore to I - (0). 0 COROLLARY 5.10 Let M be a prime monoid. If M contains a 0-minimal right ideal and a 0-minimal left ideal, then M contains a unique 0-minimal ideal I which is the union of all the 0-minimal right (resp., left) ideals. %is ideal is composed with*a regular 9 class and 0. Moreover, we have the following computational rules. 1. For m E I - ( 0 ) and n E M such that mn # 0, we have
m9mn. 2. For m E I
- { 0)
and n E M such that nm # 0, we have mYnm.
3. For any &' class H c I
-
( 0 ) we have
H ~ = H or
H2 = (0).
Proof The first group of statements is an easy consequence of Proposition 5.9. Let us prove 1. We have mnM c mM. Since mM is a 0-minimal right ideal and mn # 0, this forces the equality mnM = mM. The proof of 2 is symmetrical. Finally, to prove 3, let us suppose H Z # (0). Let h, h' E H be such that hh' # 0. Then, by 1 and 2, hWhh' and h'Yhh'. Since h 9 h ' and h'yhh', we have hYhh'. Therefore hh' E Hand H i s a group by Proposition 5.8. 0 We finally give the statements corresponding to Proposition 5.9 and Corollary 5.10 for minimal ideals instead of 0-minimal ideals. This is of course of interest only in the case where the monoid does not have a zero.
PROPOSITION 5.1 1 Let M be a monoid. 1. If M contains a minimal right (resp., left) ideal, then M contains a minimal ideal which is the union of all the minimal right (resp., lest) ideals. 2. If M contains a minimal right ideal and a minimal left ideal, its minimal ideal I is a .%class. All the &'-classes in I are groups.
Proof
Let 0 be an element that does not belong to M and let
M0=MuO be the monoid whose law extends that of M in such a way that 0 is a zero. The monoid Mo is prime.
26
0 . PRELIMINARIES
An ideal I (resp., a right ideal R,a left ideal L ) of M is minimal iff I u {0} (resp., R u {0}, L u (0))is a 0-minimal ideal (resp., right ideal, left ideal) of Mo, Moreover the restriction to M of the relations W ,9,9,&' in Mo coincide with the corresponding relations in M. Therefore statements 1 and 2 can be deduced from Proposition 5.9 and Corollary 5.10. 0 COROLLARY 5.12 Let M be a monoid containing a minimal right ideal and a minimal left ideal. 'Ihen M contains a minimal ideal which is the union of all the minimal right (resp., left) ideals. 'Ihis ideal is a 9 class and all its &' classes are groups. 0 6. SEMIRINGS AND MATRICES
A semiring K is a set equipped with two operations denoted + and * satisfying the following axioms: (i) The set K is a commutative monoid for + with a neutral element denoted 0. (ii) The set K is a monoid for multiplication with a neutral element denoted by 1. (iii) Multiplication is distributive on addition. (iv) For all x E K, 0 x = x 0 = 0.
-
-
Clearly, any ring with unit is a semiring. Other examples of semirings are as follows. The boolean semiring 93 is composed of two elements 0 and 1. The axioms imply o + 1 = 1 + o = 1, o+o=o, 0 . 1 = 1 . 0 = 0 ~ 0 = 0 , 1.1 = l . is specified by 1+1=1. The other possibility, viz., 1 + 1 = 0, defines the field 2/22. M is the semiring of natural integers and R, is the semiring of nonnegative real numbers. For any monoid M, the set Cp(M)is a semiring for the operations of union and set product.
The semiring
A semiring K is called ordered if it is given with a partial order Isatisfying the following properties: (i) 0 is the smallest element of K; (ii) the following implications hold: x s y
x
sy
*
x+zsy+z, xz I y z , zx I zy.
6.
SEMIRINGS AND MATRICES
27
The semirings a, N, 88 + ,are ordered by the usual ordering. x s y
x=y+z.
An ordered semiring is said to be complete if any subset X of K admits a least upper bound in K. It is the unique element k of K such that (i) x ~ X = t - x ~ ; k , (ii) if x I; k' for all x E X , then k I k'.
We denote k = sup(X) or k = sup{x I x E X } or k = supxex(x). Semiring a is complete. Semirings N, R, are not complete. They may be completed according to the following general procedure: Let K be an ordered semiring whose order is total. Let
X=Kum, where co # K. The operations of K are extended to X by setting for x
E
K,
(i) x + c o = c o + x = c o , (ii) if x # 0; then xco = cox = co, (iii) coco = co,000 = 000 = 0.
Extending the order of K to X by x I; co for all x E K, the set X becomes a totally ordered semiring. It is a complete semiring because any subset has an upper bound and therefore also a least upper bound. We define N=Nuco,
W,=R,wco
to be the complete semirings obtained by applying this construction to N and 88,. If X is a complete semiring, the sum of an infinite family of elements of X is defined by
c xi
= sup
iol
c xjI J
c I, J finite
{jeJ
In the case of semiringW , ,this gives the usual notion of a summablefamily: A family (xi)ielof elements in R, is summable if sum (6.1) is finite. In particular, for a sequence ( x , ) , , ~of elements of a complete semiring, we have nLO
x , = sup
c xi }
nLO{is,
(6.2)
since any finite subset of N is contained in some interval { O , l , ...,n}. Moreover, if I = u j E lj is a partition of I , then J
c xi c (c
iol
=jsJ
ioljXi).
(6.3)
28
0. PRELIMINARIES
Consider, for instance, a mapping n:A*
+ R+
. Let X be a subset of A* and let
X, = { X E X l l x l In}. Then
c
xsx
=sup( ntO
.(X)
c n(x)}.
xsX.
Formulas (6.1)-(6.3) lead to results of the following kind:
PROPOSITION 6.1 Let rE
(u,,),,~ be a sequence of elements of R,
R+,
. Let, for
c anrn.
a(r) =
ntO
If a(r) < co for r E [0,1[, then sup{a(r)1 r E [O, I[} = a(1). Proof The mapping a: r H a(r) from R, into R, is nondecreasing. By successive applications of (6.2), (6.3), and then (6.2), we have
sup{a(r)I r E [0, I[} = sup = sup(
sup
1air')
rc[O.l[ i S a
ntO
c
= SUP( n 2 0 i s n 4)
=
c
a, = a(1).
ntO
0
Let P, Q be two sets and let K be a semiring. A P x Q-matrix with coefficients in K is a mapping m:
P x Q+K.
We denote indistinctly by
(p,m,q) or mp.4 the value of m on ( p , q) E P x Q. We also say that m is a K-relation between P and Q.If P = Q, we say that it is a K-relation ouer Q. The set of all K-relations between P and Q is denoted by KP Let m E K P x Qbe a K-relation between P and Q. For p E P, the row of index p of m is denoted by mp*. It is the element of K Q defined by Q.
(mp)q
= mpq.
7.
29
FORMAL SERIES
Similarly, the column of index 4,of m is denoted by w qIt . is an element of K P . Let P , Q, R be three sets and let K be a complete semiring. Form E K P and n E K Q'R, the product rnn is defined as the following element of K P '. Its value on ( p , r ) E P x R is
When P = Q = R, we thus obtain an associative multiplication which turns K Q'Q into a monoid. Its identity is denoted I,. A monoid of K-relations over Q is a submonoid of K Q'Q. It contains in particular the identity ZQ.
7. FORMAL SERIES Let A be an alphabet and let K be a semiring. A formal series (or just series) over A with coefficients in K is a mapping a: A*
--*
K.
The value of a on w E A* is denoted (a,w). We indifferently denote by or
KA*
K((A))
the set of formal series over A. We denote by K ( A ) the set of formal series a E K ( ( A ) ) such that (c,w ) = 0 for all but a finite number of w E A*. An element of K ( A ) is called a polynomial. A series c E K ( ( A ) )can be extended to a linear function from K ( A ) into K by linearity. For p E K ( A ) , (0,
PI
=
1 (a,W N P , w).
WEA*
This definition makes sense because p is a polynomial. Let a, z E K ( ( A ) ) and k E K.We define the formal series a + z, az, and ka by .(
+ z, w ) = (at, w ) =
(CT,
+ (7, w),
(7.1)
(a,u)(z,U),
(7.2)
w)
c
uv=w
(ka,w ) = k(a,w).
(7.3) In (7.2), the sum runs over the 1 + IwI pairs (u,u) such that w = uv. It is therefore a finite sum. The set K ( ( A ) )contains two special elements denoted 0 and 1 defined by
(0,w ) = 0,
(1, w ) =
1 0
if w = 1, else.
As usual, we denote a" = a a * * *(n a times) and a '
=
1. With the operations
30
0. PRELIMINARIES
defined by (7.1)and (7.2)the set K ( ( A ) )is a semiring. It may be verified that when K is complete K ( ( A ) ) is also complete. The support of a series a E K ( ( A ) )is the set supp(o) = { w E A* I (a,w ) # O}. The mapping a- supp(a) is an isomorphism from i3?((A))onto V(A*). A family (a&, of series is said to be locally finite if for all w E A*, the set { i E z l ( a , , w )# O }
is finite. In this case, a series a denoted t3=cui iEI
can be defined by
This notation makes sense because in the sum (7.4)all but a finite number of terms are different from 0. We easily check that for a locally finite family (&I,
Let E K ( ( A ) )be a series such that
(6,l)= 0. Then th6 family (a")nlois locally finite. In fact, the support of contain words of length less than n. We denote o*=
c a",
o+ =
0"
does not
C a'. n l 1
'20
Then a* = 1 + a+,
a*@= aa* = a + .
PROFQSITION 7.1 Let K be a ring with unit and let a E K ( ( A ) ) be a series such that (a, 1)= 0. Then 1 - a is inoertible and a* = (1 - a)-'. Proof
We have
1 = a* - a + = a* - a*g = a*(l - a). Symmetrically, 1 = (1 - @)a*,hence the result. 0
(7.5)
7.
31
FORMAL SERIES
For X c A*, we denote by 3 the characteristic series of X defined by ( X , X )=
1 0
if X E X , else.
We consider the characteristic series X of X as an element of N((A)). When X = {x} we usually write x instead of 5. In particular, since the family ( x ) is~locally ~ ~ finite, we have X = c x s x x More . generally, we have for any series a E K ((A)), 0
=
c (a,w)w.
weA*
PROPOSITION 7.2 Let X , Y c A*. Then
if W # X U Y , if W E ( X - Y ) U ( Y - X ) , if W E X ~ Y .
1
0
(X+_Y,w)= 1 2 In particular, with Z = X v Y,
&+r=z
if
XnY=@.
0
Given two sets X , Y c A*, the product XY is said to be unambiguous if any word w E X Y has only one factorization w = xy with x E X , y E Y.
PROPOSITION 7.3 Let X , Y c A*. Then
(X_Y,w) = Card{(x, y ) E X
I
x Y w = xy}.
In particular, with Z = X Y ,
z=gy iff the product X Y is unambiguous. 0 The following proposition approaches very closely the main subject of this book. PROPOSITION 7.4
For X c A', we haoe
((X)*,w)=Card{(x,,...,x , ) I n 2 0 , x i ~ X , w = x 1 x Z . . . x , } . (7.6) Proof
By the definition of
(X)*we have
Applying Proposition 7.2, we obtain (($)k,
W)
= Card{(x,,x,,
whence formula (7.6). 0
. ..,Xk) I X i E x, W = "1x2 " ' x k } ,
32
0. PRELIMINARIES
EXAMPLE 7.1 A*
= (1
- A)-’ = CWEA, w and (A*A*, w ) = 1 + IwI.
We now define the Hadamard product of two series a, z E K((A)) as the series a 0 z given by (a 0 7 , w ) = (a,w)(z,w). This product is distributive over addition, viz. a0(z
+ z’) = o a t +oat’.
If the semiring K satisfies xy = 0 =. x = 0 or y = 0, then supp(a 0 z) = supp(a) n supp(z) In particular, for X,Y c A* and Z = X n Y,
z=xo_r. Given two series a, z E Z ( ( A ) ) we write a I; z when (a,w ) I(z, w ) for all w E A*. 8. PERMUTATION GROUPS
In this section we give some elementary results and definitions on permutation groups. Let G be a group and let H be a subgroup of G. The right cosets of H in G are the sets H g for g E G. The equality Hg = Hg’ holds iff gg‘-’ E H . Hence the right cosets of H in G are a partition of G. When G is finite. [G:H ] denotes the index of H in G. This number is both equal to Card(G)/Card(H) and to the number of right cosets of H in G. Let Q be a set. The symmetric group over Q composed of all the permutations of Q is denoted by 6,.For Q = { 1,2,, , ,,n} we write 6, instead of 6~1,2,..,,n). A permutation is written to the right of its argument. Thus for g E 6, and q E Q the image of q by g is denoted by qg. A permutation group over Q is any subgroup of 6,. For instance, the alternating group over { 1,2,. ..,n } , denoted by a,,,is the permutation group composed of all even permutations. Let G be a permutation group over Q. The stabilizer of q E Q is the subgroup of G composed of all permutations of G fixing q,
H = { h E G I qh = 4). A permutation group over Q is any subgroup of 6,. For instance, the exists g E G such that p g = q. PROPOSITION 8.1 1. Let G be a group and let H be a subgroup of G. Let Q be the set of right cosets of H in G.
Q = { H g l g E GI.
8.
33
PERMUTATION GROUPS
Let cp be the mapping from G into Gladefined for g E G and H k E Q by ( H k )cp(d = H(kg). The mapping cp is a morphismfrom G into 6, and the permutation group q ( G ) is transitive. Moreover, the subgroup cp(H)is the stabilizer of the point H E Q. 2. Conversely, let G be a transitive permutation group over Q, let q E Q and let H be the stabilizer of q. The mapping y from G into Q defined by
Y: 9 H 4 8 induces a bijection a from the set of right cosets of H onto Q and for all k E G, G, a(Hk) g = a(Hkg).
g E
Proof 1. The mapping cp is well defined because H k = Hk' implies Hkg = Hk'g. It is a morphism since cp(1) = 1 and (Hk)cp(g)cp(g')= (Hkg)cp(g')= Hkgg' = (Hk)cp(gg'). The permutation group q ( G ) is transitive since for k,k' E G, we have ( H k ) cp(k-'k') = Hk'. Finally, for all h E H , cp(h)fixes the coset H and conversely, if cp(g) ( g E G ) fixes H , then Hg = H and thus g E H. 2. Assume that Hg = Hg'. Then gg'-* E H , thus qgg'-' =-q, showing that qg = qg'. Thus y ( g ) = y(g'). This shows that we can define a function a by setting a(Hg) = y(g). Since G is transitive, y is surjective and therefore also a is surjective. To show that a is injective, assume that a(Hg) = a(Hg'). Then qg = qg', whence qgg'-l = q. Thus gg'-' fixes q. Consequently gg'-' E H , whence Hg = Hg'. The last formula is a direct consequence of the fact that both sides are equal to qkg. 0
Let G be a transitive permutation group over a finite set Q. By definition,the degree of G is the number Card(Q).
PROPOSITION 8.2 Let G be a transitive permutation group over a Jinite set Q. Let q E Q and let H be the stabilizer of q. The degree of G is equal to the index of H in G. Proof The function a: Hg Hqg of Proposition 8.1(2)is a bijection from the set of right cosets of H onto Q. Consequently Card(Q) = [ G : H I . 0 Two permutation groups G over Q and G' over Q' are called equivalent if there exists a bijection a from Q onto Q' and an isomorphism cp from G onto G' such that for all q E Q and g E G, C4qg) = a(q)cp(g)
34
0 . PRELIMINARIES
or equivalently, for q’ E Q and g E G, q ’ d d = a((@- ‘(4’))g).
As an example, consider a permutation group G over Q and let H be the stabilizer of some q in Q. According to Proposition 8.1(2) this group is equivalent to the permutation group over the set of right cosets of H obtained by the action of G on the cosets of H. Another example concerns any two stabilizersH and H’of two points q and q’ in a transitive permutation group G over Q. Then H and H’are equivalent. Indeed, since G is transitive, there exists g E G such that qg = q’. Then g defines a bijection a from Q onto itself by a@) = p g . The function cp: H -+ H‘ given by cp(h) = g - ’ hg is an isomorphism and for all p E Q, h E H, a(PN = a ( p )
cpm.
Let G be a transitive permutation group over Q.An imprimitivity equivalence of G is an equivalence relation 8 over Q that is stable for the action of G, i.e., such that for all g E G, p=q
mod0 + p g r q g mod8
The partition associated with an imprimitivity equivalence is called an imprimitiuity partition. Let 0 be an imprimitivity equivalence of G. The action of G on the classes of 8 defines a transitive permutation group denoted by Go called the imprimitivity quotient of G for 8. For any element q in Q, denote by [ q ] the equivalence class of q mod 8, and let K, be the transitive permutation group over [ q ] formed by the restrictions to [q] of the permutations g that globally fix [a], i.e., such that [ q l g = [ q ] . The group K, is the group induced by G on the class [ q ] . We prove that the groups K,,q E Q all are equivalent. Indeed let q,q’ E Q and g E G be such that qg = q‘. The restriction a of g to [ q ] is a bijection from [ q ] onto [q’].Clearly, a is injective. It is surjective since if p = q’ mod 8, then p g - ’ = q mod 8 and a ( p g - ’ ) = p . Let cp be the isomorphism from Kq onto K,. defined for k E K, by p’cp(k)= a(a- ‘(p’)k).This shows that the groups K , and K,.are equivalent. In particular, all equivalence classes mod 8 have the same number of elements. Any of the equivalent groups K, are called the induced groups of G on the classes of 8 and are denoted by G’. Let d = Card Q be the degree of G, e the degree of Go,and f the degree of G’. Then d = ef.
Indeed, e is the number of classes of 8 and f is the common cardinality of each of the classes mod 0.
Let G be a transitive permutation group over Q. Then G is called primitive if the only imprimitivity equivalences of G are the equality relation and the universal relation over Q.
PROPOSITION8.3 Let G be a transitive permutation group over Q. Let q E Q and H be the stabilizer of q. Then G is primitive iff H is a maximal subgroup of G. Proof Assume first that G is primitive. Let K be a subgroup of G such that c K c G. Consider the family of subsets of Q having the form qKg for g E G. Any two of these subsets are either disjoint or identical. Suppose indeed that for some k, k’ E K and g , g‘ E G, we have qkf = qk’g’. Then qkg g ‘ - k‘- = q, showing that keg’- k’- E H c K.Thus gg’- E K , whence Kg = Kg’ and consequently qKg = qKg’. Thus the sets 4Kg form a partition of Q which is clearly an imprimitivity partition. Since G is primitive this implies that either qK = {q} or qK = Q. The first case means that K = H. In the second case, K = G since for any g E G there is some k E K with qk = qg showing that g k - E H c K which implies g E K . This proves that His a maximal subgroup. Conversely, let H be a maximal subgroup of G and let 0 be an imprimitivity equivalence of G. Let K be the subgroup
H
’
K ={kEGlqk=qmodO}. Then H c K c G, which implies that K = H or K = G. If K = H, then the class of q is reduced to q and 0 is therefore reduced to the equality relation. If K = G, then the class of q is equal to Q and 8 is the universal equivalence. Thus G is primitive. 0 Let G be a transitive permutation group on Q. Then G is said to be regular if all elements of G - { 1) have no fixed point. It is easily verified that in this case Card(G) = Card(Q).
PROPOSITION 8.4 Let G be a transitive permutation group over Q and let q E Q. The group G is regular iff the stabilizer of q is a singleton. 0 Let k 2 1 be an integer. A permutation group G over Q is called k-fold trunsitiue (or k-transitive) if for all k-tuples ( p l , p 2 , ...,p k ) E Qk and (ql, q 2 , . ..,qk) E Qk composed of distinct elements, there is a g E G such that PIg = 41, P 2 g = qZ,.**,Pkg= qk. The 1-transitive groups are just the transitive groups. Any k-transitive group for k 2 2 is clearly also (k - 1) transitive. The group 6, is n-fold transitive.
PROPOSITION 8.5 Let k 2 2 be an integer. A permutation group G over Q is ktransitive iff G is transitive and fi the restriction to the set Q - (4)of the stabilizer of q E Q is (k - lktransitiue.
36
0 . PRELIMINARIES
Proof The condition is clearly necessary. Conversely assume that the condition is satisfied and let (pl, p 2 , . . .,p k ) E Qk,and (ql,q2,. .., q k )E Qkbe k-tuples composed of distinct elements. Since G is transitive, there exists a g E G such that plg = q l . Let H be the stabilizer of q l . Since the restriction of H to the set Q - { q l } is (k - 1)-fold transitive, there is an h E H such that p z g h = q 2 , . . .,pkgh = qk. Since plgh = ql, the permutation g’ = gh satisfies p l g f = q l , . ..,pkg’ = qk.This shows that G is k transitive. 0 .
A 2-transitive group is also called doubly transitive. PROPOSITION 8.6
A doubly transitive permutation group is primitive.
Proof Let G be a doubly transitive permutation group over Q and consider an imprimitivity equivalence 8 on G. If 8 is not the equality on Q, then there are two distinct elements q,q’ E Q such that q = q’mod 8. Let q” E Q be distinct from q. Since G is 2-fold transitive, there exist g E G such that qg = q and q’g = q”. Since 8 is an imprimitivity equivalence we have q 3 q” mod 8. Thus 8 is the universal relation on Q. This shows that G is primitive. 0
The converse of Proposition 8.6 is false. Indeed, for any prime number p , the cyclic group generated by the permutation (12.3- p) is primitive but is not doubly transitive. An interesting case where the converse of Proposition 8.6 is true is described in a famous theorem of Schur that will be stated in Chapter 5. NOTES
Each of the subjects treated in this chapter is part of a theory that we have considered only very superficially. A more complete exposition about words can be found in Lothaire (1983). For automata (Sect. 4) we follow the notation of Eilenberg (1974). Theorem 4.9 is due to S. Kleene. Our presentation of ideals in monoids (Sect. 5 ) is developed with more detail in Clifford and Preston (1961) or Lallement (1979). The notion of a prime monoid is not classical but it is well fitted to the situation that we shall find in Chapter IV. The 0-minimal ideals of prime monoids are usually called completely 0-simple semigroups. For semirings and formal series see Eilenberg (1974) or Berstel and Reutenauer (1984). Our definition of a complete semiring is less general than that of Eilenberg (1974) but it will be enough for our purposes. A classical textbook on permutation groups is Wielandt (1964).
CHAPTER
I
Codes
0. INTRODUCTION
The definitions and some important general properties of codes are presented in this chapter. The first two sections contain several equivalent definitions of codes and free submonoids. In Section 3 we give a method for verifying that a given set of words is a code. A more appropriate franework to answer this question will be given in Chapter IV where automata with multiplicities will be considered. In Section 4 we introduce the notion of a Bernoulli distribution over an alphabet. This allows us to give a necessary condition for a set to be a code (Theorem 4.2). The questions about probabilities raised in this and in the following section will be developed in more depth in Chapter VI. Section 5 introduces the concept of a complete set. This is in some sense a notion dual to that of a code. The main result of this chapter (Theorem 5.10) describescomplete codes by using results on Bernoulli distributions developed previously. In the last section the operation of composition of codes is introduced and several properties of this operation are established. 1. DEFINITIONS
This section contains the definitions of the notions of code, prefix (suffix, biprefix) code, maximal code, and coding morphism and gives examples. 37
38
I . CODES
Let A be an alphabet. A subset X of the free monoid A* is a code over A if for all n, m 2 1 and x,,. . .,x n , x \ , . . .,xk E X,the condition X,X~”’X,
= x’1 x’2”’X:,
(1.1)
implies
n =m and xi = xi for i = 1, ..., n. (1.2) In other words, a set X is a code if any word in X + can be written uniquely as a product of words in X,that is, has a unique factorization in words in X.Since 1.1 = 1, a code never contains the empty word 1. It is clear that any subset of a code is a code. In particular, the empty set is a code. The definition of a code can be rephrased as follows:
PROPOSITION 1.1 If a subset X of A* is a code, then any morphism /I: B* + A* which induces a bijection of some alphabet B onto X is injective. Conversely, if there exists an injective morphism 8:B* + A* such that X = /3(B), then X is a code. Proof Let 8: B* + A* be a morphism such that /?is a bijection of B onto X.Let u,u E B* be words such that B(u) = p(u). If u = 1, then u = 1; indeed, B(b) # 1 for each letter b E B since X is a code. If u # 1 and u # 1, set u = b , b,, u = b; bk, with n, m 2 1, b , ,. , .,b,, b; ,. . .,bk E B. Since p is a morphism, we have B(bl)***B(bn) = B(b\).**B(bk)* But X is a code and b(b,),B(b;)E X . Thus n = m and B(b,) = /?(b;) for i = l,.. ,,n. Now B is injective on B. Thus b, = b; for i = 1,. ..,n, and u = u. This shows that B is injective. Conversely, if 8:B* + A* is an injective morphism, and if
(1.3) for some n,m 2 1, x,,. . . ,x,, xi,. . .,xk E X = B(B), then we consider the letters b,, b; in B such that /3(bi)= x i , B(b;) = xi, i = 1,. . .,n,j = 1,. ..,m. Since bisinjective,Eq.(1.3)irnplies that b , . - . b , = b;..*bk.Thusn = mand b, = b:, whencex,=x;fori= 1, ...,n. 0 x l * * * x=nx;...x;
A morphism /IB* : -+ A * which is injective and such that X = B(B),is called a coding morphism for X . For any code X c A*, the existence of a coding morphism for X is straightforward: it suffices to take any bijection of a set B onto X and to extend it to a morphism from B* into A*. Proposition 1.1 is the origin for the terminology since the words in X encode the letters of the set B. The coding procedure consists of associating to a word bl b2 * * - b, (b, E B ) which is the text in plain language an encoded or enciphered message fl(b,).../?(b,)by the use of the coding morphism b. The fact that /3 is
I . DEFINITIONS
39
injective ensures that the coded text can be deciphered in a unique way to get back the original text.
1.1 For any alphabet A, the set X = A is a code. More generally, EXAMPLE if p 2 1 is an integer, then X = APis a code called the uniform code of words of length p. Indeed, if elements of X satisfy Eq. (l.l),then the constant length of words in X implies the conclusion (1.2). EXAMPLE 1.2 Over an alphabet A consisting of a single letter a, a nonempty subset X c a* is a code iff X is a singleton distinct from 1 (=ao). EXAMPLE 1.3 The set X = {aa, baa, ba} over A = {a, b} is a code. Indeed, suppose the contrary. Then there exists a word w in X + , of minimal length, that has two distinct factorizations, w = X 1 X z ” ’ X , = x;x;***x:, (n, m 2 1, x i , x J E X). Since w is of minimal length, we have x1 # x i . Thus x1 is a proper left factor of x i or vice versa. Assume that x1 is a proper left factor of x i . By inspection of X, this implies that x1 = ba, x i = baa. This in turn implies that x2 = aa, x; = aa (see Fig. 1.1). Thus x i = xla, xix; = xlx2a, and if we assume that x;x;. * * xb = x1x2 xpa, it necessarily follows that X p + l -- aa and ~ l p = + aa. ~ Thus x;x;*..xk+l = x l x z * ~ * x p + l aBut . this contradicts the existence of two factorizations.
EXAMPLE 1.4 The set X = {a, ab, ba} is not a code since the word w = aba has two distinct factorizations w = (ab)a = a(ba). The following corollary to Proposition 1.1 is useful. 1.2 Let a: A* + C* be an injective morphism. I f X is a code COROLLARY over A, then a ( X ) is a code over C. I f Y is a code over C , then a-’( Y ) is a code over A.
Proof Let 8:B* + A* be a coding morphism for X.Then a(p(B))= a ( X ) , and since a 0 8:B* + C*is an injective morphism, Proposition 1.1 shows that a ( X )is a code.
Fig. 1.1 A double factorization starting.
40
I . CODES
Conversely, let X = a - ' ( Y ) , let n, m 2 l , x l ,...,x , , , x ; , : . . , x ; that X I " ' X , = x; * . * x ; .
E
X be such
Then a ( x l ) * * a(x,) = a(x;) * * * a(x;).
Now Y is a code; therefore n = m and a(xJ = a(xf) for i = 1,. . . , a The injectivity of 01 implies that xi = xf for i = 1 , . . .,n, showing that X is a code. 0 COROLLARY 1.3 If X c A* is a code, then X " is a code for all integers n > 0. Proof Let p: B* + A* be a coding morphism for X. Then X" = p(B").But B" is a code. Thus the conclusion follows from Corollary 1.2. 0 EXAMPLE 1.5 We show that the product of two codes is not a code in general. To do this, we consider the sets X = {a, ba} and Y = {a, a b ) which are easily seen to be codes over the alphabet A = {a, b ) . Then
Z
=X
Y = { aa, aab, baa, baab}.
The word w = aabaab has two distinct factorizations,
w = (aa)(baab)= (aab)(aab). Thus 2 is not a code. An important class of codes is the class of prefix codes to be introduced now. Let X be a subset of A*. Then X is a prefix set if no element of Xis a proper left factor of another element in X.In an equivalent manner, X is prefix if for all x, x f in X , x I x f =$ x = x'. (1.4) This may be rephrased as: two distinct elements in X are incomparable in the prefix ordering. It follows immediately from (1.4) that a prefix set X containing the empty word just consists of the empty word. Suffix sets are defined in a symmetric way, A subset X of A* is sufix if no word in X is a proper right factor of another word in X. A set is biprefix if it is both prefix and suffix. Clearly, a set of words X is suffix iff 2 is prefix. 1.4 Any prefix (suffix,biprefix) set of words X # { I } is a code. PROPOSITION
Proof If X is not a code, then there is a word w of minimal length having two factorizations w =x1x*'"x" = x;x'z-.*x;
(Xi,X>EX).
4'
I . DEFINITIONS
Both x 1,x', are nonempty, and since w has minimal length, they are distinct. But then x l < x', or x i < x 1 contradicting the fact that Xis prefix. Thus Xis a code. The same argument holds for suffix sets. 0
A prefix code (sufix code, biprefix code) is a prefix set (suffix, biprefix set) which is a code, that is distinct from { 1). EXAMPLE 1.6 Uniform codes are biprefix. The sets X and Y of Example 1.5 are a prefix and a suffix code. EXAMPLE 1.7 The sets X = a*b and Y = {a"b":n 2 l } over A = {a, b } are prefix, thus prefix codes. The set Y is suffix, thus biprefix, but X is not. This example shows the existence of infinite codes over a finite alphabet. A code X is maximal over A if X is not properly contained in any other code over A, that is, if X cX',
X' code
X =X'.
The maximality of a code depends on the alphabet over which it is given. Indeed, if X c A* and A E B, then X c B* and X is certainly not maximal over B, even if it is a maximal code over A. The definition of a maximal code gives no algorithm that allows us to verify that it is satisfied. However, maximality is decidable, at least for recognizable codes (see Section 5).
EXAMPLE 1.8 Uniform codes A" are maximal over At Suppose the contrary. Then there is a word u E A + - A" such that Y = A" u ( u } is a code. The word w = u" belongs to Y*, and it is also in (A")* because its length is x Isome , I x 1,...,x,,, E A". Now a multiple of n. Thus w = u" = x 1 x 2 ~ ~ ~for u 4 A". Thus the two factorizations are distinct, Y is not a code and A" is maximal. PROPOSITION 1.5 Any code X over A is contained in some maximal code over A. Proof Let 9 be the set of codes over A containing X, ordered by set inclusion. To show that 9 contains a maximal element, it suffices to demonstrate, in view of Zorn's lemma, that any chain V (i.e., any totally ordered subset) in 9admits a least upper bound in 9. Consider a chain C of codes containing X ;then
is the least upper bound of C.It remains to show that n,m 2 1, y l , ..., y n y y ' , , . . . , y Ek ?be such that y,..'y" =y;...y&.
? is a code. For this, let
42
I . CODES
Each of the y,, y; belongs to a code of the chain W and this determines n + m elements (not necessarily distinct) of W. One of them, say 2, contains all the others. Thus y,, * , y,, y', ,* * *, y; E 2,and since 2 is a code, we have n = m and y, = y; for i = 1,. . .,n. This shows that P is a code. 0
--
Proposition 1.5 is no longer true if we restrict ourselves to finite codes. There exist finite codes which are contained in no finite maximal code. An example of such a code will be given in Section 5 (Example 5.1 1). The fact that a set X c A* is a code admits a very simple expression in the terminology of formal power series.
PROPOSITION 1.6 Let X be a subset of A', and let M = X * be the submonoid generated by X . Then X is a code ir &I = (X)*or equivalently M = (1 - X)-'. Proof
According to Proposition 0.7.3,the coefficient ((X)*, w ) of a word win
(X)*is equal to the number of distinct factorizations of w in words in X . By definition, X is a code iff this number takes only the values 0 and 1 for any word in A*. But this is equivalent to saying that (X)* is the characteristic series of its support, that is, (X)*= &I. 0 2. CODES AND SUBMONOIDS
The submonoid X * generated by a code X is sometimes easier to handle than the code itself. The fact that X is a code (prefix code, biprefix code) is equivalent to the property that X* is a free monoid (a right unitary, biunitary monoid). These latter properties may be verified directly on the submonoid without any explicit description of its base. Thus we can prove that sets are codes by knowing only the submonoid they generate. We start with a general property.
PROPOSITION 2.1 Let A be an alphabet. Any submonoid M of A* has a unique minimal set of Benerators X = ( M - 1) - ( M- l),. Proof Set Q = M - 1. First, we verify that X generates M , i.e., that X * = M . Since X c M , we have X * c M . We prove the opposite inclusion by induction on the length of words. Of course, 1 E X * . Let m E Q. If m # Q2, then m E X . Otherwise m = m1m2 with m,, rn, E Q both strictly shorter than m. Therefore m,, m2 belong to X * and m E X*. Now let Y be a set of generators of M.We may suppose that 1 4 Y. Then each x E X is in Y* and therefore can be written as x = yly2*.*y,
( y , Y,n ~ 2 0).
The facts that x # 1 and x 4 Q2force n = 1 and x E Y. This shows that X c Y. Thus X is a minimal set of generators and such a set is unique. 0
2. CODES
AND SUBMONOIDS
43
EXAMPLE 2.1 Let A = {a, b } and let M = { w E A* I IwI, = Omod2). Then we compute X = (M - 1) - ( M - 1)' = b u ub*a. We now turn to the study of the submonoid generated by a code. By definition, a submonoid M of A* is free if there exists an isomorphism a: B * + M
of a free monoid B* onto M.
PROPOSITION2.2 Zf M is a free submonoid of A*, then its minimal set of generators is a code. Conversely, if X c A* is a code, then the submonoid X * of A* is free and X is its minimal set of generators. Proof Let a: B* + M be an isomorphism. Then a, considered as a morphism from B* into A*, is injective.By Proposition 1.1, the set X = a(B)is a code. Next M = a(B*) = (a@))* = X * . Thus X generates M. Furthermore B = B + - B + B + and a(B+)= M - 1. Consequently X = (M - 1) (M - l)', showing that X is the minimal set of generators of M. Conversely,assume that X c A*, is a code and consider a coding morphism a: B* + A* for X . Then a is injective and a is a bijection from B into X . Thus a is a bijection from B* onto a(B*)= X * . Consequently X * is free. Now a is a bijection, thus B = B + - B + B + implies X = X + - X + X + , showing by Proposition 2.1 that X is the minimal set of generators of M. 0 The code X which generates a free submonoid M of A* is called the base of M.
COROLLARY 2.3 Let X and Y be codes over A. Zf X* = Y*, then X = Y EXAMPLE 2.1 (continued ) The set X is a (biprefix)code, thus M is a free submonoid of A*. According to Proposition 2.2, we can distinguish two cases where a set X is not a code. First, when X is not the minimal set of generators of M = X*,that is, there exists an equality x = XIX*"'X,
with x, x i E X and n 2 2. Note that despite this fact, M might be free. The other case holds when X is the minimal set of generators, but M is not free (this is the case of Example 1.4). We now give a characterization of free submonoids of A* which is intrinsic in the sense that it does not rely on the bases. Another slightly different characterization is given in the exercises. Let M be a monoid. A submonoid N of M is stable (in M) if for all u,u, WEM, u, v, uw, wv E N
-
wE N.
(2.1)
44
I . CODES
The hypotheses of (2.1) may be written as w EN-’N nNN-’,
thus the condition for stability becomes N-’N nNN-’ c N
or simply N-’N nNN-’ = N,
(2.2)
since 1 E N and therefore N c N - ’ N n N N - ’ . Figure 1.2 gives a pictorial representation of condition (2.1) when the elements u, u, w are words. The membership in N is represented by an arch. Stable submonoids have interesting properties. As a result, they appear in almost all of the chapters in this book. A reason for this is Proposition 2.4 which yields a remarkable characterization of free submonoids of a free monoid. As a practical application, the proposition is used to prove that some submonoids are free, and consequently, that their bases are codes without knowing the bases.
PROPOSITION 2.4 A submonoid N of A* is stable ifl
it is free.
Proof Assume first that N is stable. Set X = ( N - 1) - ( N - 1)’. To prove that X is a code, suppose the contrary. Then there is a word z E N of minimal length having two distinct factorizations in words of X, z = X 1 X 2 ” ’ X n = y,y,.**y,
withxl ,...,x,, yl,. . .,y, EX.We may suppose ( x l l < lyll. Then yl = x l w for some nonempty word w. It follows that Xi,
y2*.*ym,X 1 W = y1,
WY2”‘J”X2‘’’Xn
are all in N . Since N is stable, w is in N.Consequently y, = x1w $ X,which yields the contradiction. Thus X is a code. Conversely, assume that N is free and let X be its base. Let u, u, w E A* and suppose that u, v, uw, wu E N .
Fig. 1.2 Representation of stability.
2. CODES AND SUBMONOIDS
45
Set u = x l " ' xk, uw= Yl.**Y,,
Wv
= xk+l'"x,,
=Yl+l**.ys,
with x i , y j in X.The equality ~ ( w u = ) ( u w ) implies ~ x1 "'xkxp+l " ' x , = y1 "'y,y1+1 " ' y , .
Thus r = s and x i = yi( i = 1,. . .,s)since X is a code. Moreover, 1 2 k because luwl 2 IuI, showing'that 1(w
= xI"'xkxk+l"'xI = U X k + l " ' X l r
hence w = X k + 1 * * * xEl N . Thus N is stable. 0 For a prefix code X,the submonoid X* is free, as we have seen, but has a stronger property. Those submonoids which are generated by prefix codes, can also be characterized by a condition which is independent of the base. Let M be a monoid and let N be a submonoid of M. Then N is right unitary (in M ) if for all u, u E M , u,uu~N UEN. In a symmetric way, N is left unitary if for all u, u E M , U,UUEN= U E N .
The conditions may be rewritten as follows: N is right unitary iff N - ' N = N , and N is left unitarv ifr NN-' = N The submonoid N of M is biunitary if it is both left and right unitary. The four properties stable, left unitary, right unitary, and biunitary are of the same nature. Their relationships can be summarized as stable: N - ' N n N N - ' = N
a
\
left unitary: N N - ' = N
N - ' N = N : right unitary
\
8
biuriitary: N - ' N = N N - ' = N
EXAMPLE 2.1 (continued) The submonoid M is biunitary. Indeed, if u, uu E M then Iul, and (uu(, = IuI, Iul, are even numbers; consequently IuI, is even and u E M. Thus M is right unitary.
+
EXAMPLE 2.2 In group theory, the concepts stable, unitary and biunitary collapse and coincide with the notion of subgroup. Indeed, let H be a stable submonoid of a group G. For all h E H,both hh-' and h - ' h are in H . Stability implies that h-' is in H. Thus H is a subgroup. If H is a subgroup, then conversely HH-' = H - ' H = H,showing that H is biunitary.
46
I . CODES
The following proposition shows the relationship between the submonoids we defined and codes. PROPOSITION 2.5 A submonoid M of A* is right unitary (resp., left unitary, biunitary) iff its minimal set of generators is a prejix code (suffix code, bipreJix code). In particular, a right unitary (left unitary, biunitary) submonoid of A* is free. Proof Let M c A* be a submonoid, Q = M - 1 and let X = Q - Q2be its minimal set of generators. Suppose M is right unitary. To show that X is prefix, let x, xu be in X for some u E A*. Then x, xu E M and thus U E M. If u # 1, then U E Q ;but then X U E Q ' contrary to the assumption. Thus u = 1 and X is prefix. Conversely,suppose X is prefix. Let u, u E A* be such that u, uu E M = X*. Then u = x 1" ' X , ,
for some x,,. ..,x,, y , , . ..,y,
E
uu = y , " ' y ,
X . Consequently
X,X,"'X,U
= y,...y,.
Since X is prefix, neither x , nor y, is a proper left factor of the other. Thus xl = y,, and for the same reason x , = y,, ...,x, = y,. This shows that m 2 n and u = Y , + ~* * - y , belongs to M. Thus M is right unitary. 0 Let M be a submonoid of A*. Then M is maximal if M # A* and M is not properly contained in any other submonoid excepted A*. PROPOSITION 2.6 If M is a maximal free submonoid of A*, then its base X is a maximal code. Proof Let Y be a code on A with X 9 Y; Then X* c Y* and X* # Y* since otherwise X = Y by Corollary 2.3. Now X * is maximal. Thus Y* = A* and Y = A.'Thus X 9 A. Let b E A - X . The set Z = X v b2 is a code and M 9 Z * g A* since b' 4 M and b 4 Z*. This contradicts the maximality of M. 0
Note that the converse of the proposition is false since uniform codes A" (n 2 1) are maximal. But if k, n 2 2, we have (A'")* (A")* q A*. showing that (Ank)*is not maximal. We now introduce a family of biprefix codes called group codes which have interesting properties. Before we give the definition, let us consider the following situation. Let G be a group, H be a subgroup of G, and rp:
A*+G
(2.3)
47
2. CODES AND SUBMONOIDS
be a morphism. The submonoid M = cp-'(H) (2.4) is biunitary. Indeed, if, for instance, p , p4 E M, then cp(p), cp(p4) E H, therefore cp(p)-'cp(pq) = q(q)E H and q E M. The same proof shows that M is left unitary. Thus the base, say X,of M is a biprefix code. The definition of the submonoid M in (2.4)is equivalent to a description as the intersection of A* with a subgroup of the free group A@on A. Indeed, the morphism cp in (2.3)factorizes in a unique way in
with I the canonical injection. Setting Q = t j - ' ( H ) , we have M=QnA*.
Conversely if Q is a subgroup of A @and M = Q n A*, then
M
= [-' (Q).
A group code is the base X of a submonoid M = cp-'(H), where cp is a morphism given by (2.3)which, moreover, is supposed to be surjective. Then X is a biprefix code and X is a maximal code. Indeed, if M = A*, then X = A is maximal. Otherwise take w E A* - M and setting Y = X u w, let us verify that Y is not a code. Set m = cp(w). Since cp is surjective, there is word l E A* such that c p ( l ) = m-'. The words u = w l , v = l w both are in M , and w l w = uw = wv E Y*. This word has two distinct factorizations in words in Y, namely, uw formed of words in X followed by a word in Y, and wv which is composed the other way round. Thus Y is not a code and X is maximal. We give now three examples of group codes.
EXAMPLE 2.3 Let A = {a,b} and consider the set
M
= {w E
A*
I 1 WI, = Omod 2)
'
of Example 2.1. We have M = cp - (0), where cp:
A* -+ 2/22
is the morphism given by cp(a) = 1, cp(b) = 0. Thus the base of M,namely the code X = b u ab*a, is a group code, hence maximal.
48
I . CODES
EXAMPLE 2.4 The uniform code A" over A is a group code. The monoid (A")* is indeed the kernel of the morphism of A* onto Z / m Z mapping all letters on the number 1. EXAMPLE 2.5 Let A = {a, b } , and consider now the submonoid A* I IwIa = Iwlb) composed of the words on A having as many a's as b's. Let
iW
(2.5)
6: A* + Z be the morphism defined by a(a) = 1,6(b) = - 1 . Clearly 6(w) = IwIa - lwlb
for all w E A*. Thus the set (2.5) is equal to S-'(O). The base of S-'(O) is denoted by D or D,, the submonoid itself by D* or D1*.Words in D are called Dyck-primes, D is the Dyck code over A. The set D* is the Dyck set over A.
EXAMPLE 2.6 More generally, let A = B u B ( B n B = 0)be an alphabet with 2n letters, and let 6: A* + B* be the morphism of A* onto the free group Badefined by 6(b) = b, 6(b)= b - for b E B, 6 E B. The base of the submonoid 6-,(1) is denoted by D, and is called the Dyck code over A or over n letters. We now furn to a slightly different topic and consider the free submonoids of A* containing a given submonoid. We start with the following observation which easily follows from Proposition 2.4.
PROPOSITION 2.7 The intersection of an arbitrary family of free submonoids of A* is a free submonoid.
Proof Let (Mi)icfbe family of free submonoids of A*, and set M nlEf MI.Clearly M is a submonoid, and it suffices to show that M is stable. If a
=
u, vw, uv, w
E M,
then these four words belong to each of the Mi. Each Mi being stable, w is in Mi for each i E I . Thus w E M. 0 Proposition 2.7 leads to the following considerations. Let X be a subset of A*. As we have just seen, the intersection of all free submonoids of A* containing X is again a free submonoid. It is the smallest free submonoid of A* containing X.We call it the free hull of X . If X* is a free submonoid, then it coincides of course with its free hull. Let X be a subset of A*, let N be its free hull and let Y be the base of N.If X is not a code, then X # Y. The following result, known as the defect theorem gives an interesting relationship between X and Y.
49
2. CODES AND SUBMONOIDS
THEOREM 2.8 Let X be a subset of A*, and let Y be the base of the free hull of X.If X is not a code, then Card(Y) I Card(X) - 1. The following result is a consequence of the theorem. It can be proved directly as well (Exercise 2.1). COROLLARY 2.9 Let X = {xI,x2}. Then X is not a code ifl x1 and x2 are powers of the same word: x1 = y p , x2 = yq for some y E A', p , q 2 0. Note that this corollary entirely describes the codes with two elements. The case of sets with three words is already much more complicated. For the proof of Theorem 2.8, we first show the following result.
PROPOSITION 2.10 Let X c A* and let Y be the base of the free hull of X . Then
Y c X(Y*)-'
n (y*)-'X,
i.e., each word in Y appears as the j r s t (resp.,last) factor in the factorization of some word x E X in words belonging to Y. Proof Suppose that a word y E Y is not in (Y*)-'X. Then X c 1 u Y *( Y - y). Setting
z = Y*(Y - Y), we have 2' = Y*( Y - y), thus X c Z*. Now Z* is free. Indeed, any word z Z* has a unique factorization
E
z = Y ~ Y ~ * * . Y ~Y, ~ , * * * , Y ~YEn Y Z Y, ,
and therefore can be written uniquely as Z=yP~ZlyP2Z2...y~rZr, z1, ...,Z , E Y - y ,
piro.
Now X c Z* g Y*, showing that Y* is not the free hull of X. This yields the contradiction. 0 Proof of Theorem 2.8 If X contains the empty word, then X and X' = X - 1 have same free hull Y*. If the result holds for X', it also holds for X, since if X' is a code, then Y = X and Card(Y) = Card(X) - 1, and otherwise Card(Y) 5 Card(X') - 1 < Card(X) - 2. Thus we may assume that 1 $ X. Let a:X + Y be the mapping defined by a(x) = y
if
x~yY*.
This mapping is uniquely defined since Y is a code; it is everywhere defined since X c Y*. In view of Proposition 2.10, the function a is surjective. If X is
50
not a code, then there exists a relation X1X2"'X,
= x'x' 1 2
* * a
x m' ,
xi,x;EX
with xl # x i . However, Y is a code, and by (2.6) we have a(xl) = a ( x i ) .
Thus a is not injective. This proves the inequality. 0 3. A TEST FOR CODES
It is not always easy to verify that a given set of words is a code. The test described in this section is not based on any new property of codes but consists merely in a systematic organization of the computations required to verify that a set of words satisfies the definition of a code. In the case where X is finite, or more generally if X is recognizable, the amount of computation is finite. In other words, it is effectively decidable whether a finite (recognizable)set is a code. Before starting the description of the algorithm, let us consider 3.1. Let A = {a,b},and X = {b,abb,abbba,bbba,baabb}.This EXAMPLE set is not a code. For instance, the word
w = abbbabbbaabb
has two factorizations (see Fig. 1.3) w = (abbba)(bbba)(abb)= (abb)(b)(abb)(baabb).
These two factorizations define a sequence of left factors of w, each one corresponding to an attempt at a double factorization. We give this list, together with the attempt at a double factorization: (abbba) = (abb)ba (abbba) = (abb)(b)a (abbba)bb = (abb)(b)(abb) (abbba)(bbba)= (abb)(b)(abb)ba (abbba)(bbba)abb= (abb)(b)(abb)(baabb) (abbba)(bbba)(abb)= (abb)(b)(abb)(baabb).
Each but the last one of these attempts fails, because of the rest corresponding to the italicized right factor, which remains after the factorization. The algorithm presented here computes all the remainders in all attempts at a double factorization. It discovers a double factorization by the fact that the empty word is one of the remainders.
3.
5'
A TEST FOR CODES
Fig. 1.3 Two factorizations.
Formally, the computations are organized as follows. Let X be a subset of A + , and let
u, = x - ' x - 1, for n 2 1.
V,,,, = X-'U,, v Un-'X Then we have the following result:
THEOREM 3.1 The set X c A + is a code iff none of the sets U,,defined above contains the empty word. EXAMPLE 3.1 (continued) The word ba is in U, ,next a E U,, then bb E U, and ba E U,, finally abb E U, and since 1 E U,, the set X is not a code, according to Theorem 3.1. The proof of Theorem 3.1 is based on the following lemma. 3.2 Let X c A + and let (U,,),,2,be defined as above. For all n 2 1 LEMMA and k E { 1,. ..,n}, we have 1 E U,, iff there exist a word u E uk and integers i, j 2 0 such that u X ' n X J#
a,
i
+ j + k = n.
(3.1) Proof We prove the statement for all n by descending induction on k. Assume, first k = n. If 1 E U,,, taking u = 1, i = j = 0, Eq. (3.1) is satisfied. Conversely, if (3.1) holds, then i = j = 0. This implies u = 1 and consequently 1 E v,. Now let n > k 2 1, and suppose that the equivalence holds for n, n - 1 , . . .,k + 1. If 1 E U,,, then by induction hypothesis, there exists v E U,, and two integers i,j 2 0 such that
,
vX'nX'#@,
i + j + k + l =n.
Thus there are words x E X i , y E X Jsuch that vx = y . Now v E U,, are two cases. Either there is a word z E X such that
zv =
E
uk,
or there exist z E X , u E Uk such that z = uv.
',and there
52
I . CODES U
z
*
m
0
0
Y
Y
0
(b)
(a)
Fig. 1.4 The two cases.
In the first case [see Fig. 1.4a], one has ux = z y , thus uXi nX j + l #
125,
u E U,.
,
In the second case [see Fig. 3.2(b)], one has z x = uux = uy, thus uXI n X i + ' # fa,
u E uk.
In both cases, Formula (3.1) is satisfied. Conversely, assume that there are u E Uk and i , j 2 0 with uXi nXj,
i
+ j + k = n.
Then UX,X,"'Xi
=y,y,***yj
for some x,, y, in X . If j = 0, then i = 0 and k = n. Thus j 2 1. Once more, we distinguish two cases, according to the length of u compared to the length of Yl-
If u = y,u for some u E A + , then u E x - ' U k c UX1X2"'Xi
and further
&+I
= yz***yj.
Thus u X i n X j - l # 125 and by the induction hypothesis 1 E U,. If y , = uu for some u E A + , then u E U ; ' X c u k + l and X1XZ"'Xi
= oy,'*.y,,
showing that X i n ox]-' # fa, Thus again 1 E U,, by the induction hypothesis. This concludes the proof. 0 It may be noteworthy that the nature of U1did not play any role in the proof of the lemma. In other words, the lemma remains true if U1 is replaced by any set Y, provided that for n > 1, the sets U,,are defined as before. However, the definition of U , is essential in the proof of Theorem 3.1 itself. Proof of Theorem 3.1 If X is not a code, then there is a relation x1x~*'*xp=ylyz*.*yq,xi,yjEX,
XI
ZyI.
3.
A TEST FOR CODES
53
Assume lyll < Ixll. Then x1 = y , u for some u E A'. But then and
u E U,
uXp-' n X 4 - ' #
0.
According to the lemma, 1 E U p + , Conversely, if 1 E U,,, take k = 1 in the lemma. There exists u E U, and integers i , j 2 0, such that UX'n X j # 0.Now u E U, implies that xu = y for some x, y E X . Furthermore x # y since u # 1. It follows from XUX' n x X i # 0 that yX' n x X j # 0, showing that X is not a code. This establishes the theorem. 0 Proposition 3.3 shows that Theorem'3.1 provides an algorithm for testing whether a recognizable set is a code.
PROPOSITION 3.3 If X c A + is a recognizable set, then the set of all V,, ( n 2 0) is jnite. This statement is straightforward for a finite set X since each U,,is composed of right factors of words in X . Proof
Let o be the syntactic congruence of X , defined by w=w'
modo
iff ( V U , O E A * , ~ W U E X O U W ' V E X ) .
Let p be the congruence of A* with the two classes { 1) and A'. Let i = o n p. We use the following general fact. If L c A* is a union of equivalence classes of a congruence 8, then for any subset Y' of A*, Y - ' L is a union of congruence classes mod 8. (Indeed, let z E Y-'L and z' = zmod 8. Then yz E L for some y E Y, whence yz' E L. Thus z' E Y-'L). We prove that each U,,is a union of equivalence classes of z by induction on n 2 1. For n = 1, X is a union of classes of o,thus X - ' X also is a union of classes for o,and finally X - ' X - 1 is a union of classes of 1. Next, if U,, is a union of classes of z, then by the previous fact both U,,- ' X and X - U,, - are unions of classes of 1. Thus U,,, is a union of classes of z. The fact that X is recognizable implies that z has finite index. The result follows. 0
'
EXAMPLE 3.1 (continued) For X = {b,abb, abbba, bbba, baabb}, we obtain Ul = {ba,bba,aabb};
X - ' U l = {a,ba};
U ; ' X = {abb};
V, = {a,ba,abb};
X - ' U , = {a, l};
U;'X
Thus 1 E U, and X is not a code.
=
{bb,bba,abb, 1, ba}.
I . CODES
54
EXAMPLE 3.2 Let X = {a,& ba} and A = {a,b}. We have
U, = { b } ;
U3 = { 1, b);
U, = {a};
U, = X ;
U, = U,.
The set U3 contains the empty word. Thus X is not a code. EXAMPLE 3.3 Let X = {aa,ba, bb, baa, bba} and A = {a, b } . We obtain
U,= {a}, U, = U,.Thus U, = {a} for all n 2 1 and X is a code. If X c A " is prefix (thus a code), then U, = X - ' X - 1 = 0. Thus the algorithm ends immediately for such codes. On the other hand, if X is suffix, the algorithm does not stop after one step. This asymmetrical behavior is due of course to the definition of the U,, which favors computing from left to right. 4. MEASURE OF A CODE
In this section,we give a precise formulation to the idea that a code has only few words, by introducing the notion of the measure of a code. A Bernoulli distribution on A* is a function n: A* + R,
which is a morphism into the multiplicative monoid R, of nonnegative real numbers, and which moreover satisfies
c n(a)
= 1.
ae.4
A distribution is positive if n(a) > 0 for all a E A. It follows from the definition
that n(1) = 1
and that
Indeed, for u E A", we have
c n(ua)
acA
= n(u)
.
c n(a)
= n(u),
acA
thus
Thus n defines a probability distribution on each set A". We extend n to '$(A*) by setting for L c A*
4.
55
MEASURE OF A CODE
Then a becomes a function a: V ( A * ) + @
and satisfies the following properties: for all L c A*;
a(L) 2 0 for any family
I
a(@)= 0;
of subsets of A*
if the sets Li are pairwise disjoint, then
The value a(L)is the measure of the set L relative to a. It is a nonnegative number or + co. There is a special case of these formulas which is useful for the computation of the measure of a set. Let L c A*, and for n 2 0 set s, = a {w E L
I IW I I n}.
Then the following formula holds in 9: a(L) = SUPS,.
(4.3)
n20
EXAMPLE 4.1 Let A be a finite nonempty set, and 1
.(a) = -
Card(A)’
a E A.
Then n defines a Bernoulli distribution on A* called the unijorrn distribution.
EXAMPLE 4.2 Let A = {a,b} and X = {a,ba,bbj. Let a:A* + R, be a distribution. Setting as usual a(b)= q
.(a) = p ,
=
1- p,
we have a ( X )= p
+ pq +
q2
=p
If L, M are subsets of A*, then LM =
+ p q + (1 - p ) q = p + q = 1.
u u {lrn}.
IeLmEM
(4.4)
56
I . CODES
It follows from (4.1) that
The inequality in (4.5) may be strict if the union in (4.4) is not disjoint. For any subset X of A*, Eqs. (4.1) and (4.5) imply that
c n(X,) c rc(X),.
n(X*) I
I
ntO
n2O
Thus if n(X)< 1, then n(X*) < co. Proposition 4.1 shows that for codes, the converse implication holds, and that the inequalities in (4.6) are equalities.
PROPOSITION 4.1 Let X c A + and let a be a Bernoulli distribution on A*. (1) If X is a code, then for n 2 1; a(X)";
n(X")= n(X),,
.(X*) =
c
(4.7)
ntO
and in particular
a ( X * )< co
ifl
a ( X ) < 1.
(2) Conversely, if rc is positiue, if n ( X ) is finite, and if rc(Xn)= rc(X)"
for all n 2 1,
(4.8)
then X is a code. For n 2 1, let S,, be the n-fold Cartesian product of X:S,, = X x
Proof
xx
* * *
x
x.
(1) Assume that X is a code. Then the function (XI,.
.. ,x,) H X I
"
'X,
is a bijection of S, onto X".Consequently
n(X), = (X1
c
....,x.)
rc(x,)**.n(x,)= E S"
C
a(x) = n(X").
XEX"
This shows (4.7). Next the sets (X"),,,o are pairwise disjoint. Thus a(X*)=
c n(X,).
ntO
This together with (4.7) proves the second equality. The last one is an immediate consequence. (2) Assume that X is not a code. Then there is a word u E X* having two distinct factorizations in words in X,say u = X1XZ"'Xn = x;X;.**xk,
4.
57
MEASURE OF A CODE
with n, m 2 1 and x i ,X J E X . The word w = uu has two factorizations: u =x ~ x 2 ~ ~ ~ x "= x x; '1'x~'2 "' 'xX ~~ X 1 X ~ " * X ,
of the same length k = n + m. It follows that
+
1
n ( X ) k=
n(Y1"'Yk) 2 n ( X k ) n(w).
(Yl,. . . . Y k ) E S k
But the finiteness of n ( X )and condition (4.8)imply n(w) I0, hence n(w) = 0. This contradicts the hypothesis that n is positive. 0
THEOREM 4.2 Let X be a code over A. For all Bernoulli distributions n on A*, we have n ( X ) I1. In the case where the alphabet A is finite and where the distribution n is uniform, we obtain,
COROLLARY 4.3 Let X be a code over an alphabet with k letters; then
Proof of Theorem 4.2 We prove the statement first in the case where the numbers 1x1, for x E X, are bounded. (This is the case of a finite code, but a code satisfying the condition is not necessarily finite when the alphabet is infinite.) Thus we assume that there is some integer k 2 1 with
X
c A uA2 u
*
u Ak.
It follows that for n 2 1, X " c A u A 2 u ... u Ank,hence n(X") Ink.
Arguing by contradiction, assume now that n ( X ) > 1, that is, n ( X )= 1 + & for some E > 0. Then by Eq. (4.7),for n 2 1 n(X)" = (1 + &)"* In view of (4.9),we have for all n 2 1 (1
+
E)"
Ikn,
which is impossible. Thus n(X)I1. If X is an arbitrary code, set for n 2 1 X,, = { x E X 11x1 In}. The set X , is a code satisfying the conditions of the first part of the proof. Consequently
.(X") I 1.
58
I . CODES
This shows, in view of Eq. (4.3),that n(X) = supn(X,) I1. 0 n21
EXAMPLE 4.3 Let A = {a,b}, and X = {b,ab,ba}. Define n by .(a) = n(b) = 3. Then n(X) = 3
4,
+ 8 + y,
thus X is not a code. Note that for n(a)= n(b)= 4,we get n(X) = 1. Thus it is impossible to conclude that X is not a code from the second distribution. The following example shows that the converse of Theorem 4.2 is false.
EXAMPLE4.4 Let A
= {a,b},and X = {ab,aba,aab}. The set X is not a
code since (aba)(ab)= (ab)(aab).
However, any Bernoulli distribution n gives n ( X ) < 1. Indeed, let n(a)= p , n(b) = q. Then n(x) = p q
+ 2p2q.
It is easily seen that we always have p q I and also p 2 q I&, provided that + q = 1. Consequently
p
n(X) I $
+ & < 1.
This example gives a good illustration of the limits of Theorem 4.2 in its use for testing whether a set is a code. Indeed, the set X of Example 4.4, where the test fails, is obtained from the set of Example 4.3, where the test is successful, simply by replacing b by ab. This shows that the counting argument represented by a Bernoulli distribution takes into account the lengths as well as the number of words, In other words, Theorem 4.2 allows us to conclude that X is not a code only if there are “too many too short words.” Proposition 4.4 is very useful for proving that a code is maximal. The direct method for proving maximality, based on the definition, indeed is usually much more complicated than the verification of the conditions of the proposition. A more precise statement, holding for a large class of codes, will be given in the next section (Theorem 5.10). PROPOSITION 4.4 Let X be a code over A. If there exists a positive Bernoulli distribution A on A* such that n(X) = 1, then the code X is maximal. Proof Assume that X is not maximal. Then there is some word y E X such that Y = X u y is a code. By Theorem 4.1, we have n ( Y ) I1. On the
4.
MEASURE OF A CODE
59
other hand, a(Y)= a(X)
+ a(y) = 1 + a( y).
Thus a(y) = 0,which is impossible since n is positive. 0
EXAMPLE 4.2 (continued) Since a(X)= 1 and X is prefix, X is a maximal code. EXAMPLE 4.5 We examine again the Dyck code D over A = {a,b} of Example 2.5. Let a:A* + R + be a positive Bernoulli distribution, and let
44 = P,
a(b)= q,
d!,*)= a(D*n A"),
d, = x ( D n A"),
n 2 0.
In view of computing the measure of D, consider the formal power series
We have for n 2 0
(
d#+
d$*' = :)p"q";
= 0.
On the other hand, n(L n A") I1 for any L c A* and for all n. In particular, dl?) I1 for all n 2 0, and consequently the series dF'z" converges in the region {z E C I IzI < I}. In fact
xnZO
1
d'*'(z) =
JW
for IzI < 1
(4.10)
since in this region, the (generalized) binomial formula gives 1
1(- 1)"( -')4"p"qnz2", n Jv =
"20
and an elementary verification shows that
);(
= (-I)"(
3 4 "
(4.1 1)
Each word in D* n A" has a unique factorization into words in D. Thus
d!,*' =
C
C
kL 0 ml + ... +mk = n
dm,dm;~~dm,.
For the corresponding generation functions, this gives 1 d'*'(t) = ___ 1 - d(t)'
(4.12)
60
I . CODES
Consequently we have, for (zI < 1, (4.13)
It follows that
1 d, =
n(D) =
n t O
lim
C d,z"
z+l n t O z=[O.lI
lim ~ - J w = l - J i - Y i & .
=
2'1 ZE P , 1 [
It is easily seen that 1 - ,/-
= 1 - Ip
K ( D )= 1 - ( p - q ( = 1
-
- 41. Thus we get
Ix( u )
- a(b)l.
4,
For n(a) = n(b) = we have n(D)= 1. This gives another proof that D is a maximal code (Example 2.5). Note that n ( D ) < 1 for any other Bernoulli distribution. By Formula (4.12),we have (using again the binomial formula)
Since (t)= (1/2n)(,,:+) we obtain [using again Formula (4.1l)] (4.14)
This gives as a byproduct the following formula Card(D n A2")= Let D' = D n aA* = D n A*b. The set D'* is exactly the set of well parenthesized strings when letter a stands for '(' and letter b for ')'. The previous formulae allow us to compute easily its length distribution. Let D" = D n bA* = D n A*a and
d; = n(D' n A"),
d r = n(D" n A")
let d'(t), d"(t) be the series d'(t) = C n 2 0 d b t n ,d r ( t ) = C n 2 0 d ~ t "We . have d' = D" and d'(t) = d"(t). Since d(t) = d'(t) + d"(t), we obtain d'(t) = i d ( t ) . Let d$*) = n(D'* n A") and d'(*)(t)= ~ n t O d ~ * ) Then, t " . as for (4.12) d"*'(t) =
1 1 - d'(t)'
5.
61
COMPLETE SETS
Since d'(t) = i d ( t ) ,we obtain from (4.13)
2
d"*'(Z) =
1
+ Jiq$
=- 44
2P9Z2'
Therefore by (4.14)
and also Card(D'* n A'") =
-(2")# 1
(4.15)
n
n+l
The numbers y,, = (l/(n + 1)(2,") for n 2 0 are called the Catalan numbers. Their first values are given below n
o
1
yn
1
1
2
2
3
5
4
5
14
42
EXAMPLE 4.6 The set X = Un,,a"bA" is prefix, and therefore is a code over A = {a, b } . It is a maximal code. Let indeed K be a positive Bernoulli distribution, and set p = A@). Then K(U"bA")= p"(1 - p )
hence
.(X) =
c p"(1 - p )
= (1/(1- p))(l - p ) = 1.
n2O
5. COMPLETE SETS
Any subset of a code is itself a code. Consequently, it is important to know the structure of maximal codes. Many of the results contained in this book are about maximal codes. The notion of complete sets introduced in this section is in some sense dual to that of a code. For instance, any set containing a complete set is itself complete. Even if the duality is not perfectly balanced, it allows us to formulate maximality in terms of completeness, thus replacing an extremal property by a combinatorial one. Let M be a monoid and let P be a subset of M . An element m E M is completable in P if there exist u, v in M such that umv E P .
62
I . CODES
It is equivalent to say that P meets the two-sided ideal M m M , MmM n P # 0,
or, in other words, that m E F(P) = M - l P M - ' . A word which is not completable in P is incompletable. The set of words completable in P is of course F(P);the set F(P) = M - F(P) of incompletable words is a two-sided ideal of M which is disjoint from P. A subset P of M is dense in M if all elements of M are completable in P, thus if F(P) = M or, in an equivalent way, if P meets all (two-sided)ideals in M . Clearly, each superset of a dense set is dense. The use of the adjective dense is justified by the fact that dense subsets of M are exactly the dense sets relative to some topology on M (see Exercise 5.3).
EXAMPLE 5.1 Let A
= {a}.The dense subsets of A*
are the infinite subsets.
EXAMPLE 5.2 In a group G , any nonempty subset is dense, since GmG = G for m in G. EXAMPLE 5.3 The Dyck code D over A = {a,b } is dense in A*. Indeed, if w E A*, then u = a31W1bwblWI is easily seen to be in D*.Furthermore, no proper nonempty left factor of u is in D*. Thus u is is D,showing that w is completable in D. It is useful to have a special term for codes X such that the submonoid X* is dense. A subset P of M is called complete in M if the submonoid generated by P is dense. Every dense set is also complete. Next a subset X of A* is complete iff
F(X*)= A*. EXAMPLE 5.4 Any nonempty subset of a + is complete, since it generates an infinite submonoid. The following theorem is of fundamental importance. 5.1 Any maximal code is complete. THEOREM
To prove this result, we first describe a method for embedding any code in a complete code.
PROPOSITION 5.2 Let X c 4' be a code. Let y such that A*yA* n X* = 0.Let
U
= A*
- X* - A*yA*.
Then the set
Y = x u y(Uy)* is a complete code.
E
A* be an unbordered word
5.
COMPLETE SETS
Proof
63
Let V = A* - A*yA*.
Then by assumption X *
c
Vand U = V - ' X * . Let us first observe that the set 2 = vy
is a prefix code. Assume indeed that uy u'y for two words u and u' in V. Since y is unbordered, uy must be a left factor of u'. But then u' is in A*yA*, a contradiction. Thus Z is prefix. Now we show that Y is a code. Assume the contrary and consider a relation
-=
~
1 ***Yn ~ = 2 Y'IY;
~k
with y,, .. . ,yk E E: and y, # y',. The set X being a code, one of these words must be in Y - X. Assume that one of y,, . . .,y, is in Y- X,and let p be the smallest index such that y p E y(Uy)*. From y $ F(X*) it also follows that y, $ F(X*). Consequently one of y', ,.. .,yk is in y(Uy)*. Let q be the smallest index that yb E y(Uy)*. Then
Y,...Y,-lY,
,
--
y;y;.**y;-lyEZ
,
whence y1 * y,- = y', yb - since 2 is prefix. The set X is a code, thus from y, # y', it follows that p = q = 1. Set yl
= yuly*"yuky,
Y', = yu;y..-yu;y,
with u , , . . ., u k ,u', ,.. . ,u; E U.Assume y , c y ; . Since 2 is prefix, the set Z* is right unitary. From U c V, it follows that each uiy, uiy is in 2. Consequently u1 = u',
9 .
Let t = u; + y
* *
. ., uk = u;.
yuiy. We have y , * * * y , = ty;***y;.
The word y is a factor of t, and thus occurs also in y2 y., This shows that one of y2,...,y,,, say y,, is in y(Uy)*. Suppose r is chosen minimal. Then y,***y,-,y ~Zandu;+,y~Zareleftfactorsof thesameword. With theset2 being prefix, we have ~ ; + = l ~ 2 * * * ~ r - 1 *
Thus u;+ E X*, in contradiction with the hypothesis u ; + ~E U . This shows that Y is a code. Finally, let us show that Y is complete. Let w E A* and set
w
=u,yu~y~~*yU,~,yu,
64
I . CODES
with n 2 1 and uiE A*
- A*yA*.
Then ywy E Y*.
Indeed let ui,,u i z r , ..,uik be those u:s which are in X*.Then YWY
= (WIY * * * Y ~ i l lY)vil(YOi, + 1 Y * * . Yuiz-
x
1 ~ )
x uik(Yui~+ly***y~ny)*
Each of the parenthesized words is in Y. Thus the whole word is in Y *. 0
Proof of Theorem 5.1 Let X c A + be a code which is not complete. If Card(A) = 1, then X = 125, and X is not maximal. If Card(A) 2 2, consider a word u 4 F(X*). According to Proposition 3.6 of Chapter 0, there is a word u E A* such that y = uu is unbordered. We still have y I# F(X*).If follows then from Proposition 5.2 that X u y is a code. Thus X is not maximal. 0
EXAMPLE 5.5 Let A = {a, b } and X = {bb,bbab, babb}. The word y = aba is incompletable in X*. However, X u y is not a code, since (bb)(aba)(babb)= (bbab)(aba)(bb). This example shows that Proposition 5.2 is false without the assumption that y is unbordered.
EXAMPLE 5.6 We are able now to verify one of the claims made in Section 1, namely that there do exist finite codes which are not contained in a maximal finite code. Let X = {a5,ba2,ab, b } . It is a code over A = {a, b } . Any maximal code containing X is infinite. Indeed, let Y be a maximal code over A containing X,and assume Y finite. Set m = max(ly1 I y E Y} and let u = bma4+5mbm.
The maximality of Y implies its completeness. Thus u is a factor of a word in Y *, Neither b" nor can be proper factors of a word in Y. Thus there exist y, y' E Y u 1 and integers p, q, r 2 0 such that u = bpyaqy'b'
with a4 E Y* (see Fig. 1.5). The word a5 is the only word in Y which does not I y'l. = 4mod 5. contain b; thus q is a multiple of 5; this implies that I y .1 Let y = bha5s+i and y' = a'+ srbkwith 0 Ii, j I4. We have i + j = 4 mod 5 whence i + j = 4.We will show that any choice of i, j leads to the conclusion that Y is not a code. This yields the contradiction.
+
k b * a 5 ( ' + '*) ab * bk-'. If i = 0,j = 4,then k 2 1 and we have ba2 - u 5 f + 4 b=
.
-
If i = 1,j = 3, then bha5s+1 b = bh ass ab.
5.
65
COMPLETE SETS
bm
b"
Y
Y'
Fig. 1.5 The factorization of bma4+5nbm in words in Y
-
If i = 2, j = 2, then b a2+"bk = b a 2 .a''. bk.
. - -
If i = 3, j = 1, then h 2 1 and bha5s+3b = bh-' ba2 ass ab. Finally,
.
if i = 4, j = 0,then bha5s+4.,ab = bh a5(s+1).b. This example is a particular case of a general construction (see Exercise 5.2). The converse of Theorem 5.1 is false (see Example 5.7). However, it is true under an additional assumption that relies on the following definition. A subset P of a monoid M which is not dense is called thin. If P is thin, there is at least one element m in M which is incompletable in P, i.e., such that MmM n P = 0,or equivalently F(P) # M . The use of the adjective thin is justified by results like Proposition 5.3 or 5.6. PROPOSITION 5.3 Let M be a monoid and P , Q c M . Then the set P u Q is thin iff P and Q are thin. J" R is dense and P is thin, then R - P is dense. Proof
If P and Q are thin, then there exist m, n E M such that MmM n P = @,
MnM n Q = @.
Then mn is incompletable in P u Q and therefore P u Q is thin. Conversely if P u Q is thin, there exists m E M which is uncompletable in P u Q and therefore incompletable in P and also in Q. Hence P and Q are thin. If R is dense in M and P is thin, then R - P cannot be thin since otherwise R = ( R - P ) u P would also be thin by the above statement. 0 Thin subsets of a free monoid have additional properties. In particular, any finite subset of A* is clearly thin. Furthermore, if X, Yare thin subsets of A* then the set XY
is thin. In fact, if u # F ( X ) , u 4 F( Y), then uu 4 F(X Y). EXAMPLE 5.7 The Dyck code D over A = {a,b ) is dense (See Example 5.3). It is a maximal code since it is a group code (see Example 2.5). For each x E D, the code D - x remains dense, in view of Proposition 5.3, and thus remains complete. But of course D - x is no more a maximal code. This example shows that the converse of Theorem 5.1 does not hold in general.
66
I . CODES
Theorem 5.1 admits a converse in the case of codes which are both thin and complete. Before going on to prove this, we give some useful properties of these sets. PROPOSITION 5.4 Let X c A* be a thin and complete set. Let w be a word incompletable in X . Then A* = d - l X * g - l = D-'X*G-' (5.3)
u
dED geG
where D and G are the sets of right (resp. left)factors of w. Note that the set D x G is finite. Proof Let z E A*. Since X * is dense, the word wzw is completable in X * , thus for some u, v E A*
uwzwv E x*. Now w is not a factor of a word in X . Thus there exist two factorizations w = g l d = gd, such that ug,,dzg,d,v E X*. This shows that z E d-'X*g-'.
0
PROPOSITION 5.5 Let X be a thin and complete subset of A*. For any positive Bernoulli distribution R on A*, we have
R ( X ) 2 1. Proof We have n(A*) = OD.In view of Eq. (5.3), there exists a pair (d,g) E D x G such that n(d-'X*g-') = 00. Now d(d-'X*g-')g c X*. This implies n(d)n(d- ' X * g - ' ) n ( g ) 5 n(X*). The positivity of n shows that n(dg) # 0. Thus n(X*)= OD. Now n(X*)5
1 n ( X n )Ic (R(X))".
nTO
n2O
Assuming R ( X ) < 1, we get n(X*) < OD. Thus n ( X ) 2 1. 0 Note the following property showing, as already claimed before, that a thin set has only few words.
5.
67
COMPLETE SETS
PROPOSITION 5.6 Let X c A* be a thin set. For any positive Bernoulli distribution on A*, we have n(X) < 00. Proof Let w be a word which is not a factor of a word in X : w I$ F ( X ) . Set n = IwI. We have n 2 1. For 0 5 i I n - 1, consider X,={x~XIIxI=imodn}.
It suffices to show that n(Xi)is finite for i = 0,. ..,n
-
1. Now
x,c A'(A" - w)*. Since A" - w is a code, we have
C (n(A"- w ) ) =~ 1 ( 1 - T C ( W ) ) ~ .
n[(A" - w)*] =
k2O
ktO
The positivity of n implies n(w) > 0, and consequently 1 n[(A" - w)*] = n(w)*
Thus .(Xi) I l/n(w). 11 We are now ready to prove
THEOREM 5.7 Any thin and complete code is maximal. Proof Let X be a thin, complete code and let IT be a positive Bernoulli distribution. By Proposition 5.5, n(X) 2 1, and by Theorem 4.2, we have n(X) I 1. Thus n ( X ) = 1. But then Proposition 4.4 shows that X is maximal. 0 Theorems 5.1 and 5.7 can be grouped together to give THEOREM 5.8 Let X be a code over A. Then X is complete if X is dense or maximal. Proof Assume X is complete. If X is not dense, then it is thin, and consequentlyX is maximal by the previous theorem. Conversely, a dense set is complete, and a maximal code is complete by Theorem 5.1. 0 Before giving other consequences of these statements, let us present a first application of the combinatorial characterization of maximality.
PROPOSITION 5.9 Let X c A* be a j n i t e maximal code. For any nonempty subset B of A, the code X n B* is a maximal code over B. In particular, for each letter a E A, there is an integer n such that a" E X. Proof The second claim results from the first one by taking B = {a}.Let n = max (1x1 1 x E X} be the maximal length of words in X, and let 0 # B c A. To show that Y = X n B* is a maximal code over B, it suffices to show,
68
I . CODES
..' U
... .'.
*..
-
b" t 1
w
... b"+l
..* v
Fig. 1.6 The factorization of ub"+'wb"+'u,
in view of Theorem 5.7, that Y is complete (in B*). Let w E B* and b E B. Consider the word w' = bfl+lwbfl+1. The completeness of X gives words u, u E A* such that uw'v = x1x2"'xk
for some xl, x 2 , .. .,xk E X . But by the definition of n, there exist two integers i,j(l IicjIk)suchthat x i x i + l * * * x=j b'wbs
for some r, s E { 1 , . . ,,n } (see Fig. 1.6). But then x i ,xi + l,. . .,x j E X n B* = Y. This shows that w is completable in Y*. 0 Let X c A + be a finite maximal code, and let a E A be a letter. The (unique) integer n such,that a" E X is called the order of a relative to X . TWREM5.10 Let X be a thin code. 7kefollowing conditions are equivalent: (i) (ii) (iii) (iv)
X is a maximal code. 'Ihere exists a positive Bernoulli distribution x with x ( X ) = 1. For any positive Bernoulli distribution II, we have n ( X ) = 1 . X is complete.
Proof (i) =. (iv) is Theorem 5.1. (iv) (iii) is a consequence of Theorem 4.2 and Proposition 5.5. (iii) =. (ii)is not very hard, and (ii) * (i) is Proposition 4.4. 0
Theorem 5.10 gives a surprisingly simple method to test whether a thin code is maximal. It suffices to take any positive Bernoulli distribution and to check whether the measure of the code equals 1 . EXAMPLE 5.8. The Dyck code D over A = {a,b} is -maximal and complete, but satisfies z ( D ) = 1 only for one Bernoulli distribution (see Example 4.5). Thus the conditions (i) + (ii) + (iv) do not imply (iii) for dense codes. 5.9 The prefix code X = UnZO a"bA" over A = {a,b } is dense EXAMPLE since for all w E A*, alWlbwE X . It satisfies (iii), as we have seen in Example 4.6. Thus X satisfies the four conditions of the theorem without being thin.
5.
69
COMPLETE SETS
THEOREM 5.11 Let X be a thin subset of A', and let n be a positive Bernoulli distribution. Any two among the three following conditions imply the third (i) X is a code, (ii) n ( X ) = 1, (iii) X is complete. Proof (i) + (ii) * (iii).The condition n ( X ) = 1 implies that X is a maximal code, by Proposition 4.4. Thus by Theorem 5.1, X is complete. (i) + (iii) => (ii)Theorem 4.2 and condition (i)imply that n ( X )I 1. Now X is thin and complete; in view of Proposition 5.5, we have n ( X ) 2 1. (ii) + (iii) (i) Let n 2 1 be an integer. First, we verify that X" is thin and complete. To see completeness, let u E A*, and let u, w E A* be such that uuw E X * . Then uuw E X k for some k 2 0. Thus (uuw)"E (X")kc (X")*. This Further, since X is thin and because the shows that u is completable in (X")*. product of two thin sets is again thin, the set X" is thin. Thus, X" is thin and complete. Consequently, n(X")2 1 by Proposition 5.5. On the other hand, we have n(X") In(X)"and thus n(X") I 1. Consequently n(X")= 1. Thus for all n 2 1 n(X") = n(X)".
Proposition 4.1 shows that X is a code. I] Thin codes constitute a very important class of codes. They will be characterized by some finiteness condition in Chapter IV. We anticipate these results by proving a particular case which shows that the class of thin codes is quite a large one. PROPOSITION 5.12 Any recognizable code is thin. Proof Let X c A* be a recognizable code, and let d = (Q,i, T ) be a deterministic complete automaton recognizing X . Associate to a word w, the number p(w) = Card@. w) = Card{q w Iq E Q}.
-
We have p(w) I Card Q and for all u, u, P(UWU)
P(W).
Let J be the set of words w in A* with minimal p(w). The inequality above shows that J is a two-sided ideal of A*. Let w E J,andlet P = Q w. Then P . w = P . Indeed P. w c Q w = P,and on the other hand, P w = Q w2. Thus Card(P w) = p(w2). Since p(w) is minimal, p(w2)= p(w), whence the equality. This shows that the mapping p H p w from P onto P is a bijection. It follows that there is some integer n with the property with the mapping p H p w" is the identity mapping on P .
-
+
.
-
70
I . CODES
Since P = Q w, we have q.w=q-w"+'
forall q E Q .
To show that X is thin, it sufficesto show that X does not meet the two-sided ideal J. Assume that J n X # 0 and let x E X n J . Then i.x=tE?: Next x E J and, by the previous discussion, there is some integer n 2 1 such that i x"+' = t. This implies that x"' E X.But this is impossible, since X is a code. 0
.
The converse of Proposition 5.12 is false: EXAMPLE 5.10 The code X = {a"b"I n 2 l } is thin (e.g., ba is not a factor of X),but X is not recognizable.
EXAMPLE 5. I 1 In one interesting case, the converse of Proposition 5.12 holds: Any thin group code is recognizable. Indeed let X c A* be a group code. Let cp: A* + G be a surjective morphism onto a group G, and let H be a subgroup of G such that X* = cp-'(H). By assumption, X is thin. Let m be a word that is incompletable in X.We show that H has finite index in G, and more precisely that G
=
u HCcp(p)l-',
psm
(where p runs over the left factors of m).Indeed let g E G and w E cp-'(g). Let u E A* be such that cp(u) is the group inverse of gcp(m). 73en cp(wmu) = gcp(rn)cp(u)= 1, whence wmu E X * . Now m is incompletable in X.Thus m is not factor of a word in X and consequently there is a factorization m = pq such that wp, qu E X * . But then h = cp(wp) E H . Since h = gcp(p), we have g E Hcp(p)-'. This proves the formula. The formula shows that there are finitely many right cosets of H in G. Thus the permutation group on the right cosets, say K,is also finite. Let a: G -+ K be the canonical morphism defined by Hra(g) = Hrg (see Section 0.8). Then, setting N = {a E K I Ha = H } , we have H = a - ' ( N ) = a-'(a(H)). Thus X* = $-I$(X*), where $ = a cp. Since K is finite, this shows that X* is recognizable. Consequently, X is also recognizable (Exercise 2.7). Remark We have used in the preceding paragraphs arguments which rely essentially on two techniques: measures on the one hand which allowed us to prove especially Theorem 5.7 and direct combinatorial arguments on words on the other (as in the proof of Theorem 5.1). It is interesting to note that some of the proofs can be done by usingjust one of the two techniques. A careful analysis shows that all preceding statements with the exception of those involving maximality can be established by using only arguments on measures. As an example, the implication (ii) (iv)
6.
7'
COMPOSITION
in Theorem 5.10 can be proved as follows without using the maximality of X . If X is not complete, then X* is thin. Thus, by Proposition 5.6, a(X*) 00 which implies a ( X ) 1 by Proposition 4.1. Conversely, there exist, for some of the results given here, combinatorial proofs which do not rely on measures. This is the case for Theorem 5.8, where the proof given relies heavily on arguments about measures. Another proof of this result will be given in Chapter IV (Corollary 5.5). This proof is based on the fact that if X c A is a thin complete code, then all words w E A* satisfy
-=
-=
+
( X * w X * ) + n X* #
0.
This implies Theorem 5.8, because according to this formula, X u w is not a code for w # X and thus X is a maximal code. 6. COMPOSITION
We now introduce a partial binary operation on codes called composition. This operation associates to two codes Y and Z satisfying a certain compatibility condition a third code denoted by Y 0 2. There is a twofold interest in this operation. First, it gives a useful method for constructing more complicated codes from simpler ones. Thus we will see that the composition of a prefix and a suffix code can result in a code that is neither prefix nor suffix. Second,and this constitutes the main interest for composition, the converse notion of decomposition allows us to study the structure of codes. If a code X decomposes into two codes Y and Z, then these codes are generally simpler. Let Z c A* and Y c B* be two codes with B = alph(Y). Then the codes Y and Z are composable if there is a bijection from B onto 2. If is such a bijection, then Y and Z are called composable through p. Then defines a morphism B* -P A* which is injective since Z is a code (Proposition 1.1). The set
x = p(Y) c z* c A*
(6.1)
is obtained by composition of Y and 2 (by means of p). We denote it by X=YO,Z
or
X=YOZ,
when the context permits it. Since p is injective, X and Y are related by bijection, and in particular CardX = Card Y. The words in X are obtained just by replacing, in the words of Y, each letter b by the word p(b)E 2. The injectivity of p, the Corollary 1.2 and (6.1) show the following result
PROPOSITION 6.1 If Y and 2 are two composable codes, then X = Y 0 Z is a code. [I
I . CODES
72
EXAMPLE 6.1 Let A = {a,b } , B = { c , d , e }and 2 = {a,baybb} c A*,
Y = {cc,d,dc,e,ec} c B*.
The code 2 is prefix, and Y is suffix. Further Card B = Card 2. Thus Y and 2 are composable, in particular by means of the morphism p: B* -+ A* defined by p(c) = a, P(d) = ba, P(e) = bb. Then X = Y 0 2 = {aa, baybaa, bb, bba}. This code X is neither prefix nor suffix. Now choose P': B* -+ A*, p'(c) = ba,
p'(d) = a,
P'(e) = bb.
Then X' = Y oa.2 = {baba,a, aba, bb, bbba}. This example shows that the composed code Y os2 depends essentially on the mapping 8. Let X c A* be a code, with A = alph(X). Since A is a code over itself and X = z(X), with I : A* A* the identity mapping, we have -+
x =X0,A. Let p: B*
-+
A* be a coding morphism for X . Then X = p ( B ) and thus X = BOsX.
These examples show that every code is obtained in at least two ways as a composition of codes. The two expressions X = X o A and X = B o X are exactly the particular cases obtained by replacing one of the two codes by the alphabet in the expression x =YOZ. Indeed, if Y = B, then 2 = P(B) = X ; if now Z = A , then B can be identified with A, and Y can be identified with X . Notice also the formula X = YosZ=>X"= Y"osZ
n22.
Indeed, Y" is a code (Corollary 1.3) and Y" 0 z = P(Y")= X".
PROPOSITION 6.2 Let X c C*, Y c B*, and 2 c A* be three codes, and assume that X and Yare composable through y and that Y and 2 are composable through P. Then ( X oyY )0 s 2 = x osoy( Y 0s 2).
6.
COMPOSITION
73
Proof We may suppose that C = alph(X), B = alph(Y). By hypothesis, the injective morphisms y:
C*+B*,
8: B * + A *
satisfy y ( C ) = Y,
Let 6: D*
P(B) = Z .
-,C* be a coding morphism for X; thus 6 ( D ) = X. Then D* d ,c*4 B*L A.
and &d(D) = p y ( X ) = X OsyPy(C) = X osy( Y ODZ), and also Pyd(D) B(rw4)= Y W ) 0s P(B) = (Xoy Y) 0 z. 0
=
Some of the properties of codes are preserved under composition.
PROPOSITION 6.3 Let Y and Z be composable codes, and let X
=
Y 0 Z.
1. If Y and Z are prefix (sufJix) codes, then X is a prefix (sufJix)code. 2. If Y and Z are complete, then X is complete. 3. If Y and Z are thin, then X is thin.
The proof of (3) uses Lemma 6.4 which cannot be established before Chapter IV, where more powerful tools will be available. LEMMA 6.4 Let Z be a thin complete code ouer A. For each word u E Z* there exists a word w E Z*uZ* having the following property. If mwn E Z*, then there exists a factorization w = sut with ms, tn E Z*. Proof of Proposition 6.3 Let Y c B*, Z c A*, and let P: B* injective morphism with P(B) = 2. Thus X = P( Y) = Y Og 2.
+ A*
be an
1. Assume Y and Z are prefix codes. Consider x,xu E X with u E A*. Since X c Z*, we have x , xu E Z* and since Z* is right unitary, this implies u E Z*. Let y = p- ' ( x ) ,u = p - ' ( u ) E B*. Then y, yu E Y and Y is prefix; thus u = 1 and consequently u = 1. This shows that X is prefix. The case of suffix codes is handled in the same way. 2. Let w E A*. The code 2 is complete, thus uwu E Z* for some u, u E A*. Let h = p- '(uwu) E B*. There exist, by the completeness of Y, two words il, U E B* with tihU E Y*. But then B(U)uwo/?(S)E X*. This proves the completenessof X. 3. If Z is not complete, then F(X)c F(Z*) # A* and X is thin. Assume now that 2 is complete. The code Y is thin. Consequently F(Y) # B*. Let P E B* - F(Y),and u = p(t7). Let w be the word associated to u in Lemma 6.4. Then w $ F(X). Indeed, assuming the contrary, there exist words m, n E A* such that
x = mwn E X c Z*.
74
I . CODES
In view of Lemma 6.4, x = msutn,
with ms, tn E Z* = p(B*).
Settingp = fl-'(ms),4 = j-'(tn),wehavepi&~Y.Thusii~F(Y),contrary to the assumption. This shows that w is not in X, and thus X is thin. 0 We now consider the second aspect of the composition operation, namely the decomposition of a code into simpler ones. For this, it is convenient to extend the notation alph in the following way: let Z c A* be a code, and X c A*. Then alph,(X) = { z E Z I 3u,v E Z* :uzu E X}. In other words, alph,(X) is the set of words in Z which appear at least once in a factorization of a word in X as a product of words in 2. Of course, alph, = alph. The following proposition describes the condition for the existence of a decomposition.
PROPOSITION 6.5 Let X, Z c A* be codes. There exists a code Y such that X=YOZ
ifs alph,(X) = Z . (6.2) The second condition in (6.2)means that all words in Z appear in at least one factorization of a word in X as product of words in 2. X c Z*
and
Proof Let X = Y OsZ, where p:B* +A* is an injective morphism, Y c B* and B = alph(Y). Then X = p(Y) c F(B*) = Z* and further p(B) = alphscB,(/?(Y)), that is, Z = alph,(X). Conversely, let 6: B* + A* be a coding morphism for 2, and set Y = p-'(X). Then X c B(B*) = Z* and b(Y) = X. By Corollary 1.2, Y is a code. Next alph( Y) = B since Z = alph,(X). Thus Y and Z are composable and X = Y0,Z. 0
We have already seen that there are two distinguished decompositions of a code X c A* as X = YoZ, namely X = BOX and X = X o A . They are obtained by taking Z = X and Z = A in Proposition 6.5 [and assuming A = alph(X)]. These decompositions are not interesting. We will call indecomposable a code which has no other decompositions. Formally, a code X c A* with A = alph(X) is called indecomposable if X = Y 0 2 and B = alph(Y) imply Y = B or Z = A. If X is decomposable, and if 2 is a code such that X = Y 0 Z, and Z # X, A, then we say that X decomposes ouer 2.
EXAMPLE 6.1 (continued) The code X decomposesover Z. On the contrary, the code Z = {a, ba, bb} is indecomposable. Indeed, let T be a code such that
6.
COMPOSITION
75
Z c T*, and suppose T # A. Necessarily, a E T. Thus b 4 T. But then ba, bb E T, whence Z c T. Now Z is a maximal code (Example 4.2), thus 2 = T.
PROPOSITION 6.6 For any jinite code X, there exist indecomposable codes Z1,.. .,2, such that
x = z1o..*oz,. To prove this proposition, we introduce a notation. Let X be a finite code, and let I(X) =
C (1x1 - 1) =
11x1 - Card(X). *EX
XPX
For each x E X, we have 1x1 2 1. Thus l(X) 2 0, and moreover I(X) = 0
iff
X is a subset of the alphabet.
PROPOSITION 6.7 If X,Z c A* and Y c B* are jinite codes such that X = Y 0 2, then I(X) 2 I(Y) I@).
+
Proof Let 8: B*
4
A* be the injective morphism such that
x = Y0,Z. From Card(X) = Card( Y) it follows that
By assumption B = alph(Y), whence x y E y ( y l b2 1 for all b in B. Further Ifl(b)I 2 1 for b E B by the injectivity of P. Thus
Proof of Proposition 6.6 The proof is by induction on I(X).If I(X) = 0, then X is composed of letters, and thus is indecomposable. If I(X) > 0 and X is decomposable,then X = Y 0 Z for some codes Y, 2. Further Y and Z are not formed of letters only, and thus 1( Y) > 0, I(Z) > 0. By Proposition 6.7, we have
76
I . CODES
Z(Y) < l ( X ) and l(2)< 1(X). Thus Y and Z are compositions of indecodposable codes. Thus X also is such a composition. 0 Proposition 6.6 shows the existence of a decomposition of codes. This decomposition needs not be unique. This is shown in EXAMPLE 6.2 Consider the codes X = {UU, ba, baa, bb, bba} and Y = {cc,d,dc,e,ec},
Z = {a,ba,bb}
of Example 6.1. As we have seen, X = Y 0 2. There is also a decomposition
x = Y'OyZ' with
Y' = {cc,d,cd,e,ce},
Z' = {aa,b, ba},
and
y:B* + A*
defined by y(c) = b, y(d) = aa, y(e) = ba.
The code Z is undecomposable, the code Z' is obtained from Z by interchanging a and b, and by taking then the reverse. These operations do not change indecomposability. EXAMPLE 6.3 This example shows that in decompositions of a code in indecomposable codes, even the number of components need not be unique. For X = {a3b},we have X = { c d } 0 {a2,ab}= { c d } 0 { u 2 , u } 0 {a,ab} and also
X
= {cd}0
( a 3 ,b } .
This gives two decompositions of length 3 and 2, respectively. The code X in Example 6.2 is neither prefix nor suffix, but is composed of such codes. We may ask whether any (finite) code can be obtained by composition of prefix and suffix codes. This is not the case, as shown in 6.4 The code X = { b ,ba, a2b,a3ba4}does not decompose over a EXAMPLE prefix or a suffix code. Assume the contrary. Then X c Z* for some prefix (or suffix) code Z # A. Thus Z* is right unitary (resp. left unitary). From b, ba E Z*, it follows that a E Z*, whence A = (a, b} c Z* and A = 2. Assuming Z* left unitary, b,
6.
COMPOSITION
77
a2b E Z* implies a’ E Z*. It follows that a 3 b E Z*, whence a 3 E Z* and finally a E Z*. Thus again Z = A. We now give a list of properties of codes which are inherited by the factors of a decomposition. Proposition 6.8 is in some sense dual to Proposition 6.3.
PROPOSITION 6.8 Let X,Y , Z be codes with X = Y O2. 1. I f X 2. l f X 3. lf X 4. If X
is prefix (sufix), then Y is prejx (sufix). is maximal, then Y and Z are maximal. is complete, then Z is complete. is thin, then Z is thin.
Proof We assume that X,Z c A*, Y c B*, p: B* + A* an injective morphism with P(B) = 2, p( Y ) = X. 1. Let y, yu E Y. Then p(y), p(y)p(u)E X,and since X is prefix, p(u) = 1. Now p is injective, whence u = 1. 2. If Y is not maximal, let Y’ = Y u y be a code for some y $ Y. Then p( Y’) = p( Y ) u B( y ) is a code which is distinct from X by the injectivity of /I Thus . X is not maximal. Assume now that Z is not maximal. Set 2’ = Z u z for some z 4 Z such that 2’ is a code. Extend B to B‘ = B u b ( b 4 B ) and define jover B’* by P(b) = z . Then is injective by Proposition 1.1 because 2’ is a code. Further Y’ = Y u b is a code, and consequently p(Y’) = X u z is a code, showing that X is not maximal. 3. is clear from X * c Z*. 4. Any word in Z is a factor of a word in X. Thus F ( Z ) c F ( X ) . By assumption, F ( X ) # A*. Thus F ( Z ) # A* and Z is thin. 0 PROPOSITION 6.9 Let X , Y ,Z be three codes such that X = Y 0 2. Then X is thin and complete o Y and Z are thin and complete.
Proof By Proposition 6.3, the code X is thin and complete, provided Y and Z are. Assume conversely that X is thin and complete. Proposition 6.8 shows that Z is thin and complete. In view of Theorem 5.8, X is a maximal code. By Proposition 6.8, Y is maximal, and thus Y is complete (Theorem 5.2). It remains to show that Y is thin. With the notations of the proof of Proposition 6.8, consider a word u $ F ( X ) . Since Z* is dense, sut E Z* for some words s , t E A*. Thus sut = p(w) for some w E B*. But now w is not completable in Y , since otherwise hwk E Y for some h,k E B*, giving p(h)sut B(k) E X , whence u E F ( X ) . Thus Y is thin. 0 By Proposition 6.9, for thin codes Y, 2, the code Y 0 Z is maximal iff Y and Z are maximal. We have no example showing that this becomes false without the assumption that Y and 2 are thin.
78
I . CODES
PROPOSITION 6.10 Let X be a maximal code over A. For any code Z c A*, the following equivalence holds: X decomposes over Z iff X* E Z*. I n particular, X is indecomposable iff X* is a maximal submonoid of A*. Proof If X decomposes over 2, then X* c Z*. Conversely, if X* c Y*, let 2 = alph,(X). Then X c Z*, and of course 2 = alphz(X). By Proposition 6.5, X decomposes over 2. In view of Proposition 6.8, the code 2 is maximal. By 2 c 2, we have 2 = 2. 0
EXAMPLE 6.5 Let A be an alphabet. We show that the uniform code A" decomposes over 2 iff Z = A" and m divides n. In particular, A" is indecomposable for n prime and for n = 1. Indeed, let A" = X = Y os2, where Y c B* and /3: B* + A*. The code X is maximal and biprefix, and thus Y also is maximal and biprefix and Z is maximal. Let y E Y be a word of maximal length, and set y = ub with b E B. Then Y u uB is prefix. Let indeed y' = ub', b' E B. Any proper left factor of y' is also a proper left factor of y, and therefore is not in Y u uB;next if y' is a left factor of some y" in Y u uB, then by the maximality of the length of y, we have ly'l = ly"I and y' = y". Thus Y u uB is a code. Hence Y u uB = Y, because Y is maximal. It follows that B(uB) = p(u) Z c X. Now X is a uniform code, thus all words in 2 have the same length, say m. Since Z is maximal, Z = A". It follows that n = m Iyl. 0 EXERCISES SECTION 1 1.1.
Let n 2 1 be an integer. Let I, J be two sets of nonnegative integers such that for i, i' E I and j,j' E J , i+j=i'+j'
modn
implies i = i', j = j'. Let Y = {a'ba'l i E I , j E J >
and X = Y u a". Show that X is a code.
SECTION 2 2.1. 2.2.
Show directly (i.e,,without using Theorem 2.8) that a set X = {x, y} is a code iff x and y are not powers of a single word. Let K be a field and A an alphabet. Denote by K ( A ) the subalgebra of K ( ( A ) ) formed by the series o E K ( ( A ) ) such that (6,w) = 0 for all
79
EXERCISES
but a finite number of words w E A*. Let X c A + be a code and fl: E* -+ A* be a coding morphism for X. Extend fl by linearity to a morphism from K ( E ) into K ( A ) . Show that bdefines an isomorphism between K ( B ) and the subalgebra of K ( A ) generated by the elements of x. 2.3.
Show that a submonoid N of a monoid M is stable iff for all m, n E M we have nm,n,mn E N
2.4. 2.5.
*2.6.
S , + , = [S,,-'S,, n S,,Sn-']*.
Set S(X)= u n z 0 S , . Show that S ( X ) is the free hull of X.Show that when X is recognizable, the free hull of X is recognizable. Let M be a submonoid of A* and let
X
2.8.
m EN.
Let M be a commutative monoid. Show that a submonoid of M is stable iff it is biunitary. For X c A + let Y be the base of the smallest right unitary submonoid containing X. (a) Show that Y c ( Y * ) - ' X . (b) Deduce that Card(Y) ICard(X), and give an example showing that equality might hold. Let X be a subset of A + . Define a sequence (S,Jnroof subsets of A* by setting So = X*,
2.7.
3
=(M
- 1) - ( M
- 1)2
be its minimal set of generators. Show that X is recognizable iff M is recognizable. Let M be a monoid. Show that M is free iff it satisfies the following conditions: (i) there is a morphism
1: M - N into the additive monoid N such that 1-'(0) = 1, (ii) for all x, y, z, t E M, we have x y = zt
iff there exists u E M such that xu = z, y = ut or x = zu, uy = t .
SECTION 3 3.1.
Let X be a subset of A + such that X n XX' = @. Define a relation E X* such that
p c A* x A* by (u, u) E p iff there exists x UXUEX,
ux # 1,
uu # 1,
x u # 1.
a0
I . CODES
Show that X is a code iff (1,l) 4 p + , where p + denotes the transitive closure of p. SECTION 4 4.1.
Let D be the Dyck code over A = {a, b } . (a) Show that the following equality holds
Q = aQ*b + b&*a with 0,= D n aA*b, D,,= D n bA*a. (b) Derive from the previous equality another proof of
d(t) = 1 4.2.
Jw,
where d(t) is the series defined in Example 4.5. Let n 2 1 be an integer and I, J be two subsets of {0,1,.. .,n - l } such that for each integer p in (0, 1, ...,n - l} there exist a unique pair (i, j) E I x J such that p=i+j
modn.
Let V = {i + j - n l i I~, j E J , i
+ j 2 n}.
For a set K of integers, let aK = {akI k E Kj.
Let X c {a, b}* be the set X = al(ba")*baJu a".
Show that X is a maximal code.
SECTION 5 Show that the set X = {a3,b,ab,ba2,aba2} is complete and that no proper subset of X is complete. Show that X is not a code. *5.2. Let p 2 5 be a prime number. Let I, J be two subsets of (0, 1,. ..,p - 2) such that for each integer q in (0, 1,. , .,p - 2) there exists a unique pair ( i , j )E I x J such that q = i + j . (a) Show that X = up u {a'bdl i E I , j E J } is a code. (b) Show that if Card(Z) 2 2 and Card@) 2 2, then X is not contained in any finite code. 5.3. Let M be a monoid. Let 9 be the family of subsets of M which are twosided ideals of M or empty. 5.1.
81
NOTES
(a) Show that there is a topology on M for which 9is the family of open sets. (b) Show that a subset P of M is dense in M with respect to this topology iff F ( P ) = M , i.e., if P is dense in the sense of the definition given in Section 5. 5.4. With the notations of Proposition 5.2, and I/ = A* - A*yA*, show successively that
A*
+ (VY)*(27*Y(UY)*Y.
= (YY)*Y= (VY)*(X*Y(UY)*)*Y= (VY)*_v
(Use the identity (c+ z)* = z*(oz*)* = (c*z)*o* for two power series c,T having no constant term). Derive directly from these equations the fact that Y is a code and that Y is complete. 5.5. Show that each recognizable code is contained in a maximal recognizable code. 5.6. Show that each thin code is contained in a maximal thin code. SECTION 6 6.1.
Let t,b: A* -,G be a morphism from A* onto a group G. Let H be a subgroup of G and let X the group code defined by X* = + - l ( H ) . Show that X is indecomposable iff H is a maximal subgroup of G.
NOTES
The concept of a code originated in the theory of communication initiated by C. Shannon in the late 1940s. In this framework, the development of coding theory lead to a detailed study of constant length codes in connection with problems of error detection and correction. An account of this research can be found in McWilliams and Sloane (1977) or Van Lindt (1982). A classical book on information and communication theory is Ash (1965). See also Csiszar and Korner (1981) and McEliece (1977). Variable-length codes were investigated in depth for the first time by Schutzenberger (1955) and also by Gilbert and Moore (1959). The direction followed by Schutzenberger consists in linking the theory of codes with classical noncommutative algebra. The results presented in this book represent this point of view. An early account of it can be found in Nivat ( 1966).
The notion of a stable submonoid appears for the first time in Schiitzenberger (1955) which contains Proposition 2.4. The same result is also given in Shevrin (1960), Cohn (1962) and Blum (1965). A detailed proof of Proposition 2.7 appears in Tilson (1972). The defect theorem (Theorem 2.8) has been proved in several formulations in Lentin (1972), Makanin (1976), and
82
I . CODES
Ehrenfeucht and Rozenberg (1978). Some generalizations are discussed in Berstel et al. (1979). For related questions see also Spehner (1976). The test for codes given in Section 3 goes back to Sardinas and Patterson (1953) and is in fact usually known as the Sardinas and Patterson algorithm. The proof of correctness is surprisinglyinvolved and has motivated a number of papers Bandyopadhyay (1963), Levenshtein (1964), Riley (1967), and De Luca (1976). The design of an efficient algorithm is described in Spehner (1976). See also Rodeh (1982) and Apostolico and Giancarlo (1984). The problem of testing whether a recognizable set is a code is a special case of a well-known problem in automata theory, namely testing whether a given rational expression is unambiguous. Standard decision procedures exist for this question (see Eilenberg, 1974 and Aho et al. 1975).These techniques will be used in Chapter IV. The connection between codes and rational expressions has been pointed out in Brzozowski (1967). Further, a characterization of those codes whose coding morphism preserves the star height of rational expressions is given in Hashiguchi and Honda (1976a). The results of Section 4 are well known in information theory. The full statement of the Kraft-McMillan theorem (McMillan, 1956) includes a converse of Corollary 4.3 which will be proved in Chapter VII (Proposition VII.3.1). The main results of Section 5 are from Schiitzenberger (1955). Our presentation is slightly more general. Proposition 5.2 and Exercise 5.5 are due to Ehrenfeucht and Rozenberg (1983). They answer a question of Restivo (1977).Theorem 5.12 appears in Boe et al. (1980). Example 5.1 1 is a special case of a construction due to Restivo (1977), Exercise 2.6 is from Berstel et al. (1979), Exercise 2.8 is known as Levi's lemma (Levi, 1944), Exercise 3.1 is from Spehner (1975). Exercise 5.2 is from Restivo (1977). It exhibits a class of codes which are not contained in any finite maximal code. Further results in this direction can be found in De Felice and Restivo (1984).
CHAPTER
I1
Prefix Codes
0. INTRODUCTION
Undoubtedly the prefix codes are the easiest to construct. The verification that a given set of words is a prefix code is straightforward. However, most of the interestingproblems on codes can be raised for prefix codes. In some sense, these codes form a family of models of codes: frequently, it is easier to gain intuition about prefix codes rather than general codes. However, we can observe that the reasoning behind prefix codes is often valid in the general case. For this reason we now present a chapter on prefix codes. In the first section, we comment on their definition and give some elementary properties. We also show how to draw the picture of a prefix code as a tree (the literal representation of prefix codes). In Section 2, a construction of the automata associated to prefix codes is given. These automata are deterministic, and we will see in Chapter IV how to extend their construction to general codes. The third section deals with maximal prefix codes. Characterizations in terms of completeness are given. Section 4 presents the usual operations on prefix codes. Most of them have an easy interpretation as operations on trees. An important family of prefix codes is introduced in Section 5. They have many combinatorial properties which illustrate the notions presented 83
84
11. PREFIX CODES
previously. The synchronization of prefix codes is defined in Section 6. In fact, this notion will be generalized to general codes in Chapter IV where the relationship with groups will be established. The notion of measure of a prefix code can be amplified by a definition of average length. This is done in Section 7, where a combinatorial equivalence of average length is given. The last section contains an important result. The deciphering delay is introduced to extend the notion of prefix code to more general codes. Prefix codes are codes with decipheringdelay zero. We give a combinatorial proof of the fact that €or a finite maximal code the decipheringdelay is either infinite or Fero, which means that in the latter case the code is prefix. 1. PREFIX CODES
This introductory section contains equivalent formulations of the definition of a prefix code together with the description of the tree associated to a prefix code. We then show how any prefix code induces in a natural way a factorization of the free monoid. Of course, all results in this chapter transpose to suffix codes by using the reverse operation. It is convenient to have a shorthand for the proper left factors (resp., proper right factors) of the words of a set X . For this we use X A - = X(A+)-'
and
A - X = (A')-'X.
A subset X of A* is prejix if X n X A + = 0.If X contains the empty word 1, then X = { 1). In the other cases, X is a code (Proposition 1.1.4). There is a series of equivalent definitions, all of which will be useful.
PROPOSITION 1.1 For any subset X of A*, the following conditions are equiualent:
(i) X n X A + = @ , (ii) X n X A - = 0, (iii) X A - , X , X A + are pairwise disjoint, (iv) if x, xu E X , then u = 1, (v) if xu = x'u' and x, x' E X , then x = x' and u = u'. 0 The following proposition can be considered as describing a way to construct prefix codes. It also shows a useful relationship between prefix codes and right ideals.
PROPOSITION 1.2 1. Let L be a subset of A*, and let
X = L - LA'.
I . PREFIX CODES
85
Then X is prefix, and it is a code fi
1 # L. Moreover
X A * = LA*,
(1.2)
that is, X and L generate the same right ideal. 2. Let X be a prefix set, and let M be a subset of A* such that XA* = MA*. Then X = M - MA'. In particular X c M , and X is the minimal set of generators of the right ideal M A * . Proof 1 . From X c L, it follows that XA' c L A + , whence X n XA' c X n L A + = @. Further 1 E X iff 1 E L . This proves the first statement. Next, we claim that L c XA*.
(1.3)
Indeed let u E L. If u E X,then u E XA*. Otherwise u E LA', whence u = u'w for some u' E L, w E A'., Arguing by induction on the length, we get that u' E XA*. Thus, u also is in XA*. From (1.3), it follows that LA* c XA*. The reverse inclusion is clear. This proves (1.2). 2. Let x be an element of X . Then x = mu for some m E M , u E A*. Similarly, m = x'u for some x' E X , u E A*. It follows that x = x'uu, whence uu = 1, since X is prefix. Thus X c M . Next (XA*)A = (MA*)A,showing that XA' = MA'. This implies that X c M - MA'. To show the converse inclusion, let m E M - M A + . Then m = xu for some x E X , u E A*. Assuming u # 1, it follows that m E X A + = MA', which is not the case. Thus, u = 1 and m E X.This completes the proof. 0 The set X = L - LA' is called the initial part of L or also the base of the right ideal LA*.
COROLLARY 1.3 Let X and Y be prejix subsets of A*. If X A * = YA*, then X=Y. 0 EXAMPLE1.1 Let A = ( a , b } and let L = A*aA* be the set of words containing at least one occurrence of the letter a. Then X = L - LA' = b*a.
EXAIWLE1.2 Let A = (a,b} and let X = A*ab - A*abA+. Clearly X is a prefix code. It is the set of words ending with ab that do not contain another occurence of the factor ab. Thus, X = b*a*ab. This code, as does the previous one, belongs to the family of semaphore codes that are studied in Section 5. We now give a useful graphic representation of codes, especially of prefix codes over an alphabet with 2 or 3 letters. It consists of associating a tree with
86
11. PREFIX CODES
each prefix code in such a way that the leaves of the tree represent the words in the code. First, we associate an infinite tree with the set A* of words over an alphabet A as follows. The alphabet is totally ordered, and words of equal length are ordered lexicographically Each node of the tree represents a word in A*. Words of small length are to the left of words of greater length, and words of equal length are disposed vertically according to lexical ordering. There is an edge from u to u iff u = ua for some letter a E A. The tree obtained in this way is the literal representation of A* (see Figs. 2.1 and 2.2).
.. ..
..
.. ..
.. Fig. 2.1 The literal representation of {a,b)*.
aa
FYg. 2.2 The literal repmentation of {o, b,c}*.
87
I . PREFIX CODES
To a given subset X of A* we associate a subtree of the literal representation of A* as follows. We keep just the nodes correspondingto the words in X and all the nodes on the paths from the root to these nodes. Nodes corresponding to words in X are marked if necessary. The tree obtained in this way is the literal representation of X . Figures 2.3-2.6 give several examples. a
Fig. 2.3 Literal representation of X = {a,ba, ban}.
Fig. 2.4 Literal representation of X = {oa,ab,baa, bb}.
Fig. 2.5 Literal representation of X
= {a,b, ca, cbba, cbcc].
Fig. 2.6 Literal representation of X = a%.
88
11. PREFIX CODES
It is easily seen that a code X is prefix iff in the literal representation of X , the nodes corresponding to words in X are all leaves of the tree. The advantage of the literal representation, compared to simple enumeration, lies in the easy readability. Contrary to what might seem to happen, it allows a compact representation of relatively big codes (see Fig. 2.7). After this digression, let us state the factorization theorem announced at the beginning of this section. 1.4 Let X be a prefix code over A, and let THEOREM
R = A* - XA*.
Then R is prefix closed and nonempty. Further,
x - 1 = R ( / j - l), l+&A=R+X,
A* = X*& Conversely, if R is a prefix-closed subset of A*, then X = R A - R is a prefix code. If R # 0,then it is the unique prefix code such that R = A* - XA*. Proof If a word w has no left factor in X, then each of the left factors of w shares this property. Thus, R is prefix closed. The set R is nonempty because X # (1). The product of X and A* is unambiguous in view of Proposition l.l(v). Thus, = XA* and
=*
= A* - XA* = A*
- XA* =(1- X)A*.
Fig. 2.7 A code with 26 elements.
(1.7)
89
I . PREFIX CODES
From this equation and = A*,
(1 we get, by multiplying (1.7) by.1 - .4,
R(1 - A ) = 1 -
x,
which is formula (1.4). This also gives
8-
=
1-
x,
from which (1.5) follows. Finally, note that
x* = (1 -
x)-1
by Proposition 6.1 of Chapter I. Multiplication of (1.7) by X * gives X*R
= X*(l
- X)A* = A*.
This proves (1.6). Conversely, let R be a prefix closed subset of A*. If R = 125, then X = R A - R = 521 is a prefix code; thus we may assume R nonempty. Let X = R A - R. Then X is prefix. Indeed if x E X , then x = ra for some r E R, a E A . Each proper left factor u of x is a left factor of r, thus, it is in R and therefore not in X.This shows that X n X A - = 125. With R being prefix-closed, 1 belongs to R and
x = 8.4 - g + 1. Indeed the product R A is unambiguous, and furthermore 1 $ RA. It successively follows from the formula that
X - 1 = R(A - l),
R = (1 - X)A* = A* - X_ A_ * = A* - X A * .
The last equality holds because Xis prefix. Assume finally that there is a prefix code Y such that R = A* - yA* = A* - X A * .
Then YA* = XA*, whence Y = X by Corollary 1.3. This proves uniqueness. 0 Note the followingcombinatorial interpretations of formulas (1.5) and (1.6). The first says that a word in R followed by a letter is either in R or in X and that each word in X is composed of a word in R followed by a letter. The second formula says that each word w E A* admits a unique factorization w = x~x~"'x,u,
xi,...,x, E X ,
u E R.
11. PREFIX CODES
Fig. 2.8 A prefix code.
Fig. 2.9 Another prefix code.
EXAMPLE 1.3 .Let X = {a,baa,bab,bb} c A + be the code represented in Fig. 2.8. Here R = { 1, b, ba) = X A - , and X - 1 = (1 + b + ba)(A- 1). The equality between R and X A - characterizes maximal prefix codes, as we will see in Section 3. EXAMPLE1.4 Let X = (b2)*{a2b,ba}, as given in Fig. 2.9. Here R = R, u R 2 , where R, = X A - = (b2)*(1 u a u b u a,) is the set of proper left factors of X,and R , = XA' - X - X A - = (b2)*(abA*u a3A*). Thus Eq. (1.4) now becomes - 1 = (b2)*(1 + a + b + a2 abA* + a34*)(4 - 1).
x
+
2. THE AUTOMATON OF A PREFIX CODE
The literal representation yields an easy method for verifying whether a word w is in X* for some fixed prefix code X.It suffices to follow the path starting at the root through the successive letters of w. Whenever a leaf is reached, the corresponding factor of w is split away and the procedure is restarted. We will consider several automata derived from the literal representation and relate them to the minimal automaton. The study which is initiated here will be generalized to arbitrary codes in Chapter IV. The particular case of prefix codes is interestingin itself because it is the origin of most of the general results of Chapter IV.
2. THE AUTOMATON OF A PREFIX CODE
9'
Recall (Chapter 0) that for any subset X c A*, we denote by d ( X ) the minimal deterministicautomaton recognizing X .
PROPOSUION 2.1 Let X be a subset of A*. The following conditions are equivalent: (i) X is prefix. (ii) The minimal automaton d ( X )either is empty or has a single Jinal state t and t * A = 0. (iii) There exists a deterministic automaton d = (Q,i, T ) recognizing X with T A = 0. Proof (i) * (ii)Supposethat Xis nonempty. Set d ( X ) = (Q,i,T ) .First, we claim that for q E T , we have { w E A* I q w E T } = { l}. Indeed let x E X and w E A* be words such that i x = q (remember that 4 E T )and q w E T . Then xw E X,whence w = 1. This shows the claim. Thus, two final states are not separable and from the minimality of d ( X ) ,it follows that d ( X ) has just one final state, say t. Assume that t A # 0,and that t a = p for some letter a E A and some state p. Since p is coaccessible, we have p u = t for some u E A*. Thus t au = t, whence au = 1, a contradiction. (ii) =+ (iii) clear. (iii) + (i) From T A = 0,it follows that T A + = 0.Thus, if x E X , and w E A', then i - x w = 0 and xw $ X . Thus X n XA+ = 0. 0
-
-
-
It is easy to construct an automaton for a prefix code by starting with the literal representation. This automaton, call it the literal automaton of a prefix code X,is the deterministic automaton d = ( X A - u X , 1, X) defined by ua if U U E X A -u X , u*a = 0 otherwise.
-
Since XA- u X is prefix-closed, we immediately see that 1 u E X iff u E X , that is L ( d ) = X . The pictorial representation of a literal automaton corresponds, of course, to the literal representation of the code.
EXAMPLE2.1 The code X = {ab,bab, bb} over A = {a,b} has the literal representation given in Fig. 2.10a and the literal automaton given in Fig. 2.10b. The literal automaton d of a prefix code X is trim but is not minimal in general. For infinite codes, it is always infinite. Let us consider two states of d . It is equivalent to consider the two left factors of words of X , say u and u, leading to these states. These two states are unseparable iff
u-'x = u-'x.
11. PREFIX CODES
Ibl
Fig. 2.10 (a) Literal representationof X, (b) Literal automaton of X.
Note that this equality means, on the literal representation of X that the two subtrees with roots u and u, respectively, are the same. This yields an easy procedure for the computation of the minimal automaton: first, all final states are labeled, say with label 0. If labels up to i are defined we consider subtrees such that all nodes except the roots are labeled. Then roots are labeled identically iff the (labeled)subtrees are isomorphic. Taking the labels as states, we obtain the minimal automaton. The procedure is described in Example 2.1-2.3.
EXAMPLE 2.1 (continued) The three terminal states are unseparable by Proposition 2.1. The states a and ba are unsebarable since a-'X = @a)-'X = b. No other relation exists, thus the minimal automaton is as given in Fig. 2.1 1.
W Fig. 2.11 The minimal automaton of X = {ab,bab,bb}.
2. THE AUTOMATON OF A PREFIX CODE
93
Fig. 2.12 A literal automaton.
Fe.2.13
Minimal automaton corresponding to Fig. 2.12.
EXAMPLE 2.2 The literal automaton of X = (b2)*(a2bu ba) is given in Fig. 2.12. Clearly the final states are equivalent, and also the predecessors of final states and their predecessors. On the main diagonal, however, the states are only equivalent with a step 2. This gives the minimal automaton of Fig. 2.13. EXAMPLE 2.3 In Fig. 2.14 the labeling procedure has been carried out for the 26 element code of Fig. 2.7. This gives the subsequent minimal automaton of Fig. 2.15. We now consider automata recognizing the submonoid X* generated by a prefix code X.Recall that X* is right unitary (Proposition 1.2.5).Proposition 2.2 is the analogue of Proposition 2.1.
11. PREFIX CODES
94
Fig. 2.14 The computation of a minimal automaton.
b
Fig. 2.15 A minimal automaton.
2. THE AUTOMATON OF A PREFIX CODE
95
PROPOSITION 2.2 Let P be a subset of A*; the following conditions are equivalent; (i) P is a right unitary submonoid. (ii) The minimal automaton d ( P )has a unique final state, namely the initial state. (iii) There exists a deterministic automaton recognizing P having the initial state as the unique $nu1 state. Proof (i)=. (ii) The States in d ( P )are the nonempty sets u-'P, for u E A*. Now if u E P, then u-'P = P because uv E P iff v E P. Thus, there is only one final state in d ( P ) ,namely P which is also the initial state. (ii) =$ (iii) clear. (iii) =-(i) Let d = (Q,i, i)be the automaton recognizing P. The set P then is a submonoid since the final state and the initial state are the same. Further let u, uv c P. Then i s u = i and i uv = i. This implies that i v = i because d is deterministic. Thus, v E P, showing that P is right unitary. 0 If d = (Q,i, T ) is any deterministic automaton over A, the stabilizer of a state q is the submonoid
Stab(q) = {w E A* I q w = 4).
PROPOSITION 2.3 The stabilizer of a state of a deterministic automaton is a right unitary submonoid. Every right unitary submonoid is the stabilizer of a state of some deterministic automaton. Proof
It is an immediate consequence of the proof of Proposition 2.2. 0
This proposition shows the importance of right unitary submonoids and of prefix codes in automata theory. Proposition 2.4 presents a method for deriving the minimal automaton d ( X * ) of X* from the minimal automata d ( X )of the prefix code X.
PROPOSITION 2.4 Let X be a nonempty prefix code over A, and let d ( X ) = (Q,i, t )be the minimal automaton of X . Then the minimal automaton of
X* is i, t ,t )
if
Stab(i) # 1,
if
Stab(i) = 1;
the action of d ( X * ) , denoted by 0, is given by qOa=q*a
for q # t ,
t Oa = i a.
96
11. PREFIX CODES
Proof Let W = (Q, t, t ) be the automaton obtained from d ( X ) , defining the action 0 by (2.3)and (2.4).Then clearly
L ( 9 )= { w l t 0 w = t } = x*. Let us verify that the automaton W is reduced. For this, consider two distinct states p and q. Since d ( X )is reduced, there is a word u in A* separating p and q, i.e., such that, say p * u = t, q . u # t. (2.5) It follows that p o u = t , and furthermore p 0 u # t for all u < u. If q o u # t, then u separates p and q in the automaton 93 also. Otherwise, there is a smallest left factor u of u such that q o u = t . For this u, we have q u = t. In view of (2.5), u # u. Thus, u c u. But then q 0 u = t and p 0 u # t, showing the p and q are separated by v. Each state in 9is coaccessiblebecause this is the case in d ( X ) .From 1 4 X, we have i # t. The state i is accessible in $iJ iff the set { w I t 0 w = i } is nonempty, thus iff Stab(i) # 1. If this holds, 9 is the minimal automaton of X*.Otherwise, the accessible part of 9J is its restriction to Q - i. 0
-
The automaton d ( X * ) always has the form given by (2.2) if X is finite. In this case, it is obtained. by identifying the initial and the final state. The classification of the prefix codes according to the form of the minimal automaton d ( X * ) goes through the notion of chain. A prefix code X is a chain if there exist disjoint nonempty sets 2 such that Y u 2 is prefix and
x = Y*Z. PROPOSITION2.5 Let X be a nonempty prefix code over A, and let d ( X ) = (Q, i, t ) be the minimal automaton of X . The following conditions are equivalent:
(i) Stab(i) # 1; (ii) X is a chain, (iii) there exists a word u E A + such that u - ' X = X .
Proof (i)+ (ii) By Proposition 2.3, Stab(i)is a right unitary submonoid. Its base, say Y, is a prefix code which is nonempty because Stab(i) # 1. Let 2 be the set of words defined as follows: z E 2 iff i - z = t and i - z ' # i for all nonempty left factors z' of z. From t * A = fa, it follows that 2 is a prefix code. Further Y n 2 = 0,by i # t. Finally X = Y*Z.It remains to verify that V = Y u 2 is prefix. A proper left factor of a word in 2 is neither in 2 nor in Y, the latter by definition. A proper left factor w of a word y in Y cannot be in 2, since otherwise i w = t whence i * y = 0.Thus V is prefix and X is a chain. 4
2.
THE AUTOMATON OF A PREFIX CODE
97
(ii) (iii) Assume that X = Y*Z with I/' = Y u Z prefix and Y n Z = 0. Consider a word u E Y. The code V being prefix, we have u-'Z = 0.Thus u-'x = u - ' ( y * z ) = u - ' y * z = y * z = x. (iii) =- (i) The automaton d ( X ) being minimal, the states of d ( X ) are in bijective correspondence with the nonempty sets u - 'X,where u runs over A*. The bijection is given by associating the state i . u to u - ' X . Thus, the equality u-'X = X expresses precisely that i * u= i. Consequently u E Stab(i). 0
EXAMPLE 2.1 (continued) The minimal automaton of X* is given in Fig. 2.16. The finite code X is not a chain.
Fig. 2.16 The minimal automaton of X * .
EXAMPLE 2.2 (continued) This code is a chain. The automaton d ( X * )is obtained without suppressing the initial state of d ( X ) .See Fig. 2.17.
Fig. 2.17 The minimal automaton of X*,with X = (b2)*(azbu ba).
EXAMPLE 2.4 Consider the code X = ba*b over A = { a , b } . Its minimal automaton is given in Fig. 2.18a. The stabilizer of the initial state is just 1. The minimal automaton d ( X * ) given in Fig. 2.18b is derived from formula (2.2).
98
11. PREFIX CODES
(a)
(b)
Fig. 2.18 (a) The minimal automaton of X = ba*b, and (b) of A'*.
A construction which is analogous to that of Proposition 2.4 allows us to define the literal automaton of X* for a prefix code X.It is the automaton d = ( X A - , 1,l) whose states are the proper left factors of words in X,and with the action given by ua if u a E X A - , u.a= 1 if U U E X , (2.6) 0 otherwise. This automaton is obtained from the literal automaton for X .by identifying all final states of the latter with the initial state 1. It is immediate that this automaton recognizes X * .
1
3. MAXIMAL PREFIX CODES
A subset X of A* is a maximal prefix set if it is prefix and if it is properly contained in no other prefix subset of A*, that is, if X c Y c A* and Y prefix imply X = Y. As for maximal codes, a reference to the underlying alphabet is necessary for the definition to make sense. The set { I } is a maximal prefix set. Every other maximal prefix set is a code. A maximal code which is prefix is always maximal prefix. The converse does not hold: there exist maximal prefix codes which are not maximal as codes. However, under mild assumptions, namely for thin codes, we will show that maximal prefix codes are maximal codes. The study of maximal prefix codes uses a'left-to-right oriented version of dense and complete codes. Let M be a monoid, and let N be a subset of M. An element m E M is rightcompletable in N if mw E N for some win M. It is equivalent to say that N meets the right ideal mM. A subset N is right dense if every m E M is right completable in N,i.e., if N meets all right ideals. The set N is right complete if the submonoid generated by N is right dense. The set N is right thin if it is not right dense. Of course, all these definition make sense if right is replaced by left.
3.
99
MAXIMAL PREFIX CODES
The following implicationshold for a subset N of a monoid M :
N right dense N right complete N thin
* =t-
=*
N dense N complete N right thin.
In the case of a free monoid A*, a subset N of A* is right dense iff every word in A* is a left factor of some word in N.Thus every (nonempty)left ideal is right dense. Similarily, N is right complete if every word w in A* can be written as
w = m,mz.-.m,p for some r 2 0, ml,. ..,m, E N,and p a left factor of some word in N.
PROPOSITION 3.1 For any subset X c A* the following conditions are equiualent. (i) XA* is right dense, (ii) A* = X A - u X u X A + , (iii) for all w E A*, there exist u, u E A*, x E X with wu = xu. Proof (i) =. (iii) Let w E A*. Since XA* is right dense, it meets the right ideal wA*. Thus w = xu for some u, u E A*, and x E X . (iii)+ (ii)If wu = xu, then w E X A -,w E X or w E X A + according to w < x, w = x, or w > x. (ii) (i) The set of left factors of XA* is X A - u X u XA’. 0 PROPOSI~ON 3.2 Lat X c A + be a subset that does not contain the empty word. Then XA* is right dense i$ X is right complete.
Proof Supposefirst that XA* is right dense and consider a word w E A*. If w E X A - u X , then wu E X for some u E A*. Otherwise w E X A + by Proposition 3.1. Thus, w = xw’ for some x E X , w’ E A + . Since x # 1, we have Jw’I < Iwl. Arguing by induction,w’u E X* for some u in A*. Thus, w is a left factor of some word in X*. Conversely, let W E A*, and assume that w u ~ X *for some U E A * . Multiplying if necessary by some word in X , we may assume that wu # 1. Then wu E X + c XA*. 0 Note that F’roposition 3.2 does not hold for X = (1). In this case, XA* = A* is right dense, but X* = { l } is, of course, not. THeOREru 3.3 Let X be a preJix subset of A*. The following conditions are equivalent:
(i) X is a maximal prefix set, (ii) XA* is right dense.
11. PREFIX CODES
I00
Further, if X # {I}, they are equioalent to (iii) X is right complete.
Observe that the equivalence of prefix maximality and right completeness given in this statement is stronger than the result for general codes (Theorem 1.5.10). In the general case, the restriction to thin codes is necessary. Proof By Proposition 3.2, conditions (ii) and (iii) are equivalent. Let us show that (i) and (ii) are equivalent. Let u E A + . The set X u u is prefix iff (X u u) n (X u u)A+ = @, thus, iff (u n X A + )u
(Xn u A + ) = 0,
or iff u 4 X A + and u 4 XA-.Denote by U the set of all words u such that X u u is prefix. Then
u = A* - (XA+u XA-).
(3.1)
Clearly X is maximal prefix iff X = U ; thus, by (3.1) and in view of Proposition 3.1, this is equivalent to X being right dense. 0 The following corollary appears to be useful.
COROLLARY 3.4 Let L c A + and let X = L - L A + . Then L is right complete &?a X is a maximal prefix code. Proof
L is right complete iff LA* is right dense (Proposition 3.2). From
XA* = LA* (Proposition 1.2)and Theorem 3.3, the statement follows. 0 We now give the statement corresponding to the first part of Theorem 1.4, but for maximal prefix codes. 3.5 Let X be a prefix code ouer A, and let P = XA- be the set of THEOREM proper left factors of words in X . Then X is maximal prefix iff one of the following equivalent conditions hold:
g - 1 = f(dj - l), 1 + P A = p + x, A* = x*p.
(3.2) (3.3) (3.4)
Proof Set R = A* - XA*. If X is maximal prefix, then XA* is right dense and R = P by Proposition 3.1. The conclusion then follows directly from Theorem 1.4. Conversely, if & - 1 = l'(A - l), then by Eq. (1.4)
p ( A - 1) = &(A - 1). Since A - 1 is invertible we get P = R, showing that XA* is right dense. 0
3.
MAXIMAL PREFIX CODES
I01
In the case of a finite maximal prefix code, Eq. (3.2) gives a factorization of - 1 into two polynomials. Formula (3.3) has an interpretation on the literal representation of a code X which makes the verification of maximality very easy: if p is a node which is not in X,then for each a E A, there must exist a node pa in the literal representation of X. The following statement corresponds to the other part of Theorem 1.4 and characterizes those prefix-closed sets which define maximal prefix codes.
PROPOSITION 3.6 Let R be a nonempty prefix-closed subset of A*, and let X = RA - R. Then X is a maximal prefix code i f f R contains no right ideal. If this is the case, R = X A - . Proof
By Theorem 1.4, X is prefix and R = A* - XA*.
From this, it follows that R contains no right ideal iff XA* meets all right ideals, i.e., iff XA* is right dense. Thus the equivalence follows from Theorem 3.3. The last observation is a consequence of A* = XA* u X A - . 0 We now come to the relation existing between maximal prefix codes and maximal codes which are prefix. THEOREM 3.7 Let X be a thin subset of A'. The following conditions are equivalent: (i) X is a maximal prefix code, (ii) X is prefix and a maximal code, (iii) X is right complete and a code. Proof (ii) =r (i) is clear. (i) =. (iii) follows from Theorem 3.3. It remains to prove(iii) =.(ii). Let Y = X - XA'. By Proposition 1.2 YA* = XA*. Thus Y is right complete. Consequently Y is complete. The set Y is also thin, since Y c X. Thus Y is a maximal code by Theorem 1.5.7. From the inclusion Y c X,we have X =. Y. 0
EXAMPLE 3.1 This example shows that Theorem 3.7 does not hold without the assumption that the code is thin. Let X = {ubalUII u E A*} over A = {a,b } be the reversal of the code given in Example 1.4.6. It is a maximal code which is right dense, whence right complete. However, X is not prefix. From Corollary 3.4, it follows that Y = X - XA' is a maximal prefix code. Of course, Y # X, and thus, Y is not maximal.
II. PREFIX CODES
I02
PROPOSI~ON 3.8 Let X be a thin subset of A'. The following conditions are equivalent: (i) X is a maximal prejx code. (ii) X is prefx, and there exists a positive Bernoulli distribution K such that .(X) = 1. (iii) X is prefx, and n(X) = 1 for all positive Bernoulli distributions II. (iv) X is right complete, and there exists a positive Bernoulli distribution such that x ( X ) = 1. (v) X is right complete, and K ( X )= 1 for all positue Bernoulli distributions IL.
It is an immediate consequence of Theorem 3.7 and of Theorem
Proof 1.5.10. 0
In the previous section, we gave a description of prefix codes by means of the bases of the stabilizers in a deterministic automaton. Now we consider maximal prefix codes. Let us introduce the following definition. A state q of a deterministic automaton d = (Q,i, T ) over A is recurrent if for all u E A*, there is a word u E A* such that q uu = q. This implies in particular that q u # 0 for all u in A*. PRoposrnO~3.9
Let X be aprejx codeouer A. The following conditions are
equivalent. (i) (ii) (iii) (iv) (v)
X is maximal prefix. The minimal automaton of X* is complete. All states of the minimal automaton of X* are recurrent. The initial state of the minimal automaton of X* is recurrent. X* is the stabilizer of a recurrent state in some deterministic automaton.
Proof (i) =-(ii)Let .ai(X*) = (Q,i, i) be the minimal automaton of XI.Let q E Q, a E A. There is some word u E A* such that i u = q. The code X being right. complete, uau E X* for some word u. Thus i = i uau = (q a) v, showing that q a # 521. Thus .ai(X*) is complete. (ii) * (iii) Let q E Q,u E A*; then q' = q u Z 0 since d ( X * ) is complete. d ( X * ) being minimal, q' is coaccessible, and q is accessible. Thus q' u = q, for some u E A*, showing that q is recurrent.
-
(iii) (iv) * (v) is clear. (v) =$ (i) Let d = (Q,i, T ) be a deterministic automaton and q E Q be a recurrent state such that X* = Staqq). For all u E A* there is a word u E A* with q uu = q, thus uu E X*.This shows that X is right complete. The set X being prefix, the result follows from Theorem 3.3. 0
4.
‘03
OPERATIONS ON PREFIX CODES
4. OPERATIONS ON PREFIX CODES
Prefix codes are closed under some simple operations. We start with a general result which will be used several times.
PROPOSITION 4.1 Let X , ( partition of X . Set
be nonempty subsets of A* and let (Xi)ie,be a
z=UX& iEI
1. Zf X and the r;’s are prefix (maximal prefix), then Z is prejix (maximal prefix). 2. If 2 is prefix, then all yi are prefix. 3. Zf X is prefix and Z is maximal prefix, then X and the y;’s are maximal prefix. Proof
1. Assume that z, zu E 2. Then
zu = x’y‘ z = xy, or some i,j E I , x E Xi, y E yi, x’ E X,, y’ E 5. From
xyu = x’y’ it follows that x = x’, because X is prefix, whence i = j and y = y’. Thus, u = 1 and Z is prefix. Assume now that X A * and the &A* are right dense. Let w E A*. Then ww‘ = xu for some w’, u E A*, x E X . Let x belong to Xi. Since &A* is right dense, uu’ E &A* for some u’ E A*. Thus ww’u‘ E X,&A* whence ww’v’ E ZA*. Thus Z is maximal prefix. 2. Let y, yu E & and x E Xi. Then xy, xyu E 2, implying that u = 1. 3. From ZA* c X A * we get that X A * is right dense. Consequently X is maximal prefix. To show that &A* is right dense, let w E A*. For any x E Xi, xw is right-completable in ZA*. Thus, xw = zw’ for some z E 2. Setting z = x‘y’ with x’ E X,, y’ E yi gives xw = x’y’w’.The code X being prefix, we get x = x‘, whence w = y’w‘, showing that w is in &A*. U For Card(1) = 1, we obtain, in particular, 4.2 If X and Y are prefix codes (maximal prefix),then X Y is a COROLLARY prefix code (maximal prefix). 0
The converse of this corollary holds only under rather restrictive conditions and will be given in Proposition 4.10. COROLLARY 4.3 Lat X c A’, and n 2 1. Then X is (maximal)prefix iff X” is (maximal)prefix.
11. PREFIX CODES
104
Proof By Corollary 4.2, X"is maximal prefix for a maximal prefix code X. Conversely, setting Z = X" = X"-'X,it follows from Proposition 4.1(2) that X is prefix. Writing Z = XX"-',we see by Proposition 4.1(3) that X (and xn-l)are maximal prefix if Z is. 0 Corollary 4.3 is a special case of Proposition 4.8, to be proved later.
COROLLARY 4.4 Let X and Y be prejx codes, and let X = Xlu X, be a partition. Then Z = Xlu X, Y is a prefix code and Z is maximal prefix ifl X and Yare maximal prefix. Proof
With Y' = { l}, we have Z = XIY' u X, Y; The result follows from
is prefix and Z is maximal prefix if X and Y are.
0
There is a special case of this corollary which deserves attention. It constitutes an interesting operation on codes viewed as trees. COROLLARY 4.5 Let X and Y be prefix codes, and x E X . Then 2 = (X- x ) u XY is prefix and Z is maximal prefix fi X and Y are. 0 The operation performed on X and Y is sketched in Fig. 2.19. We now turn to the converse operation. 2 = (X -x)U X Y
Fig. 2.19 Combining codes X and Y.
PROPOSITION 4.6 Let Z be a prefix code, and let p E Z A - . Then Yp= p - ' 2 and X = Z - pY,, u { p } (4.1) are prefix sets. Further if Z is maximal prefix, then Ypand X are maximal prefix also. The opration described in (4.1) can be drawn as shown in Fig. 2.20. Proposition 4.6 is a special case of the followingresult. PROPOSITION 4.7 Let Z be a prefix code, and let Q be a prefix subset of Z A -. For each p E Z A - , the set Y,, = p - '2 is a prefix code;further /
\
4.
'05
OPERATIONS ON PREFIX CODES
x = (Z- PY,, u P
Fig. 2.20 Separating Z and Y,.
is a prefix set. If Z is maximal prefix, then X and the Yp (p E Q ) are maximal prefix.
Proof
Set Xo = Z
-
upEppYp,Yo = { l}, Xp = { p}; then
z = XOY, u (J XPYP. P ~ Q
Thus, to derive the result from Proposition 4.1, it suffices to show that X is prefix. Let x, xu E X with u E A +.These words cannot both be in the prefix set Z nor can they both be in the prefix set Q.Since Q c Z A -,we have x E Q,xu E 2. Thus u E Y, and xu is not in X. 0 Propositions 4.1 and 4.7 can be used to enumerate some maximal prefix sets. Let us illustrate the computation in the case of A = { a , b } . If 2 is maximal prefix and 2 # 1, then both Y =b-'2 X = a-'Z, are maximal prefix and Z = aX v by. (4.2) Conversely, if X and Y are maximal prefix, then so is 2. Thus, Eq. (4.2) defines a bijection from maximal prefix codes onto pairs of maximal prefix sets. Further Card(2) = Card(X) + Card(Y). Let a, be the number of maximal prefix sets with n elements. Then by Eq. (4.2),for n 2 2,
These numbers are related to the Catalan numbers already encountered in Section 1.4 (see Exercise 4.1): a,+'
='"(').n + l
n
I 06
11. PREFIX CODES
PROPOSTION 4.8 Let Y,Z be composable codes and X = Y 0 Z.Then X is a maximal prefix and thin code i$ Y and Z are maximal prefix and thin codes. Proof Assume first that X is thin and maximal prefix. Then X is right complete by Theorem 3.3. Thus X is thin and complete. By Proposition 1.6.9, both Y and Z are thin and complete. Further Y is prefix (Proposition 1.6.8(i)). Thus Y (being thin, prefix, and complete)is a maximal prefix code. Next X is right dense and X c Z*. Thus 2 is right dense. Consequently 2 is a right complete, thin code. By Theorem 3.7,Z is maximal prefix. Conversely, Y and Z being prefix, X is prefix by Proposition 1.6.3, and Y,Z being both thin and complete,Xis also thin and complete by Proposition 1.6.9. Thus X is a maximal prefix code. 0
PROPOSITION 4.9 Let Z be a prefix code ouer A, and let Z = X u Y be a partition. Then T = X* Y is a prefix code, and further T is maximal prefix ifl Z is a maximal prefix code. hoof Let B be an alphabet bijectively associated to 2, and let B = C u D be the partition of B induced by the partition Z = X u Y.Then
T - C*DOZ. The code C* D clearly is prefix. Thus, T is prefix by Proposition 1.6.3. Next, T* = 1 u Z* Y showing that T is right complete iff Z is right complete. The second part of the statement thus results from Theorem 3.3. 0 Note that the codes of this proposition are the chains of Section 2. We conclude this section by the proof of a converse to Corollary 4.2. This result will be of use later (Section 8). PROPOSITION 4.10 Let X and Y be finite nonempty subsets of A* such that the product XY is unambiguous.If XY is a maximal prefix code, then X and Y are maximal pre$x codes.
The following example shows that the conclusion fails for infinite codes. However, it is not known whether the hypothesis of unambiguity is necessary.
EXAMPLE 4.2 Consider X = { 1, a} and Y = (a2)*bover A = (a, b}. Here X is not prefix, and Y is not maximal prefix. However, XY = a*b is maximal prefix and the product is unambiguous. Proof Let Z = XY and n = max{lyl I y E Y}.The proof is by induction on n. For n = 0, we have Y = { 1) and Z = X.Thus, the conclusion clearly holds. Assume n 2 1 and set
I
T = { y E Y Iyl = n } ,
Q = {q E YA - I qA n T # 0).
5.
SEMAPHORE CODES
107
By construction, T = QA. In fact T = QA. Indeed, let q E Q, a E A and let x E X be a word of maximal length. Then xq is a left factor of a word in 2, and xqa is right-completable in ZA*. The code 2 being prefix, no proper left factor of xqa is in 2. Consequently xqau = x’y’
for some x’ E X,y’ E Y,and u E A*. Now n = 1qal 2 ly’l, and 1x1 2 Ix’I. Thus x = XI, y’ = qa, u = 1. consequently qa E Y and T = QA.
Now let Y’ =(Y - T)u Q,
Z‘=XY’. We verify that Z‘ is prefix. Assume the contrary. Then xy‘u = x’”’
for some x, x’ E X, y’, y” E Y’, u # 1. Let a be the first letter of u. Then either y’ or y’a is in Y.Similarly either y” or y”b (for any b in A ) is in Y. Assume y’ E Y.Then xy’ E Z is a proper left factor of x’y” or x‘yl’b, one of them being in 2. This contradicts the fact that Z is prefix. Thus y’a E Y.As before, xy’a is not a proper left factor of x’y” or xry% Thus necessarily u = a and y” E Y, and we have xy’a = x’y”
with y’a, y” E Y. The unambiguity of the product X Y shows that x = x’, y’a = y”. But then y” # Y’. This yields the contradiction.
To see that Z’ is maximal prefix, observe that Z c 2’u Z’A. Thus ZA* c Z’A* and the result follows from Theorem 3.3. Finally, it is easily seen that the product XY‘ is unambiguous: if xy‘ = x’y’’ with x, x’ E X,y’, y“ E Y’, then either y’, y” E Y - T or y’, y” E Q, the third case being ruled out by the prefix character of Z. Of course,max{lyl I y E Y‘} = n - 1. By theinduction hypothesis,Xand Y are maximal prefix. Since Y = (Y’- Q)u QA, the set Y is maximal prefix by Corollary 4.4. 0 It is also possible to give a completely different proof of Proposition 4.10 using the fact that, under the hypotheses of the Proposition, we have n:(X)n(Y) = 1 for all Bernoulli distributions n: (see Exercise VIII.4.2). 5. SEMAPHORE CODES
This section contains a detailed study of semaphore codes which constitute
an interesting subclass of the prefix codes. This investigation also illustrates the techniques introduced in the preceding sections.
I 08
11. PREFIX CODES
PROPOSITION 5.1 For any nonempty subset S of A + , the set X = A*S
- A*SA+
(5.1)
is a maximal prefix code.
Proof The set L = A*S is a left ideal, and thus, is right dense. Consequently, L is right complete, and by Corollary 3.4, the set X = L - L A + is maximal prefix. 0 A code X of the form given in Eq. (5.1) is called a semaphore code, the set S being a set of semaphores for X . The terminology stems from the following observation:a word is in X iff it ends with a semaphore,but none of its proper left factors ends with a semaphore. Thus, reading a word from left to right, the first appearance of a semaphore gives a “signal” indicating that what has been read up to now is in the code X . EXAMPLE 5.1 Let A = {a,b} and, S = {a}. Then X = A*a - A*aA+ whence X = b*a. Note that we also have X = A*a+ - A*a+A + showing that a semaphorecode admits several sets of semaphores(see also Proposition 5.7). EXAMPLE 5.2 For A = {a,b} and S = {aa,ab}, we have A*S = A*aA. Thus A*S - A*SA+ = b*aA. The following proposition characterizes semaphore codes among prefix codes.
PROPOSITION 5.2 Let X c A’. Then X is a semaphore code iff X is prefix and A*X c XA*. (54 Proof Let X = A*S - A*SA+ be a semaphore code, Then X is prefix and it remains to show (5.2). Let w E A*X. Since w E A*S, w has afactor in S. Let w’ be the shortest left factor of w which is in A*S. Then w’ is in X . Consequently, w E XA*. Conversely, assume that a prefix code X satisfies (5.2). Set M = XA*. In view of Proposition 1.2, and by the fact that X is prefix, we have X =M -MAt.
Equation (5.2) implies that A*M = A*XA* c XA* = M , thus, M = A* M and
X = A*M
- A*MA+. O
EXAMPLE 5.3 The code Y = {a’, aba, ab’, b} is a maximal prefix code over A. However, Y is not a semaphore code, since ab E A*Y but ab # YA*.
5.
'09
SEMAPHORE CODES
A semaphore code is maximal prefix, thus, right complete. The following proposition describes those right complete sets which are semaphore codes. PROPOSITION5.3 Let X c A + . Then X is a semaphore code ig X is right complete and X n A*XA+ = 0. (5.3) Proof A semaphore code is maximal prefix, thus also right complete. Further, by (5.2) A*XA" c X A + thus X n A * X A + c X n X A + = 0, showing Eq. (5.3). Conversely, if a set X satisfies (5.3), then X is prefix. To show that X is semaphore, we verify that (5.2) holds. Let w = ux E A*X with u E A*, x E X . The code X being right complete, we have uxu = x'yu for some x' E X , y E X*,u E A*. Now Eq. (5.3) shows that ux is not a proper left factor of x'. Thus ux E x'A*. 0 COROLLARY 5.4 Let X c A+ be a semaphore.code and let P = X A - . Then PX c X P u x2. Proof (See Fig. 2.21) Let p E P, x E X . By Eq. (5.2), px = yu for some y E X , u E A*. The code X is prefix, thus IpI < Iyl. Consequently, u is a right factor of x, and by (5.3), u 4 X A + . The code X is maximal prefix, therefore UEXA-uX. 0 V
X
Fig. 2.21 Proof of Corollary 5.4.
Formula (5.3) expresses a property of semaphore codes which is stronger than the prefix condition: for a semaphorecode X , and two elements x, x' E X , the only possible way for x to occur as a factor in x' is to be a right factor of x'. We now use this fact to characterize semaphore codes among maximal prefix codes.
PROPOSITION 5.5 Let X c A + , and let P = X A - be the set of proper le$t factors of words in X . Then X is a semaphore code fi X is a maximal prefix code and P is sufix-closed.
I I0
11. PREFIX CODES
Of course, P is always prefix closed. Thus P is suffix closed iff it contains the factors of its elements.
Proof Let X be a semaphore code. Then X is a maximal prefix code (Proposition 5.1). Next, let p = uq E P with u,q E A*. Let u E A + be a word such that pu E X . Then q 4 X A * , since otherwise p~ = uqv E X n A * X A + , violating Proposition 5.3. Thus q E X A - = P. Conversely, assume that X is maximal prefix and that X n A * X A + # 0. Let x E X n A * X A + . Then x = ux'u for some u E A*, x' E X , u E A + . It follows that ux' E P and, P being suffix-closed, also x' E P which is impossible. Thus X is a semaphore code by Proposition 5.3. 0 Another consequence of Proposition 5.3 is
PRop0srnO~5.6 Any semaphore code is thin.
Proof By Formula (5.3),no word in X A * is a factor of a word in X . 0
COROLLARY 5.1 Any semaphore code is a maximal code. Proof A semaphore code is a maximal prefix code and thin by Propositions 5.1 and 5.6. Thus by Theorem 3.7, such a code is maximal code. 0 Now we determine the sets of semaphores giving the same semaphorecode.
PRoPos~no~ 5.8 Two nonempty subsets S and T of A + defne the same semaphore code ifl A*SA* = A* T A * . For each semaphore code X , there exists a unique minimal set of semaphores, namely T = X - A'X. . Proof Let X = A*S - A*SA+, Y = A * T - A* T A + . By Proposition 1.2, we have XA* = A*SA*, YA* = A*TA*, and by Corollary 1.3, X = Y iff A*SA* = A* T A * . Next, let X = A*S - A*S A + be a semaphore code. By the definition of T = X - A'X, we may apply to T the dual of Proposition 1.2. Thus, A * T = A * X . Since A*TA* = A*XA* = A*SA*, the set T is indeed a set of semaphores for X . Finally, let us verify that T c S. Let t E T; then t = usu for some U , U E A*, s E S, and s = u't'u' for some u',u' E A*, t' E T. Thus, t = uu't'u'u. Note that T c X . Thus, formula (5.3) applies, showing that u'u = 1. Since T is a suffix code, we have uu' = 1. Thus, t = s E S. 0 We now study some operations on semaphore codes.
PROPOSITION 5.9 If X and Yare semaphore codes, then X Y is a semaphore code. Conversely, if X Y is a semaphore code and if X is a prefx code, then X is a semaphore code.
5.
SEMAPHORE CODES
111
Y'
X'
X
V
Fig. 2.22 Proof of Proposition 5.9.
Proof If X,Y are semaphore codes, then by Corollary 4.2, X Y is a prefix code. Further by Proposition 5.2. A*XY c XA*Y c XYA*. thus X Y is a semaphore code. Assume now that XY is a semaphore code, and that X is a prefix code. We show that A*X c XA*. For this, let w = ux E A*X,with u E A*, x E X,and let y be a word in Y of minimal length. Then
wy = uxy = x'y'u' for some x' E X , y' E Y,u' E A* (see Fig. 2.22). By the choice of y, we have IYI5 IY'I I ly'u'l, thus, lux1 2 Ix'l, showing that ux E X A * . 0 The following example shows that if X Y is a semaphore code, then Y need not be semaphore, even if it is maximal prefix. 5.4 Over A = {a,b}, let X = a*b, and Y = {a2,aba, ab', b). Then EXAMPLE X is a semaphore code, and Y is a maximal prefix code. However, Y is not semaphore (Example 5.3). On the other hand the code Z = XY is semaphore.
Indeed, Z j s maximal prefix, and the set
P =ZA-
= a*{l,b,ba,bab)
is suffix-closed. The conclusion follows from Proposition 5.4 (see Fig. 2.23).
Fig. 223 The code a*b{03,aba,ab2,b}.
I I2
iff
11. PREFIX CODES
COROLLARY 5.10 For any X c A + and n 2 1, the set X is a semaphore code X" is a semaphore code.
Proof If X"is a semaphore core, then X is a prefix by Corollary 4.3 and X is a semaphore code by Proposition 5.9. The converse follows directly from Proposition 5.9. 0
EXAMPLE5.5 The code X = {a,baa, baba, bab2,b'} represented in Fig. 2.24 is a maximal prefix code but not semaphore. Indeed, the word a has an inner occurrence in bab2, contradicting formula (5.3). However, X decomposes into two semaphores codes
X=YoZ with Y = {c,dc,d2,de,e}and Z = {a,ba, b 2 } .
Fig. 2.24
The code X.
Given a semaphore code
X
= A*S - A*SA+,
it is natural to consider
Y = SA* - A+SA*. The code Y is a maximal suffix code; its reverse = A*S - A*SA+ is a semaphore code with semaphores 3. The following result shows a strong relation between X and Y. PROPOSITION 5.1 1 Let S c A'. There exists a bijection /3 from X = A*S A*SA+ onto Y = SA* - A+SA* such that, for each x E X , /3(x) is a conjugate of x.
Proof First, consider J = A*SA*. Then J is a two-sided ideal, and further
X
= J - JA',
Y =J
- A'J.
Indeed, J = JA* = (A*S)A*. Thus, A*S and J generate the same right ideal. By Proposition 1.2, J - J A + = X . The same arguments hold for Y.
5.
“3
SEMAPHORE CODES
Now we define, for each x E X, D(x) = { d E A + I there is some g E A* with x = gd and dg E J}.
Thus, D(x) is composed of nonempty right factors of x. Further D(x) is nonempty since x is in D(x). Thus, each D(x) contains some shortest element. This will be used to define j3 as follows. For x E X, (5.4)
B(x) = dg,
where d is the shortest word in D(x) and g is such that (5.5)
x = gd.
Thus, P(x) is a conjugate of x, and P(x) E J. We show that / ? ( x ) E J - A + J = Y. Assume the contrary. Then
P(x) = dg = ~j
(5.6)
for some u E A+, j E J. Next g is a proper left factor of x. Consequently, g # J, since J = J A * = XA*, and thus, any word in J has a left factor in X. This shows that 191 < l j l , since otherwise g would belong to the ideal generated by j , thus g E J. It follows from this and from (5.6) that (dl > lul, thus, d = ud’ for some d’ E A*. Moreover d‘ E D(x), since d’(gu) = j u E J and ( g u ) d = gd = x E X. This yields a contradiction by the fact that d‘ is strictly shorter than d. Thus, B(x) E y.
Consider the converse mapping y from Y into X defined by considering for y in Y
C(y)= { e E A + l y
= eh and
he E J }
and by setting y(y) = he, with e E G(y)of minimal length. If y = @(x)= dg is given by (5.4) and (5.5) and if y ( y ) = he with e E G ( y ) , eh = y, then dg = P(x) = eh.
(5.7)
Note that gd E J. Thus, d E G(y).Consequently, Id1 2 lei. Now the word e is not a proper left factor of d. Otherwise, setting d = eu, ug = h in (5.7) with u E A + , we get geu = gd = x,
uge = he E J ,
showing that u E D(x) and contradicting the minimality of Id[. Thus d = e, g = h, and y( /?(x)) = x. An analogous proof shows that /?(y(y)) = y for y in Y. Thus, fl and y are reciprocal bijections from X onto Y. 0
XI. PREFIX CODES
TABLE 2.1
The X - Y Correspondance ~~~~~
~
X
D
Y
aa aba
a, aa a,ba,aba b,bb,abb ba 6,bb
aab bab ba bb
abb ba bb
00
EXAMPLE5.6 Let us illustrate the construction of Proposition 5.11 by considering, over A = {a,b}, the set of semaphores S = {a2,ba, b2}.Then X = A*S Y = SA*
- A*SA+ = {a2,ba,b2,aba,ab2}, - A+SA* = {a2,a2b,ba, bab, b2}.
Table 2.1 lists on each row an element x E X , the correspondingset D(x)and the element P(x) E Y. Proposition 5.1 1 shows that any semaphore code can be transformed into a suffix code by a bijection which conjugates words. This property does not hold for arbitrary prefix codes, as shown by
EXAMPLE5.7 Let X = {ab,ba,c,ac,bca}. Assume that there exists a bijection P which transforms X into a suffix code Y by conjugating the words in X . Then Y necessarily contains c, and ab, ba. Further Y contains ca (with c and ac, Y would not be sufh!). All the words conjugateto bca now have a right factor equal to one of c, ab, ba, ca. Thus, Y is not s&. In fact, X cannot be completed into a semaphorecode, since c is a factor of bca. We end this section with the following result which shows that biprefix codes are not usually semaphore codes.
PROPOSITION 5.12 Let X be a biprefix Semaphore code. Then X = A" for some n 2 1.
Proof It is sufficient to show that X c A" for some n. Let x, y E X . For each right factor q of x, we have q y E A*X c XA*. Thus there is, in view of Proposition 5.3, a left factor p of y such that q p E X . In this way we define a mapping from the set of right factors of x into the set of left factors of y. The set X being sufiix, the mapping is injective; it follows that 1x1 5 Iy 1. Interchangingx and y, we get Iyl 5 1x1. Thus, all words in X have the same length. 0
6.
SYNCHRONOUS CODES
I
15
For semaphore codes, the above result is a particular case of the theorem of synchronization of semaphores to be stated in the next section. For biprefix codes, it is an easy consequence of Theorem 111.4.2.
6. SYNCHRONOUS CODES
Let X be a maximal prefix code over A, and let x be a word. The word x is said to synchronize a word u E A* if ux E X*. The word x E A + is called synchronizing for X if it synchronizes every word in A*, ie., iff A*x c x*. (6.1) It follows from (6.1) that a synchronizingword x is in X* and further that x y (with y E X * ) is synchronizing if x is synchronizing. A maximal prefix code X is synchronous iff there exists a word which is synchronizing. Otherwise the code is called asynchronous.
EXAMPLE 6.1 The code X = b*a is synchronous. Indeed, a is a synchronizing word, since A*a c X*. EXAMPLE 6.2 A biprefix code X over A is asynchronous if X # A. Assume indeed that x E X * is synchronizing. For any u E A*, we have ux E X*. The monoid X * being left unitary, it follows that u E X*. Thus A* = X*. The terminology is derived from the following observation: let w be a word which has to be factored into words of some prefix code X.The appearance, in the middle of the word w, of some synchronizingword x, i.e., the existence of a factorization w = #XU
implies that ux is in X * . Thus we may start the decoding at the beginning of the word v. Since X* is right unitary, we have indeed w E X * iff u E X*. That means that the whole word is in X* iff the final part can be decoded. Note that any code X over A satisfying (6.1) is maximal prefix. Indeed, let y , yu E X . Then ux E X*, and ~ ( u x )(yu)x , are two X factorizations which are distinct iff u # 1. Thus u = 1. Next, (6.1) shows that X is right complete. Any synchronous code is thin. Indeed, if x is a synchronizingword for a code X , then x 2 is not a factor of a word in X, since otherwise uxxu E X for some u, u E A*. From ux E X + ,it would follow that X is not prefix. The fact that a code X is synchronous is well reflected by the automata recognizing X.Let us give a definition. Let d = (Q,i, T )
I
16
11. PREFIX CODES
be a deterministic automaton on A. The rank of a word x fined by rankJx) = Card(Q x). It is an intenger or
E A*
in d is de-
+ co.Clearly rankd(uxv) Irank,(x).
PROPOSITION 6.1 Let X be a maximal prefix code over A, and let x E X*. The following conditions are equivalent: (i) x is synchronizing for X . (ii) For the minimal automaton d(X*), we have rank,(,,,(x) = 1 (iii) There exists a deterministic automaton d recognizing X* such that rank,(x) = 1. Prooj’ (i) (ii) Let d ( X * ) = (Q,i,i). Let q E Q, and let u E A* with i u = q. From ux E X*, it follows that q x = i. Thus Q x = { i } , showing that rank,(,*,(x) = 1. (ii) (iii) is clear. (iii) * (i) Let d = (Q,i, T )and let qo = Q x. Then qo E T. We verify that x is synchronizing. Let u E A*, and set q = is u. From q . x = qo, we have UXEX*. 0
-
-
PROPOSITION 6.2 Let X be a thin maximal preJix code over A, and let P = X A - . Then X is synchronous ifl for all p E P, there exists x E X* such that px E x*. Note that the existence of a unique word synchronizingall left factors is not required in the hypotheses of the proposition. Proof The condition is clearly necessary. It is also sufficient. Let w E A* be such that w is not a factor of a word in X. Let K = ( p l yp z ,. ..,p.} be the set of right factors of w which are in P. The set K is not empty since 1 E K. Define a sequence xl,. . .,x , of words in X* as follows: x 1 is a word in X* such that plxl E X*. If x l , . .., xi- are defined, there exist, because X is right complete, words y E X * , p E P with p i x , “‘xi-l = yp.
Then x , is defined to be a word in X* such that pxi E X*. It follows that for i = 1, ...,n, The word z = wxl x z
(6.2) pix1 “ ‘ X i - l X I EX*. x, is synchronizing for X. Indeed let u E A *. Then uw = y’p’
6.
SYNCHRONOUS CODES
117
for some y' E X* and p' E P. The word p' is right factor of w, since otherwise w would be a factor of a word in X.Thus, p' = pifor some pi E K.Using (6.2),we obtain
x*. 0
uz = y'(p,x, . * * x i ) ( x i + i * * *Ex " )
PROPOSITION 6.3 Let X , Y,Z be maximal prejx codes with X = Yo Z . Then X is synchronous if Y and Z are synchronous. Proof
Let Y c B*, X,Z c A*, and p: B*
+ A*
be such that
x = Yo& First, assume that Y and Z are synchronous, and let y E Y*, z E Z* be synchronizing words. Then B*y c Y* and A*x c Z*, whence A*ZS(Y) c Z*P(Y) = B(B*Y) c B(Y*) =
x*,
showing that zB( y ) is a synchronizing word for X. Conversely, assume that A*x c X * for some x E X * . Then x E Z* and X* c Z*; thus, x is also synchronizingfor 2. Next, let y = B-'(x) E Y*. Then B(B*y) = z * x c A*x c
x* = B(Y*).
The mapping p being injective, it follows that B*y c Y*. Consequently, Y is synchronous. 0 EXAMPLE 6.3 The code X = (A2 - b2) u b2A2 is asynchronous, since it decomposes over the code A 2 which is asynchronous (Example 6.2). It is also directly clear that a word x E X* can never synchronize words of odd length. EXAMPLE 6.4 For any maximal prefix code Z and n 2 2, the code X = Z" is asynchronous. Indeed, such a code has the form X = B" 0 Z for some alphabet B, and B" is synchronous only for n = 1 (Example 6.2).
We now give a result on prefix codes which will be generalized when other techniques will be available (Theorem 111.3.6). The present proof is elementary. Recall from Chapter I that for a finite code X , the order of a letter a is the integer n such that a" is in X . The existence of the order of a results from Proposition 1.5.9. Note that for a finite maximal prefix code, it is an immediate consequence of the inclusion a f c X * P , with P = X A - . THEOREM 6.4 Let X c A + be a Jinite maximal prefix code. If the orders of the letters a E A are relatively prime, then X is synchronous. Proof Let P = X A - and let d = (P, 1,l) be the literal automaton of X*. This automaton is complete since X is maximal prefix. Recall that its action is
I 18
11. PREFIX CODES
given by p * a=
if p a € P, if EX.
pa 1
For all w E A*, define Q(w) = P w.
Then for w, w' E A*, Q(W'W)
c
Card Q(w'w) 5 Card Q(w').
Q(w),
(6.3)
Observe that for all W EA*, Card(Q(w)) = rank, (w). Let u E A* be a word such that Card(Q(u))is minimal. The code X being right complete, there exists u E A* such that w = uu E X+.By (6.3),Card(Q(w)) is minimal. Further w E X + implies (6.4) We will show that Card(Q(w))= 1. This proves the theorem in view of Proposition 6.1. Let a E A be a fixed letter, and let n be the positive integer such that a" E X. We define two sets of integers I and K by 1 E Q(w).
I
I = {i E N Q(w)a' n X #
a},
K = { ~ E { O..., , n - l}lakwEX*}. First, we show that Card I = Card Q(w).
(6.5)
Indeed, consider a word p E Q(w) (cP). There is an integer i such that pa' E X, since X is finite and maximal. This integer is unique since otherwise X would not be prefix. Thus there is a mapping which associates to each p in Q(w) the integer i such that pa' E X. This is clearly a surjective mapping onto 1. We verify that it is also injective. Assume the contrary. Then pa' E X and p'a' E X for p , p' E Q(w), p # p'. This implies Card(Q(wa')) < Card(Q(w)), contradicting the minimality of Card(Q(w)). Thus the mapping is bijective. This proves (6.5). Next set m = max{i k 1 i E I, k E K}.
+
Clearlym 5 maxl
+ maxK 5 m a x l + n - 1. Let R = {m,m+ 1,...,m + n - l}.
We shall find a bijection from I x K onto R. For this, let r E R and for each P E Q(w), let V ( P ) = p ' a'w.
6.
1 19
SYNCHRONOUS CODES
Then v(~)=(~.u').w P *Ew = Q ( w ) .
Thus v(Q(w))c Q(w)and v(Q(w))= Q(w)by the minimality of Q(w).Thus v is a bijection from Q(w) onto itself. It follows by (6.4) that there exists a unique p , E Q(w)such that p,arw E X * . Let i, be the unique integer such that p r d r E X. Then i, E I, whence i, 5 m Ir. Set
r = i, with I
E
+ I n + k,
(6.6)
N and 0 I k, < n. This uniquely defines k, and we have
p,w = (p,a'r)(a")"ak~w). Since p,air E X and X * is right unitary, we have (a")'(ukrw)E X* and also akrwE X*. Thus, k, E K . The preceding construction defines a mapping
r w(i,,k,) (6.7) first by determining i,, then by computing k, by means of (6.6).This mapping is injective. Indeed, if r # r', then either i, # ir,, or it follows from (6.6) and from r f r' mod n that k, # k,,. We now show that the mapping (6.7) is surjective. Let (i, k) E I x K, and let I E N be such that R -+ I x K ,
r =i
+ In + k E R.
By definition of I, there is a (unique) q E Q(w) such that qa' E X,and by the definition of K, we have qa'w
E
x.
Thus, q = p,, i = i,, k = k,, showing the surjectivity. It follows from the bijection that
-
n = Card@) = Card(1) Card(K). This in turn implies, by (6.5), that Card Q(w)divides the integer n. Thus Card Q(w) divides the order of each letter in the alphabet. Since these orders are relatively prime, necessarily Card(Q(w)) = 1. The proof is complete. 0 complete. 0 We will prove later (Section IV.7) the following important theorem. THEOREM 6.5 Let X be a semaphore code. Then there exists a synchronous semaphore code Z and an integer d such that
x = zd. This result admits Proposition 5.12 as a special case. Consider indeed a biprefix semaphore code X c A +. Then according to Theorem 6.5, we have
I20
11. PREFIX CODES
TABLE2.2
The Transitions of d ( X * ) e
l
2
3
4
5
a b
2 4
3 6
1 7
1 5
3 1
6 8
7 9
4
8 3
1
9 .
5
1 1
X = Zd with Z synchronous. The code X being biprefix, Z is also biprefix (Proposition 4.9); but a biprefix synchronous code is trivial by Example 6.2. Thus, Z = A and X = Ad. Theorem 6.5 describes in a simple manner the structure of asynchronous semaphore codes. Indeed, if X = Zd with Z synchronous, then X is asynchronous iff d 2 2 (see also Example 6.4). We may ask whether such a description exists for general maximal prefix codes: is it true that an indecomposable maximal prefix code X is either biprefix or synchronous? Unfortunately, it is not the case, even when X is finite, as shown by EXAMPLE 6.5 Let A = {a, b}, and let X be the prefix code with automaton d ( X ) = (Q,l,l) whose transitions are given in Table 2.2. The automaton d ( X * )is complete, thus X is maximal prefix. In fact, X is finite; it is given in Fig. 2.25.
P
Fig. 2.25 An asynchronous indecomposable code.
7.
I21
AVERAGE LENGTH
Fig. 2.26
To show that X is asynchronous, observe that the action of the letters a and b preserves globally the sets of states
{ L Z 31, (1,4951, {4,6, 71, {1,8,9} as shown in Fig. 2.26. This implies that X is asynchronous. Assume indeed that x E X * is a synchronizing word. Then by definition A*x c X * , whence q x = 1 for all states in q E Q. Thus for each three element subset I , we would have I x = { l}. Further X is not biprefix since b3, a b 4 e X . Finally, the inspection of Fig. 2.25 shows that X is indecomposable.
-
-
7. AVERAGE LENGTH The results of Chapter I concerning the measure of a code apply of course to prefix codes. However, for these codes, considerable extensions exist in two directions. First, the properties proved in Chapter I hold for measures which are much more general than those defined by Bernoulli distributions. Second, there exists a remarkable combinatorial interpretation of the average length of a prefix code by means of the measure of its proper left factors (Proposition 7.2). A function a: A* -+]0,1] is a (positive) cylindric measure if the two following conditions are satisfied: n(1) =
1
and for all w E A*, a(w) =
1 a(wa).
acA
(7.2)
I22
11. PREFIX CODES
The function R is extended to subsets of A * by setting, for L c A*, R(L) =
c
n(u).
UEL
Thus, (7.2) may be written as n(w) = n(wA).Note that this implies, in view of (7.1), that x(A) = 1. More generally, for n 2 0 1,
R(A") =
since n(An+') =
c
A(WA)
C
=
wsAn
A(W)
= R(A").
weAn
A positive Bernoulli distribution is of course a cylindric measure. The converse is false. Example 7.1 shows that cylindric measures are quite general.
EXAMPLE 7.1 Let A = {a,b}, and let (a,,),,2obe a sequence of positive real numbers such that the series a,, converges and
zzo
by Po = 1 and for n 2 1,
Define a sequence
/3,
n- 1
C a*.
= 1-
i=O
Then by construction 0 < p, < 1 (n 2 l),
a,,
+
Bn+l
= /3,
1(n 2 0).
We define a cylindric measure on A* by setting for n 2 0,
a(b"a)= a,,,
x(b")= 8,
and for w = b"au
with n 2 0 and u E A + , x(w) = a,2-'4.
It is easily verified that R is indeed a cylindric measure. Let X = b*a. The real numbers a,,, /3, are reported on the literal representation of X (Fig. 2.27). Note that n(X)= C a, = a. nrO
EXAMPLE 7.2 Let
R:
A* +]O,
13 be a cylindric measure, and let u E A*.
7.
AVERAGE LENGTH
Fig. 2.27 Some values of the cylindric measure.
Define a function p: A* + 30,1] by
Then p is a cylindric measure since a(1) = 1 and
PROPOSITION 7.1 Let II be a cylindric measure on A*. 1. For any prejix code X , we have
.(X) I 1. 2. For any jinite maximal prefix code, we have
.(X)
=
1.
For Bernoulli distributions, this follows from Theorems 1.4.2 and 1.5.10and from the fact that a finite set is thin. Proof We first show 2. If Card@) = 1, or X = A, then the result is clear. Otherwise, let x E X be a word of maximal length, and let t E A*, a E A be such that x = ta. Then by the maximality of x, t A c X. If t = 1, then X = A. Thus, t # 1. The set Y = (X- t A ) u t is a maximal prefix code (Proposition 4.6). Next, Y has strictly fewer elements than X;this comes from Card(A) 2 2. Further
K ( Y ) = .(X) - .@A)
+ a(t) = K ( X ) .
This shows that n(X) = 1 by induction of the cardinality.
'14
11. PREFIX CODES
To prove statement 1, let X be an arbitrary prefix code and let
x, = {x E x 11x1 5 n}. According to what we just proved, we have n(X,) 5 1. Thus, n(X) = supn(X,) I1. 0 nTO
EXAMPLE 7.1 (continued). The code X = b*a is maximal prefix. Nevertheless, n(X)= a < 1. This shows that the second part of Proposition 7.1 does not hold for infinite codes. In addition, since any number a €10, 1[ can be written in the form given by (7.3), the cylindric measure of the infinite maximal prefix code X can take any value in the interval 10, I]. Given a prefix code X c A' and a cylindric measure K on A* such that n(X) = 1, the average length of X (relatively to n) is defined by
It is a nonnegative real number or infinite. The context always indicates which is the underlying cylindric measure. We therefore omit the reference to it in the notation. The quantity A(X) is in fact the mean of the random variable assigning to each x E X its length 1x1. PROPOSITION 7.2 Let X be a prefix code and let P = XA-. Further, let n be a cylindric measure such that n(X) = 1. Then A(X) = n(P).
Proof
We start the proof by showing that each p
E
P satisfies
n(pA* n X) = n(p).
(7.4)
Consider indeed the function p: A* --t ]0,1] defined by P(W) = NPw)/n(P)* In view of Example 7.2, it is a cylindric measure. On the other hand PA* n X = p(p-'X), and by Proposition 4.6, the set p- 'X is a prefix code. Proposition 7.1 therefore implies that p(p-'X) 5 1. From P(P-'x) = 4PA* n X M P ) it follows that n(p) 2 n(pA* n X).
(7.5)
7.
125
AVERAGE LENGTH
Next let Y = (X
- (PA* n X))u p .
By the same Proposition 4.6, the set Y is a prefix code; according to Proposition 7.1, n ( Y )I1. Thus, n ( X ) - n(pA* n X ) + n(p)I1. By assumption, n ( X ) = 1. Consequently n(p)In(pA* n X ) . Together with ( 7 3 , this yields (7.4). Now, we observe the following identity concerning formal power series
Indeed, the multiplicity of a word x E X on the right-hand side of (7.6) is the number of its left factors in P. This number is precisely its length. It follows from (7.6) that
The left-hand side is A(X), and by (7.4), we obtain A(X) =
1 n(p)= n(P).
0
PEP
For a finite maximal prefix code, the cylindrical measure equals 1. Thus, COROLLARY 7.3 Let X be afinite maximal prefix code and P = X A - . For any cylindric measure n on A*, the following holds: A(X) = n(P). 0
For a Bernoulli distribution, the finiteness condition can be replaced by the condition to be thin. COROLLARY 7.4 Let X be a thin maximal prefix code, and P = X A - . For any positive' Bernoulli distribution n on A*, we have
n(x)= n(P). Further, the average length A(X) isfinite. Proof The code X being maximal, Theorem 1.5.1 1 shows that n(X)= 1; thus, the equality A(X) = n(P)follows from Proposition 7.2. Moreover, P is thin since each factor of a word in P is also a factor of a word in X. By Proposition 1.5.6, n(P) is finite. 0
I 26
11. PREFIX CODES
We shall see in Chapter VI that the average length is still finite in the more general case of thin maximal codes.
7.3 Let A = {a,b} and X = a*b. For any positive Bernoulli EXAMPLE distribution a, let a(a) = p, n(b) = q. Then
q x ) = n(a*) = l/q. EXAMPLE 7.4 Let D be the Dyck code over A = {a, b} (see Example 1.4.5). We have seen that for a Bernoulli distribution given by .(a) = p , n(b) = q, the following formula holds: n(D) = 1 - Ip - 41. Further, setting d, = a(D n A"), we have by formula (1.4.14),
It follows that, for p = q = 3, we have A(D)= 1 and
This series diverges since, according to Stirling's formula,
Thus, the average length of D is infinite for p = q = 4. The next example concerns the computation of the average length of semaphore codes. We start with an interesting identity. PROPOSITION 7.5 Let X c A + be a semaphore code, P = X A - and let S be the minimal set for which X = A*S - A*SA+. For s, t E S , let
I
X , = X n A*s and R,,, = { w E A* sw E A*t and IwI
-= Itl}.
Thenfor all t E S,
(7.7) Proof First, we observe that each product X,R, is unambiguous, since X , is prefix. Further any two terms of the sum are disjoint, since X = U X , is prefix. Thus, it suffices to show that Pt =
U X,R,. saS
7.
AVERAGE LENGTH t
W
X
Fig. 2.28 Factorizations of pt.
First let p E P, and let x be the shortest left factor of pt, which is in A* S. Then x E X and pt = xw
for some w E A*. Next x E X,for some s E S. Set x = us. The word p being in P, we have IpI < 1x1, whence IwI < (ti(see Fig. 2.28). Now p cannot be a proper left factor of u, since otherwise s would be a proper factor of t, contradicting Proposition 5.7 and the minimality of S. Thus, u is a left factor of p and sw E A*t, showing that w E Rst. Conversely, let x E X,and w E R,, for some s, t E S. Then x = us and sw = It for a proper left factor 1 of s. Then ul is a proper left factor of us = x; thus, ul E P and xw = ult E Pt. 0 COROLLARY 7.6 With the notations of Proposition 7.5, we have for any Bernoulli distribution n, the following system of equations: STS
1 n(X,) = 1.
0
SES
For the case of a finite set S, the system (7.8) is a set of 1 + Card@) linear equations in the 1 + Card@) unknown variables n(X,) and A(X).The sets R,, being finite, the coefficients n(R,,)are easily computed. This yields the desired expression of A(X). In the special case where S is a singleton, we get
I
7.7 Let s E A', let X = A*s - A*sA+ and R = { w E A* COROLLARY sw E A*s and 1wI < Isl}. Then for any positive Bernoulli distribution 'IC, we have l ( X ) = x ( R ) / n ( ~ ) .0
EXAMPLE 7.5 Let A = {a, b} and consider s = aba. The corresponding set R is R = {l,ba}. If n(a) = p and n(b) = q = 1 - p , then for X = A*aba A*abaA+,we get
n(x)= -.1 + P4 P24
I 28
11. PREFIX CODES
Now choose s' = baa. The corresponding R' is the set R' = {I). Thus, for X = A*baa - A*baaA', we have 1 A ( X ) = -7j 4P For p = q = $, this gives A ( X ) = 10, A(X') = 8. This is an interesting paradox: we have to wait longer for the first appearance of aba than for the first appearance of baa! 8. DECIPHERING DELAY
Let X be a subset of A'. Then X is said to have finite deciphering delay if there is an integer d 2 0 such that VX,X' E X, Vy E X d , VU E A*,
xyu
E x'X*
x = x'.
(8.1)
If (8.1) holds for some integer d, then it holds for all d' 2 d. If X has finite decipheringdelay, then the smallest integer d satisfying (8.1)is the deciphering delay of X . This notion is clearly oriented from left to right; it is straightforward to define a dual notion (working from right to left). The terminology is justified by the following consideration: let w E A* be a word having two left factors in X', and such that the shorter one is in X' 'd. If (8.1)holds, then the two X-factorizations start with the same word in X. Thus, the "delay" between the moment when, reading from left to right, a possible factor of an X-factorization is discovered, and the moment when these factors are definitively valid, is bounded. It follows from the definition that the sets with delay d = 0 are the prefix codes. This is the reason why prefix codes are also called instantaneous codes. In this sense, codes with finite delay are a natural generalization of prefix codes.
PROPOSITION 8.1
A subset X of A', which has a finite deciphering delay, is a
code.
Proof Let d be the deciphering delay of X. Assume that X is not a code. Then there is an equality w = X ' X Z " ' X , = y,y,..*y,,
x1
#y,
with n, m 2 1, xl,. . .,x,, yl,. . .,y, E X. Let z E X. Then wzd E y,X*. By (8.1), we have x1 = y , . Thus, X is a code. 0
EXAMPLE8.1 The suffix code X delay. Indeed, for all d y = ba # b.
= {aa, ba, b} 2 0, the word b(aa)dE X' 'd
has an infinite deciphering is a left factor of ~ ( U Uwith ) ~
8.
DECIPHERING DELAY
129
We now study maximal codes with finite deciphering delay. The following result is similar to Proposition 1.52. However, it is less general and we do not know of a simple construction to embed a code with finite deciphering delay into a complete one. PROPOSITION 8.2 Let X be a subset of A + which has finite deciphering delay. I f y E A + is an unbordered word such that X*yA* n X * = 0,
then Y = X u y has finite deciphering delay. Proof Consider the set I/ = X * y . It is a prefix code. Indeed, assume that v = x y and v‘ = x‘y with x, x’ E X * , and v < v’. Then necessarily v I x‘ since y is unbordered. But then x’ E X*yA*, a contradiction. Note also that V + A * n X * = $3
since V + A * c VA*. 0 Let d be the deciphering delay of X and let e =d
+ Jyl.
We show that the deciphering delay of Y is at most e. For this, let us corisider a relation w = y l y 2 ” . y e + l U = v;y;...y:,
,,
with y,, ... , y e + y;, .. .,y; E Y , u E A* and, arguing by contradiction, assume that y , # y’,. First, let us verify that one of y,, . . ., y e + is equal to y. Assume the contrary. Then y, . . * y d +E, X d + ’ . Let 4 be the smallest integer such that (Fig. 2.29) yl*“Yd+l
I Y’,”*yb.
The delay of X being d, and y, # yi, one among y’, ,. . .,y; must be equal to v2 Ydtl
Vf,
Fig. 2.29
Yet1
11. PREFIX CODES
130
y. We cannot have y; = y for an index i < q, since otherwise y , - * - y d + E V + A * n X*. Thus, yb = y and y ; ...yb E V . Note that fl*'*yb-1
Yl'**yd+l*
Next (yd+2.-.ye+l( 2 e - d = Iyl. It follows that Y;*-Yb ~ Y l . ' . Y e + l But then y , ye+ E X* n X*yA*, which is impossible. This shows the claim, namely, that one of yl, . ..,ye+ is equal to y. It follows that w has a left factor y1y2 y, in V with y,, . .. ,y p- E X and y p = y. By the hypothesis, one of y;, . . .,y; must be equal to y. Thus w has also a left factor y;y; * * yb in V with y', ,.. .,yb- E X and yb = y. The code V being prefix, we have Y l Y 2 '.Y p - 1 = Y;Y; * * * Yb- 1* Since X is a code, this and the assumption yl # y; imply that p = q = 1. But then y p = y = yq. This yields the final contradiction. 0
-
,
,
--,
,
Proposition 8.2 has the following interesting consequence. 8.3 Let X be a thin subset of A + .If X has finite deciphering delay, THEOREM then the following conditions are equivalent
(i) X is a maximal code, (ii) X is maximal in the family of codes with finite deciphering delay.
Proof The case where A has just one letter is clear. Thus, we suppose that Card(A) 2 2. It sufficesto prove (ii) * (i).For this, it is enough to show that X is complete. Assume the contrary and consider a word u which is not a factor of a word in X*.According to Proposition 0.3.6,there exists u E A* such that y = uv is unbordered. But then A*yA* n X * = 0 and by Proposition 8.2, X u y has finite deciphering delay. This yields the contradiction. 0
We now state and prove the main result of this section. THEOREM 8.4 Any finite maximal code with finite deciphering delay is prefix. In an equivalent manner, a maximal finite code either is prefix or has infinite deciphering delay. The proof'of the theorem is rather long. It requires new concepts which will be illustrated in a different way in Chapter VI. Let X be a subset of A'. A words E A* is simplifying (for X ) if for all x E X*, V E A*, XSVEX* * SVEX*.
8.
'3'
DECIPHERING DELAY
The set of simplifying words is denoted by S ( X ) or S. It is a right ideal. Further X* is right unitary iff 1 E S, thus, iff S = A*. A word p is extendable (for X ) if, for all u E A*, there exists u E A* such that puu E X * . Clearly, an extendable word is right completable. The set of extendable words is denoted by E ( X ) or E. As for S, it is a right ideal. Further, X is right complete iff 1 E E; thus, iff E = A*. Of course the notions of simplifying and extendable words are left-to-right oriented,just as the notion of deciphering delay.
8.5 Let X c A + be a code. If both E and S are nonempty, then PROPOSITION E = S. Proof Let us show first the inclusion S c E . Let s E S, p E E . Note that pt E X * for some word t and that pt still is extendable.Thus, we may assume that p E E n X*. Consider any word u E A*. Since p E E, the word psu can be completed: there is a word u E A + such that psuu E X * . But p is in X* and s is simplifying. Thus, suu E X * , showing that s is extendable. Conversely, let s E S, p E E. To show that p is simplifying, let x E X * , u E A* such that xpu E X * . Since the word pus is right completable, we have pusw for some w E A*. But then xpusw E X*'also and since s is simplifying, we have sw E X * . Thus, finally, the four words x , x(pu),( p u ) (sw),and sw are in X * . The set X* is stable, thus pu E X * . This shows that p is simplifying. 0 E X*
Let X c A + be a code and let w E A*. A right context of w is a word u E A* such that there exist x l , ...,x, E X with 1. wu = X 1 X 2 " ' X "
2. u is a proper right factor of x,. The set of right contexts of w is denoted by C,(w).This set is nonempty iff w is right-completable in X*. Further 1 E C,(w)iff w E X * . Recall that if S ( X ) # 0,then it is a right ideal. Its initial part U=S-SA+ is prefix and UA* = S .
PROPOSITION 8.6 Let X be a code such that S = E # 121 and let U = S S A + . For all w E A*, we have 1. The set C,(w)U is prefix. 2. The product C,(w)U is unambiguous. 3. Zf w E S,then C,(w)U is maximal prefix.
11. PREFIX CODES
132
Proof
We first verify the following property:
if uuz = u'u' for u, u' E C,(w),u, u' E U , and z E A*, then u = u', u = u', z = 1
(*)
Indeed, first note that u E E. Thus, there exists t E A* such that uzt E X*.Then (wu)(uzt)= (wu')(u't). (8.2) Each one of the first three parenthesized words is in X*.Now the fourth word, namely u't, is also in X*,because u' is simplifying. The set X being a code, the word in (8.2) has a unique factorization in words in X , starting with wu, and also with wu'. It follows that u = u'y or. u' = uy for some y E X*.This implies that u = u' as follows: assume, for instance, that u = u'y, and set wu = X ~ X 2 " ' X n , wu' = x;*.*x;, y = y, y,, with x,,. ..,x,, x i , . . .,&, yI,. ..,y, E X.Then (x,J > Iu( and assuming p > 0, we have on the one hand (Fig. 2.30) a ' .
l ~ p5 l
IYI 5 IvI < IXnI,
and on the other hand, by X1X2"'Xn
= x;*-x;y,*-y,,
we have x, = y,. Thus p = 0, y = 1, and u = u'. Going back to (8.2), this gives uz = u'. Now U is prefix. Consequently z = 1 and u = u'. This proves property
(*I.
It follows immediately from (*).that C,(w)U is prefix, and also, taking z = 1, that the product C,(w)V is unambiguous. This proves 1 and 2. To prove 3, we show that C,(w)S is right dense. For this, consider a word t E A*. The word
V
W
-
-
Fig. 2.30 Factorization of wu = wu'y.
wt is right completable, since w E E. Thus, wtt' E X* for some t' E A*. Thus, it' is in w-'X*. Consequently tt' = uy for some u E C,(w), y E X * . Now observe that w E E, and consequently also yw E E. Thus, tt'w = uyw E C,(w)S. This shows that C,(w)S is right complete. From C,(w)S = C,(w)UA* it follows then by Theorem 3.3 that the prefix set C,(w)U is maximal prefix. 0
To be able to use these propositions in the proof of the theorem, we verify that the hypothesis of the propositions are satisfied: we show that S and E are nonempty. PROPOSITION 8.7 Let X be a code with jinite deciphering delay. Then
X dc S(X). Proof Let x E X d , let further x l , x 2 , .. .,x p E X and u E A*, be such that x 1 x 2 * * ~ x PExXu .Thus X,X2"'XPXU
=y,y,***y,
for some y , , . . .,y, E X. From the fact that X has delay d , it follows that xl = y 1 , x 2 = y 2,...,x p = y p , whence q 2 p and x u = y , + , . . . y , . Thus, X D E X*. This shows that x is simplifying. 0 PROPOSITION 8.8 Let X be a maximal code with jinite deciphering delay d. Then Xd c E(X). Proof The case of a one letter alphabet is clear. Thus, assume that Card(A) 2 2. Let x E Xd and u E A*. By Proposition 0.3.6, there is a word u E A* such that y = xuu is unbordered. This implies that X*yA* n X* # 0.
Indeed, otherwise X u y would be a code by Proposition 8.2, contradicting the maximality of X. Consequently, there exist z E X * , w E A* such that zyw E X*. By Proposition 8.7, x is simplifying; thus, zyw = zxuuw E X* implies xuuw E X*. This shows that x is extendable. 0 Proof of Theorem 8.4. Let X be a maximal finite code with finite deciphering delay d. According to Propositions 8.7 and 8.8, both S ( X ) and E(X) are nonempty. Thus by Proposition 8.5, they are equal. Set .S = S(X) = E(X).
Then Xd c S, further S is a right ideal, and the prefix set
U
= S - SA'
satisfies S = UA".
11. PREFIX CODES
I34
We claim that U is a finite set. Indeed, set
6 = d maxlxl X€X
and let us verify that a word in U has length 5 6 . For this, lets E S with Is( > 6. The words tieing extendable, there is a word w E A* such that sw E X*.By the choice of 6, the word sw is a product of at least d + 1 words in X,and s has a proper left factor, say s', in X d . From X d c S,we have s E SA'; thus, s 4 U. This proves the claim. Now, we fix a word x E X d ,and consider the set C,(x) of right contexts of x. The set C,(x) is finite since each element of C,(x) is a right factor of some word in the finite set X. By Proposition 8.6(3), the set 2 = C,(x)U is a maximal prefix set, since x E X d c S.Further, Z is the unambiguous product of the finite sets C,(x) and U.By Proposition 4.10,both C,(x) and U are maximal prefix sets. Since 1 E C,(x), we have C,(x) = { l}. Thus, we have shown that C,(x) = { l} for x E X d . This implies as follows that X is prefix. Assume that y, y' E X and yt = y' for some t E A*. Let x = y d . Then x t = ydt = yd-'y' and It1 < (y'l show that t E C,(x). Since x E X d , we have t = 1.Thus, X is a prefix code. 0 The following examples show that Theorem 8.4 is optimal in several directions, EXAMPLE 8.2 The suffix code X = {aa,ba, b} is a finite maximal code and has infinite deciphering delay. EXAMPLE 8.3 The code {ab,abb, baab} has deciphering delay 1. It is neither prefix nor maximal: indeed, the word bbab, for instance, can be added to it. EXAMPLE 8.4 The code X = ba* is maximal and suffix. It has deciphering delay 1. It is not prefix, but it is infinite.
EXERCISES SECTION 1 1.1. Let A be a finite alphabet, and let P be a prefix-closed subset of A*. Show that Pis infinite iff there exists an infinite sequence(pn)" of elements in
P such that Pz < P3 < " ' * 1.2. Let A be a finite alphabet and let X c A + be a prefix code. Set R = A* - XA*. Show that for all n 2 1, A" = ( R n A") u (XA* n A"), (*I P1
'35
EXERCISES
where the union is disjoint. Let k = Card(A)and a,, = Card(X n A") for n 2 1. Derive from (*) that
u,k-" I 1. nz 1
(This gives an elementary proof of Corollary 1.4.3 for prefix codes. See also Proposition 7.1.) SECTION 2 2.1.
Let X c A + be a prefix code. Let P = XA- and let
d
= (P, 1,l)
be the literal automaton of X*.Consider an automaton d = (Q, i, i)
which is deterministic, trim, and such that X* = Stab(i).Show that there is a function p:
P+Q
with p(1) = i and such that for a E A, P(P ' 4 = P(P)
(1.
SECTION 3 3.1.
Let X c A + be a finite maximal prefix code and P = XA-. Show that Card(X) - 1 = Card(P)(Card(A)- 1).
3.2. Let A be an alphabet, and let M(A) be the monoid of prefix subsets of A* equipped with the induced product. Show that M(A) is a free monoid and that the set of maximal (resp., recognizable) prefix sets is a free submonoid of M(A). (Use Exercise 1.2.2. and set y ( X ) = minxex1x1). SECTION 4 4.1.
Let A = {a,b}. Associate to each maximal prefix set 2 c A*, a word y ( 2 ) E A* as follows:
Y(m
=1
and if 2 = aX u by, Y(Z)= aY(X)bY(Y)* Let D c A + be the Dyck code and D, = D n aA*. Show that y is a bijection of the set of maximal prefix sets onto DZ and further that I y ( 2 ) l = 2Card(Z) - 2.
136
11. PREFIX CODES
4.2.
Derive from this that the number of maximal prefix sets with n elements is Card@: n A'"-') = (l/n)(?--?) (cf. Example 1.4.5). Let X and Y be two prefix codes over A, and P = A* - X A * ,
Q = A* - YA*.
Set R=PnQ.
Show that there exists a unique prefix code Z = X A Y such that
Z = R A - R. Show that Z = ( X n Q ) u (X n Y )u (P n Y). 4.3.
Show that if X and Y are maximal prefix sets, then so is Z. Let A be a finite alphabet. Show that the family of recognizable maximal prefix codes is the least family 9of subset of A* such that (i) A E 9, (ii) if X , Y E 4 and if X = X I u X , is a partition in recognizable sets X I , X , , then
z = x , VXZYE9. (iii) if X E 4 and if X = X, u X, is a partition in recognizable sets, then
z = x:x, E 9. SECTION 5 5.1.
Let X c A* be a prefix code. Show that the following conditions are equivalent. (i) A*X = X + . (ii) X is a semaphore code, and the minimal set of semaphores S=X-A+X
satisfies SA* n A*S = SA*S u S .
Note that for a code X = A*w - A*wA+,
the conditions are satisfied provided w is unbordered
137
EXERCISES
5.2.
Let J c A + be a two-sided ideal. For each x E J , denote by llxll the greatest integer n such that x E J"
andset IIxII = O f o r x # J . Show that, for all x, y E A* llxll
+ IlYll G
IIXYII G llxll
+ IlYll + 1
SECTION 6 Let X c A + be a finite maximal prefix code. Show that if X contains a letter a E A, then there is an integer n 2 1 such that a" is synchronizing. 6.2. Let X c A* be a synchronous prefix code. Assume that the automaton d ( X * ) has n states. Show that there exists a synchronizing word of length at most n3. 6.1.
SECTION 7 7.1.
Let X c A* be a maximal prefix code, and let a: X + ]0,1] be a function such that C a(x) = 1. xsx
7.2.
Show that there is a cylindric measure on A* such that a(x) = a(x) for x E X. Show that, moreover, n may be chosen such that ~ ( x y= ) n(x) n(y)for all x, y E X * . Let X c A + be a code and let a: X +]O, 11 be a function such that
c
.(X) = 1.
xcx
Define the entropy of X (relatively to a) to be
H(X)= -
c Wlog,
n(x),
xsx
where k = Card(A). Set A ( X ) = Show that
cxsX IxIa(x).
H(X)I A(X) and that the equality holds iff n(x) = k-1"'
for x E X.
Show that if X is finite and has n elements, then H(X)I log,n.
138
11. PREFIX CODES
7.3.
Let B be a finite set and let A: B + R, be a function. Also, let A = {a,b}. For every function a: B 4 A* set
We consider injective functions a such that a(B)is prefix, and we look for such a function with A(a) minimal. Such a function is called a Hugman encoding of (By A). Show that the following algorithm allows us to obtain a Huffman two elements of B such that n(cl), n(c2)5 n(c) for all c in B - {c, ,c2}. Let
B' = B - { c ~ , c Z } u { d } with d # B and define A': B' R, by n'(c) = A(C) for all c E B' - {d}and A'(4 = 4c1) n(c2). Let a' be a Huffman encoding of (B', A ' ) and let a be the function from B into A* defined by
+
a(c) = a'(c)
for c E B - {cl ,cz},
a(cz)= a'(d)b. a(cl) = a'(& Show that a is a Huffman encoding of (By n). SECTION 8 8.1.
For a set X c A', define, as in Section 1.3, a sequence (U,Jnro of subsets of A* by setting
U'
=
x-'x - 1,
un,
= x-'U"
u
u,'x,
n 2 1.
Show that X has finite deciphering delay iff all sets U,,are empty for sufficiently large n. 8.2. Let Y and Z be composable codes with finite decipheringdelay d( Y)and d ( 2 ) . Show that X = Y o Z has finite delay d(X)5 d(Y)+ d ( Z ) .(Hint: show that for y E Xd(r),z E Xd''), we have yz E S(X).) 8.3. Let X = {x, y} be a two-element code, Show that the deciphering delay of X is at most one. 8.4. Let X c A* be a finite code. (a) Show that there exists a smallest submonoid M c X* such that M is generated by a code with finite deciphering delay. (b) Let Y c A* be the base of the submonoid whose existence is asserted in (a). Show by a proof analogous to that of Proposition 1.2.10 that
Y c x(Y*)-'n (Y*)-'X*
NOTES
‘39
Deduce from this if X does not have finite deciphering delay, Card(Y) I Card(X) - 1. NOTES
The results of the first four sections belong to folklore, and they are known to readers familiar with automata theory or with trees. Semaphore codes were introduced in Schutzenberger(1964) under the name of f codes. All the results presented in Section 5 can be found in that paper which also contains Theorem 6.5. and Proposition 7.5. The notion of synchronous prefix code has been extensively studied in the context of automata theory. Let us mention Cerny’s problem which can be formulated as followsin our context: given a synchronous prefix code X whose minimal automaton d ( X * )has n states, what upper bound can be given to the length of the shortest synchronizingword as a function of n? See Exercise 6.2, Moore (1956), Cerny (1964), and Pin (1978). Example 6.5 is obtained by a construction of Perrin (1977a) (see Exercise VII1.4.1). The synchronization properties of a special class of prefix codes called inclusion codes have been studied by Bobrow and Hakimi (1969) and Boe (1972). The results of Section 7 are given in another terminology in Feller (1957). The pair formed by a prefix code X and a cylindric measure x such that n(xy) = n ( x ) x ( y )for all x, y E X* defines what Feller calls a recurrent event. The notion of decipheringdelay appears at the very beginning of the theory of codes (Gilbert and Moore, 1959; Levenshtein, 1964). Theorem 8.4 is due to Schutzenberger (1966). It was conjectured in Gilbert and Moore (1959). An incomplete proof appears in Markov (1962). A proof of a result which is more general than Theorem 8.4 has been given in Schutzenberger (1966). A characterisation of the coding morphism for a code with finite deciphering delay is given in Choffrut (1979). The monoid of prefix subsets defined in Exercise 3.2 has been further studied by Lassez (1973). The description of the Huffman algorithm (Exercise 7.3) is commented on in Knuth (1973) which also contains other interesting algorithms on trees. Exercise 8.4 is from Berstel et al. (1979). An analogous result is proved in Salomaa (1981). Let us mention the following result which has not been reported here: For a three-element code X = {x, y,z}, there exists at most one right infinite word with two distinct X-factorizations (Karhumaki, 1983).
CHAPTER
I11
Biprefix Codes
0. INTRODUCTION
The object of this chapter is to describe the structure of maximal biprefix codes. This family of codes has quite remarkable properties and can be described in a rather satisfactory manner. As in the rest of this book, we will work here within the family of thin codes. As we will see, this family contains all the usual examples, and most of the fundamental properties extend to this family when they hold in the simple (i.e., finite or recognizable) case. To each thin maximal biprefix code, two basic parameters will be associated: its degree and its kernel. The degree is a positive integer which is, as we will see in Chapter IV, the degree of a permutation group associated with the code. The kernel is the set of code words which are proper factors of some code word. We shall prove that these two parameters characterize a thin maximal biprefix code. In the first section,we introduce the notion of a purse of a word with respect to a biprefix code. It allows us to define an integer-valued function called the indicator of a biprefix code. This function will be quite useful in the sequel. In the second section, we give a series of equivalent conditions for a thin code to be maximal biprefix. The fact that thin maximal biprefix codes are extremal objects is reflected in the observation that a subset of their properties ‘40
'4'
I. PARSES
suffices to characterize them completely. We also give a transformation (called internal transformation)which preserves the family of maximal biprefix codes. Section 3 contains the definition of the degree of a thin maximal biprefix code. It is defined as the number of interpretations of a word which is not a factor of a code word. This number is independent of the word chosen. This fact will be used to prove most of the fundamental properties of biprefix codes. We will prove that the degree is invariant under internal transformation. In the fourth section, a construction of the thin maximal biprefix code having a given degree and kernel is described. We also describe the deriued code of a thin maximal biprefix code. It is a code whose degree is one less than the degree of the original code. Both constructions are consequences of a fundamental result (Theorem 4.3) which characterizes those sets of words which can be completed in a finite maximal biprefix code without modification of the kernel. Section 5 is devoted to the study of finite maximal biprefix codes. It is shown that for a fixed degree and a fixed size of the alphabet, there exists only a finite number of such codes. Further it is proved that, on this finite set, the internal transformation acts transitively. 1.
PARSES
A biprefix code is a subset X of A' which is both prefix and suffix. In other words, we have XA'
r\
X =
a,
A'X n X = 0.
(1.1)
An X-parse (or simply a parse) of a word w E A* is a triple (u,x,u) (see Fig, 3.1) such that w = uxu and VEA*-A*X,
XEX*,
UEA*-XA*.
An interpretation of w E A* is a triple (u,x, u) such that w = uxu and UEA-X,
XEX*,
UEXA-.
Since the code X is biprefix, we have A - X c A* - A*X, and X A - c A* - X A * , thus any interpretation of w is also a parse of w.
... V
X
Fig. 3.1 An X-parse of w.
U
111. BIPREFIX CODES
‘4‘ Y V 0
r
S
0
0
Fig. 3.2 A parse of w passing through the point (r,s).
A point in a word w E A* is a pair (r,s) E A* x A* such that w = rs. A word w thus has IwI 1 points. A parse (0, x, u) of w is said to pass through the point (r,s) provided x = yz for some y, z E X* such that r = uy, s = zu (see Fig. 3.2).
+
PROPOSITION 1.1 Let X c A be a biprefix code. For each point of a word w E A*, there is one and only one parse passing through this point. +
Proof Let (r, s) be a point of w E A*. The code X being prefix, there is a unique z E X*, and a unique u E A* - XA* such that s = zu (Theorem 11.1.4). Since X is suffix, we have r = uy for a unique u E A* - A*X and a unique y E X*. Clearly (0, yz, u) is a parse of w passing through (r,s). The uniqueness follows from the uniqueness of the factorizations of s and r. 0
PROPOSITION1.2 Let X c A + be a biprejix code. For any w E A*, there are bijections between the following sets: 1. the set of parses of w, 2. the set of left factors of w which have no right factor in X, 3. the set of right factors of w which haw no left factor in X.
Proof Set V = A* - A*X, U = A* - XA*. For each parse (u,x,u) of w, the word u is in V and is a left factor of w. Thus u is in the set described in 2. Conversely, if w = uw’ and u E V, set w’ = xu with x E X* and u E U (this is possible since X is prefix). Then (u,x,u) is a parse. The uniqueness of the factorization w’ = xu shows that the mapping (0, x, u) Hu is a bijection from the set of parses on the set described in 3. 0 Let X be a biprefix code over A. The indicator of X is the formal power series L, (or simply L )which associates to any word w the number (L,w) of X-parses of w. Setting U = A* - XA*, Y = A* - A*X, we have the result
-m*_v
L= Note that XA* = u*since X is prefix, and & X
(1.2) = A*X since X is suffix. = A*(1 - X).Substituting this in
Thus = A* - XA* = (1 - X)A* and y (1.2), we obtain L = A*(1 - 1()A*.
(1.3)
I. PARSES
'43
This can also be written as L = -VA* = A*_U.
(1.4)
Note that this is a compact formulation of Proposition 1.2. From formula (1.3), we obtain a convenient expression for the number of parses of a word w E A*: (L,w ) = IWI
+ 1 - @*&A*, w).
(1.5) The term (A*XA*,w ) equals the number of occurrences of words in X as factors of w. Thus we see from (1.5) that for any biprefix codes X , Y the following implication holds: YCX * LxILy. (1.6) (Recall that the notation L, IL , means that ( L x ,w ) I( L y ,w ) for all w in A*).
PROPOSITION 1.3 Let X c A + be a biprejx code, let U = A* V = A* - A*X, and let L be the indicator of X . Then
-v = L(1 - A), _U = (1 - A)L, 1 - g = (1 - A)L(l - A).
- XA*,
(1.7)
(1.8) Proof Formula (1.7) follows from (1.4), and (1.8) is an immediate consequence of (1.3). 0
PROPOSITION 1.4 Let X c A + be a biprejx code and let L be its indicator. Then for all w E A* 0 I ( L ,w) - 15
JWI.
(1.9)
In particular, (L, 1) = 1. Further, for all u, u, w E A*, (L,v ) I(L,uuw).
(1.10)
Proof For a given word w, there are at most J w J+ 1 and at least one (namely, the empty word) left factors of w which have no right factor in X . Thus (1.9) is a consequence of Proposition 1.2. Next any parse of u can be extended to a parse of uuw. This parse of uvw is uniquely determined by the parse of u (Proposition 1.1). This shows (1.10). 0
EXAMPLE1.1 The indicator L of the biprefix code X = 0 satisfies (L, w ) = IwI
+ 1 for all w E A*.
EXAMPLE1.2 For the biprefix code X = A , the indicator has value (L,w ) = 1 for all w E A*. The following proposition gives a characterization of formal power series which are indicators.
111. BIPREFIX CODES
'44
PROPOSITION 1.5 A formal power series L E Z ( ( A ) ) is the indicator of a biprefix code iff it satisfies the following conditions: (i) For all a E A, w E A*, 0 I (L,aw) - (L,w) I 1,
(1.11)
0 I (L,wa) - (L,w) I1.
(1.12)
(ii) For all a, b E A and w E A*,
+
(L,aw) (L,wb) 2 (L,w) + (L,awb).
(1.3)
(iii) (L, 1) = 1. Proof Assume that L is the indicator of some biprefix code X.It follows from formula (1.7) that the coefficients of the series L(l - 4) and (1 - A)L are 0 or 1. For a word w E A* and a letter a E A, we have (L(1 - A), wa) = (L,wa) - (L,w). Thus, (1.12)holds. The same holds for (1.11). Finally, formula (1.8)gives for the empty word, the equality (L, 1) = 1, and for a, b E A , w E A*,
-(g,awb) = (L,awb) - (L,U W ) - (L,wb) + (L,w), showing (1.13). Conversely, assume that L satisfies the three conditions. Set S = (1 - 4 ) L . Then (S, 1) = (L,1) = 1. Next for a E A, w E A*, we have (S,aw) = (L,aw) - (L,w).
By (1.1l),0 I (S,aw) I 1, showing that S is the characteristic series of some set U containing the empty word 1. Next, if a, b E A, w E A*, then by (1.13) (S,U W ) = (L,uw)- (L,W ) 2 (L,awb) - (L,wb) = (S,awb).
Thus, awb E U implies aw E U,showing that U is prefix-closed. According to Theorem 11.1.4, the set X = UA - U is a prefix code and 1 - 3 = U(1 -4). Symmetrically, the series T = L( 1 - 4) is the characteristic series of some nonempty suffix-closed set V, the set Y = A Y - Y is a suffix code and 1 -Y=(l--d)y, Finally 1 - g = _u(1- 4) = (1 - A)L(l- 4) = (1 - A)Y = 1 - 1. Thus, X = Y and X is biprefix with indicator L. 0 2. MAXIMAL BIPREFIX CODES
A code X c A 'is a maximal biprefix code if, for any biprefix code Y c A ', the inclusion X c Y implies that X = Y. As in the preceding chapter, it is convenient to note that the set { 1} is a maximal biprefix set without being a
-
2. MAXIMAL BIPREFIX CODES
'45
code. We start by giving a series of equivalent conditions for a thin code to be maximal biprefix.
PROPOSITION 2.1 Let X be a thin subset of A + . The following conditions are equivalent: (i) (ii) (iii) (iv) (iv') (v)
is a maximal code and biprefix. is a maximal biprefix code. is a maximal prefix code and a maximal sufix code. is a prefix left complete code. is a suffix right complete code. X is a left complete and right complete code. X X X X X
Proof (i) => (ii) is clear. (ii) =s (iii). If X is maximal prefix, then by Theorem 11.3.7, X is a maximal code, therefore X is maximal suffix. Similarly,if X is maximal suffix, it is maximal prefix. Thus, assume that X is neither maximal prefix nor maximal suffix. Let y, z E X be such that X u y is prefix and X u z is suffix. Then X u yz is biprefix. However, yz 4 X (since otherwise X u y would not be prefix). This contradicts (ii). (iii) * (iv') is a consequence of Theorem 11.3.3 stating that a maximal prefix code is right-complete (similarly for the implication (iii) * (iv)). (iv) (v) The code X is complete and thin. Thus, it is maximal. This shows that it is maximal prefix, which in turn implies that it is right complete. (v) * (i) A complete, thin code is maximal. By Theorem 11.3.7 a rightcomplete thin code is prefix. Similarly, X is suffix. 0 A code which is both maximal prefix and maximal suffix is always maximal biprefix, and the converse holds, as we have seen, for thin codes. However, this may become false for codes that are not thin (see Example 2.3).
EXAMPLE 2.1 A group code is a biprefix and a maximal code (Section 1.2). EXAMPLE 2.2 Let A
X
= {a,b }
and
= {a3,a2ba,a2b2,ab,ba2,baba,bab2,b2a,b3}.
By inspection of the literal representation (Fig. 3.3), X is seen to be a maximal prefix code. The reverse code 8 is also maximal prefix (Fig. 3.4). Thus X is a maximal biprefix code. Observe that 8 is equal to the set obtained from X by interchanging a and b (reflectionwith respect to the horizontal axis). This is an exceptional fact, which will be explained later (Example 5.1).
EXAMPLE 2.3 Let A = {a,b } and X = {wabl"l I w E A * } (see Examples 1.4.6 and 11.3.1).It is a maximal, right-dense code which is suffix
111. BIPREFIX CODES
Fig. 3.4 The literal representation of 1.
but not prefix. The set
Y=X-XA+ is maximal prefix and suffix but not maximal suffix since Y 9 X.Thus, Y is also maximal biprefix, satisfying condition (ii) in Proposition 2.1 without satisfying condition (iii). The following result gives a different characterization of maximal biprefix codes within the family of thin codes. PROPOSITION 2.2 A thin code X is maximal biprefix i$ for all w E A*, there exists an integer n 2 1 such that w' E X*.
Proof Assume that for all w E A*, we have w" in X* for some n 2 1. Then X clearly is right-complete and left-complete. Thus, X is maximal biprefix by
Proposition 2.1.
'47
2. MAXIMAL BIPREFIX CODES
Conversely, let X be a maximal biprefix code, and let w E A*. Consider a word u E &X),i.e., which is not a factor of a word in X. The code X being right-complete, for all i 2 1 there exists a word vi such that wiuq E x*.
Since u E F ( X ) , there exists a left factor si of u such that wisi E X * . Let k, m with k c m be two integers such that sk = s,. Then setting n = m - k, we have W k S k E X*,wms, = WnWkSk E X*. Since X* is left-unitary, this implies that w" E X * . 0 We now describe an operation which makes it possible to construct maximal biprefix codes by successive transformations.
PROPOSITION 2.3 Let X be a code which is maximal pre$x and maximal sufiix, and let w E A*. Set
G = XW-',
D = w-'X,
Go = (wD)w-', G1
=G
- Go,
Do = w-'(Gw),
(2-1 )
D, = D - Do.
If G, # 0 and D, # 0, then the set Y = ( X u w u G,(wDX)D,)
- (Gw u wD)
(2.2)
is a maximal pref;x and maximal sufiix code. Further, -Y =
3 + (1 - g W ( 1
- DQ,).
(2.3)
Proof If D, # 0, then D is nonempty. Further 1 # D, since otherwise w E X, and X being biprefix, this implies G = D = { I } , and Do = { l } and finally D, = 0, a contradiction. Thus, w is a proper left factor of a word in X, and by Proposition 11.4.6, the sets D and
Y1 = ( X u W ) - WD
are maximal prefix codes. Next, Gw = X n A*w, also C o w = W D n A*w; similarly for D and Do. Thus, wA* n A*w n X = Gw n wD = wDo = C o w .
Now note that G = Go u G , . From this and (2.4), we get GW u W D = C o w u G,w u wD = wDO v G i w u W D = G,w u wD,
(2.4)
111. BIPREFIX CODES
148
since Do c D. Similarly Gw u W D = Gw u wDl.
Thus
Y = (Y1 u GI WD,*D1)- GI W . Since D = Dl u Do is a maximal prefix code and D, # 0,the set D!D, is a maximal prefix code (Proposition 11.4.9). This and the fact that Y,is maximal prefix imply, according to Proposition 11.4.4, that Y is maximal prefix. Symmetrically, it may be shown successively that Yz = (X u w ) - WGand Y = (Y, - wD,) u G1G,*wDl are maximal suffix codes. From (2.4), we obtain by induction that G,*w = wD,*. Thus, Y’ = Y and consequently Y is also maximal suffix. To prove (2.3), set d =8 (1 G)w(l - Qo*Ql) Then d = w - Gw - W Q , * Q l + GwQ,*D1,
+ +
x+
d
=g
+ w - Gw -
WQfQl+
~iJWQO*Ql+
G,wQ,*D,.
Since Gow = w e o , we obtain o=
x + w - GW -
=g
WQ,*Dl
+ wQoQ,*Di+
G~wQ,*D~
+ w - G w - WQ, + G,wQ,*Q,.
The sets Glw, D o , D1are prefix, and Do # 1 (since otherwise w E X).Thus, the products in the above expression are unambiguous. Next it follows from (2.4) that G,w n wD = 0. Consequently GW u WD= G,w
+ wQ.
Thus o=
X + w + G1wD,*D, - GW u WD= 1. 0
The code Y is said to be obtained from X by internal transformation (with respect to w).
EXAMPLE2.4 Let A = {a,b}, and consider the uniform code X = A’. Let w = a. Then G = D = A and Go = Do = { a } . Consequently, the code Y defined by formula (2.2) is
Y = a u bulb. Note that Y is a group code as is X.
* 49
2. MAXIMAL BIPREFIX CODES
From formula (2.2), it is clear that for a finite code X,the code Y is finite iff Do = 0.This case deserves particular attention. PROFQSITION 2.4 Let X be a jinite maximal biprejix code and let w E A*. Set
G = XW-', D = w-'X. If G # 0,D # fa and Gw n W D = 0,then Y = (X u w u GwD) - ( G w u wD)
(2.5) (2.6)
is a finite maximal biprefix code, and
-Y = & + (c; - l)w(D - 1).
(2.7) Conversely, let Y be a finite maximal biprejix code. Let w E Y be a word such that there exists a maximal prefix code D, and a maximal sufix code G with GwD c Y. Then
X = ( Y - w - GwD) u (Gw u wD)
(2.8)
is a finite maximal biprejix code, and further (2.5),(2.6), and (2.7)hold. Proof If Gw n W D = 0,then we have, with the notations of Proposition 2.3, Go = Do = 0 by formula (2.4). Then (2.2) simplifies into (2.6). Formula (2.7) is a direct consequence of formula (2.3). Conversely, let us first show that X is a maximal prefix code. Set 2 = ( Y - W ) u wD.
Since Y is maximal prefix by Proposition 2.1. and since D is maximal prefix and w E Y, Corollary 11.4.5. implies that the set 2 is a maximal prefix code. Next observe that
X
= (2 - GwD) u Gw.
The set Q = Gw is contained in Z A - , since Gw c (X- w ) A - . Next Q is prefix: assume indeed that gw = g'wt for some g , g' E G,t E A*. Let d be a word in D of maximal length. The set D being maximal prefix, either td E D A - , or td E DA' or finally td E D. The two first cases are ruled out by the property of Y to be prefix. Thus, td E D, and since d has maximal length, we get t = 1 . This proves the claim. Further, for all g E G,we haveD = (gw)- '2. Indeed, the inclusion gwD c 2 implies D c ( g w ) - ' Z , and D being a maximal prefix code, the equality follows. In view of Proposition 11.4.7, the set X consequently is a maximal prefix code. Symmetrically, it may be shown that X is maximal suffix. Since X is finite, it is maximal biprefix.
150
111. BIPREFIX CODES
It remains to show that Y is obtained from X by internal transformation. First, the inclusion Gw c X follows from (2.8), implying G c Xw-l,and G being a maximal suffix code, this enforces the equality
G = XW-'. Symmetrically D = w-lX. Moreover, G # 0, D # 0, because they are maximal codes. Let us show that
Gw n wD = 0. If gw = wd for some g E G,d E D, then ggw
= gwd E GwD c Y.Thus w, ggw E Y; this is impossible, since Y is suffix. From w E Ywe get the result that Gw n Y = 0 ;otherwise Y would not be suffix. Similarly wD n Y = 0,because Y is prefix. Then as a result of (2.8), X - (Gw u wD)= Y - (w u GwD),implying (2.6). 0
EXAMPLE2.5 Let A = {a,b } and X = A3.Consider the word w = ab. Then = A and Gw n WD = 0.Thus Proposition 2.4 yields a finite code Y. This code is obtained by dropping in Fig. 3.5 the dotted lines and by adjoining the heavy lines. The result is the maximal biprefix code of Example 2.2.
G =D
Fig. 3.5 An internal transformation.
3. DEGREE
In this section, we examine the indicator of thin maximal biprefix codes. For these biprefix codes, some simplifications occur. Let X c A + be a biprefix code, set U = A* - XA*,V = A* - A*X and let L = J"*g be the indicator of X. If X is a maximal prefix code, then U = P where P = X A - is the set of proper left factors of words in X.In the same way, for a maximal suffix code, we have V = S where S = A-X is the set of proper
3.
‘5’
DEGREE
right factors of words in X . It follows that if X is maximal prefix and maximal suffix, each parse of a word is an interpretation. Then we have L = S$*p = SA* = A*P. _ _
(3.1) This basic formula will be used frequently. It means that the number of parses of a word is equal to the number of its right factors which are in P , or equivalently the number of its left factors which are in S. Let X be a subset of A + . Denote by H ( X ) = A - X A - = { w E A* I A + w A + n X #
125)
the set of internal factors of words in X . Let R ( X ) = A*
- H(X).
Clearly, each internal factor is a factor of a word in X . The converse may be false. The set H ( X ) and the set F ( X ) = {w EA*IA*wA* n X #
125}
of factors of words in X are related by F(X)= H(X)uX A - u A - X uX and for F ( X ) = A* - F ( X ) , A + H ( X ) A +c F ( X ) c H ( X ) . These relations show that R ( X ) is nonempty iff F ( X ) is nonempty; thus X is thin iff R(X)# 0. THEOREM 3.1 Let X c A + be a biprejix code. Then X is a thin maximal code i$ its indicator L is bounded. In this case, H ( X ) = { W E A* I (L,W ) = d } ,
(3.2)
where d is defined as d = max{(L, w) I w E A * } . Proof Let X be a thin maximal biprefix code. Let w E R ( X ) and w‘ E A*. According to formula (3.1), (L,ww’) = @A*, ww’). Thus the number of parses of ww’ is equal to the number of left factors of ww’ which are in S = A - X . Since w E R ( X ) ,it follows that no such left factor in S is strictly longer than w. Thus all these left factors are left factors of w.Again using formula (3.1), this shows that (L,ww’) = (L,w).
Now by Proposition 1.4, we have (L, ww’) 2 (L, w’). Thus we get (L,w’) 5 (L, w )
152
111. BIPREFIX CODES
showing that L is bounded on A* by its value for a word in A(X). This shows also that L is constant on R(X). Thus
R(X)c { w E A* [ ( L w, ) = d}. To show the converse inclusion, consider an internal factor w E H(x). Then there exist p, s E A + such that w' = pws E X.This implies that
(L,w') 2 (L,w) + 1. Indeed, each parse of w can be extended in a parse of w', and w' has an additional parse, namely (1,w', 1). This shows that for an internal factor w, the number (L, w ) is strictly less than the maximal value d. Thus formula (3.2)is proved. Assume now conversely that X is a biprefix code with bounded indicator L, let d = max{(L,w ) I w E A*} and let u E A* be a word such that (L,v) = d . We use formula (1.3)which can be rewritten as
&A*
= A*
+ ( A - 1)L.
Let w E A + be any nonempty word, and set w = au, with a E A, u E A*. Then
(XA*, wu) = (A* + (4 - 1)L,auu) = 1 + (L,uu) - (L,m u ) . By Proposition 1.4, both (L,uu) and (L,auu)are greater than or equal to (L,u). By the choice of u, we have (L, uu) = (L,auu) = d. Thus (&A*, wu) = 1. Thus we have proved that for all w E A + , wu E X A * . This shows that X is right complete. This shows also that X is thin. Indeed, we have u E R(X)since for all g , d E A + we have go E XA* and therefore gvd $ X . Thus X is a thin maximal prefix code. Symmetrically,it can be shown that X is maximal suffix. This gives the result by Proposition 2.1. 0 Let X be a thin maximal biprefix code, and let L be its indicator. The degree of X , noted d ( X ) or simply d, is the number
d ( X ) = max{(L,w ) I w E A*} According to Theorem 3.1, the degree d is the number of parses of any word which is not an internal factor of X.Before going on, let us illustrate the notion of degree with several examples.
EXAMPLE 3.1 Let cp be a morphism of A* onto a group G,and let G' be a subgroup of G. Let X be the group code for which X* = q-'(G'). We have seen that X is a maximal biprefix code, and that X is thin iff G'has finite index in G (Example 1.5.11). The degree of X is equal to the index of G' in G. Indeed let w E R(X)be a word which is not an internal factor of X,and consider the function $ which
3.
‘53
DEGREE
associates, to each word u E A*, the unique word p E P = X A - such that uw E X * p . Each p obtained in such a way is a proper right factor of w. The set $(A*) is the set of right factors of w in P. Since w E A(X),we have Card $(A*) = d ( X ) . Next, we have for u, u E A*,
$(u) = $(u)o G’cp(u)= G‘cp(~). Indeed, if $(u) = $(u) = p, then uw, uw E X * p , and consequently cp(u), q ( u ) E G’cp(p)cp(w)-’.Conversely, if G’cp(u) = G’cp(u), let r E A* be a word such that uwr E X * . Then cp(uwr)E G’cp(u)cp(wr)c G’, whence uwr E X * . Thus
$04 =
This shows that the index of G’ in G is d ( X ) . By Proposition 0.8.1, d ( X ) is also equal to the degree of the permutation group corresponding to the action of G on the cosets of G’, as defined in Section 0.8. EXAMPLE 3.2 The only maximal biprefix code with degree 1 over A is
X
= A.
3.3 Any maximal biprefix code of degree 2 over an alphabet A EXAMPLE has the form
X
= C u BC*B,
(3.3)
where A is the disjoint union of B and C, with B # 0. Indeed, let C = A n X and E = A - C. Each b E B has two parses, namely (1,1, b) and (b, 1,l).Thus, a word which is an internal factor of a word x E X cannot contain a letter in B, since otherwise x would have at least three parses. Thus, the set H of internal factors of X satisfies H c C*. Next consider a word x in X . Either it is a letter, and then it is in C, or otherwise it has the form x = aub with a, b E A and u E H c C*. X being biprefix, neither a nor b is in C. Thus X c C u EC*E. The maximality of X implies the equality. This shows that any maximal biprefix code of degree 2 is a group code. Indeed, the code given by (3.3) is obtained by considering the morphism from A* onto 2 / 2 2 defined by q ( B ) = {I}, cp(C) = (0). EXAMPLE 3.4 Consider the set
Y = {a”b”ln2 l}. It is a biprefix code which is not maximal since Y u ba is biprefix. Also Y is thin since ba E P( Y). The code Y is not contained in a thin maximal biprefix code. Suppose indeed that X is a thin maximal biprefix code of degree d containing Y. For any n 2 0, the word a“ then has n + 1 parses, since it has n + 1 right factors which all are proper left factors of a word in Y , whence in X . Since d I n, this is impossible. In fact, Y is contained in the Dyck code over {a,b} (see Example 1.2.5).
I54
111. BIPREFIX CODES
EXAMPLE 3.5 Let X , Y c A + be two thin maximal biprefix codes. Then X Y is maximal biprefix and thin and d ( X Y )= d ( X )+ d(Y).
The first part of the claim follows indeed from Corollary 11.4.2. Next, let w E R ( X Y ) be a word which is not an internal factor of X Y . Then, w E R ( X ) and w E R( Y ) .The left factors of w which are also proper right factors of X Y are of two kinds: first there are d( Y )left factors which are proper right factors of Y ,then there are d ( X )left factors uy with u E A - X , y E Y.These are the only left factors of w which are in A - ( X Y ) .Since w has d ( X Y )parses with respect to X Y , this gives the formula. We now define a formal power series associated to a code X and which plays a fundamental role in the following. Let X be a thin maximal biprefix code over A. The tower over X is the formal power series Tx (also written T when no confusion is possible) defined by (Tx, W ) = d - (Lx,w).
(3.4) The following proposition states some useful elementary facts about the series T.
PROPOSITION 3.2 Let X be a thin maximal biprefix code of degree d over A , set P = X A - , S = A - X , and let T be the tower over X . Then (T, w) = 0
ifl w E R ( X ) ,
andfor w E H ( X ) , 1 I(T,W ) I;d - 1.
(3.5)
Further (T,l)=d- 1 and
& - 1 = (4 - l)T(A - 1) + d ( A - l), = (4 - l)T + d,
S = T(A - 1) + d.
(34 (3.7) (3.8)
Proof According to Theorem 3.1, (T, w) = 0 iff w E f i ( X ) . For all other words, 1 I(T, w). Also (T, w) Id - 1 since all words have at least one parse, and (T, 1) = d - 1 since the empty word has exactly one parse. Next by definition of T, we have T + L = dA*, whence T(l - 4) + L(l
- A ) = d = (1 - A)T
+ (1 - A)L.
3.
'55
DEGREE
The code X is maximal; consequently P = A* - X A * and S = A* - A * X . Thus we can apply Proposition 1.3 with P = U,S = 1/: Together with the equation above, this yields formulas (3.7),(3.8),and also (3.6) since
X - 1 = P ( A - 1 ) = ((A - l ) T + d ) ( A - 1).
0
Proposition 3.2 shows that the support of the series T is contained in the set H ( X ) . Note that two thin maximal biprefix codes X and X' having the same tower are equal. Indeed, by Proposition 3.2, they have the same degree since (T, 1) = d ( X ) - 1 = d(X') - 1. But then Eq. (3.6)implies that X = X'. Whenever some thin maximal biprefix code of degree d = d ( X ) satisfies some equation
X - 1 = (4 - 1)T(A - 1) + d ( A - l), then T must be the tower on X . The next result gives a sufficient condition to obtain the same conclusion without knowing that the integer d is equal to d(X). PROPOSITION 3.3 Let T, T E Z <
> and let d, d 2 1 be integers such that
( A - 1)T(A - 1) + d ( A - 1) = ( A - l)T'(A - 1) + d'(A - 1).
(3.9) If there is a word w E A* such that (T, w) = (T',w), then T = T' and d = d'. Proof After multiplication on both sides by becomes
A*
= (1 - A),
Eq. (3.9)
T - dA* = T - d'A*. If (T, w ) = (T',w), then (dA*,w) = (d/A*,w). Thus, d = d', which implies T = T'. 0 We now observe the effect of an internal transformation (Proposition 2.3) on the tower of a thin maximal biprefix code X.Recall that, provided w is a word such that G , , D, are both nonempty, where G = XW-,,
D = w-'X, Gl=G-Go,
Go = (wD)w-',
Do = w-'(Gw),
DlzD-Do,
the code Y defined by
-Y =
x + (1 - Qw(l - Q,*Q,)
is maximal biprefix. By Proposition 11.4.6,the sets G = X w - ' and D = w - ' X are maximal suffix and maximal prefix. Let U be the set of proper right factors
156
111. BIPREFIX CODES
of G, and let V be the set of proper left factors of D. Then D,* V is the set of proper left factors of words in DgD1, since D = Do u D, . Consequently
G - 1 = (4 - l)U,
Q,*Q, - 1 = Q,*_V(A- 1).
Going back to Y, we get
-Y - 1 = 1:- 1 + (4 - l)_UwQ,*V(A- 1). Let T be the tower over X.Then using Eq. (3.6), we get
-Y - 1 = (4 - 1)(T + _UwQ,*Y)(A- 1) + d ( A - 1). Observe that since X is thin, both G and D are thin. Consequently also U and Vare thin. Since D, = D - Do # 0,Do is not a maximal code. As a subset of D, the set Do is thin. By Theorem 1.5.7, Do is not complete. Thus D,* is thin. Thus UwDt V, as a product of thin sets, is thin. Next supp(T) c H(X)is thin. Thus supp(T) u UwD,*V is thin. Let u be a word which is not a factor of a word in this set. Then U) = 0 (T + _UWD,*~,
On the other hand, formula (2.2)shows that since Gl(wD,*)D1 is thin, the set Y is thin. Thus, the support of the tower T, over Y is thin. Let u be such that (T,, u) = 0, then (T + _UwQ,*Y,UV) = (Ty,UU) = 0, showing that Proposition 3.3 can be applied. Consequently, d ( X )= d ( Y )
and
T, = T + _UwQ,*_V.
Thus, the degree of a thin maximal biprefix code remains invariant under internal transformations. EXAMPLE3.6 The finite maximal biprefix code X = {a3,a2ba, a2b2,ab, ba', baba, bab', b2a,b3} over A = {a,b} of Example 2.2 has degree 3. This can be seen by observing that no word has more than 3 parses, and the word a3 has 3 parses, or also by the fact (Example 2.5) that X is obtained from the uniform code A' by internal transformation with respect to the word w = ab. Thus d ( X ) = 4 A 3 ) = 3. In thisexample,D(=w-'A3) = G(=A3w-')=A.Thus Tx = TA3 + W. Clearly TA3 = 2 + a + b.
3.
‘57
DEGREE
Consequently
T,
=2
+a
+ b + ab.
We now give a characterization of the formal power series that are the tower over some thin maximal biprefix code.
PROPOSITION 3.4 A formal power series T E N <>is the tower ouer some thin maximal biprefix code iff it satisfies the following conditions. (i) for all a E A , u E A*, 0 I(T,u) - (T,au) I 1,
(3.10)
0 I(T,u)- (T,ua)I 1;
(3.11)
(ii) for all a, b E A, u E A*, (T,au)
+ (T,ub) I(T ,u) + (T,aub);
(3.12)
(iii) there exists a word u E A* such that ( T ,U) = 0.
Proof Let X be a thin maximal biprefix code of degree d, let L be its indicator, and let T = dA* - L. Then Eqs. (3.10), (3.11), and (3.12) are direct consequences of Eqs. (1.1l), (1.12), and (1.13). Further (iii) holds for all u E B(X), and this set is nonempty. Conversely, assume that T E N <> satisfies the conditions of the proposition. Define L = dd* - T. d = (T,1) + 1, Then by construction, L satisfies the conditions of Proposition 1.5, and consequently L is the indicator of some biprefix code X.Next by assumption, T has nonnegative coefficients. Thus for all w E A*, we have (T,w ) = d - (L,w ) 2 0. Thus, L is bounded. In view of Theorem 3.1, the code X is maximal and thin. Since (T, u) = 0 for at least one word u, we have (L,u) = d and d = max{(L, w) I w E A*}. Thus, d is the degree of X and T = dA* - L is the tower over X. 0
The preceding result makes it possible to disassemble the tower over a biprefix code. PROPOSITION 3.5 Let T be the tower ouer a thin maximal biprefix code X of degree d 2 2. The series T’ = T - H(X) is the tower of some thin maximal biprefix code of degree d
-
1.
158
111. BIPREFIX CODES
Proof First observe that T' has nonnegative coefficients. Indeed, by Proposition 3.2, (T,w) 2 1 iff w E H(X). Consequently (T',w) 2 0 for w E H(X),and (T', w) = (T,w) = 0 otherwise. Next, we verify the three conditions of Proposition 3.4.
(i) Let a E A, v E A*. If au E H(X),then u E H ( X ) . Thus (T',au)= (T,au)- 1 and (T',u) = (T',au)- 1. Therefore the inequality results from the corresponding inequality for T. On the other hand, if au # H ( X ) , then (T,au)= (T',au) = 0. Consequently (T,u)I 1 whence (T',u) = 0. Thus the inequality (3.10) holds for 7". (ii) Let a, b E A and u E A*. If aub E H ( X ) , then (T',au)= (T, w) - 1 for each of the four words w = avb, au, ub, and v. Thus, the inequality
(T', au) + (T',ub) I (T',u) + (T'? aub) results, in this case, from the corresponding inequality for T. On the other hand, if aub # H(X),then as before (T,av), (T,ub) I1 and (T', au) = (T',ub) = 0. Thus (3.12) holds for T'. Condition (iii)of Proposition 3.4 is satisfied clearly for T' since (T',w) = 0 for w E H(X).Thus T' is the tower over some thin maximal biprefix code. Its degree is (T', 1). Since 1 E H(X), we have (T',1) = d - 2. This completes the proof. 0 Let X be a thin maximal biprefix code of degree d 2 2, and let T be the tower over X. Let X' be the thin maximal biprefix code with tower T' = T - H(X). Then X' has degree d - 1. The code X'is called the code derived from X . Since for the indicators L and L' of X and X', we have L = dd* - T and L' = (d - l)d* - T', it follows that L' = L - qx).
denote the code derived from X ( " - ' ) for d ( X ) 2 n We let X(")
(3.14)
+ 1.
PROPOSITION 3.6 The tower of a thin maximal biprejix code X of degree d 2 2 satisfies T = H ( X ) + H(X') + . * * Proof
+ H(Xcd-2)).
By induction, we have from Proposition 3.5
+ + H(X'd-2))+
T = H(X)+ H(X')
where '? is the tower over a code of degree 1. This code is the alphabet, and consequently ?= 0. This proves the result. 0 We now describe the set of proper left factors and the set of proper right factors of words of the derived code of a thin maximal biprefix code.
3.
‘59
DEGREE
PROPOSITION 3.7 Let X c A + be a thin maximal biprejix code of degree
d 2 2. Let S = A-X, P
= XA-
and
H
= A-XA-,
H
= A*
- H.
1. The set S n H is a thin maximal prejix code. The set H is the set of its proper left factors, i.e., S n a = HA - H. 2. The set P n H is a thin maximal SUBX code. The set H is the set of its proper right factors, i.e., P n A = AH - H. 3. The set S n H is the set of proper right factors of the derived code X’. 4. The set P n H is the set of proper left factors of the derived code X’. Proof We first prove 1. Let T be the tower over X,and let T‘ be the tower over the derived code X‘. By Proposition 3.5, T = T’ + H, and by Proposition 3.2 S = T ( A - 1) + d.
+
Thus, S = T’(A - 1) + d - 1 H(A - 1) + 1. The code X ’ has degree d - 1. Thus, the series T’(A - 1) + d - 1 is, by formula (3.8), the characteristic series of the set S’ = A-X’of proper right factors of words of X’.Thus,
8’ = T’(A - 1) + d - 1. The set His prefix-closed and nonempty: We show that H contains no right ideal. Indeed, the set X is a maximal prefix code by Proposition 2.1. By
S = H ( A - 1) + 1 + 2’
and
Proposition 11.3.6, the set P contains no right ideal and consequently H = A-P contains no right ideal. Again by Proposition 11.3.6, the set Y = HA - H is a maximal prefix code, and H = Y A - . Thus
-Y = H(A - 1) + 1. Further, H being also suffix-closed, the set Y is in fact a semaphore code by Proposition 11.5.5. We now verify that Y = S n H. Assume that y E Y.Then, from the equation S = _Y + S’, it follows that y E S. Since H = YA-,we have y 4 H.Thus y E S n 8. Conversely, assume that y E S n A. Then y # 1, since d 2 2 implies that H # 0 and consequently 1 E H.Further, each proper left factor of y is in SA- = A-XA- = H,thus is an internal factor of X.In particular, consideringjust the longest proper left factor, we have y E HA.Consequently, y E HA - H = Y. The second claim is proved in a symmetric way. To show 3, observe that by what we proved before, we have
S = _y + 8’.
(3.15)
Next S = (S n H)u (S n H ) = Y u (S n H),since Y = S n 8. Moreover, the union is disjoint, thus = y + S n H.Consequently S’= S n H.In the same way, we get point 4. 0
I 60
111. BIPREFIX CODES
THEOREM 3.8 Let X be a thin maximal biprefix code of degree d. Then the set S' of its proper right factors is a disjoint union of d maximal preJx sets. Proof If d = 1, then X = A and the set S = { 1) is a maximal prefix set. If d 2 2, then the set Y = S n R, where H = A-XA- and H = A* - H,is maximal prefix by Proposition 3.7. Further, the set S' = S n H is the set of proper right factors of the code derived from X.Arguing by induction, the set S' is a disjoint union of d - 1 maximal prefix sets. Thus S = Y u S' is a disjoint union of d maximal prefix sets. U It must be noted that the decomposition, in Theorem 3.8, of the set S into disjoint maximal prefix sets is not unique (see Exercise 3.2). The following corollary to Theorem 3.8 is rather surprising. COROLLARY 3.9 Let X c 'A be a thin maximal biprefix code having degree d. For any positive Bernoulli distribution ?I on A*, the average length of X is equal to d.
Proof Let x be a positive Bernoulli distribution on A*,and let A(X)be the average length of X.By Corollary 11.7.4, the average length A(X) is finite and A(X)= x(S),where S = A-X is the set of proper right factors of X. In view of Theorem 3.8, we have where each I;is a maximal prefix code. As a set of factors of X,each also is thin. Thus x( yi) = 1 for i = 1,. ,.,d by Theorem 1.510. Consequently, d
A ( X ) = C ~ ( 5=)d. 0 i=i
Note that Corollary 3.9 can also be proved directly by starting with formula 3.7. However, the proof we have given here is the most natural one. We now prove a converse of Theorem 3.8. PROPOSITION 3.10 Let X be a thin maximal sufix code. If the set of its proper rightfactors is a disjoint union of d maximal prefix sets, then X is bipreJix and has degree d. Proof Let S = A-X. By assumption 3: = 1,+ - - .+ Id, where Y , , ..., & are maximal prefix sets. Let Vi be the set of proper left factors of Then A* = _ Y f _ U i , and thus (1 - _Y,)A* = ui,whence
x.
Summing up these equalities gives
3. DEGREE
161
Multiply on the left by 4 - 1 . Then, since (4 - 1)s = .1: - 1, d
d = C (4 - 1)ui i=i
+ (X- 1)A*,
whencse
&A*
= A* -
d
C ( A - 1 ) Q i + d. i=i
From this formula, we derive the fact that XA* is right dense. Indeed, let w E A +,and set w = au, with a E A. Each of the sets y is maximal prefix. Thus, each YJ* is right dense. There exists a word u such that simultaneously auu E XA* for all i E { 1 , . . . , d } and also uu E y A * for all i E {I,.. . , d ) . Thus for eachiE {1, ...,d }
((A - l ) U i , W u ) = (AUi,wu) - (Ui,wV) = ( U i , U U ) - (_Vi,WU) =
1
-
1 = 0.
Consequently
(XA*,wu) = (A*,.WU) = 1. Thus, wu E X A * . Consequently XA* is right dense or equivalently, X is right complete. In view of Proposition 2.1, this means that X is maximal biprefix. Let w E R(X)be a word which is not an internal factor of X . Then w E Vifor 1 Ii Id. The set being maximal prefix, we have w E Y,A* for 1 I i I d. Consequently, w has exactly d left factors which are right factors of words in X, one in each Y,. Thus X has degree d. 0
Fig. 3.6 A maximal biprefix code of degree 4.
111. BIPREFIX CODES
I 62
EXAMPLE3.7 Let X be the finite maximal biprefix code given in Fig. 3.6. The tower T over X is given by Fig. 3.7 (by its values on the set H(X)). The derived code X’is the maximal biprefix code of degree 3 of Examples 2.2 and 3.6. The set S’, or proper right factors of X‘,is indicated in Fig. 3.8. The set S of proper right factors of X is indicated in Fig. 3.9. The maximal prefix code Y = S n A is the set of words indicated in the figure by (0). It may be verified by inspection of Figs. 3.7,3.8, and 3.9 that S’ = S n H. 1
Fig. 3.7 The tower T over X .
Fig. 3.8 The set S‘ of proper right factor of X ’ .
Fig. 3.9 The set S of proper right factors of X.
'63
4. KERNEL 4. KERNEL
Let X c A + ,and let H = A-XA- be the set of internal factors of X.The kernel of X , denoted K ( X ) , or K if no confusion is possible, is the set K = X n H.
Thus a word is in the kernel if it is both in the code and an internal factor of the code. As we will see in this section,the kernel is one of the main characteristics of a maximal biprefix code. We start by showing how the kernel is related to the computation of the indicator. PROPOSITION 4.1 Let X c A + be a thin maximal biprejx code of degree d and let K be the kernel of X . Let Y be a set such that
KcYcX. Thenfor all w E H(X)u Y,
L, = A*(l - X)A*, L , = A*(1 - _Y)A*. Let w E A*, and let F(w) be the set of its factors. For any word x E A*, the number (A*xA*, w ) is the number of occurrences of x as a factor of w. It is nonzero only if x E F(w).Thus
(A*XA*, w) =
1 xef(w)n
(A*xA*, w) X
showing that if F(w)n X = F(w) n Y, then (L,, w ) = ( L y ,w). Thus, it suffices to show that F(w) n X = F(w)n Y for all w E H(X)u Y.From the inclusion Y c X,we get F(w)n Y c F(w) n X for all w E A*. If w E H(X), then F(w) c H ( X ) and F(w) n X c K(X).Thus F(w)n X c F(w)n Y in this case. If w E Y,then no proper left nor right factor of w is in X,since X is biprefix. Thus F(w) n X = {w} u (A-wA- n X)c {w} u K(X)c Y. Consequently F(w) n X c F(w) n Y in this case also. This shows (4.1). Now let w E H(X)be an internal factor of X.Then (L,, w) d by Theow) by formula (4.1). Next let w E R(X). rem 3.1. Consequently, ( L x ,w) = (Ly, Then (L,, w) = d. By formula (1.6), (L,, w) I(Ly, w).This proves (4.2). 0
-=
Given two power series c and z, we denote by min { c,t} the series defined by (min{a,z}, w) = min{(c, w),(7,w)}.
'64
111. BIPREFIX CODES
THEOREM 4.2 Let X be a thin maximal biprejx code with degree d, and let K be its kernel. Then L, = min{dA*,L,}. In particular, a thin maximal biprejx code is determined by its degree and its kernel. Proof Take Y = K(X)in the preceding proposition. Then the formula follows from (4.2). Assume that there are two codes X and X' of same degree d and K ( X ) = K(X').Then LK(,)= LK(,,), whence L, = L,. which in turn implies X = X'.This completes the proof. 0
Clearly, the kernel of a biprefix code is itself a biprefix code. However, not every biprefix code is the kernel of some biprefix code. We now give characterization of those biprefix codes which are a kernel of some thin maximal biprefix code. For this, it is convenient to introduce a notation: for a subset Y of A + , let P ( Y ) = max{(LY,Y)lYE Y } ;
it is a nonegative integer or infinity. By convention, p ( 0 ) = 0. THEOREM 4.3 A biprejx code Y is the kernel of some thin maximal biprejx code of degree d ifl (i) Y is not maximal biprejx, (ii) p ( Y ) I d - 1. Proof Let X be a thin maximal biprefix code of degree d, and let Y = K(X) be its kernel. Let us verify conditions (i) and (ii). To verify (i), consider a word x E X such that ( L , , x ) = p(X);we claim that x q! H(X).Thus, x q! K ( X ) , showing that Y X.Assume the claim is wrong. Then uxu E X for some u, u E A'. Consequently, (L,,uxu) 2 1 + ( L , , x ) since the word uxu has the interpretation (1, uxu, 1) which passes through no point of x . This contradicts the choice of x, and proves the claim. Next, for all y E we have (L,, y) = (L,, y) by formula (4.1). Since (L,,y) I d - 1 because y E H(X), condition (ii) is also satisfied. Conversely, let Y be a biprefix code satisfying conditions (i) and (ii). Let L E N <> be the formal power series defined for w E A* by
(L,w ) = min{d,(L,, w ) } , Let us verify that L satisfies the three conditions of Proposition 1.5. First, let a E A and w E A*. Then
0 I ( L , , a w ) - (L,,y) I 1. If follows that if (L,, w) < d, then (L,,aw) = (L,aw); on the other hand, if
4.
KERNEL
‘65
(Ly, w) 2 d, then (L,aw) = (L,w) = d. Thus 0 I (L, aw) - (L,w) I1. The symmetric inequality
0 I(L,wa)- (L,w)I 1 is shown in the same way. Thus the first of the conditions of Proposition 1.5 is satisfied. Next, for a, b E A, w E A*, (Ly,aw) + (Ly, wb)2 (Ly, w) + (Ly,awb). Conw ) 2 d. Then (L, aw) = (L, wb) = (L,w ) = sider first the case where (Ly, (L, awb) = d, and the inequality
+
(L,aw) + (L,wb)2 (L,w) (L,awb) is clear. Assume now that (Ly, w) < d. Then (Ly,aw)Id and ( L y ,wb) Id. Consequently (L,aw)
+ (L,wb) = (Ly,aw) + (Ly,wb) 2 (LY,w) + (LY,awb)
2 (L,w)
+ (L,awb)
since L IL,. This shows the second condition. Finally, we have (Ly, 1) = 1, whence (L, 1) = 1. Thus, according to Proposition 1.5, the series L is the indicator of some biprefix code X. Further, L being bounded, the code X is thin and maximal biprefix by Theorem 3.1. By the same argument, the code Y being nonmaximal, the series L , is unbounded. Consequently,max{(L, w)I w E A*) = d, showing that X has degree d. We now prove that Y = X n H ( X ) ,i.e., Y is the kernel of X. First, we have Indeed, if y E Y, then (L,y) I( L y,y ) Ip( Y ) I d - 1. the inclusion Y c H(X). Thus, by Theorem 3.1, y E H ( X ) . Next, observe that it suffices to show that X n H ( X ) = Y n H(X); this is equivalent to showing that for all w E H(X).Let us prove this by induction on I wI. Clearly, the equality holds for Iw(= 0. Next, let w E H ( X ) - 1. Then (L, w) I d - 1. Thus, (L, w) = (Ly, w).This in turn implies (A*&fj*, w) = (A*Td*,w).
But F(w) c H(X).Thus, by the induction hypothesis, (X,s)= (Xs)for all proper factors of w. Thus the equation reduces to (&, w) = (1 w). , 0 We now describe the relation between the kernel and the operation of derivation.
I 66
111. BIPREFIX CODES
PROPOSITION 4.4 Let X be a thin maximal biprefx code of degree d 2 2, and let H = A - X A - . Set Y=HA-H,
K=XnH,
Z=AH-H.
Then the code X ’ derived from X is
X’= K u ( Y n 2).
(4.3)
K = X n X’.
(4.4)
Further, Proof Let S = A - X and P = X A - be the sets of proper right factors and of proper left factors of words in X . Let S’ = S n H and P‘ = P n H. According to Proposition 3.7, S’ is the set of proper right factors of words in X’ and similarly for P‘. Thus,
X’- 1 = (4 - 1)s’ = AS‘
- S‘.
From S‘ = S n H , we have AS’ = AS n A H , and As’=&OAH,
where o denotes the Hadamard product (see Section 0.6). Thus,
XI-
1= ( & O M ) - S ’ .
Now observe that, by Proposition 3.7, the set 2 is a maximal suffix code with proper right factors H . Thus, - 1) = (4 - l)H and
(z
AH=Z-l+H.
Similarly, from X
-
1 = (4 - 1)s we get
AS = x
-
1
+ 8.
Substitution gives
x’-l=(g-l+S)O(g-l+zj)-S’ =X n 2
+ S n Z + X n H + S n H + 1 - (1 0 H ) - (S 0 1) - s’.
Indeed, the other terms have the value 0 since neither X nor 2 contains the empty word. Now 2 = P n H (Proposition 3.7), whence X n 2 = X n P n = 0.Also by definition S’ = S n H and K = X n H . Thus the equation becomes
g’- 1 = S
n Z + 5 - 1.
Finally, note that by Proposition 3.7, Y = S n A. Thus,
s n Z = S n P n H = Y n 2,
4.
KERNEL
and X’ = K u ( Y n Z ) ,
showing (4.3).Next X n X ’ = ( K n X ) u ( X n Y n 2).
Now X n Y n Z = X n P n S n H = 0, and K n X = K. Thus, XnX’=K
as claimed. [I PROPOSITION 4.5 Let X be a thin maximal biprejx code of degree d 2 2 and let X‘ be the derived code. Then K ( X ’ ) c K ( X ) 4 X’.
(4.51 Proof First, we show that H(X’) c H ( X ) . Indeed, let w E H(X’). Then we have (T’,, w ) 2 1, where Tx, is the tower over X’. By Proposition 3.5, (Tx,,w) = (Tx,w) - ( H ( X ) w). , Thus, (Tx,w) 2 1. This in turn implies that w E H ( X ) by Proposition 3.2. By definition, K ( X ’ ) = X ’ n H(X’). Thus, K ( X ’ ) c X’ n H ( X ) . By Proposition 4.4, X’ = K ( X ) u T, where T is a set disjoint from H ( X ) . Thus X’ n H ( X ) = K ( X ) . This shows that K(X’) c K ( X ) . Next, formula (4.4) also shows that K ( X ) c X‘. Finally, we cannot have the equality K ( X ) = X’, since by Theorem 4.3, the set K ( X ) is not a maximal biprefix code. 0 The following theorem is a converse of Proposition 4.5. THEOREM 4.6 Let X’ be a thin maximal biprejx code. For each set Y such that K(X’)c Y
X’
(4.6)
there exists a unique thin maximal biprejx code X such that (i) K ( X ) = (ii) d ( X ) = 1 + d(X’). Moreover, the code X’ is derived from X .
Proof We first show that Y is the kernel of some biprefix code. For this, we verify the conditions of Theorem 4.3. The strict inclusion Y c X’ shows that Y is not a maximal code. Next, by Proposition 4.1, (L,,y) = (L,.,y) for y E Y Thus, setting d = d(X’) + 1, we have p ( Y ) I d(X’) = d - 1. According to Theorem 4.3, there is a thin maximal biprefix code X having degree d such that K ( X ) = Y. By Theorem 4.2, this code is unique. It remains to show that X’ is the derived code of X . Let 2 be the derived code of X . By
I 68
111. BIPREFIX CODES
Proposition 4.5,K ( Z ) c K(X)= Y 4 Z. Thus we may apply Proposition 4.1, showing that for all w E A*, (L,, w ) = min{d - 1, ( L y ,w ) ) .
The inclusions of formula 4.6 give, by Proposition 4.1, ( L x rw, ) = min{d - l,(Ly,w ) }
for all w E A*. Thus Lx. = L,, whence 2 = X’. 0 Proposition 4.5 shows that the kernel of a code is located in some “interval” determined by the derived code. Theorem 4.6 shows that all of the “points” of this interval can be used effectively. More precisely, Proposition 4.5 and Theorem 4.6 show that there is a bijection between the set of thin maximal biprefix codes of degree d 2 2, and the pairs composed of a thin maximal biprefix code X‘ of degree d - 1 and a set Y satisfying(4.6).The bijection associates to a code X the pair (X’, K ( X ) ) ,where X’is the derived code of X. EXAMPLE 4.1 We have seen in Example 3.3 that any maximal biprefix code of degree 2 has the form X = C v BC*B,
where the alphabet A is the disjoint union of B and C, and B # 0.This observation can also be established by using Theorem 4.6.Indeed, the derived code of a maximal biprefix code of degree 2 has degree 1 and therefore is A. Then for each proper subset C of A there is a unique maximal biprefix code of degree 2 whose kernel is C. This code is clearly the code given by the above formula.
4.2 The number of maximal biprefix codes of degree 3 over a EXAMPLE finite alphabet A having at least two letters is infinite. Indeed, consider an infinite thin maximal biprefix code X’of degree 2. Its kernel K(X’) is a subset of A and consequently is finite. In view of Theorem 4.6,each set K containing K(X’) and strictly contained in X’is the kernel of some maximal biprefix code of degree 3. Thus, there are infinitely many of them. 5. FINITE MAXIMAL
BIPREFIX CODES
Finite maximal biprefix codes have quite remarkable properties which make them fascinating objects.
5.
FINITE MAXIMAL BIPREFIX CODES
‘69
PROPOSITION 5.1 Let X c A + be a finite maximal biprejx code of degree d. Then for each letter a E A, ad E X .
With the terminology introduced in Chapter I, this is equivalent to saying that the order of each letter is the degree of the code. Proof Let a E A. According to Proposition 2.2, there is an integer n 2 1 such that a” E X. Since X is finite, there is an integer k such that akis not an internal factor of X. The number of parses of ak is equal to d. It is also the number of right factors of akwhich are proper left factors of words in X, that is n. Thus n = d. 0 Note as a consequence of this result that it is, in general, impossible to complete a finite biprefix code into a maximal biprefix code which is finite. Consider, for example, A = {a,b } and X = {a’, b3}.A finite maximal biprefix code containing X would have simultaneously degree 2 and degree 3. We now show the following result: 5.2 Let A be a finite set, and let d 2 1 . There are only a finite THEOREM number of finite maximal biprejix codes over A with degree d.
Proof The only maximal biprefix code over A, having degree 1 is the alphabet A. Arguing by induction on d, assume that there are only finitely many finite maximal biprefix codes of degree d. Each finite maximal biprefix code of degree d + 1 is determined by its kernel which is a subset of X‘. Since X’ is a finite maximal biprefix code of degree d there are only a finite number of kernels and we are finished. 0 Denote by Pk(d)the number of finite maximal biprefix codes of degree d over a k letter alphabet A. Clearly Pk(1) = 1. Also Pk(2) = 1; indeed X = A’ is, in view of Example 2.3, the only finite maximal biprefix code of degree 2. It is also clear that P,(d) = 1 for all d 2 1. EXAMPLE 5.1
Let us verify that
P2(3) = 3. (5.1) Let indeed A = {a,b } , and let X c A + be a finite maximal biprefix code of degree 3. The derived code X’ is necessarily X’ = A 2 , since it is the only finite maximal biprefix code of degree 2. Let K = X n X’ be the kernel of X. Thus K c A2. According to Proposition 5.1, both a3,b3 E X . Thus K cannot contain a’ or b2. Consequently, K c {ab,ba}. We next rule out the case K = {ab,ba}. Suppose indeed that this equality holds. For each k 2 1, the word (ab)khas
170
111. BIPREFIX CODES
exactly two X parses. But X being finite, there is an integer k such that (ab)' E H ( X ) ,and (ab)' should have three X parses. This is the contradiction. Thus there remain three candidates for K:K = (25 which correspond to X = A 3, then K = { ab},which gives the code X of Example 2.2, and K = { ba} which gives the reverse 3 of the code X of Example 2.2. This shows (5.1). Note also that this explains why 3 is obtained from X by exchanging the letters a and b: this property holds whenever it holds for the kernel. We now show how to construct all finite maximal biprefix codes by a sequence of internal transformations, starting with a uniform code.
THEOREM 5.3 Let A be a finite alphabet and d 2 1. For each finite maximal biprefix code X c A + of degree d, there is a finite sequence of internal transformations which, starting with the uniform code Ad, yields X . Proof Let K be the kernel of X . If K = 0,then X = Ad and there is nothing to prove. This holds also if Card(A) = 1. Thus we assume K # (25 and Card(A) 2 2. Let x E K be a word which is not a factor of another word in K. We show that there exist a maximal suffix code G and a maximal prefix code D such that GxD c X .
(5.2)
Assume the contrary. Let P = X A - . Since x E K, x is an internal factor. Thus the set Px-' is not empty. Then for all words g E P x - ' , there exist two words d, d' such that gxd, gxd'
EX
and
X(xd)-' # X ( x d ' ) - ' .
Suppose the contrary. Then for some g E P x - l , all the sets X ( x d ) - ' , with d running over the words such that gxd E X , are equal. Let D = { d I gxd E X } and let G = X(xd)-', where d is any element in D.Then GxD c X , contradicting our assumption. This shows the existence of d, d'. Among all triples (9,d, d ' ) such that gxd, gxd' E X and X(xd)-' # X ( x d ' ) - ' , let us choose one with Id1 + Id'l minimal. For this fixed triple (g,d,d'), set G = X(xd)-'
and
G' = X ( x d ' ) - ' .
Then G and G' are distinct maximal suffix codes. Take any word h E G - G'. Then either h is a proper right factor of a word in G' or has a word in G' as a proper right factor. Thus, interchanging if necessary G and G', there exist words u, g' E A + such that g'
E
G,
ug' E G'.
5.
‘7’
FINITE MAXIMAL BIPREFIX CODES
Note that this implies g’xd E X,
ug’xd’ E X.
Now consider the word ug‘xd. Of course, ug‘xd # X. Next ug’xd $ P, since otherwise g‘xd E K, and x would be a factor of another word in K, contrary to the assumption. Since ug’xd # P u X, it has a proper left factor in X. This left factor cannot be a left factor of ug’x, since ug’xd’ E X. Thus it has ug‘x as a proper left factor. Thus there is a factorization d = d“v
with d“, v E A’, and ug‘xd” E X . Now we observe that the triple (ug’,d’,d’’) has the same properties as (g,d,d’). Indeed, both words ug‘xd‘ and ug’xd” are in X. Also X ( x d ’ ) - ’ # X ( x d ” ) - since gxd’ E X,but gxd” # X: this results from the fact that gxd” is a proper left factor of gxd E X (Fig. 3.10). Thus, (ug’,d’,d”) satisfies the same constraints as (g,d,d’): however, (d’l + Id”[ < ld’l + (dl. This yields the contradiction and proves (5.2). Let
Y =(X
u GX u xD)- (X u GxD).
(5.3)
In view of Proposition 2.4, the set Y is a finite maximal biprefix code, and moreover, the internal transformation with respect to x transforms Y into X. Finally (5.3) shows that Card(Y) = Card(X) + Card(G) + Card(D) - 1 - Card(G)Card(D) = Card(X) - (Card(G) -
l)(Card(D) - 1).
The code G being maximal suffix and Card@) 2 2, we have Card(G) 2 2. For the same reason, Card(D) 2 2. Thus Card(Y) I Card(X) - 1.
(5.4)
Arguing by induction on the number of elements, we can assume that Y is obtained from Ad by a finite number of internal transformations. This completes the proof. 0
Fig. 3.10 From triple (g, d, d’) to triple (ug’,d‘,d”).
111. BIPREFIX CODES
172
Observe that by this theorem [and formula (5.4)] each finite maximal biprefix code X c A + of degree d satisfies (5.5) Card(X) 2 Card(Ad) with an equality iff X = Ad. This result can be proved directly as follows (see also Exercise 11.7.2). Let X be a finite maximal prefix code, and
with k = Card(A).The number I is the average length of X with respect to the uniform Bernoulli distribution on A*. Let us show the inequality Card(X) 2 kA.
(5.6)
For a maximal biprefix code X of degree d, we have I = d (Corollary 3.9), and thus (5.5) is a consequence of (5.6). To show (5.6), let n = Card(X). Then
The last equality follows from 1 = a finite maximal prefix code. Thus,
k-Ixl,which holds by the fact that X is
xsx
The function log being concave, we have
This shows (5.6).
EXAMPLE 5.2 Let A = {a, b} and let X be the finite maximal biprefix code of degree 4 with literal representation given in Fig. 3.11. The kernel of X is K = {ab,a z b 2 } .There is no pair (G,D ) composed of a maximal suffix code G and a maximal prefix code D such that GabD c X. On the other hand Aa2b2Ac X. The code X is obtained from the code Y given in Fig. 3.12 by internal transformation relatively to a2b2. The code Y is obtained from A4 by the sequence of internal transformations relatively to the words aba, ab2,and ab.
5.
FINITE MAMMAL BIPREFIX CODES
= 73
Fig. 3.11 The code X.
Fig. 3.12 The code Y.
We now describe the construction of a finite maximal biprefix code from its derived code. Let Y c A + be a biprefix code. A word w E A* is called fill (with respect to Y) if there is an interpretation passing through any point of w. It is equivalent to saying that w is full if any parse of w is an interpretation. The biprefix code Y is insuficient if the set of full words with respect to Y is finite.
'74
111. BIPREFIX CODES
PROPOSITION 5.4 A thin maximal biprejix code over a jinite alphabet A is 3 its kernel is insuficient. finite 1 Proof Suppose first that X is finite. Let d be its degree, and let K be its kernel. Consider a word w in R(X).Then w has exactly d X-interpretations. These are not all K-interpretations, because K is a subset of the derived code of X, which has degree d - 1. Thus, there is a point of w through which no Kinterpretation passes. Thus, w is not full (for K). This shows that the set of full words (with respect to K) is contained in H(X).Since H ( X )is finite, the set K is insufficient. Conversely, suppose that X is infinite. Since the alphabet A is finite, there is an infinite sequence (an), of letters such that, setting P = X A - l , we have for all n 2 0, pn = aOal* . - a ,E P .
Note that there are at most d ( X ) integers n for which pn is a proper right factor of a word in X.Similarly,there exist at most d ( X )integers n such that for all m 2 1, an+lan+z***an+ntE
Indeed, each such integer n defines an interpretation of each word aOal * b, (r > n), which is distinct from the interpretations associated to the other integers. These observations show that there exists an integer k such that for all n 2 k, the following hold: pn E A*X and a n + l a , + ~ ~ ~ ~ aE, +X, for some rn 2 1. The first property implies by induction that for all n 2 k, there is an integer i I k such that a i . . . a , E X*. Let w, = akak+,* . * a k +for l I2 1. We show that through each point of wI passes a K-interpretation. Indeed, let u = akak+l... a,,
v = a,+,a,+2...ak+l
for some k I n I k + 1. There exists an integer i Ik such that a i . . .a,- 1, u E X*, and there is an integer m 2 k 1 such that oak+ "'a,,, E X*. In fact, these two words are in H(X) n X* and consequently they are in K*. This shows that K is a sufficient set and completes the proof. 0
+
The previous proposition yields the following result. 5.5 Let X' be a finite maximal biprefix code of degree d - 1 and THEOREM with kernel K'. For each insujicient subset K of X' containing K', there exists a unique finite maximal biprefix code X of degree d, having kernel K . The derived code of X is X'.
I75
EXERCISES
Proof Since K is insufficient, K is not a maximal biprefix code. Thus K' c K 4 X'. In view of Theorem 4.6, there is a unique thin maximal biprefix code X of degree d and kernel K . The derived code of X is X'. By Proposition 5.4, the code X is finite. 0
The following corollary gives a method for the construction of all finite maximal biprefix codes by increasing degrees.
COROLLARY 5.6 For any integer d 2 2, the function X
H
K(X)
is a bijection of the set of jinite maximal biprejx codes of degree d onto the set of all insuflcient subsets K of jinite maximal biprejx codes X ' of degree d - 1 such that
K ( X ' ) c K 4 X'. 0
EXAMPLE 5.3 Let A = {a, b } . For each integer n 2 0, there exists a unique finite maximal biprefix code X , c A + of degree n + 2 with kernel K,, = {a'b'l 1 I i I n } . For n = 0, we have & = 0 and X, = A'. Arguing by induction, assume X , constructed. Then K, c X, and also an+', bn+' E X,,, since d(X,) = n + 2. We show that a"+lbn+' E X , . Indeed, no proper left factor of a"+lb"+l is in X,, since each has a right factor in X,, or is a proper right factor of an+2.Consider now a word an+lb"+' for a large enough integer k. Since X,, is finite, there is some left factor an+'bn+' E X , , for some r 2 1. If r 2 2, then bn+' is a right factor of this word. Thus r = 1, and a n + l b " + lE X,,. Clearly K,,c K,,+ 1. The set K,,, is insufficient. In fact, a has no K,,+ interpretation passing through the point (a,1) and b has no interpretation passing through the point (1, b). Therefore, the set of full words is { 1). Finally Kn c & + I 4 Xn* This proves the existence and unicity of X, + 1, by using Theorem 5.5. The code X , is the code of degree 3 given in Example 2.2. The code X, is the code of degree 4 of Example 5.2.
EXERCISES SECTION 1 1.1. Let X c A + be a biprefixcode and L = L, its indicator. Show that if for u, u E. A* we have (L, uvu) = (L, u), then for all m 2 0, (L,( U U ) ' " ~ ) = (L,u).
176
111. BIPREFIX CODES
SECTION 2 2.1.
Let X c A + be a thin maximal biprefix code. To each word w E &X),we will associate a function pw from { 1,2,. ..,Iwl} into itself. Let w = a l a 2 * * * ~ ~ a, , E A.
(a) Show that for each integer i in { 1,2,. . . ,n } , there exists a unique integer k E { 1,2,. .. ,n} such that either U , U ~ + ~ ak or alui+l anal - * - a is k in X.Set a * *
pw(i) = k.
This defines, for each w E P(X), a mapping pw from { 1,2,. ..,Iw I } into itself. (b) Show that X is suffix iff the function pw is injective for all w
E
F(X).
(c) Show that X is left complete iff the function pw is a surjection for all w E F(x). (d) Derive from this that a thin maximal prefix code is suffix iff it is left complete (cf. the proof of Proposition 2.1). 2.2. Let P = {WGI w E A*} be the set of palindrome words. (a) Show that P* is biunitary. Let X be the biprefix code for which X* = P*. Then X is called the. set. of palindrome primes. (b) Show that X is left complete and right complete.
2.3. Show that two maximal biprefix codes which are obtained one from the other by internal transformation are either both recognizable or both not recognizable. SECTION 3
3.1.
Let P, S E Z ( A ) be polynomials such that P(1 - A ) = (1 - 4)s. Show that there exist an integer d E Z and a polynomial Q E Z ( A ) such that P = ( l -A)Q+d;
S=Q(l-A)+d
(Hint: show for this that if two homogeneous polynomials P, Q E Z ( A ) satisfy P A = AQ, then P = AR, Q = R A for some polynomial R.)
3.2. Let X be a thin maximal biprefix code of degree d. Let w E R(X)and let 1 = p1, P 2 , * . * , P d
'77
EXERCISES
be those of the right factors of w which are proper left factors of X. Set for 2 I
K=p,-'X
is d,
x
Y, = 1.
Show that each is a maximal suffix set, and that the set S of proper (cf. Theorem 3.8). right factors of X is the disjoint union of the 3.3. Let X be a thin maximal biprefix code of degree d and let S be the set of its proper right factors. Show that there exists a unique partition of S into a disjoint union of d prefix sets yi satisfying
rs
x-lc F A (Hint: Set
for 2 I i I d.
5 = S n R(X).)
SECTION 4 4.1.
Let X bea finite biprefix code. Show,using Theorem 4.3, that there exists a recognizable maximal biprefix code containing X. 4.2. Show that if X is a recognizable maximal biprefix code of degree d 2 2, then the derived code is recognizable. 4.3. Let X be a thin maximal biprefix code of degree d 2 2. Let w E A(X), and let s be the longest left factor of w which is a proper right factor of X. Further, let x be the left factor of w which is in X. Show that the shorter one of s and x is in the derived code X'. 4.4. Let XIand X, be two thin maximal biprefix codes having same kernel: K(Xl) = K(X,). Set
x = xlA x2 (cf. Exercise 11.4.2). Show that Xis thin, maximal and biprefix. Use this to give a direct proof of the fact that two thin maximal biprefix codes with same kernel and same degree are equal. SECTION 5 5.1.
5.2.
Let X be a finite maximal biprefix code. Show that if a word w E A + satisfies pwq = rws E x (6.1) for some p, q, I , s E A + ,and p # r, then w E H(X'), when X' is the derived code of X. (Hint: Start with a longest word satisfying (6.1) and use Proposition 3.7.) For a finite code X, let I(Z)= max{1x1 I x E X}. Show, using Exercise 5.1, that if X is a finite maximal biprefix code over a k letter alphabet, then I(X) I I(X') + k"*"-' with X' denoting the derived code of X.
178
111. BIPREFIX CODES
Denote by A(k, d ) the maximum of the lengths of the words of all finite maximal biprefix codes of degree d over a k letter alphabet. Show that for d 2 2 A(k,d) 5 A(k,d - 1) + kA(k*d-l)-l. 5.3.
Compare with the bound given by Theorem 5.2. Let X c A + be a finite maximal biprefix code of degree d . Let a, b E A, and define a function cp from (0, 1, ..., d - 1) into itself by &,d-cp(i)
x.
Show that cp is a bijection. Show that for each k 2 2, the number B,(d) of finite maximal biprefix codes of degree d over a k letter alphabet is unbounded as a function of d. 5.5. A quasipower of order n is defined lby induction as follows: a quasipower of order 0 is an unbordered word. A quasipower of order n + 1 is a word of the form uuu, where u is a quasipower of order n. Let k be an integer and let a,, be the sequence inductively defined by 5.4.
+ 1, a,,+1 = a,(kUn+ 1) a1 = k
5.6.
(n 2 1).
Show that any word over a k letter alphabet with length at least equal to ah has a factor which is a quasipower of order n. Let X be a finite maximal biprefix code of degree d 2 2 over a k letter alphabet. Show that maxlxl Iad-l X€X
+ 2,
where (a,) is the sequence defined in Exercise 5.5. (Hint: Use Exercise 1.1.) Compare with the bound given by Exercise 5.2. 5.7. Show that the number of finite maximal biprefix codes of degree 4 over a two-letter alphabet is
8,(4) = 73. NOTES
The idea to study biprefix codes goes back to Schiitzenberger (1956) and Gilbert and Moore (1959). These papers already contain significant results. The first systematic study is in Schutzenberger (1961b,c). Propositions 2.1 and 2.2 are from Schutzenberger (1961~).The internal transformation appears in Schutzenberger (1961b). The fact that all finite
NOTES
‘79
maximal biprefix codes can be obtained from the uniform codes by internal transformation (Theorem 5.3) is from Ctsari (1972). The fact that the average length of a thin maximal biprefix code is an integer (Corollary 3.9) is already in Gilbert and Moore (1959). It is proved in Schiitzenberger (1961b) with the methods developed in Chapter VI. Theorem 3.8 and its converse (Proposition 3.1) appear in Perrin (1977a).The notion of derived code is due to Ctsari (1979). The presentation of the results given here, using the indicator and the tower, appears to be new. The results of Section 4 are a generalization to thin codes of results in Ctsari (1979).The methods of this section are used in Perrin (1982)to prove that any finite biprefix code is contained in a recognizable maximal biprefix code (Exercise 4.1). It is unknown whether any recognizable biprefix code is contained in a recognizable maximal biprefix code. Theorem 5.2 appears already in Schutzenberger (1961b) with a different proof (see Exercise 5.6). The rest of this section is due to Ctsari (1979). The enumeration of finite maximal biprefix codes over a two-letter alphabet has been pursued by computer. A first program was written in 1975 by C. Precetti using internal transformations. It produced several thousands of them for d = 5. In 1984, a program written by M. Ltonard using the method of Corollary 5.6 gave the exact number of finite maximal biprefix codes of dekree 5 over a two-letter alphabet. This number is 5,056 783. Exercise 3.1 is a very special case of a result of Cohn called “weak Euclidian algorithm” in Cohn (1971). Exercises 3.3, 4.4, 5.1, and 5.2 are from CCsari (1979).
CHAPTER
IV
Automata
0. INTRODUCTION
In Chapters I1 and 111, we studied special families of codes. We now return to the study of general codes. In the present chapter, we introduce a new technique: unambiguous automata. The main idea is to replace computations on words by computations on paths labeled by words. This is a technique which is well known in formal language theory. It will be used here in a special form related to the characteristic property of codes. Within this frame, the main fact is' the equivalence between codes and unambiguous automata. The uniqueness of paths in unambiguous automata corresponds to the uniqueness of factorizations for a code. Unambiguous automata appear to be a generalization of deterministicautomata in the same manner as the notion of a code extends the notion of a prefix code. To each unambiguous automaton corresponds a monoid of relations which is also called unambiguous. A relation in this monoid corresponds to each word and the computations on words are replaced by computations on relations. The principal result of this chapter (Theorem 5.1) shows that very thin codes are exactly the codes for which the associated monoid satisfies a finiteness condition: it contains relations of finite rank. This result explains why thin codes constitute a natural family containing the recognizable codes. It makes it I 80
0 . INTRODUCTION
181
possible to prove properties of thin codes by reasoning in finite structures. As a consequence, we shall give, for example, an alternative proof of the maximality of thin complete codes which does not use measures. The main result also allows us to define, for each thin code, some important parameters: the degree and the group of the code. The group of a thin code is a finite permutation group. The degree of the code is the number of elements on which this group acts. These parameters reflect properties of words by means of “interpretations.” For example, the synchronous codes in the sense of Chapter I1 are those having degree 1. One of the applications of the notions introduced in this chapter is the proof of the theorem on the synchronization of semaphore codes announced in Chapter 11. A direct combinatorial proof of this result would certainly be extremely difficult. This chapter is organized in the following manner. In the first section, the notion of unambiguous automaton is introduced. The basic result on the correspondence between codes and unambiguous automata is Proposition 1.5. In the second section, we study various types of unambiguous automata which can be associated with a code. In particular, the flower automaton is defined and it is shown that this is some kind of universal automaton in the sense that any unambiguous automaton associated with a code can be obtained by a reduction of the flower automaton of this code. We also show how to decompose the flower automaton of the composition of two codes. In Section 3, basic properties of unambiguous monoids of relations are proved. These monoids constantly appear in the sequel, since each unambiguous automaton gives rise to an unambiguous monoid of relations. In particular, we define two representations of unambiguous monoids of relations, called the W- and 2’-representations. These representations are relative to a fixed idempotent chosen in the monoid, and they describe the way the elements of the monoid act by right or left multiplication on the W-class and the 9-class of the idempotent. The notion of rank of a relation is defined in Section 4. The most important result in this section states that the minimal ideal of an unambiguous monoid of relations is formed of the relations having minimal rank, provided that rank is finite (Theorem 4.5). Moreover, in this case the minimal ideal has a wellorganized structure. In Section 5 we return to codes. We define the notion of a very thin code which is a refinement of the notion of thin code. The two notions coincide for a complete code. Then we prove the fundamental theorem: A code X is very thin iff the associated unambiguous monoid of relations contains elements of finite positive rank (Theorem 5.1). Several consequences of this result on the structure of codes are given.
I 82
IV. AUTOMATA
Section 6 contains the definition of the group and the degree of a code. The definition is given through the flower automaton, and then it is shown that it is independent of the automaton considered. We also show how the degree may be expressed in terms of interpretations of words. Then synchronous codes are defined, and it is shown that these are the codes with degree 1. We prove then that the composition of codes corresponds to a classical operation on their groups. The case of prefix codes is studied in some more detail. In Section 7, we give the proof of the theorem on the synchronization of semaphore codes (Theorem 7.1) which has been mentioned in Chapter I1 (Theorem 11.6.5).
1. AUTOMATA
Let A be an alphabet. Recall from Section 0.4 that an automaton over A d = (Q, 1, T )
consists of a set Q of states, of two subsets I and Tof Q called sets of initial and terminal states, respectively, and of a set S c Q x A x Q
whose elements are called edges. An edge
f
= (P,a, 4 )
is also denoted by
f: p A q . The letter a E A is called the label of the edge. A path in d is a finite sequence c = C1C2"'Cn
of consecutive edges ci:p i We shall also write
qi (i.e., such that q i = p i + for 1
C:
PI
I i In - 1).
5qn,
where w = a,a,**-a,
is the label of the path c. The path is said to start at p1 and end at q,,.The length of the path is the number n of edges which compose it. For each state q E Q, we introduce also the null path beginning and ending at q, denoted by 1,: q --+ q. Its length is 0, and its label is 1 E A*.
‘83
I . AUTOMATA
With each automaton d = (Q, I, T) over A is associated a function denoted by cpd (or cp for short, if the context allows it) cp:
A + NOxQ
defined by
This function extends into a morphism, still denoted cp, from A* into the monoid N Q of N-relation over Q (see Section 0.6). In particular, we have cp(l) = I Q ,
and for u, u E A*
The morphism cp = cpd is called the representation associated with d .
PROPOSITION 1.1 Let d = (Q,I , T )be an automaton over A. For all p , q E Q and w E A*, ( p , rpd(w),q ) is the (possibly injnite) number of paths from p to q with label w. 0 An automaton d = (Q, I, T) over A is unambiguous if for all p , q E Q and w E A*, ( P , cpdW,4)E {0,1>.
Thus an automaton is unambiguous iff for all p q E Q and w E A*, there is at most one path from p to q with label w. A path c: i -,t is called successful if i E I and t E T. The behavior of the automaton d = (Q, I, T) is the formal power series Id(defined by
(14,w) =
C(i,c p d ( W ) , 0.
iel teT
(1.1)
The set recognized by d is the support of ldl.It is just the set of all labels of successful paths. It is denoted by L ( d ) ,as in Section 0.4.
PROPOSITION 1.2 Let d = (Q,I , T )be an automaton over A. For all w E A*, (Idl w) ,is the (possibly infinite) number of successful paths labeled by w. 0 A more compact writing of formula 1.1 consists in
(IdL w) = I d w ) T.
( 1.2)
Here, the element I E N Qis considered as a row vector and T E M Q as a column vector.
184
IV. AUTOMATA
EXAMPLE1.1 Let d be the automaton given by Fig. 4.1, with I = T = { l}. Its behavior is the series
I d I = nC1 0 !nun,
aw;Fig. 4.1 The Fibonacci automaton.
where fn is the nth Fibonacci number. These numbers are defined by fo
=f1
f, + 1 = f n
= 1,
+ f,- 1 ,
n 2 1.
For n 2 1, we have q&(an)=
["
fn-1
fn-
'1,
fn-2
with f - l = 0. Let d = (Q, I, T) be an automaton. A state q E Q is accessible (resp., coaccessible) if there exists a path c: i + q with i E I (resp., a path c: q --+ t with t E T). An automaton is trim if all states are both accessible and coaccessible. The trim part of d = (Q, I, T) is the automaton d o= ( P , I n P , T n P),
which is the restriction of d to the set P of states of d which are both accessible and coaccessible. It is easily seen that I d o l = ldl.
PROPOSITION 1.3 Let d = (Q,i, t ) be a trim automaton with a unique initial and a unique j n a l state. Then d is unambiguous iJs Id[is a characteristic series. Proof If d is unambiguous, then clearly Id1 is a characteristic series. Conversely, if
( P , Cpd(W)Y 4) 2 2 for some p, q E Q and w E A*, then choosing paths i U, p and q A t , we have ( I d l y u w u )2
2. 0
‘85
I . AUTOMATA
If d is an unambiguous automaton, we usually identify the series Id1 with the set L ( d ) recognized by d . To each automaton d = (Q,Z,T), we associate an automaton d* by a canonical construction consisting of the two following steps. Let o # Q be a new state, and let Af = (Q v
(1.3)
O,O,O)
be the automaton with edges Y = S v $ v F v O
where S is the set of edges of d ,and
f
= {(w,a,q)I3i E I : ( i , a,q) E S},
I
F = ((4,a, O ) 3t
E
T :(4, a, t ) E S},
I
O = {(w,a,w ) 3i E I , t E T :(i, a, t ) E 9).
By definition, the automaton d*is the trinf part of
PROPOSITION 1.4 Let X
c A+,
( 1-41 (1.5) (1.6)
a,
and let Af be an automaton such that
Id1 = X. Then ld*I = (&)*.
(1.7)
Proof Since d* is the trim part of the automaton d defined by formula 1.3, it suffices to show that 1 9 1= Id\*. A path c: o -P w in is called simple if it is not the null path, and if no interior node is equal to o.Each path c: o -P o is a product, in a unique manner, of simple paths from w to o. Let S be the power series defined as follows: for all w E A*, (S,w ) is the number of simple paths from w to o labeled with w. By the preceding remarks, we have
l q = S*. Thus it remains to prove that
s = 8. Let w E A*. If w = 1, then (S, 1) = (X, 1) = 0,
since a simple path is not null. If w = a E A, then (S,a) = 1 iff a E X, according to formula (1.6). Assume now IwI 2 2. Set w = aub with a,b E A and u E A*. Each simple path c: o A o factorizes uniqualy into c:
w L p L q + bo
I 86
IV. AUTOMATA
for some p, q E Q. There exists at least one successful path i 4 p A q - tb
in d .This path is unique because the behavior of d is a characteristic series. If there is another simple path c': w 5 w in 9, then there is also another successful path labeled w in d ;this is impossible. Thus there is at most one simple path c: w A w in 9 and such a path exists iff w E X.Consequently, S = X,which was to be proved. 0
EXAMPLE 1.2 Let X = {a,a2}, Then X = Jdlfor the automaton given in Fig. 4.2, with I = {l}, T = (3). The automaton d*is the automaton of Fig. 4.1 up to a renaming of w. Consequently, for n 2 0
((X)*, a") = f".
Fig. 4.2
EXAMPLE1.3 Let X = {aa,ba,baa,bb,bba}.We have X = [dl for the automaton d of Fig. 4.3, with Z = { l}, T = (4). The corresponding automaton d*is given in Fig. 4.4.
Fig. 4.3
Fig. 4.4
'87
I . AUTOMATA
PROPOSITION 1.5 Let X c A + and let d be an automaton such that Id1 = X.Then X is a code fi d* is unambiguous. Proof According to Proposition 1.4, we have Id*[= (&)*. Since d is trim, Proposition 1.3 shows that d*is unambiguous iff Id*/is a characteristic series. Since L ( d ) = X*, this means that d* is unambiguous iff X* = @)*. Thus we get the proposition from Proposition 1.6.1. 0
In view of Proposition 1.5, we can determine whether a set X given by an unambiguous automaton d is a code, by computing d*and testing whether d*is unambiguous. For doing this, we may use the following method. Let d = (Q,I , T ) be an automaton over A. Define the square of d Y(d)=(Q x Q , I x 1,T x T )
by defining (Pl,P2) A ( q 1 9 q 2 )
to be an edge of Y ( d )iff P I 5 4 1
and
p2.,q2
to be an edge of Y ( d )iff PROPOSITION 1.6 An automaton d = (Q,I , T )is unambiguous iff there is no path in Y ( d )of the form ( P , P ) A (r,s) 4(4,q)
(14
with r # s. Proof The existence of a path of the form (1.8) in Y ( d is ) equivalent to the existence of the pair of paths p
A
r
4
q
and
p + s A q
with the same label uv in d. 0
To decide whether a recognizable set X (given by an unambiguous finite automaton) is a code, it suffices to compute d*and to test whether d * is unambiguous by inspecting the finite automaton Y(d*),looking for paths of the form (1.8). EXAMPLE 1.4 Consider again the automaton d * of Example 1.3 (Fig. 4.4). The automaton 9'(d*) is given in Fig. 4.5, where only the part accessible from the states (q, q ) is drawn. It shows that d* is unambiguous.
I 88
IV. AUTOMATA
Fig. 4.5 Part of the square of the automaton of Fig. 4.4.
The following proposition is the converse of Proposition 1.5.
PROPOSITION 1.7 Let I = (Q, 1 , l ) be an unambiguous automaton ouer A with a single initial and j n a l state. Then its behavior I.dl is the characteristic series of some free submonoid of A*. Proof Let M c A* be such that Id1 = M . Clearly the set M is a submonoid of A*. We shall prove that A4 is a stable submonoid. For this, suppose that u, wv, uw,v E
M;
Then there exist in d paths l A 1 ,
1-1,
1-1,1".,1.
The two middle paths factorize as l*pAl,
l A q A 1
for some p, q E Q. Thus there exist two paths 1 A l A p A l
t " . , q - l AW l . Since d is unambiguous, these paths coincide, whence 1 = p = q. Consequently w E M. Thus M is stable, and by Proposition 1.2.4, M is free. 0 The following terminology is convenient for automata of the form d = (Q, 1, 1) having just one initial state which is also the unique final state. A path c: p 4 q
‘89
2. FLOWER AUTOMATON
is called simple if it is not the null path (i.e., w E A’) and if for any factorization c: p A r L q
of the path c into two nonnull paths, we have r # 1 . We have already used, in the proof of Proposition 1.4, the fact that each path from 1 to 1 decomposes in a unique manner into a product of simple paths from 1 to 1 . More generally, any path c from p to q either is the null path, is simple, or decomposes in a unique manner as c: p A 1 L 1 4 1 - l - l a q , where each of these n + 2 paths is simple. We end this section by showing how to test (by means of an automaton) whether a code is complete. PROPOSITION 1.8 Let X c A + be a code, and let d = (Q, 1 , l ) be an unambiguous trim automaton recognizing X * . Then X is complete iff the monoid cp,(A*) does not contain the null relation. Proof
If X is complete, then there exist, for each w E A*, two words
u, u E A* such that uwu E X*.Then there exists a path
l-+p-sq--%l,
and consequently qJw) is not null. Conversely, if cp,(A*) does not contain the null relation, then for each w E A*, there exists at least one path p q. Since d is trim, there exist two paths 1 p and q 1 . Then uwu E X*. Thus X is complete. 0
2. FLOWER AUTOMATON
We describe in this section the construction of a “universal” automaton recognizing a submonoid of A*. Let X be an arbitrary subset of A + . We define an automaton ~ D ( X=)(Q, 1, T ) by
Q = {(u,u) E A* x A* 1 uu E X } , I=lxX,
T=Xxl,
with edges (u, u) -%(u’,
u’)
IV. AUTOMATA
190
Fig. 4.6 The edges of &(X) for x = a , a 2 ” . a n .
iff ua = u‘ and u = ad. In other words, the edges of dD are uau E
(u, all) 4 (ua, u),
x.
It is equivalent to say that the set of edges of the automaton d,,is the disjoint union of the sets of edges given by Fig. 4.6 for each x = alaz* - a, in X . The automaton d D ( X )is unambiguous and recognizes X , i.e.,
The flower automaton of X is by definition the automaton ( d , ( X ) ) * obtained from d D ( X )by applying the construction described in Section 1, [formulas . denote by qD the (1.4)-(1.6)]. It is denoted by d $ ( X ) rather than ( d D ( X ) ) *We associated representation. Thus, following the construction of Section 1, the , automaton d $ ( X )is obtained in two steps as follows. Starting with d D ( X ) we add a new state w, and the edges oA(a,u)
(u,a) ( 1 w
for au E X , for ua E X , for
00’0
UEX.
This automaton is now trimmed. The states in 1 x X and X x 1 are no longer accessible or coaccessible and consequently disappear. Usually, the state w is denoted by (1,l). Then d $ ( X ) takes the form d X X ) = W,(A 1),(1, 1)) with P = {(u,u) E A + x A + I uu E X } u ((1, l)}, and there are four types of edges (u, au) 4(ua, u)
for uau E X, (u, u) # (1, l),
(l,l)*(u,u)
for
UUEX, u#l,
(u,a)A(l,l)
for
U U E X , u # 1,
(l,l)A(l,l)
for
EX.
'9'
2 . FLOWER AUTOMATON
The terminology is inspired by the graphical representation of this automaton. Indeed each word x E X defines a simple path (1,1)
in d t ( X ) . If x
=a E
A, it is the edge ( 1 3 1 )
-".(I,
1).
If x = a 1 a 2 ~ ~with ~ a ,n 2 2, it is the path (1,l)
(a1,a2 * * * a,) 4 (ala,, a 3 * * . a , )+. ..
(1,l).
EXAMPLE2.1 Let X = {aa,ba, bb, baa, bba}. The flower automaton is given in Fig. 4.7.
Fig. 4.7 The flower automaton of X.
THEOREM 2.1 Let X be a subset of A + . The following conditions are equivalent: (i) X is a code. (ii) For any unambiguous automaton d recognizing X , the automaton d* is unambiguous. (iii) The power automaton d t ( X ) is unambiguous. (iv) There exists an unambiguous automaton d = (Q, 1,l) recognizing X* and X is the minimal set of generators of X * .
-
Proof (i) => (ii) is Proposition 1.5. The implication (ii) (iii) is clear. To prove (iii) * (iv) it suffices to show that X is the minimal generating set of X*. Assume the contrary, and let x E X , y, z E X + be words such that x = yz. Then
IV. AUTOMATA
192
there exists in d g ( X ) a simple path ( 1 , 1 ) 4 ( 1 , 1 ) and a path ( 1 , 1 ) ~ ( 1 , 1 ) ~ ( 1 , l ) w h i c h i s a l s o l a b e lbyx.Thesepathsaredistinct, ed so d g ( X ) is ambiguous. (iv) => (i) By Proposition 1.7, X* is free; thus X is a code. 0 We shall now describe explicitly the paths in the flower automaton of a code
X. PROPOSITION 2.2 Let X c A + be a code. The following conditions are ) the automaton equivalent for all words w E A* and all states (u,u), ( u ‘ , ~ ‘in dg(X): (i) There exists in d g ( X ) a path c:
(u, v) * ( u ’ ,
u‘),
(ii) w E uX*u’ or (uw = u‘ and u = wu‘), (iii) uw E X*u‘ and wu’ E uX*. Proof (i) * (ii) If c is a simple path, then it is a path in dD. Consequently, uw = u’ and u = wu”(Fig. 4.8(a)). Otherwise c decomposes into c:
(u,u) A ( 1 , l ) A ( 1 ,
l)&(U’,U’)
with w = uxu’ and x E X* (Fig. 4.8(b)). (ii) * (iii) If w E uX*u’, then uw E uuX*u’ c X*u’ and w E uX*u’u‘ c uX*, since uu, u’u’ E X u 1. If uw = u’ and u = wu’, then the formulas are clear. (iii) => (i) By hypothesis, there exist x, y E X* such that uw = xu‘,
wu’ = uy.
Let z = uwu’. Then z = uwu’ = xu’u’ = uuy E
x*.
Fig. 4.8 Paths in the flower automaton.
‘93
2. F L O W E R AUTOMATON
Each of these three factorizations determines a path in d ; ( X ) : c: ( l , l ) & ( i i , F ) A ( i i ’ , V ’ ) A ( l , l ) ,
1) L ( U ’ , d ) A (1, l),
c’:
(l,l)+(l,
c”:
( l , l ) A ( u , u )L ( 1 , l ) L ( 1 , l ) .
-
(The p a t h s ( l , l ) A ( u , u ) + ( l , 1) and(l,l)&(u’,u’)&(l, 1)may have length 0.) Since X is a code, the automaton d r ” ; ( X )is unambiguous and consequently c = c’ = c”. We obtain that (u, u) = (U;V) and (ii‘, F‘)= (u’,u’). Thus (u, u ) A ( u ’ , 0‘). 0
The flower automaton of a code has “many” states. In particular, the flower automaton of an infinite code is infinite, even though there exist finite unambiguous automata recognizing X* when the code X is recognizable. We shall show that in some sense which will be made more precise now, d ; ( X ) is universal among the automata recognizing X*. Consider two unambiguous automata
d = (P,1,l)
and
W = (Q, 1,l)
and their associated representations rp, and pa. A function p: P-Q
is a reduction of d onto W if it is surjective, p(1) = 1 and if, for all w E A*, ( q 9
rp,(w), 4’) = 1
iff there exist p, p’ E P with
(P,P d ( W ) , P ’ ) = 1,
P ( P ) = 4, P(P‘) = 4’.
The definition means that if p A p’ is a path in d,then p( p) Lp ( p’) is a path in W. Conversely, a path q Lq‘ can be ‘‘lifted’’ in some path p 4p’ with p E p-’(q), p’ E p - ’ ( 4 ’ ) . Another way to see the definition is the following. The matrix cp,(w) can be obtained from qJw) by partitioning the latter into blocs indexed by a pair of classes of the equivalence defined by p , and then by replacing null blocs by 0, and nonnull blocs by 1. Observe that if p is a reduction of d onto W, then for all w, w’ E A*, the following implication holds: = (pJd(w’)
*
v . w ) = cp,(w’).
IV. AUTOMATA
'94
Thus there exists a unique morphism
p^: %&I cp,(A*) *) +
such that 4?¶ = p^o%f.
The morphism p^ is called the morphism associated with the reduction p.
PROPOSITION 2.3 Let d = (PI1,l) and L43 = (Q,1,l) be two unambiguous trim automata. Then there exists at most one reduction of d onto W . l f p : P Q is a reduction, then --f
1. I d l = I % 2. I d 1 = la1
-
i f l p - ' ( l ) = 1.
Proof Let p , p': P Q be two reductions of al onto 93. Let p E P , and let q = p ( p ) , q' = p'(p). Let u, u E A* be words such that 1 A p V. 1 in the automaton d.Then we have, in the automaton W ,the paths
lAq"-*l,
1 4 q ' A l .
Since L43 is unambiguous, q = q'. Thus p = p'. 1. I f w E l d g l , t h e r e e x i s t s a p a t h l ~ l i n d ; t h u s t h e r e i s a p a t h l ~ l in W. Consequently u E IWI. 2. LetwE IWI.Thenthereisapathp-%p'indwithp(p) = p ( p ' ) = 1.If 1 = p-'(I), then this is a successful path in d and w E Id[. Conversely, let p # 1. Let 1 A p A 1 be a simple path in d.Then uu E X,where X is the p ( p ) VI 1. Since 19Il = Id[, we have base of Id(.Now in a,we have 1 p ( p ) # 1. Thus p-'(l) = 1. 0
PROPOSITION 2.4 Let X c A + be a code, and let d g ( X ) be its flower automaton. For each unambiguous trim automaton d recognizing X * , there exists a reduction of d $ ( X ) onto d . Proof
Let d g ( X ) = (P,(l,l),(l, 1)) and d = (Q,1,l). Define a function p: P - Q
as follows. Let p = (u, u) E P. If p = (1, l), then set p ( p ) = 1. Otherwise uu E X, and there exists a unique path c: 1 :q ' b1 in d . Then set p ( p ) = q. The function p is surjective. Let indeed q E Q,q # 1. Let cl:
1a q ,
c2: q*l
be two simple paths in d .Then uu E X,and p = (u, u) E P satisfies p ( p ) = q.
‘95
2. FLOWER AUTOMATON
We now verify that p is a reduction. For this, assume first that for a word w E A*, and q, q’ E
Q, (4,%f(w),q’)
= 1.
Consider two simple paths in at,e: 1 A q, e’:q’ 21. Then in d,there is a path 1*qAqt--Sl.
Consequently uwu’ E X*. Thus for some xi E X,UWU‘ = x1x2 x,. Since e is simple, u is a left factor of xl, and similarly u’ is a right factor of x,. Setting XI
x, = u’u’,
= uu,
we have uwu‘=
UUX~”’X,
= xl.**x,-,u‘v‘,
whence uw
E X*u’,
wv‘ E u x * .
In view of Proposition 2.2, ((u, u), cp,(~), Suppose now conversely that
(u’, u’)) = 1.
( p , qdw),P’)= 1 for some p = (u, u), p’ = (u’, u’), and w E A*. Let q construction, there are in d paths 1
’-q ’
*1
and
1
“
(2.1) = p ( p ) , q’ = p ( p ’ ) .
:q’
1
By
(2.2)
In view of Proposition 2.2, formula (2.1) is equivalent to (uw = u’
and u = wu’) or (w = uxu’ for some X E X * ) .
In the first case, uu = uwu’ = u’u’. Thus the two paths (2.2)coincide, giving the path in d, 1
-4’
‘‘4
v’
1.
In the second case, there is in d a path q
’ :1
x.
:1
’’
q’
Thus, (4, pd(w),q’) = 1 in both cases. 0
EXAMPLE2.2 For the code X = {aa,ba, bb, baa, bba}, the flower automaton is given in Fig. 4.9.
IV. AUTOMATA
8
Fig. 4.9 The flower automaton of X .
The matrices of the associated representations (with the states numbered as indicated in Figs. 4.9 and 4.10)
Fig. 4.10 Another automaton recognizing X * .
'97
2. FLOWER AUTOMATON
-
1 0:1 0 0:o 0 0 ..................... 2 1: 3 1: 0 0 1: ..................... l:o 0 0: 0:o 0 0: oto 1 0: 0 .0-: .o 0 0 :
;
-
0:o 0 0:1 1 ....................
0: 0:
ko
1
0
0
..................... 0 oi
6 1 : o 0 0; 7 0;o 0 0: 8 -0 1. 0 . 0 1 :
0
The concept of a reduction makes it possible to indicate a relation between the flower automata of a composed code and those of its components. PROPOSITION 2.5 Let Y c B', Z c A + be two composable codes and let X = Yo,, 2. If Y is complete, then there exists a reduction of d t ( X ) onto d t ( Z ) . Moreover, d t ( Y ) can be identijied, through p, with the restriction of d ; ( X ) to the states in Z* x Z*. Proof Let P and S be the sets of states of d g ( X )and d t ( Z ) ,respectively, and let cpx and cpz be the representations associated to d t ( X ) and d t ( Z ) . We define the function p:
P+S
as follows. First; let p((1,l)) = (1,l). Next, consider (u,u) E P - (1,l). Then uu E 2 '. Consequently, there exist unique z, Z E Z*, and ( I , s) E S such that u = zr,
v = S'F
(see Fig. 4.1 1). Then let p(u, u) = (r,s). The function p is surjective. Indeed, each word in 2 appears in at least one word in X ;thus each state in S is reached in a refinement of a state in P. To show that p is a reduction, suppose that ((u, V), cp*(w), (u', u')) = 1.
IV. AUTOMATA
V
Fig. 4.11 Decomposing a petal.
Let (r,s) = p((u, v)), (r’,s’)= p((u’, u’)), and let z, Z, z’, 5‘ E Z* be such that v = sT,
u = zr,
v’ = s’5’.
u’ = z’r’,
By Proposition 2.2, uw E X*u’, wv’ E vX*. Thus zrw E Z*r‘, WS’Z’E sZ*, implying that zrws‘ E Z* and rws’5.s Z*. This in turn shows, in view of the stability of Z*, that rws‘ E Z*. Set zrw = 2r‘, with 2 E Z*. Then
2(rfs’)= z(rws’) and each of the four factors in this equation is in Z*. Thus Z being a code, either z^ = zt or z = ZZ for some t E Z*. In the first case, we get tr’s’ = rws’, whence rw E Z*r’. The second case implies r’s‘ = trws’. Since r’s’ E 1 u 2, this forces t = 1 or rws‘ = 1. In both cases, rw E Z*r’.Thus rw E Z*r’, and similarly ws’ E sZ*. By Proposition 2.2, ((r,4, cpz(w),(I s’! )), = 1. Assume conversely that
4,
((I d, w
) , (rl, st))= 1.
Then by Proposition 2.2
rw
= zr’,
ws’ = sz’
for some z, z’ E Z*. Then rws’ E Z*, and Y being complete, there exist t, t’ E Z* such that m = trws’t‘ E X*.Let
m = trws’t’ = trsz’t’ = tzr’s’t’ = x1 * * * x , with n 2 1, xl,. ..,x, E X.We may assume that t and t’ have been chosen of minimal length, so that t is a proper left factor of x1 and t‘ is a proper right factor of x,. But then, since m E Z* and also trs E Z*,trs is a left factor of xI and r’s’t’ is a right factor of x, (Fig. 4.12). Define x1 = uu
with u = tr,
x, = u’v’
with u’ = t’r’, v E s‘Z*.
u E sZ*,
'99
2. FLOWER AUTOMATON
m:
(b)
Fig. 4.12 The cases of (a)n > 1 and (b) n = 1.
Then (u, u) and (u', u') are states of d t ( X ) ,and moreover P ( ( U , 0)) = (T,
p((u', o f ) ) = (TI, 4,
4,
and
rn = u w d = u u x 2 * * * x ,= X
f
I
~ ~ ~ - X ,u -. ~ U
Thus
uw
E
X*u'
and
wu' E uX*.
Finally, consider the set R of states of d g ( Y ) .Then R can be identified with R' = {(u,u) E P I u, u E Z * } . The edges of dg(Y) correspond to those paths (u, u) + (u', u') of d t ( X )with endpoints in R', and with label in Z. 0
EXAMPLE2.3 Recall from Chapter I that the code X = {aa, ba, bb, baa, bba} is a composition of Y = {cc, d, e, dc, ec} and 2 = {a, ba, bb}. The flower automaton d $ ( X ) is given in Fig. 4.13. The flower automaton d g ( Z )is given in Fig. 4.14. It is obtained from d g ( X ) by the reduction P ( 1 ) = P(2) = P(3) = P(4) = 1,
P ( 6 ) = P(8) = 6 p(5) = p(7) = 7.
The flower automaton d $ ( Y )is given in Fig. 4.15.
200
IV. AUTOMATA 8
Fig. 4.13 The flower automaton of X.
Let X c A + be a code and let /?:B* -,A* be a coding morphism for X.Since /? is injective, there exists a partial function,
y: A* + B* with domain X* and such that y(fi(u)) = u for all u E B*. We say that y is a decoding function for X . Such a decoding function can be computed by an automaton with output called a decoding automaton. The construction goes as follows. Let
2. FLOWER AUTOMATON
20 1
Fig. 4.15 The flower automaton of Y.
d t ( X ) = (P,(l, l),(l, 1)) be the flower automaton of X. We define for each edge e of this automaton an output label a(e) E B u 1. Let e = ((u,u), a,(u’,u’)) be an edge of d E ( X ) .Then we set a(e) = {!-‘(ua)
if (u’,u’) = (1, I), otherwise.
The output label a(c) of a path c is the product of the output labels of the edges that c is formed of. Then, for all x E X*,
Y(4
= 4c),
Fig. 4.16 The decoding automaton.
202
IV. AUTOMATA
where c is the unique successful path with (input) label x. Thus y can be computed using the automaton . I t ( X ) . This is especiallyinteresting when X is finite since, in this case, the corresponding flower automaton is also finite.
EXAMPLE 2.4 Let X = {aa,ba,bb,baa,bba} be the code of Example 2.3. Let /3 be the coding morphism for X defined by P(c) = aa,
P ( d ) = ba,
B(f)
P(e) = 66,
= baa,
B(g) = bba.
The output labels of the automaton d t ( X ) are represented in Fig. 4.16 with P cr'O(c)* q denoting that the edge e = (p, a, q ) has output label o(e). 3. MONOIDS OF UNAMBIGUOUS RELATIONS
Let m be an N-relation between P and Q. It is called unambiguous if m E { 0, I}'
Q,
that is to say, if the coefficients of m are zero or one. An unambiguous Nrelation will simply be called an unambiguous relation. A monoid of unambiguous relations over Q is a submonoid of N Q such that all its elements are unambiguous. As a submonoid of N Q Q, it contains the identity ZQ. Let M be a monoid of unambiguous relations over Q. Any element m E M can be considered as a Q x Q-matrix with coefficients zero or one. It is important to observe that these coefficients 0, 1 belong to N and not to the Boolean semiring W. Let m, n E M and p , q E Q be such that (p,mn,q) = 1. Then there exists a unique r E Q such that (p,m,r) = (r,n,q) = 1. This shows in particular that two Q x Q-matrices m, n with coefficients 0, 1 which belong to a monoid of unambiguous relations are related by a very strong condition: if I is a row of m, and c is a column of n, there is at most one r E Q such that 1, = c, = 1. We shall sometimes say that the monoid M itself, as a monoid of Nrelations, is unambiguous. Therefore an unambiguous monoid of N-relations and a monoid of unambiguous relations are, fortunately for the reader, the same thing. For m E M and p , q E Q we shall use the notation pmq to express that (p, m, q) = 1. This compact and useful notation is introduced in the analogy of (Boolean) binary relations. However, we should keep in mind that, by unambiguity, pmrnq, pmsw imply r = s.
3.
MONOIDS OF UNAMBIGUOUS RELATIONS
203
A monoid M of unambiguous relations over Q is said to be transitive if for all p,q E Q, there exists m E M such that pmq. The first result indicates the connection between monoids of unambiguous relations and unambiguous automata.
PROPOSITION 3.1 Let d be an automaton over A. Then d is unambiguous ir cp,(A*) is unambiguous. Moreover, if d = (Q,1, l), then d is trim i f cp,(A*) is transitive. Proof The first equivalence results from Proposition 1.1. Next let d = (Q,1,l) be a trim automaton. Let p , q E Q. Let u,v E A* be such that p -1, and 1 q are paths. Then p A q is a path and consequently pqd(uv)q.The converse is clear. 0
Let m be an unambiguous relation between P and Q.Then m is a function if there is a function f from Pinto Q such that pmq iff f ( p ) = q. I f f is a bijection, then m will be called a bijection; further if P = Q,then m is a permutation. If m is an JV relation over Q,then m is idempotent if mm = m, and m is invertible if there is a relation n over Q such that mn = nm = 1,.
PROPOSITION3.2 Let m be an unambiguous relation over Q. Then m is invertible iff m is a permutation. Proof Let m be an invertible relation, and let n be a relation such that mn = nm = 1,. For all p E Q,there exists q E Q such that pmq, since from pmnp we get pmqnp for some q E Q . This element q is unique: if pmq’, then qnpmq‘ = @Qq‘, whence q = 4’. This shows that m is a function. Now if pmq and p’mq, then pmqnp and p’mqnp, implying p’ = p. Thus m is injective. Since nm = I,, m is also surjective. Thus m is a permutation. The converse is clear. 0 Observe that we did not make use of the unambiguity in this proof. Let m be an unambiguous relation over a set Q. A j x e d point of m is an element q E Q such that qmq. In matrix form, the fixed points are the indices q such that m4,4= 1, thus such that there is a 1 on the diagonal. We denote by Fix(m) the set of fixed points of m. PROPOSITION 3.3 Let M be a monoid of unambiguous relations over Q.Let m E M and let S = Fix(m). The following conditions are equivalent:
(i) m is idempotent. (ii) For all p, q E Q,we have pmq iff there exists an s E S such that pms and smq. (iii) We have m = cl and lc = Is, (3.1) are the restrictions of m to Q x S and and 1 E (0,l}‘ where c E {0,1}, S x Q, respectively.
IV. AUTOMATA
204
If m is idempotent, then moreover, we have in matrix form
with c' E (0,l}(Q-s)xs, I' E {O,l}sx(Q-s), and l'c' = 0. The decomposition (3.1) of an idempotent relation is called the column-row decomposition of the relation. Proof (i) * (ii) Let p , q E Q be such that pmq. Then pm3q. Consequently, there are s, t E Q such that pmsmtmq. It follows that pmsmq and pmtmq. Since M is unambiguous, we have s = t, whence sms and s E S.The converse is clear. (ii) * (iii) Let c and 1 be the restrictions of m to Q x S and S x Q, respectively. If pmq, then there exists s E S such that pms and smq. Then pcs and slq. Conversely if pcs and slq, then we have pmsmq, thus pmq. Since this fixed point s is unique, we have m = cl. Now let r,s E S with rlcs. Then rlqcs for some q E Q. Thus rmq and qms. Moreover, rmr and sms,whence
rmrmqms,
rmqmsms
The unambiguity implies that r = q = s. Conversely we have slcs for all s E S. Thus IC = Is. (iii) => (i) We have m2 = clcl = c(1c)l = cl = m. Thus m is idempotent. (iv) Assume that m is idempotent. The restriction of m to S x S is the identity. Indeed sms holds for all s E S , and if smr with s, r E S, then smsmr and smrmr, implying s = r by unambiguity. This shows that c and 1 have the indicated form. Finally, the product lc is
Ic = I,
+ l'c'.
Since lc = I,, this implies that l'c' = 0, which concludes the proof. 0 Let M be a monoid of unambiguous relations over Q and let e E M be an idempotent. Then M(e) = eMe is a monoid, and e is the neutral element of M(e), since for all m E M(e), em = me = erne = m. It is the greatest monoid contained in M and having neutral element e. It is called the monoid localized at e (cf. Section 0.2).We denote by G(e)the group of units of the monoid M(e), which is also the &-'class of e (Proposition 0.5.4). Note that M ( e )is not a monoid of unambiguous relations in our sense, since the relation 1, is not in M(e), except when e = I,. However, the following proposition shows that M(e) is naturally isomorphic to a monoid of unambiguous relations Me over the set of fixed points of e.
PROPOSITION 3.4 Let M be a monoid of unambiguous relations over Q, let e be an idempotent in M and let S = Fix(e) be the set of fixed points of e. The
3.
MONOIDS OF UNAMBIGUOUS RELATIONS
205
restriction y of the elements of M(e) to S x S is an isomorphism of M(e) onto a monoid of unambiguous relations Me over S. If e = cl is the column-row decomposition of e, this isomorphism is given by
m t--r lmc. In particular, Ge = lG(e)cis a permutation group over S. Further, if M is transitive, then Me is transitive. y:
Proof
(3.2)
Let y be the function defined by (3.2). If m E eMe, then for s, t E S, (s,r(m), t ) = (s,lmc, t) = (s,m, 0,
because we have sls and tct. Thus y(m) is the restriction of the elements in eMe to S x S. Further, y is a morphism since y(e) = lec = 1, and for m, n E eMe, y(mn) = y(men) = l(men)c = lmclnc = y(m)y(n). Finally y is injective since if y(m) = y(n) for some m, n E eMe, then also cy(m)l= cy(n)l. But cy(m)l= clmcl= erne = m. Thus m = n. The monoid Me = y(eMe) is a monoid of N-relations over S since it contains the relation Is. Its elements, as restrictions of unambiguous relations, are themselves unambiguous. Finally G, = y(G(e))is composed of invertible relations. By Proposition 3.2, it is a permutation group over S. I f M is transitive, consider s,t E S. There exists m E M such that smt. Then also (se)m(et).Taking the restriction to S, we have sy(eme)t. Since y(eme) E Me, this shows that M,.is transitive. 0
EXAMPLE 3.1 Consider the N-relation given in matrix form by
m=
Then
and m3 = m.
206
IV. AUTOMATA
Thus m 2is an idempotent relation. The monoid M = { 1, m,m2}is a monoid of unambiguous relations. The fixed points of the relation e = m2 are 1 and 2, and its column-row decomposition is
1'
0 1 1 0 1 0 0 0 0 1 0 0
e=
0 1
We have
= cl*
and the restriction of m to the set { 1,2} is the transposition (12). The monoid Me is equal to the group G, which is isomorphic to 2/22. Let M be an arbitrary monoid. We compare now the localized monoids of two idempotents of a 9-class. Let e, e' be two 9-equivalent idempotents of M. Then there exists an element d E A4 such that e B d Y e ' . By definition of these relations, there exists a quadruple (a,a', b, b')
(3.3)
of elements of M such that ea = d, da' = e, bd = e', b'e' = d (see Fig. 4.17). The quadruple (3.3) is a passing system from e to e'.
Fig. 4.17 The passing system.
(3.4)
3.
MONOIDS OF UNAMBIGUOUS RELATIONS
The following formulas are easily derived from (3.4): eaa' = e,
bb'e' = e'
bea = e'
b'e'a' = e,
ea = b'e',
be = e'a'
(the last formula is obtained by be = bb'e'a' = e'a'). I f e and e' are idempotents, then the following hold also: eabe = e,
e'a'b'e' = e',
(Note that most of these identities appear in Section 0.5.) Two monoids of N-relations M over Q and M' over Q' are equivalent if there exists a relation 0 E (0,l}QxQ'which is a bijection from Q onto Q' such that the function m H8'mO
is an isomorphism from M onto M' (8' is the transposed of 6). Since 8 is a bijection, we have 0' = 0 - l . Therefore, in the case where M and M' are permutation groups, this definition coincides with that given in Section 0.8.
PROPOSITION 3.5 Let M be a monoid of unambiguous relations over Q, and let e, e' E M be two %equivalent idempotents. Then the monoids M(e)and M(e') are isomorphic, the monoids Meand Me,an equivalent, and the groups G, and Ge, are equivalent permutation groups. More precisely, let S = Fix(e), S' = Fix(e'), let e = cl, e' = c'l' be their column-row decompositions, let y and y' be the restrictions to S x S and S' x S' and let (a,a', b, b') be a passing system fiom e to e'. Then 1. 2. 3. 4.
The function T : m H bma is an isomorphism from M(e)onto M(e'). The relation 0 = lac' = lb'c' E (0,l}s' is a bijection from S onto S'. The function z': n HB'n0 is an isomorphism from Me onto Me.. The following diagram is commutative M(e) A M(e')
yI
Me
Iy,
4 Me..
Proof 1. Let m E M(e). Then z(m) = bma = bemea = e'a'mb'e', showing that r(m) is in M(e'). Next z(e) = bea = e'. For m, m' E M(e), r(m)r(m')= bmabm'a = bmeabem'a = bmem'a = bmm'a = z(mm').
IV. AUTOMATA
Thus t is a morphism. Finally, it is easily seen that
m’H b’m’a‘ is the inverse function of z; thus z is an isomorphism from M(e)onto M(e‘). 2. We have eae’ = eb’e’. Consequently leae’c’ = leb’e’c’. Since le = 1, e’c’ = c’, we get that
8 = lac’ = lb’c’. The relation 8 is left-invertible since (1’bc)O= l’bclac’ = l’beac’ = l‘e’c’ = Is< and it is right-invertible,since we have O(1‘a‘c)= Ib’c‘l’a‘c= lb’e’a’c = lec = Is. Thus 8 is invertible and consequently is a bijection, and
8‘= l’a’c = I‘bc. 4. For rn E M(e),we have z’y(m) = (l’bc)(lmc)(lac’) = l‘bemeac’ = l’(bma)c’ = y’z(m),
showing that the diagram is commutative. 3. Results from the commutativity of the diagram and from the fact that y, t,y’ are isomorphisms. 0
EXAMPLE 3.2 Consider the matrices
They generate a monoid of unambiguous relations (as we may verify by using, for instance, the method of Proposition 2.4). The matrix
3.
MONOIDS OF UNAMBIGUOUS RELATIONS
is the matrix m of Example 3.1. The element
is an idempotent. Fix(e) = { 1,2},and the column-row decomposition is
1[
’1 01
e =
]
0 1 1 0 1 0 0 0 0 1 0 0
= cl.
The matrix 0 0 0 0
e’ = (uu)’ =
is also an idempotent. We have Fix(e’) = { 3,4},and e’ has the column-row decomposition
The idempotents e and e’ lie in the same %class. Indeed, we may take as a passing system from e to e‘ the elements a = b’ = u,
a‘ = b = vuv.
The bijection 0 = lac’ from the set Fix(e) = { 1,2}onto the set Fix(e’) = { 3,4}is e: i H 4 , 2-3 We now describe a useful method for computing the group G, of an idempotent e in a monoid of unambiguous relations. This method requires us to make a choice between “left” and “right.” We first present the right-hand case. Let M be a monoid of unambiguous relations, and let e be an idempotent element in M.Let R be the W-class of e, let A be the set of %-classes of R and let G = G(e)be the &‘-class of e.
IV. AUTOMATA
210
For each H
E
A, choose two elements aH,ah E M such that
eaHE H ,
ea,ah
= e;
with the convention that aG= a& = e.
(see Figure 4.18). Such a set of pairs (aH, is called a system of coordinates of R relatively to the idempotent e. Then GuH = H and Ha;, = G. The elements aH,ah realize two reciprocal bijections from G onto H. Let e = cl be the column-row decomposition of e, and set and cH= a;lc for H E A. (3.5) Note that then lH = lea, follows from 1 = le. Each m E M defines a partial right action on the set A by setting, for all HEA Hm if H m E A , H-m= 0 otherwise. 1, = laH
Now we define a partial function from A x M into G by setting
First, observe that H m # 0 implies H * m E G,. Indeed set H eaHE H we get eaHmE H', showing that
= Hm. From
eaHmah.E G. It follows that H * m = IHmcHt = (lea,)m(u;l,c) = l(ea,m&.)c
E
G,.
Note that for g E G, we have G * g = lgc; this is the element of G, canonically associated to g, and indeed aG = a& = e.
Fig. 4.18 Two coordinates.
3.
21 I
MONOIDS OF UNAMBIGUOUS RELATIONS
m/H +m
Fig. 4.19
Composition of outputs.
Before stating the result, let us show the following relation. For all m, n E M ,
( H * m ) ( H - m * n )= H*mn. (3.8) To verify formula (3.8),let H = Hm, H" = Hmn (thecases where H m = 0 or H mn = fzl are straightforward). See Fig. 4.19. We have
(H0 m)(H * n) = lHmcwlwncn~~ = lanmafn,eaB.na'nllc = I((eaHma;l.)e)an.nab.,c.
Since ea,tnuk,
E
G, we have ea,tnuk.e = ea,ma;l,. Thus (H * m)(H'* n) = l((eaHm)a;l,aH.)na;l,,c.
Since euHmE H' and the multiplicationon the right by ab.aH.is the identity on H',we get
(H * m)(H' * n) = leaHmna;l,,c =IHmCH.. = H*mn.
This proves formula (3.8). As a consequence,we have the following result: PROPOSITION 3.6 Let M be a monoid of unambiguousrelations generated by a set T. Let e be an idempotent of M,let R be its @-class, let A be the set of 2classes of R and let (aH,ajalar.,, be a system of coordinates of R relatively to e.
Then the group G, is generated by the elements of the form H * t, for H t E T , and H * t # 0.
E
A,
Proof The elements H * t, for H E A and t E T either are 0 or are in G,. Now let g be an element of G(e).Then g=tlt2**+tn, tiET,
because T generates M. Let G = G(e)and let H, = Gt1t2 * * t,,
1 5 i 5 n.
From Gg = G it follows that Hiti+l * * * t= , , G.Thus Hi E A. By (3.8),
G * g =(G*tl)(Hl*t2)**.(Hn-l*tn). But G * g = lgc. This shows the result. 0
IV. AUTOMATA
212
The pair of partial functions AxM+A,
AxM+G,
defined by (3.6)and (3.7)is called the W-representation of A4 relatively to e and to the coordinate system ( a H , ~ & E h . Observe that the function
p: M +(G,UO)*~~, where 0 is a new element, which associates to each m E M the A x A-matrix defined by
H*m
if H m = H ' , otherwise, is a morphism from M into the monoid of row-monomial A x A-matrices with elements in G, u 0. This is indeed an equivalent formulation of formula (3.8). Symmetrically, we define the Y-representation of M relatively to e as follows. Let L be the Y-class of e, and let r be the set of its .#-classes. For each H E r,choose two elements b,, b& E M such that
b,e
E
H,
b',b,e = e,
with bG = bb = e. Such a set of pairs (b,,bL)H,r is called a system of coordinates of L with respect to e. As in ( 3 4 , we set c, = bHc,1, = Ibh for
H
E
r.
For each m E M , we define a partial left action on r by setting, for H
m*H=
mH
{0
and a partial function from M x m*H=
E
r,
if r n H E r , otherwise,
r into G, by setting
l,mc,
if mH = H E r, otherwise.
Then formula (3.8) becomes
(n *
m
e
H)(m * H ) = nm * H
(3.9)
and Proposition 3.7 holds mutatis mutandis.
Remark For the computation of the Y-classes and the W-classes of a monoid of unambiguous relations, we can use the following observation, whose verification is straightforward: If m Y n (resp., if m h ) , then each row (resp., column) of m is a sum of rows (resp., columns) of n and vice versa. This yields an easy test to conclude that two elements are in distinct Y-classes (resp., W-classes).
3.
MONOIDS OF UNAMBIGUOUS RELATIONS
213
EXAMPLE 3.3 Let us consider again the unambiguous monoid of Example 3.2, generated by the matrices
We consider the idempotent
e = (uu)’
=
Its W-class R is formed of three %-classes, numbered 0, 1, 2. In Fig. 4.20 a representative is given for each of these X-classes. The fact that the 2’-classes
Fig. 4.20 The W-class of idempotent e.
are distinct is verified by inspecting the rows of e, eu, eu’. Next, we compute that eu3 = eu2u = e, showing that these elements are 9-equivalent. Further, euu = ( u ~ ) ~ XFinally e.
has only one nonnull row (column)and consequently cannot be in the 9-class of e. We have reported in Fig. 4.20 the effect of the right multiplication by u and u. We choose a system of coordinates of R by setting a, = ab = e,
a, = u,
a; = uuu,
a, = u2,
a; = u.
IV. AUTOMATA
2'4 Then 1 0
1 0 1 0 1 0 019
co=b
0 0 0 01 01 ] 9
c1=!
:I.
0 0
4.
1 0
1, .=[;
0 1
I,=[ 0 1 0 0 0 0 0 1
0 1
Let us denote by
the fact that H t = H'and H * t 0.8.Then the @-representationof M relatively to e and to this system of coordinates is obtained by completing Fig. 4.20 and is given in Fig. 4.21. We set 1 0 i = [ o 11'
0 1 j=[1 01
The group G, is of course G2. 0 The concepts introduced in this paragraph are greatly simplified when we considerthe case of a monoid of (total)functions from Q into itself, instead of a monoid of unambiguous relations. u/i
u/i
Fig. 4.21 "be #-repnseatation of M.
3.
MONOIDS OF UNAMBIGUOUS RELATIONS
2’5
Fora~M,writepa=qinsteadof(p,a,q)= 1. The image of a, denoted Im(a), is the set of q E Q such that pa = q for some PEQ-
The nuclear equivalence of a, denoted Ker(a), is the equivalence relation on
Q defined by p = q mod Ker(u) iff pa = qa.
If b E Ma, then Im(b) c Im(a). If b E aM, then Ker(a) c Ker(b) (note the inversion of inclusions). A function e E M is idempotent iff its restriction to its image is the identity. Thus, its image is in this case equal to its set of fixed points: Im(e) = Fix@). As a result of what precedes, if ash, then Im(a) = Im(b) and if aWb, then Ker(u) = Ker(b). This allows us to “separate” functions. To compute the Wclass of an idempotent function e over a finite set, we may use the following observation, where S = Fix@).If the restriction of rn to S is a permutation on S, then e#em. Indeed, we have in this case erne = em since Im(em) = S and e is the identity on S. It follows that (em)ke= ern‘ for k 2 1, whence (em)’ = em‘ for k 2 1. Thus (em)P = e, with p = Card(S)! and this shows the claim.
EXAMPLE 3.4 Let M be the monoid of functions from the set Q = {1,2, ...,8}
into itself generated by the two functions u and u given in the following array 1
2
3
4
5
6
1
8
where each column contains the images by u and u of the element of Q placed on the top of the column. The function e = u4 is idempotent and has the set of fixed points S = (1,4,5,8},
u4
1
2
3
4
5
6
7
8
1
4
1
4
5
8
5
8
We get the pattern of Fig. 4.22 for the W-class R of e. These four #-classes are distinct because the images of e, eu, eu2,eu3are distinct. For the edges going back to the #class of e, we use the observation stated above; it suffices to verify that the restrictions to S of the functions u, uu, u2u, u3u, u are permutations.
216
IV. AUTOMATA
Fig. 4.22 The I-class of e.
Choose a system of coordinates of R by taking a, = ab = e, a, = v,
a; = v7,
a, = v 2 ,
a; = v6,
a3 = v3,
a; = us.
For the computation of the %'-representation of M relatively to e, we proceed as follows: if H - r n = H',then the permutation H * m on S is not computed by computing the matrix product H * rn = lHmc,, of formula (3.7), but, noting that H * rn is the restriction to S of eaHrna',e, by evaluating this function on S. Thus we avoid unnecessary matrix computations when dealing with functions. Figure 4.23 shows the %representation obtained.
u/(l458)
Fig. 4.23 The 9-representation.
According to Proposition 3.6, the group C,is generated by the permutations (1458),
(15)(48),
(14)(58).
It is the dihedral group D4which is the group of all symmetries of the square
8-5.
It contains 8 elements.
4, RANK AND MINIMAL IDEAL
“7
4. RANK AND MINIMAL IDEAL
Let m be an .N-relation between two sets P and Q. The rank of m is the minimum of the cardinalities of the sets R such that there exist two .Nrelations c E N p and 1 E .NR with
m = cl.
(4.1)
It is denoted by rank@). It is a nonnegative integer or + 00. A pair (c, I ) satisfying (4.1) is a minimal decomposition if there exists no factorization m = c’l’ with c’ E N P x RI’’E, .NRx Q and R’ E R. If rank(m) is finite, this is the equivalent of saying that Card(R) is minimal. The following properties are straightforward. First rank(nmn’) S rank(m) since each decomposition (c, I) of m induces a decomposition (mc,In’) of nmn‘. Second rank(m) Imin{Card(P), Card(Q)).
If (c, I ) is a minimal decomposition of m,then rank(m) = rank(c) = rank(l), Further rank(m) = 0
o
m =O.
If P’ c P, Q’ c Q, and if m‘ is the restriction of m to P‘ x Q’, then rank(m’)
rank(m).
We get from the first inequality that two /-equivalent elements of a monoid of .N-relations have the same rank. Thus, the rank is constant on a %class. Consider two N-relations m E N p x sand n E .NsxQ. The pair (m,n) is called trim if no column of m is null and no row of n is null. This is the equivalent of saying that for all s E S, there exists at least one pair (p,q) E P x Q such that mps # 0 and nsp# 0.
PROPOSITION 4.1 Let n E N P x Rm, E N R x Rn’, E .NRx Q be three Nrelations. 1. I f nn’ is unambiguous and n has no null column, then n’ is unambiguous. 2. If nn’ is unambiguous and n’ has no null row, then n is unambiguous. 3. I f nn’ is unambiguous and (n,n‘) is trim, then n and n’ are unambiguous. 4. I f nmn’ is unambiguous and (n,n‘)is trim, then m is unambiguous.
218
IV. AUTOMATA
Proof 1. Let r E R, q E Q. There exists p E P such that (p, n, r) # 0. Then 1 2 (P,nn', q) 2 (P,n, rM, n', q) 2 (r, n', 4).
Consequently (r, n', q ) E (0,I} and n' is unambiguous. Claim 2 is shown symmetrically and claim 3 is a result of claims 1 and 2. 4. Since n' has no null row, nm is unambiguous by 2. Since n has no null column, m is unambiguous. 0 PROPOSITION 4.2 Let m be an unambiguous relation and let (c, I ) be a minimal decomposition of m. Then the pair (c, I ) is m'm and both c and 1 are unambiguous.
Proof Assume that c contains a column which is null. Then we can delete this column and the row of same index of 1 without changing the value of the product. But this implies that (c, I ) is not a mhimal decomposition. Thus no column of c is null, and symmetricallyno row of 1 is null. Consequently (c, I ) is trim. By Proposition 4.1, c and 1 are unambiguous. 0 EXAMPLE 4.1 Let m E (O,l}"Q be a bijection. Then rank(m) = Card(P) = Card(Q). For the proof, we first show that rank(ZQ)= Card(Q). Indeed let ZQ = cl be a minimal decomposition of Zo, with c E (0, l}Qx and 1 E {O, l}" Q. Note first that if cqr = 1 then, since no row of I is null, lrq. = 1 for some q'. This implies q = q', i.e., I, = 1. Next, for each q E Q, there exists a unique r E R such that cqr = 1. The existence results from the equality IQ = cl. Thus c is a function from Q into R. It is injective since cW = 1 implies l,, = 1; thus c,,~ = 1 would imply (d),.,= 1, whence q' = 4.This shows that Card@) 2 Card(Q). Consequently Card@) = Card(Q),and rank(lQ)= Card(Q). Now let n be the reciprocal bijection of m. Then Card(P) = rank&) = rank(mn) Irank(m). Thus rank@) = Card(P).
EXAMPLE 4.2 The notion of rank that we defined in Section 11.6 is a special case of the rank defined here. Let indeed d = (Q, i, T) be a deterministic automaton, and let cp = qd. We shall see that for w E A*
-
rank(q(w))= Card(Q w) = rank,(w). Let R = Q w. Then cp(w) = C l , where c is the restpction of ~ ( wto) Q x R and where 1 E (0,l}"x Q is defined,
4.
RANK AND MINIMAL IDEAL
219
for r E R and q E Q, by ( I k4) ,
=
1
0
if q = r, otherwise.
This shows that rank (q(w)) < Card@). Next, let Tc Q be a cross section of cp(w). Then the restriction m of q(w) to T x R is a bijection, whence Card(R) = rank(m) 5 rank(q(w)). 4.3 The above example shows that if an unambiguous relation m EXAMPLE is a function, then in (0,l}Q
rank(m) = Card(Im(m)). 4.4 Let Q be a finite set. The rank of an N-relation m over Q has EXAMPLE strong connections with the usual notion of rank as defined in linear algebra. Let K be a field containing N. The rank over K (or K-rank) is, as is well known, the maximal number of rows (or colums) which are linearly independent over K. We can observe (Exercise 4.1) that this number may be defined in a manner analoguous to the definition of the rank over N.In particular,
rank&) I rank(m). It is easy to see (Exercise 4.2) that the inequality usually is strict. However, in the case of relations which are functions, the two notions coincide. The following proposition gives an easy method for computing the rank of an idempotent relation.
PROPOSITION 4.3 Let e be an unambiguous idempotent relation. Then rank(e) = Card(Fix(e)). Proof Set S = Fix(e). The column-row decomposition of e shows that rank(e) I Card@). Moreover, in view of Proposition 3.3, the matrix e contains the identity matrix I,. Thus Card@) = rank(1,) s rank(e). 0
The notion of rank is of interest only when it is finite. In the case of unambiguous relations there exists an interesting characterization of relations of finite rank. PROPOSITION 4.4 For any unambiguous relation m, the following conditions are equivalent: (i) m has jnite rank, (ii) the set of rows of m is jnite, (iii) the set of columns of m is jnite.
IV. AUTOMATA
120
and 1 E N R Xbe Q a minimal Proof (i) (ii) Let m = cl, with c E N P x R decomposition of m. By Proposition4.2, the relations c and 1 are unambiguous. Now, if two rows of c, say with indices p and q, are equal, then the corresponding rows m and m,, of m also are equal. Since R is finite, the matrix c has at most 2'Id(') distinct rows. Thus the set of rows of m is finite. (ii) =$ (i) Let ( m , . ) r o R be a set of representatives of the rows of m. Then m = cl, where 1 is the restriction of m to R x Q, and c E (0,l tQ is defined by 1 Cpr
=
0
if m,. = mr,, otherwise.
This shows (i) o (ii). The proof of (i) o (iii) is identical. 0 Let M be a monoid of unambiguous relations over Q. The minimal rank of M is the minimum of the ranks of the elements of M other than the null relation. It is denoted by r(M), r ( M ) = min{rank(m) I m E M - ( 0 ) ) . If M does not contain the null relation over Q, this is of course the minimum of the ranks of the elements of M . We have r ( M ) > 0 and r ( M ) < co iff M contains a relation of finite positive rank. We now study the monoids having finite minimal rank and we shall see that they have a rather precise structure. We must distinguish two cases: the case where the monoid contains the null relation, and the easier case where it does not. Note that the null relation plays the role of a zero in view of the following, more precise statement. If M is a transitive monoid of unambiguous relations over Q, then M contains a zero iff M contains the null relation. Indeed, the null relation always is a zero. Conversely,if M has a zero z, let us prove that z is the null relation. If Card(Q) = 1, then z = 0. Thus we assume Card(Q) 2 2, and z # 0. Let p , q E Q such that zP,,= 1. Let r,s E Q. By transitivity of M , there exist m, n E M such that
mrp= n,, = 1. From mzn = z, it follows that z , = 1. Thus z , = 1 for all r,s E Q, which contradicts the unambiguity of M. If M is an unambiguous monoid of relations over Q, then for q E Q, let Stab(q) = { m E MI qmq). THEOREM 4.5 Let M be a transitive monoid of unambiguous relations over Q , containing the relation 0, and having Jinite minimal rank. 1. M contains a unique 0-minimal ideal J , which is the union of 0 and of the set K of elements of rank r(M).
4.
22 I
RANK AND MINIMAL IDEAL
2. The set K is a regular 9-class whose A?"-classesare finite. 3. Each q E Q is a fixed point of at least one idempotent e in K i.e., e E K n Stab(q). 4. For each idempotent e E K , the group G, is a transitive group of degree
W). 5.
The groups G,, for e idempotent in K , are equivalent.
Before we proceed to the proof, we establish several preliminary results.
PROPOSITION 4.6 Let M be a monoid of unambiguous relations over Q . If m E M has finite rank, then the semigroup generated by m is finite. Proof Let m = cl be a minimal decomposition of m, with c E { O , l ) Q x R and 1 E (0, l}" Let u be the A"-relation over R defined by u = lc. Then for all n 2 0, m" = c(lc)"l= C U " ~ , Q.
+
Since the pair (c,l)is trim, each relation u" is unambiguous by Proposition 4.1. Since R is finite, the set of relations u" is finite, and the semigroup {m" I n 2 l } is finite. 0 In particular it follows from this proposition that for any unambiguous relation of finite rank, a convenient power is an idempotent relation.
PROPOSITION 4.7 Let M be a monoid of unambiguous relations over Q , and let e E M be an idempotent. I f e has finite rank, then the localized monoid eMe is Jinite. Proof Let S be the set of fixed points of e. By Proposition 4.3, the set S is finite. Thus the monoid Me,which is a monoid of unambiguous relations over S,is finite. Since the monoid eMe is isomorphic to Me, it is finite. 0 We now verify a technical lemma which is useful to "avoid" the null relation.
LEMMA4.8 Let M be a transitive monoid of unambiguous relations over Q , 1. For all m E M - 0, there exist n E M and q E Q such that
mn E Stab(q)
(resp. nm E Stab(q)).
Thus in particular mn # 0 (resp., nm # 0). 2. For all m E M - 0 and q E Q, there exist n, n' E M such that nmn' E Stab(q).
3. For all m, n E M III
- 0, there exists
u E M such that mun # 0.
other terms, the monoid M is prime.
IV. AUTOMATA
222
Proof 1. Let q,r E Q be such that (q,m,r) = 1. Since M is transitive, there exists n E M such that (r, n,q) = 1. Thus (q,mn,q) = 1. 2. There exist p,r E Q such that ( p , m , r ) = 1. Let n, n' E M be such that (4, n, p ) = 1, (r, n'q) = 1. Then (4, nmn', q) = 1. 3. There exist p, I , s, q E Q such that ( p , myI ) = (s, n, q) = 1. Take u E M with (r,u,s) = 1. Then(p,mun,q) = 1. 0
PROPOSITION 4.9 Let M be a transitive monoid of unambiguous relations over Q, having finite minimal rank. Each right ideal R # 0 (resp., each left ideal L # 0) of M contains a nonnnull idempotent. Proof Let r E R - 0. By Lemma 4.8, there exist n E M and q E Q such that rn E Stab(q). Let m E M be an element such that rank(@ = r ( M ) . Again by Lemma 4.8, there exist u, v E M such that umv E Stab(q).Consider the element m' = rnumv. Then m' E R and m' E Stab(q). Since rank(m') 5 rank(m), the rank of m' is finite. According to Proposition 4.6, the semigroup generated by m' is finite. Thus there exists k 2 1 such that e = (m')kis idempotent. Then e E R and e # 0 since e E Stab(q). 0
PROPOSITION 4.10 Let M be a transitive monoid of unambiguous relations over Q , having finite minimal rank and containing the null relation. For all m E M ythe following conditions are equivalent: (i) rank(m) = r(M), (ii) the right ideal mM is O-minimal, (iii) the lefi ideal M m is O-minimal. Proof (i) =. (ii) Let R # (0)be a right ideal contained in mM. We show that R = m M . According to Proposition 4.9, R contains an idempotent e # 0. Since e E R c mM, there exist nE M such that e = mn. Since rank(e) s rank(m) and rank(m) is minimal, we have rank@)= rank(m). Let m = cl be a minimal decomposition of m, with c E (0, l } Q X S 1, E (0, l } s x Q . Then e = (c1)n = c(ln), Since rank@) = r ( M ) = Card(S), the pair (c,In) is a minimal decomposition of e. In particular, it is trim. For all k 2 0, e = ek+' = c(lnc)kln.
By Proposition 4.1(4), the relation (Inc)kis unambiguous for each k. Thus (lnc)k (0, l}sxs and S being finite, there exists an integer i 2 1 such that (Inc)' is idempotent. Since rank((lnc)') = Card(S), each element in S is a fixed point of (Inc)'.Consequently (lnc)' = Is. Thus
E
em = e'm = (c1n)'m = (cln)'cl= c(lnc)'l= cl = m.
4.
223
RANK A N D MINIMAL IDEAL
The equality em = m shows that m E R, whence R = mM. Thus mM is a 0minimal right ideal. (ii) * (i) Let n E M be such that rank(n) = r(M).By Lemma 4.8, there exists u E M such that mun # 0. From munM c m M , we get munM = m M , whence m E munM. Thus rank(m) I rank(n), showing that rank(m) = rank(n). (i) e (iii) is shown in the same way. 0 Proof of Theorem 4.5 1. By Lemma 4.8, the monoid M is prime. According to Proposition 4.10, the monoid M contains 0-minimal left and right ideals. In view of Corollary 0.5.10, the monoid M contains a unique 0-minimal ideal J which is the union of the 0-minimal right ideals (resp., left ideals).Once more by Proposition 4.10, J is the union of 0 and of the set K of elements of minimal positive rank. This proves claim 1. 2. In view of Corollary 0.5.10, the set K is a regular 9-class. All the A?classes of K have same cardinality by Proposition 0.5.3. The finiteness of these classes will result from claim 4. 3. Let q E Q and k E K. By Lemma 4.8, nkn’ E Stab(q) for some n, n’ E M . Since the semigroup generated by m = nkn’ is finite (Proposition 4.6), it contains an idempotent e, Then e E K n Stab(q). 4. Let e be idempotent in K. Then the $’-class of e is H u 0 = eM n M e = eMe = G(e)u 0. The first equality is a result of the fact that the 9-class of e is eM - 0. Next e M e c eM n Me, and conversely, if n E eM n Me, then en = ne = n whence n = ene E eMe. This shows the second equality. Finally, G(e) = H since H is a group. According to Proposition 3.4, we have Me = Gev 0 and Me is transitive. Thus G, is a transitive permutation group. Its degree is r(M). 5. Is a direct consequence of Proposition 3.5. 0 Now let M be a monoid of unambiguous relations that does not contain the null relation. Theorem 4.5 admits a formulation which is completely analogous, and which goes as follows. THEOREM 4.11 Let M be an transitive monoid of unambiguous relations over Q which does not contain the null relation and which has finite minimal rank. 1. The set K of elements of rank r ( M ) is the minimal ideal of M . 2. The set K is a regular 9-class and is a union of finite groups. 3. Each q E Q is the fixed point of at least one idempotent e in K i.e. e E K n Stab(q). 4. For each idempotent e E K , the group Ge is a transitive group of degree
r(M),and these groups are equivalent. Proof Let Mo be the monoid of unambiguous relations Mo = M v 0.
IV. AUTOMATA
224
We have r ( M ) = r(Mo).Thus Theorem 4.5 applies to M,.For all m in M , we have mM, = mM u 0. It follows easily that mM is a minimal right ideal of M iff mMo is a 0-minimal right ideal of Mo . The same holds for left ideals and for two-sided ideals. In particular, the 0-minimal ideal J of Mo is the union of 0 and of the minimal ideal K of M.This proves 1. Next K is a %class of M,,thus also of M. Since the product of two elements of M is never 0, each X-class of K is a group. This proves 2. The other claims require no proof. 0 Let M be a transitive monoid of unambiguous relations over Q, of finite minimal rank, and let
K = { m E M I rank(m) = r ( M ) } . The groups G,, for each idempotent e in K, are equivalent transitive permutation groups. The Suschkewitch group of M is, by definition,any one of them. 5. VERY THIN CODES
A code X c A + is called very thin if there exists a word x in X* which is not a factor of a word in X.Recall that F ( X )is the set of factors of words in X,and that P ( X ) = A* - F(X).With these notations, X is very thin iff
X* n F(x) z 0. Any very thin code is thin (i.e., satisfies F(X)# a). Conversely, a thin code is not always very thin (see Example 5.4). However, a thin complete code X is very thin. Consider indeed a word w E F ( X ) .Since X is complete, there exist u, v c A* such that U W D E X*.Then uwv E X* n F(X). The aim of this section is to prove the following result. It shows, in particular, that a recognizable code is very thin. This is more precise than Proposition 1.5.12, which only asserts that a recognizable code is thin. 5.1 Let X c A + be a code and let S = (Q, 1,l) be an unamTHEOREM biguous trim automaton recognizing X * . The following conditions are equivalent.
(i) X is very thin. (ii) The monoid qd(A*) has finite minimal rank. The proof of this result is in several steps. We start with the following property used to prove the implication (i) =$ (ii).
PROPOSITION 5.2 Let X c A + be a code and let d = ( Q , 1,l) be an unambiguous trim automaton recognizing X * . For all w E &X), the rank of Q J W ) is finite.
5.
VERY THIN CODES
225
Proof Let us write cp instead of qd. For each p E Q, let @ ( p ) be the set of left factors of w which are labels of paths from p to 1 : @ ( p ) = { u E A* I u I w and pq(u)l }.
We now show that if @ ( p ) = @ ( p ' ) for some p,p' E Q, then the rows of index p and p' in q ( w ) are equal. Consider a q E Q such that Pso(w)q* Since the automaton is trim, there exist u, u' E A* such that lq(u)p and qq(u')l. Thus lq(uwu')l andconsequently uwu' E X*. Since w E F ( X ) ,the pathp * q is not simple; therefore there exist u,u' E A* such that w = uu' and uu, u'u' E X*. Thus there is, in a?,the path l+p Id :l Id'*q-l.U'
By definition, u E @ ( p ) . Thus u E @(p'). It follows that p'q(u)lq(u')q, and consequently p'q(w)q. This proves the claim. The number of sets @(p), for p E Q, is finite. According to the claim just proved, the set of rows of q ( w ) also is finite. By Proposition 4.4, this implies that q(w) has finite rank. 0 EXAMPLE 5.1 Let X be the code X = {a"ba"I n 2 O } . This is a very thin code since b2 E X * n &X). An automaton recognizing X * is given in Fig. 4.24. The image e of b2 in the associated monoid of relations M is idempotent of rank 1. The finiteness of the rank also results from Proposition 5.2 since b 2 is not factor of a word in X. The localized monoid eMe is reduced to e and 0 (which is the image of b2ab2,for example). The monoid M has elements of infinite rank: thus the image of a. Indeed, clearly no power of this element can be idempotent; hence by Proposition 4.6, it has infinite rank. Moreover, M has elements of finite rank n for each integer n 2 0: the image of ba"ba"b has rank n + 1.
PROPOSITION 5.3 Let X be a code ouer A, let d = (Q,l,l) be an unambiguous trim automaton recognizing X * , let q be the associated representation and M = q(A*).
...
Fig. 4.24 An automaton for X*.
IV. AUTOMATA
226
Let e be an idempotent in q(X*) with finite rank, and such that the group G, is transitive. 1. There exist ul, u 2 , . . .,un+ all y , z E A* such that
E
yu'u~"'ui,
q-'(G(e)) with the following property: for ui+'*.*un+'ZEX*.
2. The set q - ' ( e ) n F ( X ) is nonempty. Proof Let e = cl be the column-row decomposition of e, let S be the set of its fixed points, let G = G(e)be the group of invertible elements of the monoid M(e) = eMe. The restriction y : M ( e ) -+ Me is the isomorphism m H lmc, and its inverse is the function n H cnl. The set S contains the element 1, since e E q ( X * ) . Set S = { 1,2,. .. ,n}. We first rule out the case where q-'(e) = { 1). Then e is the neutral element of M. Thus S = Q. Further for p , q E Q there exist g E G(e) (= C,) such that pgq and qg-'p. If u,u E A* are such that q ( u ) = g, q ( u ) = g-', then ~ ( u u = ) e and consequently u = u = 1. This shows that Q has just one element; but q - ' ( e ) = A*, whence A = X = 0.Thus the result holds trivially. Wenowassume thatq-'(e) # {l}.Chooseelementsg,,g,, ...,g, E Gesuch that 292 1 39293 1
ng2g3
"
gnl.
These elements exist because G, is a transitive permutation group. The permutations g 2 , g3,. ..,g. are the restrictions to S of elements h2, h3
* *
hn
of G(e).Each h, is equal to hi = cgil. Let ul, u z , . . .,un+' ~(01= )
d u n + , ) = e,
~ U Z =)
Set w = u l u 2 * * - u n1+, Consider words y,z there exist p , q E Q such that
A + be such that
h2,**.,q(Dn)= hn*
E A*
such that ywz E X*. Then
1 L p A q L l .
Note that
E
5.
VERY THIN CODES
227
Since pq(w)q,there exist r, s E S such that pcr,
Then rg, implies
* *
g, 1 (with 9,.
rg, * gns,
slq.
g, = I, when r = 1).Since the 9;s are functions, this
lgr + 1 * * * 9,s. Consequently rh2
-
*
h, 1,
1hr+ * .* h,s,
and since pcr = per, slq = seq, we have peh, . h,l, 1 h,+
*
-
*
h,eq.
This implies that yu~u~’~~u,,u,+~~~~u,+~ZEX*.
Thus the words ul,. . .,u,+ satisfy the first statement. To show the second part, we verify that the word w = U ~ U ~ ~ * * Uis, in P ( X ) . Assume indeed that ywz E X for some y,z E A*. Then there exists an integer i (1 I i s n) such that y u l ~ ~ ~ u i , u i + l ~ ~ ~ u Since n + l zul, ~ X..., *. u,+ E A’, these two words are in fact in X +,contradicting the fact that X is a code. Thus w E F ( X ) . Let h’ be the inverse of h = q(w)in G(e),and let w’ be such that q(w’) = h‘. Then ww’ E q - ‘(e),and also ww’ E F ( X ) . This concludes the proof. 0 Proof of Theorem 5.1 (i) (ii) Let x E X* n F(X).According to Proposition 5.2, the rank of qd(x)is finite. Since x E X*,we have (1, qd(x), 1) = 1 and thus qJx) # 0. This shows that qd(A*) has finite minimal rank. (ii) (i). The monoid M = qd(A*) is a transitive monoid of unambiguous relations having finite minimal rank r ( M ) . Let K = { m E M I rank(m) = r ( M ) } .
By Theorems 4.5 and 4.1 1,there exists an idempotent e in K n Stab(l), and the group G, is transitive of degree r ( M ) . By Proposition 5.3, the set q i l ( e ) n F ( X ) is not empty. Since q;l(e) c X*, the code X is very thin. 0 We now examine a series of consequences of Theorem 5.1.
COROLLARY 5.4 Let X be a complete code, and let d = (Q, 1,l) be an unambiguous trim automaton recognizing X * . The following conditions are equivalent. (i) X is thin. (ii) The monoid qd(A*) contains elements of j n i t e rank.
+ ~
228
IV. AUTOMATA
Proof Since X is complete, the monoid cpd(A*) does not contain the null relation (Proposition 1.8). Thus the result follows directly from Theorem 5.1. 0
Another consequence of Theorem 5.1 is an algebraic proof, independent of measures, of Theorem 1.5.7. COROLLARY 5.5 If X is a thin complete code, then X is a maximal code. Proof
Let d = (Q, 1,l) be an unambiguous trim automaton recognizing
X* and let cp be the associated representation. Let x E X* such that e = cp(x) is an idempotent of the minimal ideal J of the monoid cp(A*). (Such an idempotent exists by Theorem 4.1 1, claim 3). Let y 4 X . Then ecp(y)e = cp(xyx) is in the #-class of e. This &‘-class is a finite group. Thus there exists an integer n 2 1 such that ( ~ ( x y x ) ) ”= e. Consequently (xyx)” E X*.This shows that X u y is not a code. 0 Let X c A + be a code and let d = (Q, 1 , 1 ) be an unambiguous trim automaton recognizing X*.We have shown that X is very thin iff the monoid M = cpd(A*)has elements of finite, positive rank. Let r be the minimum of these nonzero ranks, and let K be the set of elements in M of rank r. Set cp = cpd. It is useful to keep in mind the following facts. 1. cp(X*) meets K. Indeed cp(X*) = Stab(1) and according to Theorems 4.5 and 4.11, K meets Stab(1). 2. Every &‘-class H contained in K that meets cp(X*) is a group. Moreover, cp(X*) n H is a subgroup of H. These #-classes are those which contain an idempotent having 1 as a fixed point. Indeed, let H be an #-class meeting cp(X*).Let h E H n cp(X*). Then h2 is not the null relation since h2 E Stab(1). Thus h2 E H and consequently H i s a group (Proposition 0.5.8). Let N = H n cp(X*). Since cp(X*) is a stable submonoid of M ,N is a stable submonoid of H, hence a subgroup (Example 1.2.2.). Figure 4.25 represents, with slashed triangles, the intersection K n cp(X*). It expresses that the #-classes of K meeting cp(X*) “form a rectangle” in K (see Exercise 4.3).Collecting together these facts, we have proved. 5.6 Let X c A + be a very thin code. Let d = (Q,1,1) be an THEOREM unambiguous trim automaton recognizing X * . Let K be the set of elements of minimal nonzero rank in the monoid M = cpd(A*). 1. cpd(X*) meets K. 2. Any &‘-class H in K that meets cpd(X*) is a group. Moreover, H n cpd(X*) is a subgroup of H . 3. The .#-classes of K meeting cpd(X*) are those whose idempotent has the state 1 as a j x e d point. 0
5.
VERY THIN CODES
Fig. 4.25 The minimal ideal.
Another consequence of the results of this section is the proof of the following lemma which was stated without proof in Chapter I (Lemma 1.6.4).
LEMMA 5.7 Let X be a complete thin code. For any word u E X* there exists word w E X*uX* satisfying the following property: if ywz E X*,then there exists a factorization w = fug such that y f , g z E X*. a
Proof Let cp be the representation associated with some unambiguous trim automaton recognizing X*.Since X is thin, the monoid M = cp(A*)has a meets J . minimal ideal J , and since X is complete, M has no zero. Thus cp(X+) Let e be an idempotent in cp(X*)n J . The group G, is transitive and, according to Proposition 5.3, there exist words ul, u z , . . .,u,+ E cp- ‘(G(e))such that the word u = uluz * * * u , +has ~ the following property: if yuz E X* for some y,z E A*, then there exists an integer i such that yu, * * * U ~ , Z + +. . ~. U , + , Z E X*. We have ecp(u)e E eMe = G(e),and ecp(u)e E cp(X*). Since G(e) n cp(X*) is a subgroup of G(e),there exists h E G(e)n cp(X*)such that ecp(u)eh = e. Since h = eh, we have ecp(u)h = e. Consider words r E cp-’(e), s E cp-’(h), set u’ = rus and consider the word w = ulul u’uz ..* u’u, + 1 UI.
Let y,z E A* be words such that ywz E X*.Since cp(u’) = e, we have
cpw = cp(u). Consequently also yuz is in X*;thus for some integer i, yu, u2 * * * ui, ui+ 1 * * * u,+ 1z E x*. Observe that cp(u,uz~..ui)= (p(u’ulu’uz~~~ului) and cp(Ui+ 1
I . .
u,+
1)
= ‘ P ( U i + 1u”
Thus also yu’ul u’uz * - - u’uiand ui+ u’ Let
*
* *
u’u, + 1 Ul).
u, + u’ are in X*.
f = u’ulu’uz*--u’uiry g = sui+lul..*u,+lul.
IV. AUTOMATA
230
Since r,s E X * , we have yf,gz E X* and this shows that the word w = Jug satisfies the property of the statement. 0 Finally, we note that for complete thin codes, some of the information concerning the minimal ideal are characteristic of prefix, suffix, or biprefix codes.
PROPOSITION 5.8 Let X be a thin complete code over A, let cp be the representation associated with an unambiguous trim automaton d = (Q, 1 , l ) recognizing X * , let M = cp(A*) end J its minimal ideal. Let H,, Ro, Lo be an .%?,W, 9-class of J such that Ho = R, n Lo and cp(X*) n Ifo #
a.
1. X is prefix i$ cp(X*) meets every &‘-class in Lo. 2. X is sufix i$ cp(X*) meets every $‘-class in R,. 3. X is biprejiix 1 3 cp(X*)meets all &‘-classes in J .
Proof (1) Let H be an .%?-classin Lo,let eo be the idempotent of Ho and let e be the idempotent of H (each .%?-classin J is a group). We have e,e = e, since e E Lo (for some m, we have me = e,; consequently e, = mee = me = eoe). If X is prefix, then cp(X*)is right unitary. Since eo E cp(X*) and e, = e,e, it follows that e E cp(X*). Thus H n cp(X*) # 0. Conversely, let us show that cp(X*) is right complete. Let m E M. Then meOE Lo, thus m, E H for some &‘-class H c Lo. If n is the inverse of me, in the group H , then meOnE cp(X*). Thus cp(X*) is right complete and X is prefix. The proof of (2) is symmetric, and (3) results from the preceding points. 0 Let X c A* be a thin, maximal prefix code, and let d = (Q, 1 , l ) be a complete, deterministicautomaton recognizing X*.The monoid M = cp,(A*) then is a monoid of (total) functions. We will write, for m E M, qm. = q’ instead of (q,m,q’) = 1. Let m E M , and w E A* with m = cp(w). The image of m is Im(m) = Qm = Q w, and the nuclear equivalence of m, denoted by Ker(m), is defined by
q = q‘ (Ker(m))
*
qm = q’m.
The number of classes of the equivalence relation Ker(m) is equal to Card(Im(m));both are equal to rank(m), in view of Example 4.3. A maximal nuclear equivalence is an equivalence which is maximal among the nuclear equivalences of elements in M. It is an equivalence relation with a number of classes equal to r(M). A minimal image is similarly an image of cardinality r(M),i.e., an image which does not strictly contain any other image.
PROPOSITION 5.9 Let X c A + be a thin maximal prejix code, let d = (Q,1,l) be a deterministic, complete automaton recognizing X * , let
5.
VERY THIN CODES
231
M = cp,(A*) and let K be the !&class of the elementsof M of rank r ( M ) .Then 1. there is a bijection between the minimal images and the 9-classes of K, 2. there is a bijection between the maximal nuclear equivalencesand the W classes of K. Proof 1. Let n,m E M be two 9-equivalent elements. We prove that Im(m) = Im(n). There exist u,u E M such that m = un, n = urn. Thus Qm = Qun c Qn, and also Qn c Qm. This shows that Im(m) = Im(n). Conversely let m,n E K be such that Im(m) = Im(n). K being a regular .%class, the 9-class of m contains an idempotent, say e, and the 9-class of n contains an idempotent f (Proposition 0.5.7). Then Im(e) = Im(m) and Im(f ) = Im(n),in view of the first part. Thus Im(e) = Im(f ). We shall see that ef = eand fe = f. Let indeed q E Q, and q’ = qe. Then q’ = qe2 = q’e. Further q ‘ E Im(e) = Im(f). Thus q’ = q’f since f is idempotent. Consequently qe = qef. Thus e = ef. The equality fe = f is shown by interchanging e and f. These relations imply e9f. Thus m Y n . 2. The proof is entirely analogous. 0
Note also that in the situation described above, every state appears in some minimal image. This is indeed the translation of Theorem 4.11(4). This description of the minimal ideal of a monoid of functions, by means of minimal images and maximal equivalences, appears to be particularly convenient.
EXAMPLE 5.2 Let X
= {aa, baybaa}. We
consider the automaton given in
Fig. 4.26.
a
Fig. 4.26 An automaton for X’.
The 0-minimal ideal of the corresponding monoid is the following: it is formed of elements of rank 1.
IV. AUTOMATA
0
1
0
1
1
0
0
0
1
0
0
0
cyupa
For each element we indicate, on the top, its unique nonnull row, and, on its left, its unique nonnull column. The existence of an idempotent is indicated by an asterisk in the %‘-class. The column-row decomposition of an idempotent is simply given by the vectors in the rows and columns of the array. For example, the column-row decomposition of a j is
El
afi = 1 [0 0
11.
The following array gives the fixed point of each idempotent
EXAMPLE5.3 Let X = {aa,ba, baa, bb, bba). We consider the automaton given in Fig. 4.27. The corresponding monoid has no 0 (the code is complete).
a, b
Fig. 4.27 An automaton for X*.
5.
VERY THIN CODES
‘33
The minimal ideal, formed of elements of rank 1, is represented by
The fixed points of the idempotents are:
EXAMPLE 5.4 Let A = {a,ii,b,6). Denote by 9 the congruence on A* generated by the relations aii- 1, b6- 1. The class of 1 foi the congruence 6 is a biunitary submonoid. We denote by D; the code generating this submonoid. The set D;* can be considered to be the set of “systems of parentheses” with two types of brackets: a, b represent left brackets, and a, 6 the corresponding right brackets. The code D ; is thin since D ; is not complete. Indeed, for instance,
ab 4 F(D;*). However, D ; is not very thin, Indeed, for all w E D;*, we have U W Z E0 ; . The code D ; is biprefix. Let d ( D ; * ) = (Q, 1, l), let cp = qd and let M = q(A*).The monoid M is isomorphic with the syntactic monoid of D;*. We have D;* = q - ’ ( l ) .The monoid M contains a 0 and cp-’(O) = F(D;*).
The only two sided ideals of M are M and 0. Indeed, if m E M - 0 and w E cp-’(m), then w E F(D;*). Therefore, there exist u,u E A* such that uwu E D;*. Hence cp(u)mq ( u ) = 1 whence 1 E MmM and M m M = M . This shows that M itself is a O-minimal ideal. Nonetheless, M does not contain any O-minimal right ideal. Suppose the contrary. By Proposition 0.5.9, M would be the union of all O-minimal right ideals. Thus any element of M - 0 would generate a O-minimal right ideal. This is false as we shall see now.
234
IV. AUTOMATA
For all n 2 1, cp(ii")M cp(ii"+')M. This inclusion is strict, since if cp(ii") = cp(ii"+'w) for some w E A*, then a"Z" E D;* would imply a n ~ n +1w E D>*, whence iiw E D>*which is clearly impossible. This example illustrates the fact that for a code X which is not very thin, no automaton recognizing X * has elements of finite, positive rank (Theorem 5.1). = J
6. GROUP AND DEGREE OF A CODE
Let X c A + be a very thin code, let d 3 ( X )be the flower automaton of X and let cpD be the associated representation. By Theorem 5.1, the monoid cpD(A*)has elements of finite, positive rank. The group of the code X is, by definition, the Suschkewitch group of the monoid qD(A*)defined at the end of Section 4. It is a transitive permutation group of finite degree. Its degree is equal to the minimal rank r(cp,(A*)) of the monoid cpD(A*). We denote by G ( X )the group of X . Its degree is, by definition, the degree of the code X and is denoted by d ( X ) .Thus one has d ( X ) = r(cpo(A*)). We already met a notion of degree in the case of thin maximal biprefix codes. We shall see below that the present and previous notions of degree coincide. The definition of G ( X )and d ( X )rely on the flower automaton of X . In fact, these concepts are independent of the automaton which is considered. In order to show this, we first establish a result which is interesting in its own right.
PROPOSITION 6.1 Let X c A .t be a thin code. Let d = (P, 1,l) and 98 = ( Q , 1 , l )be two unambiguous trim automata recognizing X * , and let cp and 9 be the associated representations. Let M = cp(A*), N = @(A*),0 = cp(F(X)), Y = @(F(X)), let E be the set of idempotents in @, and E' the set of idempotent in Y. Let p: P + Q be a reduction of d onto L% and let $ M + N be the morphism associated with p. Then 1. p^(E)= E'. 2. Let e E E, e' = p^(e).The restriction of p to Fix(e) is a bijection from Fix(e) onto Fix(e'), and the monoids Me and Ne. are equioalent.
Proof Since d and 98 recognize the same set,. we have p-'(l) = 1 (Proposition 2.3). The morphism 6: M -+ N defined by p satisfies 9 = p^ 0 cp. 1. Let e E E. Then p^(e)= p^(e') = (p^(e))'. Thus p^(e)is an idempotent. If e = cp(w) for some w E B(X), then p^(e)= #(w), whence p^(e)E Y . This shows that p^(E)c E'.
6.
GROUP AND DEGREE OF A CODE
235
Conversely, let e' E E', and let w E &X) with e' = $(w). Then cp(w)has finite rank by Proposition 5.2, and by Proposition 4.6, there is an integer n 2 1 such that (cp(w))"is an idempotent. Set e = (cp(w))";then e = cp(w")and w" E F(X). Thus e E E. Next p^(e)= I,$(w")= ern= e'. This shows that p^(E)= E'. 2. Let S = Fix(e), S' = Fix(e'). Considers E Sand let s' = p(s). From ses, we get s'e's'. Thus p(S) c S'. Conversely, if s'e's', then peq for some p , q E p - '(s'). By Proposition 3.5(2), there exists s E S such that peseq. This implies that s'e'p(s)e's' and, by unambiguity, p(s) = s'. Thus p(S) = S'. Now let s, t E S be such that p(s) = p(t) = s'. If s = 1 then t = 1, since p - '( 1) = 1. Thus we may assume that s, t # 1. Since e E 0, there exist u, u, u', u' E A + such that cp(uo) = cp(u'u') = e
and s*lAs,
t41"'t.
This implies that S
'
A
1A s ' ,
sr
U'
.- 1
"'b
s',
whence in particular in g, *l 1. Since p - ' ( l ) = 1, this implies that there is also a path 1 turn implies that S A lA 1 " ' t or, equivalently,
1 in d ;this in
set.
Since e is an idempotent and s, t E S, this implies that s = t. Thus the restriction of p to S is a bijection from S onto S'. Since p^(eMe)= e'Ne', the restriction of p to S induces an isomorphism from Me onto N,.. 0
PROPOSITION 6.2 Let X be a very thin code ouer A. Let d = ( Q , 1 , l ) be an unambiguous trim automaton recognizing X * , and let cp be the associated representation. Then the Suschkewitch group of cp(A*) is equivalent to G(X). Proof According to Proposition 2.5, there exists a reduction from d $ ( X ) onto d .Let e be a nonnull idempotent in the 0-minimal ideal of M = cpD(A*). The image of e by the reduction is a nonnull idempotent e' in the 0-minimal ideal of N = cp(A*). Since qD(F(X)),cp(F(X))are ideals (which are nonnull because they meet pD(X*),cp(X*)), both e E cpD(F(X)),e' E cp(F(X)).By the preceding proposition, Me N N,.. Thus C(X)N N,. - (0) which is the Suschkewitch group of cp(A*). 0
IV. AUTOMATA
236
EXAMPLE 6.1 Let G be a transitive permutation group on a finite set Q ,and let H be the subgroup of G stabilizing an element q of Q. Let cp be a morphism from A* onto G,and let X be the (group) code generating X * = q - ’ ( H ) . The group G ( X )then is equivalent to G and d ( X ) is the number of elements in Q. In particular, we have for all n 2 1, G(A”)= Z/nZ, d(A”)= n. The preceding proposition shows that the group of a very thin code and consequently also its degree, are independent of the automaton chosen. Thus we may expect that the degree reflects some combinatorial property of the code. This is indeed the fact, as we will see now. Let X be a very thin code over A. An interpretation of a word w E A* (with respect to X ) is a triple
(4XY 9) with dEA-X,
XEX*,
gEXA-
and w = dxg.
We denote by I(w) the set of interpretations of w. Two interpretations (d,x , g ) and (d’,X I , g‘) of w are adjacent or meet-if there exist y, z , y’, z’ E X* such that x = yz, XI = y’z’, d y = d’y’, zg = Z’g’ (see Fig. 4.28). Two interpretations which do not meet are called disjoint. A set A c I(w) is disjoint if its elements are pairwise disjoint. Let w E A * . The degree of w with respect to X is the nonnegative number &(w) = max{Card(A)lA c I(w),A disjoint}.
Thus &(w) is the maximal number of pairwise disjoint interpretations of w. Note that for w E F ( X ) , G,(uwu)
s &(w).
W
Fig. 4.28 Two adjacent interpretation.
6.
GROUP AND DEGREE OF A CODE
237
Indeed, every interpretation of uwu gives rise to an interpretation of w, and disjoint interpretations of uwu have their restriction to w also disjoint. Observe also that this inequality does not hold in general if w E F(X).In particular, a word in F ( X )may have no interpretation at all, whereas &(w) is always at least equal to 1, for w E F ( x ) n x*.
PROPOSITION 6.3 Let X be a very thin code. Then d ( X ) = min{d,(w) I w E X* n F(x)}. Proof Let .tS:(X) = (P, 1,l) be the flower automaton of X, with the shorthand notation 1.instead of (1,l) for the initial and final state. Let M = cpD(A*),let J be the 0-minimal ideal of M, let e be an idempotent in rp(X*)n J and let S = Fix(e). Then by definition d ( X ) = Card(S). According.to Proposition 5.3, we have rp; '(e)n F(X)# 0.Take a fixed word x E cp; ' ( e ) n F(X).Then x E X* n F ( X ) , since e E rp,(X*). Let w E X* n F ( X )and let us verify that d ( X ) I 6,(w). For this, it suffices to show that d ( X ) I Gx(xwx),because of the inequality G,(xwx) I S,(w). Now cp,(xwx) E H(e), and consequently its restriction to S is a permutation on S. Thus for each S E S , there exists one and only one S ' E S such that (s,rpD(xww),s') = 1, or equivalently such that s 3 s'.
Since w E P ( X ) ,this path is not simple. Setting s = (u, d ) , s' = (9, u ) it factorizes into s - dl A l A s ' and ( d , y , g )is an interpretation of xwx. Thus each path from a state in S to another state in S, labeled by xwx, gives an interpretation of xwx. Two such interpretations are disjoint. Assume indeed the contrary. Then there are two interpretations ( d , ,y , , g l ) and (d,, y,, 9,) derived from paths s , a s ' , and s 2 5 s ; that are adjacent. This means that the paths factorize into S ' dA l A l L l & s \ , d
s, 2 \1 51
with dlzl = d 2 z 2 , whence also z;g, path d S' 4 1A
= z;g,.
1' i l
1
3 s;,
Then there is also, in d l t ; ( X ) ,a a s ;
labeled xwx. This implies (sI,cp,(xwx),s;)= 1; since s; E S, one has s; = s;, whence s2 = sl. Thus the mapping associating to each fixed point an interpretation produces a set of pairwise disjoint interpretations. Consequently Card(S) I G,(xwx).
238
IV. AUTOMATA
We now show that M x 3 ) Id(X),
where x is the word in cp; ‘(e) n F(X)fixed above. This will imply the proposition. Let (d,y,g)be an interpretation of x 3 . Let p = (u,d), q = (8,u) E P. Then there is a unique path p-
d
1 A 1 8,q,
(6.1)
1, 1 8,q are simple or null. Since cpD(x)= e, and moreover the paths p there exists a unique s E S such that p~D(x)stp~(x)ScpD(x)q or, equivalently, the path (6.1)also factorizes into p L s L s A q .
The word d is a left factor of x since x E F ( X ) . Thus there exist words z, z’ E A* such that y = ZXZ,
dz = x = Tg.
Observe that the fixed point S E S associated to the interpretation is independent of the endpoints of the path (6,l):Consider indeed another path p’-1 d
Y
lA q ’
associated to the interpretation (d, y,g), and a fixed point s’ E S such that p’ A s’ A s‘ A 4’. Since d is strictly shorter than x, the inner part of this path coincides with the corresponding segment in path (6.1), and consequently s = s’.
Thus we have associated, to each interpretation (d, y, g), a fixed point s E S , which in turn determines two words z,Tsuch that y = ZXZ, and 1I‘sx‘sAl. We now show that the fixed points associated to distinct interpretations are distinct. This will imply that S,(x3) < Card@) = d(X)and will complete the proof. Let (d’, y’,8’)be another interpretation of x’, let p’ = (u’,d ’ ) ,q’ = (g’, u’) E P , and assume that the path
decomposes into
6.
GROUP AND DEGREE OF A CODE
239
Since x E F(X), the path s x ‘ s is not simple. Thus there exist h , i E A* such that x = h i and 6
s-1-s.h
The paths (6.1) and (6.2) become
-
p-d-*l-f-*s-+l-s---Ll-q, h c p 1 - L 1A
s
h
9
h
”’,1 -L4’.
1 ----*s
This shows that zh, &,z’h, 6z‘ E X*. Next dz = d‘z‘ = x. Thus dzh = d’z‘h, showing that the interpretations (d, y , g ) and (d‘,y’, 9’)are adjacent. The proof is complete. 0 Now we are able to make the connection with the concept of degree of biprefix codes introduced in the previous chapter. If X c A is a thin maximal biprefix code, then two adjacent interpretations of a word w E A* are equal. Thus &(w) is the number of interpretations of w. As we have seen in Chapter 111, this number is constant on R(X), whence on F(X). Thus Proposition 6.3 shows that the two notions of degree we have defined are identical. A classification of very thin codes according to their degree starts with the study of codes of degree 1. A very thin code X is called synchronous iff d(X) = 1. We will see below that this terminology is compatible with that introduced in Chapter 11. The following is a characterization of synchronous codes. +
PROPOSITION 6.4 Let X be a very thin code over A. Then X is synchronous iff there exist x, y E X* such that, for u, v E A* uxyu E x * s.ux, yv
E
x*.
Proof Let X be a synchronous code, and let x E X* n F(X) be a word such that &(x) = 1. Such a word exists by Proposition 6.3. We show that the pair (x, x) satisfies the condition stated above. Let indeed u, u E A* be such that uxxu E X*. The interpretation (1, uxxu, 1) induces two interpretations of x, the first on the first occurrence, the second on the second occurrence of x in uxxu. Denote them by
(4Y , g),
(d’, Y‘, 9’)
withud ~ X * , g ’ u ~ X * , a n d g1du’ ~X.Since(d,y,g)and(l,x, 1)areadjacent interpretations (degree is one), there exist xl, x2,y , ,y, E X* such that
x =~1x2, y =~
1 ~ 2 ,
= dyi,
x2 = y2g.
Consequently, ux = uxlx2 = (ud)y1x2E X*. Similarly xu E X*.
240
IV. AUTOMATA
Conversely, let x, y E X* be words satisfying the implication. Then x y E P ( X ) . Otherwise there would exist u, u E A* such that uxyu E X; but then ux,yu EX*,contradicting the fact that X is a code. Next consider an interpretation (d,z, g ) of xy. Then ud, gu E X for convenient u, u E A*. Thus uxyu E X*,hence ux E X* and yu E X*.This shows that the interpretations (d,z, g ) and (1, xy, 1) are adjacent. Thus, d,(xy) = 1. 0 A pair (x, y) of words in X + satisfying the condition of Proposition 6.4 is called a synchronizing pair. The existence of a synchronizing pair ( x , y ) has the following meaning. When we try to decode a word w E A*, the occurrence of a factor xy in w implies that the factorization of w into words in X,whenever it exists, must pass between x and y : if w = uxyu, it suffices to decode separately ux and yu.
PROPOSITION 6.5 Let X c A + be a very thin code. The X i s complete and synchronous if there exist x , y E X * such that xA*y c
x*.
Proof Assume first that X is complete and synchronous. Let ( x , y ) be a synchronizing pair. We show that y A * x c X*.Let w E A*. The word xywxy is completable in X*. Thus there exist u, u E A* such that uxywxyu E X*. This implies that ywxyu E X* and then also ywx E X*. Conversely, assume that x A * y c X*.Then the code X is clearly complete. Set z = xy. Then (z, z) is a synchronizingpair. Suppose indeed that uzzu E X +. Then zuz = x ( y u x ) y E X + and also zuz E X + . Thus z,z(uz),uz(zuz),zuzE X*.By stability, this implies that uz E X*. Symmetrically, we show that ZUEX*. 0 Let us now discuss prefix codes. Let X be a thin maximal prefix code over A. Then X is very thin since any word in F ( X )can be completed (on the right) into a word in X * . Of course, X is complete. The condition xA*y c
x*
for x , y E X * of the preceding proposition is equivalent, since X* is right unitary, to the condition A*y c X*
stated in Chapter 11.
EXAMPLE6.2 The code X = {aa,ba, baa, bb, bba} is synchronous. Indeed, the pair (aa,ba) is an example of a synchronizing pair: assume that uaabau E X* for some u, u E A*. Since ab !$ F ( X ) ,we have uaa, bau E X * . Since
6.
GROUP AND DEGREE OF A CODE
24'
X is also a complete code, it follows by (the proof of) Proposition 6.5 that baA*aa c X*. We now examine the behavior of the group of a code under composition. Let G be a transitive permutation group over a set Q. Recall (cf. Sect. 0.8) that an imprimitivity equivalence of G is an equivalence relation 8 on Q stable with respect to the action of G,i.e., such that for all g E G, p
=q
mod8
=-
p g =qg
mode.
The action of G on the classes of 8 defines a transitive permutation group denoted by Ge and called the imprimitivity quotient of G for 8. For any q E Q, the restriction to the class mod 8 of q of the subgroup K = {kEGlqk=qmodd} formed of the elements globally stabilizing the class of q mod 8 is a transitive permutation group. The groups induced by G on the equivalence classes mod 8 are all equivalent.Any one of these groups is called the group induced by G. It is ' . denoted by G Let d = Card(Q) be the degree of G, let e be the cardinality of a class of 8 (thus e is the degree of G'), and let f be the number of classes of 8 (i.e., the degree of Go).Then we have the formula
d = ef.
EXAMPLE 6.3 The permutation group over the set { 1,2,3,4,5,6)generated by the two permutations CI = (123456),
f l = (26)(35)
is the group of symmetries of the hexagon,
6
I
I
5\4/3 It is known under the name of dihedral group D,, and has of course degree 6. It admits the imprimitivit y par tition
{1,4), {2,5}, {3,6) The groups Goand G' are, respectively, equivalent to 6, and Z/2Z.
IV. AUTOMATA
242
PROPOSITION 6.6 Let X be a very thin code which decomposes into
x =Yo2 with Y a complete code. There exists an imprimitivity equivalence 0 of G = G ( X ) such that
Ge = G( Y ) , GOa= G(Z).
I n particular, we have d ( X ) = d( Y)d(Z). Proof Let P and S be the sets of states of the flower automata d $ ( X ) , d t ( Z ) ,respectively. Let cp (resp. $) be the morphism associated to d t ( X ) (resp., a$@)). In view of Proposition 2.8, and since Y is complete, there exists a reduction p: P + S. Moreover, d t ( Y )can be identified with the restriction of d $ ( X ) to the states which are in Z* x Z*. As usual, we denote by p^ the morphism from M = cp(A*) into M' = $(A*) induced by p. Thus $ = p^ 0 cp. Let J (resp., K) be the O-minimal ideal of M (resp., of MI). Then J c i - ' ( K ) , since p^-'(K) is a nonnull ideal. Thus i ( J ) c K. Since $(J) # 0, we have
p^(J)= K .
Let e be an idempotent in J n cp(X*), let R = Fix(e) c P and let G = G,. The nuclear equivalence of the restriction of p to R defines an equivalence relation 0 on R which is an imprimitivity equivalence of C. It can be verified that p is a surjective function from R onto Fix(p^(e)).The group Gice,is the corresponding imprimitivity quotient GO.This shows that G ( 2 ) is equivalent to GO. Let T = {(u,u)E PI u,u E 2").Then T can be identified with the states of the flower automaton of Y.Let L be the restriction to T of the submonoid N = cp(Z*)of M. Then eNe = G(e) n N .
Since G(e) n N is a group, this shows that e is in the minimal ideal of the monoid N and that the restriction to R n T of G(e) n N is the Suschkewitch group of L. Thus the restriction to R n T of the group G(e) n L is equivalent to the group G(Y).On the other hand, since T = p-'(l, l), this group is also the group Ge induced by G on the classes of 8. 0
EXAMPLE6.4 Let X
= Z" where Z is a
very thin code and n L 1. Then
d(X)= nd(Z)
6.
GROUP AND DEGREE OF A CODE
243 TABLE4.1
The Next-State Function of d ( X )
a b
4 2
5 3
4 4
5 5
8 6
1 7
8 8
1 1
EXAMPLE6.5 Consider the maximal prefix code Z = ( A 2 - b2) u b2A2 over A = { a , b } .Set
x = z2. The automaton d ( X * ) is given in Table 4.1. Let cp be the corresponding representation. The monoid &4*) is the monoid of functions of Example 3.4, when setting cp(a) = u, cp(b)= u. The idempotent e = cp(a4) has minimal rank since the action of A on the Wclass of e given in Fig. 3.5 is complete. Consequently, the group G ( X )is the dihedral group D4. This group admits an imprimitivity partition with a quotient and an induced group both equivalent to 2/22. This corresponds to the fact that C(Z) = 2/22,
since
z=T O A ~ , where T is a synchronous code. In the case of prefix codes, we can continue the study of the influence of the decompositions of the code on the structure of its group. We first define a canonical decomposition of a prefix code called its maximal decomposition. This is used to show that only maximal prefix codes may produce nontrivial groups by composition. Let X c A be a prefix code. Let +
D=X*(A*)-'={WEA*(WA*~X*##}
be the set of left factors of X*, i.e., the set of right-completable words in X*. Consider the set
U = {U E A* (u-'D= D } . Note first that U c D:Let u E U.Since 1 E D,we have 1 E u-'D, whence u E D.
244
IV. AUTOMATA
The set U is a right unitary submonoid of A*. Let indeed u,u E U . Then (uu)-'D = u-'u-'D = u-'D = D showing that uu E U . Assume next that U , U U E U . Then u-'D = D, and u-'D = u-'u-'D = (uu)-'D = D. Thus U is right unitary. Let Z be the prefix code generating U : U = Z * . We have X * c Z* = U . Indeed, X* is right unitary. Thus for all x E X*,x - ' X * = X*.It follows that x-'D = x-'(X*(A*)-') = (x-'X*)(A*)-' = X*(A*)- 1 = D.
We now show that
X=YOZ for some maximal prefix code Y.For this, we verify that for u E U , there exists u E U such that uu E X*. Indeed, let u E U.Then u E D, and therefore uu E X* for some u E A*. Since X* c U , we have u, uu E U , and consequently u E U (U is right unitary). The claim shows that X decomposes over 2. Let Y be such that X = Y 0 2. Then the claim also shows that Y is right-complete, hence Y is prefix maximal. It can be shown (Exercise 6.2) that for any other decomposition X = Y' 0 2' with 2' prefix and Y'maximal prefix, we have Z'* c Z*. This justifies the name of maximal decomposition of the prefix code X given to the decomposition (6.3). In the case where X is a maximal prefix code, the set D defined above is A*. Thus U = A* and 2 = A in (6.3). Thus the maximal decomposition, in this case, is trivial.
PROPOSITION 6.7 Let X be a very thin prefix code, and let X=YOZ be its maximal decomposition. Then Z is synchronous, and thus G(X)= G(Y).
I
Proof Let D = X*(A*)-', U = {u E A* u - ' D = D}.Then Z* = U . Let cp be the morphism associated with the automaton d ( X * ) . Let J be the 0-
minimal ideal of the monoid cp(A*). Consider x E X* such that cp(x) E J . First we show that D = { w E A* I c p ( ~ #~ O) } .
(6.4)
Indeed, if w E D, then xw E D and thus cp(xw) # 0. Conversely, if cp(xw) # 0 for some w E A*, then the fact that the right ideal generated by cp(x) is 0-
6.
GROUP AND DEGREE OF A CODE
245
minimal implies that there exists a word w’ E A* such that q ( x ) = ~ ( x w w ’ ) . Thus xww’ E X * . By right unitarity, we have ww’ E X*, whence w E D . This proves (6.4). Next D x - ’ = U x - ’ . Indeed D =) U implies D x - ’ 3 U x - l . Conversely, consider w E D x - ’ . Then wx E D . By (6.4), cp(xwx) # 0. Using now the 0minimality of the left ideal generated by cp(x),there exists a word w‘ E A* such that cp(w’xwx).= cp(x). Using again (6.4), we have, for all w” E D , cp(xw”) = (p(wfxwxwf’)# 0. Then also ( ~ ( x w x w ”#) 0 and, again by (6.4),wxw” E D . This shows that D c (wx)-’D. For the converse inclusion, let w” E (wx)-’D. Then wxw” E D . Thus ~ ( X W X W ” )# 0. This implies that cp(xw”) # 0, whence w” E D . Thus D = (wx)-’D, showing that w E U x - ’ . Now we prove that (x, x ) is a synchronizingpair for 2. Let w, w’ E A* be such that W X X W ‘ E Z * = U. Since U c D, we have wx E D . By the preceding equation D x - ’ = U x - this implies wx E U . Since U is right unitary, xw’ also is in U.Thus 2 is synchronous. In view of Proposition 6.6, this concludes the proof. 0
’,
We now prove a converse of Proposition 6.6 in the case of prefix codes.
PROPOSITION6.8 Let X be a thin maximal p r e j x code. If the group G = G(X) admits an imptimitivity equivalence 8, then there exists a decomposition of X into X=YoZ such that G(Y) = G* and G ( 2 ) = Go. Proof Let cp be the representation associated to the minimal automaton d ( X * ) = (Q, 1, I), and set M = cp(A*). Let J be the minimal ideal of M , let e E J n cp(X*) be an idempotent, let L be the Y-class of e and I‘be the set of &‘-classes of t.Let S = Fix(e). We have G(X) = G,. Since X is complete, each H E r is a group and therefore has an idempotent eHwith Fix(e,) = Fix(e). The code X being prefix, eHis in p(X*)for all H E r, by Proposition 5.8. By assumption, there exists an equivalence relation 8 on S that is an_ imprimitivityequivalence of the group G,. Consider the equivalence relation 8 on the set Q of states of d ( X * ) defined by
p-q
Let us verify that
mod8
iff V H E r , p e H - q e H mode.
6 is stable, i.e., that p = q mod8 =-
p.w=q.w
mod8
for w E A*. Indeed, let m = cp(w). Note that for H E r, = ernHmeH = ernHemeH
246
IV. AUTOMATA
since emHe= en,,. Observe alsp that eme, E G(e)since en E G(e)for all n E L. Assume now that p 3 q mod 8. Then by definition pemH= qemHmod 8 and 8 being an imprimitivity equivalence, this implies pemHemeH = qemHemeH mod 8
By (6.5), it follows that pm,
= qme,mod
8 for all H
E
r. Thus p . w =
q wmod6.
Moreover, the restricion of 6 to the set S = Fix(e) is equal to 8. Assume indeed that p = q mod 8 for some p , q E S. Then pe 3 qe mod 8. Since p = pe and q = qe, it follows that p 3 q mod 8. Conversely, if p = q mod 8, then for all H E r, pe, = p a_nd qe,, = q, because of the equality Fix(e,) = S. Consequently p = q mod 8. Consider the prefix code Z defined by the right-unitary submonoid
Z*
= {zEA*I1 - z
= 1 modi}.
Then clearly X c Z*, and the automaton d ( X * )being trim, X decomposes over 2: X = Yo 2. The automaton defined by the action of A* on the classes of 6 is easily seen to recognize Z*. The group G(Z) is the group G,._The automaton obtained by considering the action of Z on the class of 1mod 8 can by identified with an automaton recognizing Y*, and its group is G'. 0 COROLLARY 6.9 Let X be a thin maximal prefix code. If X is indecomposable, then the group G ( X ) is primitive. 0 EXAMPLE 6.6 We consider once more the finite maximal biprefix code = ( ( A 2- b2) u b2A2)2of Example 6.5, with the minimal automaton of X* given in Fig. 4.29. Let cp be the associated representation. We have seen that e = (p(a4) is an idempotent of minimal rank. The group G, = G ( X ) is the
X
Fig. 4.29 The minimal automaton of X+.
6.
GROUP AND DEGREE OF A CODE
247
Fig. 4.30 The Y-class of e = (p(a4).
dihedral group D4 generated by the permutations
(1458),
(15)(48).
The partition 8 = { { 1,5}, {4,8}} is an imprimitivity partition of G,. The 9-class of e is composed of four %‘-classes. They are represented in Fig. 4.30 together with the associated nuclear equivalences. The equivalence 0 is
e^=
{{L3,5,7}, {2,4,6,8}}
The stabilizer of the class of 1 mod e^ is the uniform code Z = A 2 with group H/2Z. We have already seen that
x =(TO
=
for some synchronous code T. The decomposition of X into X = T 2o Z is that obtained by applying to X the method used in the proof of Proposition 6.8.
EXAMPLE 6.7 Let Z be the finite complete prefix code over A
= {a, b},
Z = a v bA2, and consider
x = 24. The automaton d ( X * ) is given in Table 4.2. Let rp be the representation associated with d ( X * ) . The element e = rp(a4) is easily seen to be an idempotent of minimal rank 4, with Fix(e) = { 1,4,7, lo}. It is given in Table 4.3. The minimal ideal of cp(A*) reduces to the W-class of e, and we have G(X)= Z/4E, as a result of computing the 9-representation with respect to e given in Fig. 4.31. The partition 8 = { { 1,7}, (4, lo}} is an imprimitivity
248
IV. AUTOMATA
TABLE4.2 The Automaton of ((a u b A 2 r ) *
I
2
3
4
5
6
7 6
1 7
8
7
9
10
11
12
~~
a b
4 2
3
4 3
7 4
6 5
0
9 8
9
1 1
0 0
1
1 1
1 1
2 2
1 1
TABLE 4.3 The Idempotent e = (p(a4)
1
2
3
4
5
6
7
8
9
10
11
12
b / ( l . 10.7.41
a/(l, 4,7,10)
Fig. 4.31 The Y-representation with respect to e.
partition
01
G,. The corresponding equivalence 6 is
6 = { { 1,3,5,7,9,1 l}, {2,4,6,8, 10,12}}. The stabilizer of the class of 1 mod 6 is the uniform code A 2 ,and we have X c (A2)*.
Observe that we started with X = Z 4 . In fact, the words in 2 all have odd length, and consequently Z 2 = Y O A’ for some Y. Thus X has the two decompositions X=Z4=Y20A2.
7. SYNCHRONIZATION OF SEMAPHORES
In this section, we prove the result announced in Chapter 11, namely, THFDREM 7.1 Let X be a semaphore code. There exist a synchronous semaphore code Z and an integer d 2 1 such that x = Zd.
7.
SYNCHRONIZATION OF SEMAPHORES
249
In view of Proposition 6.6, the integer d is of course the degree d ( X ) of the code X. The proof of the theorem is in several parts. We first examine the group of a semaphore code. However, the following proposition is just an intermediate step, since the theorem implies a stronger property. We'recall that a transitive permutation group over a set is called regular if its elements, with the exception of the identity, fix no point (See Sect. 0.8). LEMMA 7.2 The group of a semaphore code is regular. Proof Let X c A + be a semaphore code, let P = XA - be the set of proper left factors of words in X, and let d = (P, 1,l) be the literal automaton of X*. Let cp be the representation associated with d,and set M = q(A*). A semaphore code is thin and complete;thus 0 # M and the ideal cp(F(X))of images of words which are not factors of words in X contains the minimal ideal, say K , of M. Let e be an idempotent in cp(X*) n K, and let R = Fix(e).These fixed points are words in P. They are totally ordered by their length. Indeed let w be in cp- ' ( e ) n P ( X ) . Then we have re w = r for all r E R. Since w is not a factor of a word in X,no rw is in P. Thus each word r E R is a right factor of w. Thus, for two fixed points of e, one is a right factor of the other. Next, we recall that, by Corollary 11.5.4,
P X c X ( P u X). By induction, this implies that for n 2 1,
PX" c X"P u X ) .
(7.1)
To show that G, is regular, we verify that each g E H(e) n cp(X*) is an "increasing" permutation on R, i.e., for r, s E R, Irl < ISI => lrgl < Isgl. (7.2) This property clearly implies that g is the identity on R. Since H(e) n cp(X*) is composed of the elements of H(e)fixing 1, this means that only the identity of G, fixes 1. Since G,is transitive, this implies that G, is regular. For the proof of (7.2),let g E H(e) n cp(X*),and let x E cp- '(9).Then x 6 X" for some n 2 0. Let r, s E R with Irl < IsI. Then by (7.1). rx = yu,
sx = zu
with y, z E X", u, u E P u X.
The word u is a right factor cf u (see Fig. 4.32) since otherwise z E A*yA+ contradicting the fact that X" is semaphore. Further, we have in M rg = u or 1
according to U E P or U E X ,
sg = u or 1
according to
U EP
or U E X ,
IV. AUTOMATA
250
-
r 0
0
:v
0
L
U
-
c
-
E 0
0
X
4
X 0
-
V Q
Fig. 4.32 Comparison of rx and sx.
Since g is a permutation on R and l g = 1 and s # 1, we have sg # 1. Thus sg = v. Since r # s, we have rg # sg. Since u is a right factor of v, we have lrgl c lsgl both in the two cases rg = u and rg = 1. 0
Now let X
c A + be a group code. Then by definition,
X* = a-'(H), where a: A* -+ G is a surjectivemorphism onto a group G and H is a subgroup of G. The code X is called a regular group code if
H
= (1).
Then the permutation group G ( X )is the representation of G by multiplication on the right over itself. It is a regular group. The following proposition is useful for the proof of Theorem 7.1. However, it is interesting in itself, because it describes the prefix codes having a regular PUP* PROPOSITION 7.3 Let X be a thin maximal preJix code. Then the group G ( X ) is regular iff x = u o v o w, where V is a regular group code and U,W are synchronous codes. Proof The condition is sufficient. Indeed, if X = U 0 V 0 W, then by Proposition 6,6, we have G ( X ) = G(V ) . Conversely, let d = (Q, 1,l) be an unambiguous trim automaton recognizing X*,let rp be the associated representation and A4 = cp(A*).Since X is thin and complete, the minimal ideal J of M is a union of groups. Consider an idempotent e E rp(X*) n J , let G = G(e)be its .@-class,Lits 9class and let r be the set of .@-classes contained in L. Each of them is a group, and the idempotent of H will be denoted by en. The set of pairs. {(en,e) I H E I-1 is a system of coordinates of L relative to e since
eHeE H ,
eeHe= e.
7.
SYNCHRONIZATION OF SEMAPHORES
251
Let us consider the corresponding 9’-representation of M. For this choice of coordinates we have for m E M and H E r, m
* H = lmeHc,
(7.3)
where e = cl is the column-row decomposition of e. Set N = {n E M 1 VH E r, n * H = n
* G}.
The set N is composed of those elements n E M for which the mapping
H Eb
n
* H E G,
is constant. It is a right-unitary submonoid of M. Indeed, first 1 E N and if n, n’ E N, then nn’ * H = (n * n’H)(n’ * H ) = (n
(7.4)
* G)(n’ * G )
which is independent of H.Thus nn‘ E N. Assume now that n,nn‘ E N. Then by (7.4),
n’ * H = (n * n’H)-’(nn’ * H ) = (n * G ) - ‘(nn’ * G )
which is independent of H,showing that n’ E N. Thus cp-’(N) = W* for some prefix code W. The hypothesis that G(X)is regular implies that X* c W*. Indeed, let m E cp(X*).Then by (7.3) we have for H E I‘, m * H = lmeHc.
Since X is prefix, eHE cp(X*) by Proposition 5.8. Consequently m * H fixes the state 1 E Q (since I, m, eHand c do). Since G ( X )is regular, m * H is the identity for all H E r. This shows that m E N. We now consider the function
6: W*
G,
which associates to each w E W * the permutation q ( w ) * G. By (7.4), 9 is a morphism. Moreover, 9 is surjective: if g E G, then g
* G = lgec = lgc
which is the element of G, associated to g . From g * H = lgeHc = l(ge)e,(ec) = lgec = Igc, it follows that g E N. The preceding argument shows that for all x E X*,9 ( x ) = 1.
252
IV. AUTOMATA
Since X* c W *and X is a maximal code, we have
x = Y o , w, where /3: B* -+ A* is some injective morphism, B(B) = W and /3( Y) = X.Set a=80/3.
Then a: B* + G, is a morphism and Y* c a-'(l) since for all x E X*,we have O(x) = 1. Let V be the regular group code defined by V* = a - I( 1).
Then Y = U 0 V and consequently
x = u o V o w. By construction, G ( V ) = G,. Thus G(X)= G ( V ) . The codes U and W are synchronous since d ( X ) = d( V) and d ( X ) = d( U)d(V)d(W ) , whence d( U ) = d ( W ) = 1. This concludes the proof. 0 The following result is the final lemma needed for the proof of Theorem 7.1.
LEMMA7.4 Let Y c B + be a semaphore code, and let V # B be a regular group code. If Y* c V*, then
Y = (C*D)d for some integer d, where C synchronous.
=B n
V and D = B - C . Moreover, C*D is
Proof Let a:B* + G be a morphism onto a group G such that V* = a-'(l). Since V # B, we have G # { 1). We have C = { b E B I a ( b ) = l}
and
D = { b E B l a ( b ) # l},
The set D is nonempty. We claim that for y E Y, lylD > 0. Assume the contrary, and let y E Y such that lylD = 0. Let b E D. Then a(bu) # 1 for each left factor u of y since a(u) = 1. Thus no left factor of by is in V ,whence in Y.On the other hand, B*Y c YB* because Y is semaphore (Proposition 11.5.2). This gives the contradiction and proves the claim. Set T = C*D. Consider a word y E Y such that lylD is minimal and let d = lylD.Next consider a word t = tlt2"'td,
t,E T .
7.
SYNCHRONIZATION OF SEMAPHORES
253
Since Y is a semaphore code, tdy E YB*. Thus tdy = ylwl
for some y, E Y , w1 E B*. Then I y l l D = d. Indeed, otherwise l y 1 I D = d the minimality of d = IylD, and
+ 1 by
# 1,
cr(yl)= a(ylwl) =
a contradiction. Thus, lyllD = d, l W l l D = 1. In the same way, we get td-lyl
= Y2W2,**.9tlYd-l
=YdWd,
where each of the y2,. . .,yd satisfies lyilD= d, and each w2,. ..,w d E C*DC*. Composing these equalities, we obtain (see Fig. 4.33) t y = tlt2"'tdy = y d w d w d - 1 ' * * w , .
(7.5)
Since yd E (C*D)dC*and t E (C*D)d,we have yd = t l t 2 " ' t d u
Y
(7.6) for some u E C* which is also a left factor of y. This construction applies in particular if t l E D, showing that Y contains a word x (= yd) with d letters in D and starting with a letter in D, i.e., x E (DC*)d.Thus x is one of the words in Y for which lxlD is minimal. Substitute x for y in (7.5). Then starting with any word t = tlt2 - - td E Td, we obtain (7.6), with u = 1, since u E C* and is a left factor of x. This shows that t E Y .Thus Td c Y. Since Tdis a maximal code, we have Td = Y . Since B*b c T* for b E D, the code T is synchronous. D
Fig. 4.33
E
IV. AUTOMATA
254
Proof of Theorem 7.1 Let X be a semaphore code. By Lemma 7.2, the group G ( X )is regular. In view of Proposition 7.3, we have
x = u 0 v o w, where V is a regular group code and U and Ware synchronous. Set Y = U 0 V. If d ( V ) = 1, then X is synchronous and there is nothing to prove. Otherwise, according to Lemma 7.4, there exists a synchronous code T such that Y = Td. Thus x=T~OW=(TOW)~. The code Z = T O W is synchronous because T and W are. Finally, since = Z d is semaphore, Z is semaphore by Corollary 11.5.9. This proves the theorem. 0
X
EXAMPLE 7.1 Let Z be the semaphore code Z = {a, ba, bb} over A
=
{a, b}. This code is synchronous since A*a c Z*. Set
x = z2. The minimal automaton d ( X * ) is given by Fig. 4.34. Let cp be the associated representation and M = cp(A*). The element e = cp(a2)is an idempotent of minimal rank 2 = d(X).Its Y-class is composed of two groups GI = G(e)and G,. The 9’-representation of M with respect to e is given in Fig. 4.35, with the
Fig. 4.34 The automaton d ( X * )and its table.
b/( 1 )
) a/( 13)
a/( 13)
b/( 13)
Fig. 4.35 The Y-representationof M.
255
EXERCISES
notation a instead of cp(a). The prefix code W of Proposition 7.3 is W = 2. Indeed, we have a * 1 = a * 2 = (13); ba * 1 = ba * 2 = (13); bb * 1 = bb * 2 = (13). In this case, the code U is trivial.
EXAMPLE 7.2 Consider, over 4 = {a, b}, the synchronous semaphore code 2 = a*b. Let X = 2’. The automaton d ( X * ) is given in Fig. 4.36.
Fig. 4.36 The automaton of X* = [(a*b)’]*.
Let cp be the associated representation. The element e = cp(b’) is an idempotent. Its set of fixed points is { 1,3}. The S-class of e is reduced to the group G(e), and the monoid N of the proof of Proposition 7.3 therefore is the whole monoid cp(A*). Thus W = A. The morphism tl from A* into G, is given by
44 = 1{1,31,
= (13).
We have
x=uov with
V = a v ba*b. This example shows that the decomposition given in the proof of Proposition 7.3 is not always that one looked for. Lemma 7.4must be used in this case to find the desired decomposition X = Z2.
EXERCISES SECTION 1 1.1.
Show that a submonoid M of A* is recognizable and free iff there exists an unambiguous trim finite automaton d = (Q,1,l) that recognizes M.
SECTION 2 2.1.
Let X be a subset of A + and let &(X) = (P,(l, 1),(1,1))
be the flower
IV. AUTOMATA
256
automaton of X.Let cp be the associated representation. Show that for all ( p , q), (r, s) E P and w E A* we have
((P, cp(W), (r, 4) = (dlL)*r,w) + (PW,r)(q,ws). 2.2. Let d = (P,i, 7')and W = (Q, j, S)be two automata, and let p : P -,Q be a reduction from d on W such that i = p - I ( j ) . Show that if d is 4
9
deterministic, then so is W. SECTION 3
*3.1. The aim of this problem is to prove that for any stable submonoid N of a monoid M, there exists a morphism cp from M onto a monoid of unambiguous relations over some set Q and an element 1 E Q such that N = { m e MIlcp(m)l}. For this let
D = {(u,V ) E M x M I uu E N } . Let p be the relation over D defined by (u,u)p(u',u')
iff Nu n Nu' # 0 and uN n u'N # 0.
Show that the equivalence classes of the transitive closure p* of p are Cartesian products of subsets of M. Show that N x N is a class of p*. Let Q be the set of classes of p* and let 1 denote the class N x N. Let cp be the function from M into (0,l}Q" defined by (U x V)cp(m)(U' x V ' )
-
Um c U' and mV' c V.
Show that cp is a morphism and that N = {rnEMIlcp(m)l}.
Show that in the case where M = A*, the construction above coincides with the construction of the flower automaton. 3.2. Let K be a field and let m be an n x n matrix with elements in K.Show that m = m2 iff there exist c E K" " P and I E KP" such that m=cl
and
lc = I , ,
where Zp denotes the identity matrix. 3.3. Let M be a monoid of unambiguous relations over a set Q.Let D be a 9class of M containing an idempotent e. Let R (resp. L ) be the 9-class
25 7
EXERCISES
(resp. the 2'-class) of e and let A (resp. r)be the set of its &-'classes. Let be a system of coordinates of R, and let (bK,bk)Ks,be a system of coordinates of L. Let e = cl be the column-row decomposition of e and set lH = la,, cH= bHc. The sandwich matrix of D (with respect to these systems of coordinates) is defined as the A x matrix with elements in G, u 0 given by (aH,a;l)HsA
sHK
=
Pi"
if eUHbKe E G(e), otherwise.
Show that for all m E M, H E A, K E r, (H * m)SHgK = SHKr(m* K )
.
with H' = H m, K' = m * K . SECTION 4 4.1.
Let K be a semiring and let m be a K-relation between P and Q. The rank otrer K of m is the minimum of the cardinalities of the sets R such that m = cl
4.2.
for some K-relations c E K P R, 1 E K 'Q. Denote it by rank,(m). For K = Jlr this gives the nbtion introduced in Section 4. Show that if K is a field and Q is finite, the rank over K coincides with the usual notion of rank in linear algebra. Let
m=
-;p ;j]. 0 0 1 1
4.3.
Show that rank,(m) = 4, but that rank,(m) = 3. Let M be a monoid of unambiguous relations over Q, that is transitive and has finite minimal rank. Let 1 E Q and N = Stab(1). Let A (resp. r)be the set of O-minimal or minimal left (resp. right) ideals of M, according to M contains or does not contain a zero. Let R, R' E r, L, L E A. Show that if
0
and
R ' n L'n N #
0,
R n L'n N # 0
and
R' n L n N #
0.
R nL nN #
then also
IV. AUTOMATA
258
In other words, the set of pairs (R ,L)E r x A such that R n L n N # 0 is a Cartesian product. SECTION 5
Let X c A + be a very thin code. Let M be the syntactic monoid of X* and let cp be the canonical morphism from A* onto M. Show that M has a unique 0-minimal or minimal ideal J , according to M contains a zero or not. Show that cp(X*) meets J , that J is a 9-class, and that each A?class contained in 9 and which meets cp(X*)is a finite group. 5.2. Let X c A + be a very thin code, let d = (Q, 1,l) be an unambiguous trim automaton recognizing X*.Let cp be the associated morphism and M = cp(A*). Let J be the minimal or 0-minimal ideal of M and K = J - 0. Let e E M be an idempotent of minimal rank, let R be its 9-class and L be its Y-class. Let A (resp. r)be the set of &-'classes contained in R (resp. L), and choose two systems of coordinates
5.1.
(aH
EA 9
9
( b b~k h e r 9
of R and L, respectively. Let M+(G,uo)~~~
p:
be the morphism of M into the monoid of row-monomial A x Amatrices with elements in G, u 0 defined by the 9-representation with respect to e. Similarly, let
M +(G, u O)'xr
v:
be the morphism associated with the 9-representation with respect to e. Let S be the sandwich matrix of J relative to the systems of coordinates introduced (see Exercise 3.3). Show that for all m E M, p(m)S = Sv(m)
Show that for all myn E My (VH E A, lHm = 1Hn).
p(m) = p(n)
-s
v(m) = v(n)
o (VK E
r, m c K = ncK),
where lH = la,, cK = bKcand cl is the column-row decomposition of e. Show, using these relations, that the function m H (P(m)Y v(m))
is injective.
EXERCISES
259
SECTION 6 6.1.
Let X be a very thin code. Let M be the syntactic monoid of X*,and let J be the 0-minimal or minimal ideal of M (see Exercise 5.1). Let G be an .%-class in J that meets cp(X*), and let H = G n cp(X*). Show that the representation of G over the right cosets of H is injective, and that the permutation group obtained is equivalent to
6.2.
Let X c A + be a prefix code and let X = Yo Z be its maximal decomposition. Show that if X = Y' 0 2' with 2' prefix and Y' maximal prefix, then Z'* c z*.
W).
6.3. Let X
c A + be a maximal prefix code. Let
R = { r E A* 1 Vx E X*,3y E X*:rxy E X*}.
(a) Show that R is a right unitary submonoid containing X*. (b) Let Z be the maximal prefix code such that R = Z* and set
x = YOZ. Show that if X is thin, then Y is synchronous. (c) Show that if X = Y' 0 2' with Y' synchronous, then Z'* c z*. (d) Suppose that X is thin. Let d = (Q, 1,l)be a deterministic trim automaton recognizing X * and let cp be the associated representation. Show that a word r E A* is in R iff for all m E cp(A*) with minimal rank, 1 . r = 1 mod Ker(m). 6.4.
Let X and Y be complete thin codes. Let Z be the code for which
Z*
= X* n Y*.
(a) Show that Z is thin. (Hint: use the direct product of automata.) (b) Show that if X is biprefix, then Z is complete. (Hint: Proposition 111.2.2.) SECTION 7 7.1.
Let X c A + be a semaphore code, let d = (Q, 1,l) be a deterministic trim automaton recognizing X* and let cp be the associated representation. Show that any group G c 9 ( A * ) which meets cp(F(X))is cyclic (use Proposition 7.1).
260
IV. AUTOMATA
NOTES
Unambiguous automata and their relation to codes appear in Schutzenberger (1965b). Unambiguous automata are a special case of automata with multiplicities. The latter are the K-automata which are extensively studied in Eilenberg (1974)and Berstel and Reutenauer (1984).Kleene’s theory for these automata goes back to Schutzenberger(1961a).A significant step in the study of monoids of unambiguous relations using such tools as the column-row decomposition appears in CCsari (1974). An automaton with output labels such as the decoding automaton of Section 2 is called a transducer (see Eilenberg, 1974). For a finite code, this transducer is finite and unambiguous. This is a particular case of a more general result from the theory of transducers: any function realizable by a finite transducer can also be realized by a finite unambiguous transducer (Elgot and Mezei, 1965). For another proof of this result, see Arnold and Latteux (1979). On this subject, see Eilenberg (1974), Berstel(1979), and Pin and Sakarovitch (1983). The study of the structure of the %classes in monoids of unambiguous relations is very close to the classical development for abstract semigroups presented in the usual textbooks. In particular, the W-and 2-representations are the right and left Schutzenberger representations of Clifford and Preston (1961)and Lallement (1979).The generalization of the results of Section 3 to monoids of (Boolean)relations is partly possible. See, for instance, Lerest and Lerest (1980). The notion of rank and the corresponding results appear in Lallement (1979) for the particular case of monoids of functions. Theorem 5.1 is due to Schutzenberger. An extension to sets which are not codes appears in Schutzenberger(1979).The maximal decomposition of prefix codes and Propositions 6.7, 6.8 are due to Perrot (1972).The theorem of the synchronization of semaphore codes (Theorem 7.1) is in Schiitzenberger (1964).This paper contains also a difficult combinatorial proof of this result. Problem 3.1 is a theorem due to BOBet al. (1979).Extensions may be found in Boe (1976). The notion of sandwich matrix (Exercise 3.3) is a classical one. Exercise 7.1 is from Schutzenberger (1964).
CHAPTER
V
Groups of Biprefix Codes
0. INTRODUCTION
We have seen in Chapter IV that there is a permutation group G(X)of degree d(X)associated with every thin maximal code X which we called the group and the degree of the code. We have already established several relations between a code and its group. Thus the fact that a code X is synchronous, i.e., has a trivial group, has been shown equivalent to the existence of synchronizing pairs. Another example is the fact that an indecomposableprefix code X has a permutation group G(X)that is primitive (Proposition IV.6.8). We have also seen that a thin maximal prefix code X has a regular group iff X = U 0 V O W with U,W synchronous and V a regular group code (Proposition IV.7.3). In this chapter we study the groups of biprefix codes. We start with the simplest class, namely the group codes in Section 1. We show in particular (Theorem 1.1) that a finite group code is uniform. In the second and the third sections, we again examine the techniques introduced in Chapter IV and particularize them to biprefix codes. Specifically, we shall see that biprefix codes are characterized by the algebraic property of their syntactic monoids being nil-simple (Theorem 3.3). The proof makes use of Theorem 11.8.4 concerning codes with bounded deciphering delay. Section 4 is devoted to groups of finite biprefix codes. The main result is 261
V. GROUPS OF BIPREFIX CODES
262
Theorem 4.6 stating that the group of a finite, indecomposable, nonuniform maximal biprefix code is doubly transitive. For the proof of this theorem, we use difficult results from the theory of permutation groups without proof. This is the only place in the book where nonelementary knowledge is required. The fifth section contains a series of examples of finite biprefix codes with special groups.
1. GROUP CODES
Let us first recall the definition of a group code. Let G be a group, H a subgroup of G. Let cp: A* -+ G be a surjective morphism. Then the submonoid cp-'(H) is biunitary. It is generated by a biprefix code called a group code. A group code is a maximal code (see Section 1.2). It is thin iff it is recognizable (Example 1.5.1 l), or equivalently, if the index of H in G is finite. Rather than define a group code by an "abstract" group, it is frequently convenient to use a permutation group. This is always possible for a group code X by considering the minimal automaton of X*.We give here the detailed description of the relation between the initial pair ( G , H ) and the minimal automaton of X* (see also Sect. 0.8). Let G be a group and H a subgroup of G. Let Q be the set of the right cosets of H in G,i.e., the set of subsets of the form Hg, for g E G.To each element g in G,we associate a permutation a(g) of Q as follows: for p = Hk, we define PX(d =Wkd. It is easily verified that a is well defined and that it is a morphism from the group G into the symmetricgroup over Q. The subgroup His composed of the elements of G whose image by n fixes the coset H.The index of H in G is equal to Card(Q). In particular H has finite index in G iff n(G)is a finite group. Now let cp: A* + G be a surjective morphism. Let X be the code generating X* = cp-'(H). For all u, u E A*,
Hcp(u) = Hcp(u)
-
u-'X* = u-'x*.
Indeed, set g = cp(u), k = cp(u). Then Hg = Hk iff g - ' H = k-'H (since (Hg)-'= g-'H).Further u-'X* = cp-'(g- 'H),u-'X = cp-'(k-'H). This proves the formula. According to Example 111.3.1, we have the equality Card(Q) = d(X).
(1.1)
THEOREM 1.1 Let X c A + be a group code. If X is finite, then X = Ad for some integer d.
I . GROUP CODES
263
Proof Let .a' = (Q, 1,l) be the minimal automaton of X*,and let cp be the associated representation. Let d be the degree of X.Then d = Card(Q) by( 1.1). Consider the relation on Q defined as follows: for p , q E Q, we have p I q iff p = q or q # 1 and there exists a simple path from p to q in .a'.Thus p 5 q iff p = q, or there exists a word w E A* such that both p w = q and p u # 1 for each left factor u # 1 of w. This relation is reflexive and transitive. If Xis finite, then the relation I is an order on Q.Assume indeed that p I q and q I p. Then either p = 1 and q = 1 or both p # 1, q # 1. In the second case, there exist simple paths p 5q and q 2p. There are also simple paths
-
-
1A p ,
p
L 1.
Thus for all i 2 0, the paths l L p - (ww')' p L l are simple, showing that u(ww')* u c X.Since X is finite, this implies ww' = 1, whence p = q. Thus I is an order. Now let a, b E A be two letters. According to Proposition 111.5.1, we have ad, bd E X .
- -
Thus none of the states 1 a', 1 b' for 1 < i < d is equal to 1. Consequently, 1 < 1 . a < 1 - a 2 < ..-< 1 * a i< * . - < 1 .ad-' and 1 < 1 * b < 1 . b Z < ... < 1 *b'
< a * *
< 1.bd-'.
-
Since Q has d states, this implies that 1 - a i = 1 b' for all i 2 0. Thus cp(a) = cp(b)for all a, b E A . We get that for all w E A* of length n, we have w E X* iff a" E X * , i.e., iff n is a multiple of d. This shows that X = Ad. 0 The following theorem gives a sufficient condition, concerning the group
C(X), for a biprefix code to be a group code. It will be useful later, in Section 4. THEOREM 1.2 Let X be a thin maximal biprejix code. If the group G(X)is regular, then X is a group code. Proof According to Proposition IV.7.3, there exist two synchronous codes U , Wand a group code V such that
x = u 0 v o w. Since X is thin maximal biprefix, so are U and W (Proposition 1.6.9). Since U and Ware synchronous, they are reduced to their alphabets (Example 11.6.2). Thus, X = V and this gives the result. 0
V. GROUPS OF BIPREFIX CODES
264
THEOREM 1.3 Let X c A + be a code with A = alph(X). The two following conditions are equiualent. (i) X is a regular group code, (ii) X* is closed under conjugacy, i.e., uu E X* =$ vu E X*, Proof If X is a regular group code, the syntactic monoid of X* is a group G = cp(A*) and X* = cp-'(l). If uu E X*, then cp(u)cp(u) = 1, hence also cp(u)cp(u)= 1, showing that uu is in X*.
To show the other implication, let us first show that X is biprefix. Let u,u E A* be such that u, uu E X*. Then also uu E X*.Since X* is stable, it follows that u E X*.Thus, X* is right unitary. The proof for left unitarity is
analogous. Now let M = cp(A*)be the syntactic monoid of X*. We verify that cp(X*) = 1. For x E X*, we have the equivalences UXUEX*
XUUEX*
0
c>.
UUEX* o UU€X*.
. cp(1) = 1, it follows that cp(X*) = 1. Thus, q ( x ) = ~ ( 1 ) Since Finally, we show that M is a group. From A = alph(X), for each letter a E A, there exists x E X of the form x = uav. Then auu E X*, whence cp(a)cp(uu)= 1. This shows that all elements cp(a),for a E A, are invertible. This implies that M isagroup. 0
COROLLARY 1.4 Let X c A + be a finite code with A = alph(X). Zf X* is closed under conjugacy, then X = Ad for some d 2 1. 0
2. AUTOMATA OF BIPREFIX CODES
The general theory of unambiguous monoids of relations takes a nice form in the case of biprefix codes, since the automata satisfy some additional properties. Thus, the property to be biprefix can be "read" on the automaton.
PROPOSITION 2.1 Let X be a thin maximal prefix code ouer A, and let d = (Q, 1,l) be a deterministic trim automaton recognizing X*. The following conditions are equiualent. (i) X is maximal biprefix, (ii) for all w E A*, we have 1 E Q w, (iii) for all w E A*, q w = 1 w implies q = 1. 0
Proof
In a first step, we show that
(2.1) (ii) o X is left complete. If (ii) is satisfied,consider a word w, and let q E Q be a state such that q w = 1. Choose u E A* satisfying 1 u = q. Then 1 uw = 1, whence uw E X*. This Q
2. AUTOMATA
265
OF BIPREFIX CODES
shows that X is left complete. Conversely, assume X left complete. Let w E A*.
-
Then there exists u E A* such that uw E X*. Thus, 1 = 1 uw = (1 u) w shows that 1 E Q w. Next, we prove that
(iii) o X * is left unitary. (2.2) Assume that (iii)holds and let u, u E A + be such that u, uu E X*. Set q = 1 u. Then l . u = l andq.u=(l.~).u=l.uu=l.Thus,q.u=l.u,whence q = 1. This shows that 1 u = 1 and consequently u E X*. Assume conversely that X * is left unitary and let w be such that q w = 1 w = pforsomep,q E Q. Let u,u E A* besuch that 1 . u = 4 , p - u = 1. Then
-
.
-
-
1 muwu = 1 = 1 * wu.
-
Thus, uwu, wu E X * and by unitarity, u E X * . Consequently, 1 u = 1 = q, proving the equivalence (2.2). In view of (2.1) and (2.2), the proposition is a direct consequence of Proposition 111.2.1. 0 Let X be a thin maximal'biprefix code, and let d = (Q, 1 , l ) be a trim deterministic automaton recognizing X*. Then the automaton d is complete, and the monoid M = cpd(A*) is a monoid of (total) functions. The minimal ideal J is composed of the functions m such that Carxl(Im(m))= rank(m) of J are indexed by the equals the minimal rank r ( M ) of M. The &-'classes minimal images and by the maximal nuclear equivalences (Proposition IV.5.9). Each state appears in at least one minimal image and the state 1 is in all minimal images. Each X-class H meets cp(X*) and the intersection is a subgroup of H. Note the following important fact: If S is a minimal image and w is any word, then T = S w is again a minimal image. Thus, Card(S) = Card( T) and consequently w realizes a bijection from S onto T. In the sequel, we will be interested in the minimal automaton d ( X * )of X*. According to Proposition 11.3.9, this automaton is complete and has a unique final state coinciding with the initial state. Thus, d ( X * ) is of the form considered above. Let cp be the representation associated with the minimal automaton d ( X * ) = (Q, 1, l), and let M = cp(A*). Let J be the minimal ideal of M. We define E(X)= q-'(J). This is an ideal in A*. Moreover, we have
.
WEE(X)
0
for all minimal images A and T of d we have S w = T . w.
266
V. GROUPS OF BIPREFIX CODES
Indeed, let w E E(X).Then U = Q w is a minimal image. For any minimal image T, we have T w c Q w = U,hence T w = U since T w is minimal. Thus, T w = S w = Q w. Conversely, assume that for w E A*, we have S w = T w for all minimal images S, T. Set U equal to this common image. Since every state in Q appears in at least one minimal image, we have
-
.
.
-
.
-
-
Q - w = ( y S ) - w = US w = U, S
where the union is over the minimal images. This shows that q(w)has minimal rank, and consequently w E E(X). Thus, the equivalence (2.3) is proved. PROPOSITION 2.2 Let X be a thin maximal biprejx code and let d ( X * ) = (Q, 1,l) be the minimal automaton of X*.Let p , q E Q be two states. If p h = q h for all h E E ( X ) ,then p = q. Proof It suffices to prove that for all w E A*, p * w = 1 iff q w = 1. The conclusion, namely that p = q, follows then by the definition of d ( X * ) . Let h E E ( X ) n X*. Let w E A* be such that p w = 1. We must show that q w = 1. We have p wh = ( p w) h = 1 h = 1, since h E X * . Now wh E E ( X ) , hence by assumption q wh = p wh = 1. Thus, (q w) h = 1. By Proposition 2.l(iii), it follows that q w = 1. This proves the proposition. 0
.
-
.
For a transitive permutation group G of degree d it is customary to consider the number k ( G ) which is the maximum number of fixed points of an element of G distinct from the identity. We call minimal degree of G the number 6 = d - k(G). The group is regular iff k ( G ) = 0, it is a Frobenius group if k ( G ) = 1. If X is a code of degree d and with group G ( X ) , we denote by k ( X ) the integer k(G(X)).We will prove THEOREM 2.3 Let X c A + be a thin maximal biprefx code of degree d, and let k = k(X). Then Ak - A*XA* c E ( X ) .
LEMMA 2.4 With the abooe notation, let d = (Q, 1,l) be the minimal automaton recognizing X*. For any two distinct minimal images S and T of d , we have Card@ n T ) 5 k. Proof Let M = pd(A*), and consider an idempotent e E M having image S , i.e., such that Qe = S. Consider an element t E T - S, and set s = te. Then
2. AUTOMATA OF BIPREFIX CODES
267
s E S, and therefore, s # t. We will prove that there is an idempotent f separating s and t, i.e., such that sf # tf. According to Proposition 2.2, there exists h E E ( X ) such that s h # t h. Let m = q ( h )E J, where J is the minimal ideal of M. Multiplying on the right by a convenient element n E M,the element mn E J will be in the Y-class characterized by the minimal image T. Since n realizes a bijection from Im(m) onto Im(mn) = T we have smn # tmn. Let f be the idempotent of the .#-class of mn. Then f and mn have the same nuclear equivalence. Consequently sf # q. Since t E T = Im(mn) = Im(f) = Fix(f), we have tj = t. Consider now the restriction to T of the mapping ef. For all p E S n T, we obtain pef = pf = p. Thus ef fixes the states in S n T. Further t(ef) = sf # t, showing that ef is not the identity on T. Thus, by definition of k, we have Card@ n T) Ik. 0
.
Proof of Theorem 2.3 Let d = (Q, 1,l) be the minimal automaton of X*. Let w E A* - A*XA* and set w = a l a 2 * * - awith k ai E A. Let S be a minimal image. For each i E { 1,. ..,k}, the word a l a 2 . .- a, defines a bijection from S onto Si = S u 1 u 2 - * ~ uSince i. S, is a minimal image, it contains the state 1. Thus Sk contains all the k + 1 states
.
These states are distinct. Indeed, assume that
1 'aiai+l."ak= 1 'aj**'ak for some i < j . Then setting q = 1 ~ a i a i + , * * * u jwe - , , get q'aj"*ak = 1 aj.-.ak.By Proposition 2.1, this implies q = 1. But then w E A*XA*, contrary to the assumption. Thus S wcontains k + 1 states which are determined in a way independent from S. In other words, if T is another minimal image, then T w contains these same k + 1 states. This means that Card(T. w n S w ) 2 k, and by Lemma 2.4, we have S w = T w. Thus two arbitrary minimal images have the same image by w. This shows by (2.3) that w is in E ( X ) . 0
-
-
-
-
-
.
Remark Consider, in Theorem 2.3, the special case where k = 0, i.e., where the group C(X)is regular. Then 1 E E ( X ) . Now 1 E E(X)
c>
X is a group code.
(2.4)
Indeed, if 1 E E(X),then the syntactic monoid M = cp,,,,,(A*) coincides with its minimal ideal. This minimal ideal is a single group since it contains the neutral element of M.The converse is clear. Thus we obtain, in another way, Theorem 1.2.
268
V. GROUPS OF BIPREFIX CODES
TABLE 5.1 The Automaton I ( X * )
a h
1
2
3
4
5
1 2
4 3
5 1
2 1
3 3
EXAMPLE 2.1 If X is a thin maximal biprefix code over A with degree d ( X ) = 3, then k = 0 (if G(X)= 2/32) or k = 1 (if G(X)= G3).In the second case by Theorem 2.3, we have A
- x c E(X).
The following example shows that the inclusion A c E ( X ) does not always hold. Let X be the maximal prefix code over A = { a ,b } defined by the automaton d ( X * ) = (Q, 1,l) with Q = { 1,2,3,4,5} and transition function given in Table 5.1. The set of images, together with the actions by a and b, is given in Fig. 5.1. Each of the images contains the state 1. Consequently X is a biprefix code. We have d ( X ) = 3 (which is the number of elements of the minimal images). We have Q b = {1,2,3}. Thus qd(b) has minimal rank; consequently b E E ( X ) . However, a $ E(X)since Q a = Q.In fact a E X,in agreement with Theorem 2.3.
-
THEOREM 2.5 Let X be a thin maximal biprejx code. Then the code X is indecomposable iff G(X)is a primitive group. Proof If X = Y 0 2, then Y and Z are thin maximal biprefix codes by Proposition 1.6.9. According to Proposition IV.6.6, there exists an imprimitivity partition 8 of G(X)such that Ge = G ( Y ) and Go = G ( Z ) . If G(X)is primitive, then Go = 1 or Ge = 1. In the first case, d ( Y ) = 1, implying X = Z. In the second case, d ( 2 ) = 1, whence 2 = A. Thus, the code X is indecomposable. The converse implication follows directly from Corollary IV.6.9. Cl
12345-
b
123
Fig. 5.1 The diagram of images.
145
3.
269
DEPTH
3. DEPTH
Let S be a finite semigroup,and let J be its minimal (two-sided)ideal. We say that S is nil-simple if there exists an integer n 2 1 such that S" c J.
(3.1) The smallest integer n 2 1 satisfying(3.1) is called the depth of S. Since S" is, for all n, a two-sided ideal, (3.1) is equivalent to S" = J, which in turn implies S" = S"+'.
PROPOSITION 3.1 Let S be a j n i t e semigroup. The following conditions are equivalent.
(i) S is nil-simple. (ii) All idempotents of S are in the minimal ideal J of S. Proof (i) (ii) Let n be the depth of S. For any idempotent e in S, we have e = e" E J . (ii) =. (i) Set n = 1 + Card(S). We show the inclusion S" c J. Indeed let s E S". Then s = sIs2***s,, with si E S. Let ti = s l s z ~ ~ ~for s i1,I i I n. Then there exist indices i, j with 1 5 i <j I n and ti = ti. Setting r = s i + l" - s j , we have tir = ti,
hence also tirk = t i for all k 2 1. Since S is finite, there exists an integer k such that e = rk is an idempotent. Then e E J, and consequently s = t i s i + l * * *= s , t i e s i + l . . . s , EJ .
This proves that S is nil-simple. 0 We shall use nil-simple semigroups for a characterization of biprefix codes. Before stating this result, we have to establish a property which is interesting in itself. PROPOSITION 3.2 Let X c A + be a thin maximal biprejx code, and let cp be the representation associated to an unambiguous trim automaton recognizing X * . Let J be the minimal ideal of cp(A*). Then cp(H(X))= J. Recall that H ( X ) = A - X A - is the set of internal factors of X , and R ( X ) = A* - H ( X ) . Proof Let cpD be the representation associated with the flower automaton of X , set MD = cpD(A*) and let JD be the minimal ideal of MD. It suffices to prove the result for cpD Indeed, there exists by Proposition IV.2.4, a surjective morphism p^: MD + cp(A*)such that p = p^ 0 cpD. Moreover, we have p^(JD)= J.
V. GROUPS OF BIPREFIX CODES
270
Thus the inclusion cp,(R(X)) c JD implies cp(R(X))c p^(JD)= J . Thus it remains to prove the inclusion cpD(H(X))c JD. Let dD= (Q,(1, l),(1,l))be the flower automaton of X . Let w E w ( X ) .Then w has d = d ( X ) interpretations. We prove that rank (cP,(w))= d. Since this is the minimal rank, it implies that CP,(W) is in JD. Clearly rank (cpD(w))2 d. To prove the converse inequality,let I be the set of the d interpretations of w. We define two relations 61 E
(0,lJQX', p E {O,l}'"Q
as follows: if (u,u) E Q, and (s,x,p)E I , with s E A - X , x
E
X*, p
EXA-,
then
((u,0 ) , 4 (s, x, PI) = 6",s, ((s, x, PI,BI (44)= bP.", where 6 is the Kronecker symbol. We claim that
V d w ) = as. (3.2) Assume first that (u, u)aj(u', u'). Then there exists an interpretation i = (u, x, u') E I such that (u,u)aij(u',u'). Note that i is uniquely determined by u or by u', because X is biprefix. Next W E U X * U ' ,showing that ((%V),
cpD(w),(u', V'))
= 1.
Conversely, assume that ((u,u), CP,(W),(U', u')) = 1. Then either uw = u' and u = wu', or w E uX*u'. The first possibility implies the second one: Indeed, if uw = u' and u = wu', then uwu' E X . Since w E R ( X ) this implies u = u' = 1 = u' = u. It follows that w E uX*u'. Thus, w = uxu' for some x E X * , showing that i = (0,x, u') is an interpretation of w. Thus, (u, u)ai and ifi(u', u'). This proves (3.2). By (3.2),we have rank cp,(w) 5 Card(1) = d(X). [I The following result gives an algebraic characterization of finite maximal biprefix codes. The proof uses Theorem 11.8.4 on codes with finite deciphering delay. THEOREM3.3 Let X c A + be a j n i t e maximal code, and let d be an unambiguous trim automaton recognizing X*.The two following conditions are equiualent. (i) X is biprejx, (ii) the semigroup qd(A +) is nil-simple. Proof Set cp = qd, and set S = cp(A+).Let J be the minimal ideal of S. (i) =. (ii) Let n be the maximum of the lengths of words in X.A word in X of length n cannot be an internal factor of X . Thus A"A* c R ( X ) .Observe that A"A* = (A')". Thus, S" = cp((A+)")c rp(R(X)). By Proposition 3.2, we obtain S" c J , showing that S is nil-simple.
4.
GROUPS OF FINITE BIPREFIX CODES
. ? b
Fig. 5.2 The minimal automaton of (a*b)*.
(ii) =S (i) Let n be the depth of S. Then for all y E X" c P A * = (A')", we have cp(y) E J . We prove that for any y E X",and for all x E X , u E A*, xyuEX*
=>
yuEX*.
(3.3)
The semigroupS contains no zero. Further, the elements cp( y ) and cp( y x y ) of cp(X*)are in the same group, say G, of the minimal ideal, because cp(yxy) = c p ( ~ x ) c p (and ~ ) c p ( ~ ) = C r p ( ~ x ) l - ' c p ( y x ~showing ), that c p ( y ) ~ c p ( y x y )The . same argument holds for the other side. In fact, both cp(yx) and cp(yx)-' are in the subgroup G n cp(X*). Thus there exists some r E X * such that Ccp(YX)I- = cp(r), or also cp(Y) = cp(+P(YXY). This gives d Y U ) = cp(r)cp(y)cp(xYJ)E cp(X*),
showing that yu E X*.This proves (3.3). Formula (3.3) shows that the code X has finite deciphering delay (from left to right) at most n. According to Theorem 11.8.4, X is a prefix code. Symmetrically, X is suffix. Thus X is a biprefix code. 0
EXAMPLE 3.1 Consider again the maximal biprefix code X of Example 2.1. The semigroup ( P ~ ( ~ ,+) , (isAnot nil-simple. Indeed, q(a)is a permutation of Q and thus cp(a")# J of for all n 2 1. This shows that the implication (i) => (ii) of Theorem 3.3 is, in general, false, without the assumption of finiteness on the code. EXAMPLE 3.2 Let A = {a,b} and X = a*b. The code X is maximal prefix, but is not suffix. The automaton d ( X * ) is given in Fig. 5.2. The semigroup cp(A+) is nil-simple: it is composed of the two constant functions q(a) and cp(b).This example shows in addition that the implication (ii) * (i) of Theorem 3.3 may become false if the code is infinite. 4. GROUPS OF FINITE BIPREFIX CODES
In the case of a thin maximal biprefix code X, the 9-representation, introduced in Chapter IV (Section IV.3), of the minimal automaton of X * takes a particular form.which makes it easy to manipulate.
V. GROUPS OF BIPREFIX CODES
272
Consider a thin maximal biprefix code X c A + of degree d, let d ( X * ) = (Q, 1,l) be the minimal (deterministic) automaton of X* and let cp = cp<(x., be the associated representation. Finally, let M = cp(A*) and let J be the minimal ideal of M. Each .@-class of J is a group. Fix an idempotent e E J, let S = Im(e) = Fix(e), and let be the set of &'classes of the 9-class of e. Denote by e,, the idempotent of the &'-class H E r. The set of pairs (eH,eIHerconstitutes a system of coordinates. Indeed, for
H E r,
eHe= e,,,
ee,, = e.
If e = cl is the column-row decomposition of e, then for H E r eH = cH1, with c,, = eHc, is the column-row decomposition of eH. The notations of Section IV.3 then simplify considerably. In particular, for m E M and H E r, m * H = lmc, = l(eme,)c.
Of course, m * H E C,. This can be used to define a function A* x E ( X ) + G,,
where E ( X ) = cp-'(J) as in the previous section. Let u E A* and let k E E ( X ) . Then cp(k) E J , and corresponding to this element, there is an &'-class denoted H(&)in r which by definition is the intersection of the %class of q ( k )and of the 9-class of e. In other words, H(k' = Me n cp(k)M. The function from A* x E ( X ) into C, is then defined by u * k = q ( u ) * H("). Thus U
* k = Iq(U)CH(k) = I(ecp(U)eHck))C.
Thus u * k E G,. It is a permutation on the set S = Fix(e) obtained by restriction to S of the relation eq(u)eH(k). The following explicit characterization of u * k is the basic formula for the computations. For u E A*, k E E ( X ) ,we have for s, t E S ,
In this formula, the computation of s uk and t k is of course done in the automaton d ( X * ) . Let us verify (4.1). If s(u * k ) = t, then seq(u)eHck) = t. From se = s, it follows that sv(u)f?H(k, = t. Taking the image by cp(k),we obtain Scp(U)e,cwq(k) = t q ( k ) .
Since e,o,cp(k) = cp(k), we get that scp(uk) = tcp(k), or in other words,
-
s uk = t k.
Conversely,assume that scp(uk) = tcp(k).Let rn E M such that q ( k ) m = eH(k). Then scp(u)cp(k)rn = tcp(k)rn implies S(P(U)eH(k)= teH(k).Since se = s and te = t ,
4.
273
GROUPS OF FINITE BIPREFIX CODES
we get secp(u)eHck, = teeHck, = te = t,
showing that s(u * k ) = t. This proves (4.1). The function from A* x E(X)into G, defined above is called the ergodic representation of X (relative to e). We will manipulate it via the relation (4.1). Note the following formulas which are the translation of the corresponding relations given in Section IV.3, and which also can be simply proved directly using formula (4.1). For u E A*, k E E ( X ) , and v E A*,
u * kv = u * k, uu * k = (U * uk)(u * k). PROPOSITION 4.1 Let X c A + be a thin maximal biprejix code, and let R = E ( X ) - E ( X ) A be the basis of the right ideal E ( X ) .Let e be an idempotent in the minimal ideal of and let S = Fix(e). The group G ( X ) is equivalent to the permutation group over S generated by the permutations a * r, with a E A, r E R. +
Proof It suffices to show that the permutations a * r generate G,, since G, is equivalent to G ( X ) .Set cp = (P&(~*).Every permutation u * k, for u E A* and k E E ( X ) , clearly is in G,. Conversely, consider a permutation o E G,. Let g E G(e)be the element giving o by restriction to S, and let u E cp- ' ( g ) , k E cp- '(e). Then u * k is the restriction to S of eq(u)eHck, = ecp(u)e= g. Thus u * k = c. Thus G, = { u * k l u E A*, k E E ( X ) } . For u = a,a,...a, with a,E A, and k E E ( X ) ,we get, by (4.3), u
* k = (al* ~ , a 3 . * * a , k ) ( u* ~a 3 * * * a , k ) * * .*( ak). ,
Thus G, is generated by the permutation a * k,for a in A and kin E ( X ) .Now for each k in E ( X ) , there exists r E R such that k E rA*. By (4.2), we have a * k = a * r. This completes the proof. 0 Note that Proposition 4.1 can also be derived from Proposition IV.3.6.
PROPOSITION 4.2 Let X be a jinite maximal biprejix code over A of degree d and let cp = (P&(~.). For each letter a E A , we have adE E ( X ) n X and cp(ad)is an idempotent. Proof Let d ( X * ) = (Q,1,l). By Propositiori 111.5.1, we have ad E X for a E A . The states 1,
1 -a,...,
1
.
ad-1
"74
V. GROUPS OF BIPREFIX CODES
.
.
are distinct. Indeed, if 1 ui = 1 d for some 0 Ii < j I d - 1, then setting p = j - i, we would have 1 ui = (1 up) d, and in view of Proposition 2.1, this would imply 1 u p = 1, whence up E X* with 0 < p d, contradicting the fact that ad E X and X is prefix. Moreover, we have
.
-=
Im(ad) = Q . a d = {1,1 - a,..., 1 - a d - 1 } . Indeed, let q E Q,q # 1, Let w E X A - be a word such that 1 w = q. Since X is right complete and finite there exists a power of a, say d, such that waj E X . Then j < d since X is suffix, and j > 0 since w $ X . Thus q uj = 1 and q ad = 1 .a d - j E { 1,1 .a,. ..,1 a d - ' } . This proves that Im(ad) c { 1,1 a,. ..,1 .a d - '}. The converse inclusion is a consequence of (1~a').ad=1-ad+'= l . a i , f o r i = O ,...,d-1.
-
Thus cp(ad)has rank d, showing that cp(ad)is in the minimal ideal of cp(A*), which in turn implies that adE E(X).Next (1 a j ) ad = 1 d for j = 0,.. ,,d - 1; thus rp(ad)is the identity on its image. This proves that cp(ud)is an idempotent. 0
- .
-
Proposition 4.2 shows that in the case of a finite maximal biprefix code X,a particular ergodic representation can be chosen by taking, as basic idempotent for the system of coordinates, the d(X)th power of any of the letters a of the alphabet. More precisely, let
dW*)= (Q, 1,1) and let cp be the associated morphism, set e = cp(ad)with the convention that for 1 Ii Id, i=1 ~i-1.
.
The ergodic representation relative to the idempotent cp(ad)is denoted by *,,. It is defined, for u E A*, k E E(X),and for 1 I i,j I d, by
Observe that for u = a and for any k E E(X), a*,k=a
(4.5) with a = (1,2*. . d). Indeed, by (4.4)i(a *,, k ) =j iff i ak = j k, thus iff (i + 1) k = j k. Since k induces a bijection from S onto S k, this implies j = i + 1, which is the claim. 9
-
EXAMPLE 4.1 Let A = {a,b}, and consider the finite maximal biprefix code c A + of degree 3 with kernel K ( X ) = {ab}.The transitions of the minimal automaton of X*,with states { 1,2,3,4,5}, are given in Fig. 5.3.
X
4.
2 75
GROUPS OF FINITE BIPREFIX CODES
Fig. 5.3 Transitions for a biprefix code.
The letters a and b define mappings cp(a) and p ( b ) of rank 3. Thus a, b E E(X). We consider the ergodic representation *a, i.e., relative to the idempotent e = (p(a3).To compute it, it is sufficient (according to Proposition 4.1) to compute the four permutations a *a a, a *a b, b *a a, b *a b by using (4.4). For instance, we have i ( a * a a ) = j , i . a Z = j . a o i + 1 = j mod 3. The permutations are easily seen to be a *a u = u *a b = (123),
b
*a
u = (12),
b
*a
b = (132).
The group G ( X )therefore is the symmetric group over S.
PROPOSITION 4.3 Let X and let a E A . Then a
Proof
*.
c A' be a Jinite maximal biprejx code of degree d, ad is a cycle of length d .
By (4.4), i(a *a ad) = j o i .
-
This is equivalent to i a = j , or i proving the statement. 0
ad+l =
j . ad.
+ 1 =j mod d. Thus i(a
*a
ad)= i -k 1 mod d,
We are now ready to study the groups of finite biprefix codes. We recall that a transitive permutation group G of degree d 2 2 is called a Frobenius group if k(G) = 1.
THEOREM 4.4 Let X be a Jinite maximal bipreJx code of degree d 2 4. Then
C(X)is not a Frobenius group. Proof Let d = (Q,1,l) be the minimal automaton of X * . Since d 2 4, no letter is in X.Arguing by contradiction, we suppose that G ( X )is a Frobenius group. Thus k ( G ( X ) )= 1. By Theorem 2.3, we have A c E ( X ) . This means that for all a E A, Im(a) has d elements.
2 76
V. GROUPS OF BIPREFIX CODES
Let a E A be a letter, and set S = Im(ad) = { 1,2,. . .,d}, where, for 1 Ii Id, i = 1 d - ’ . Consider the ergodic representation *,, and set a=a*,a,
P=b*,a,
where b E A is an arbitrary letter. We want to prove that
p = a. Note that, by (4.4) and (4.5) we have for i E S,
-
i ba = ip a,
i.a=
if i = d.
Since S b is a minimal image, it contains the state 1. Thus there exists a (unique) state q’ E S such that q’ b = 1. For the same reason, there exists a unique state q” E S such that q” ba = 1. We claim that 4’8 = 1, q”p = d. Indeed, we have 1 a = q’ ba = 4‘8 a. Next q” ba = q”p a = 1 = d a. Since a defines a bijection from S onto itself, it follows that 1 = q’p and q”p = d. This proves the claim. Now we verify that
-
-
q8 2 q for q E S, q # 4’. (4.7) First, we observe that the inequality holds for q”, since q”p = d. Thus, arguing by contradiction, suppose that q/3 = p < q for some q E S, q # q’, q”. Then q p * a = q * b a = p - a = p + 1 sq.
-
Setting n = q - (p + l), it follows that q ba”+l = q. Consider the path
-
qbn”ct q.
Since q # q‘, q”, we have q b # 1, q ba # 1. Also q ba‘ = p for i = 1,. . .,n + 1. Thus this path is simple. Consequently, a4-’(ba”+’)*ad-cl+l c X,
+i# 1
contradicting the finiteness of X.This proves (4.7). It follows from (4.7) that there exists at most one state q E S such that qp < q, namely the state 4’. This implies that the permutation /3 is composed of at most one cycle (of length > 1)and the rest are fixed points. Further, fl cannot be the identity on S, since otherwise the relation q’p = 1 would imply q’ = 1, hence 1 b = 1and b E X which is not true. Now by assumption, G(X)is a Frobenius group. Thus /l has at most one fixed point. If fl has no fixed point, then the
4.
277
GROUPS OF FINITE BIPREFIX CODES
inequalities in (4.7)are strict and this implies that /3 = ( 1 2 3 . e . d ) = ~ 1 .
Assume now that /lhas just one fixed point i. Then /3 = ( 1 2 3.m.i - 1 + 1 - . . d ) (i). This implies that
i
/3-'cc
= (i,i
+ 1)
P - ' a = (d 1 )
if
i # d,
if
i = d.
Since fl- ' a E G ( X )and B-'a has d - 2 fixed points, G ( X )can be a Frobenius group only if d I 3. This yields a contradiction and proves formula (4.6). It follows from (4.4)and (4.6)that i ba = i a' for i E S. This shows that for m 2 0, 1 amba = 1 * a""'. (4.8) Observe that this formula holds for arbitrary letters a, b E A. This leads to another formula, namely, for i 2 0 and a, b E A,
-
.
.
1 a'b = 1 bi+'.
(4.9) This formula holds indeed for a, b E A and i = 0. Arguing by induction, we suppose that (4.9)holds for some i 2 0, and for all a, b E A. Then we have, for a, b E A, also 1 b'a = 1 ai+ whence 1 b'ab = 1 ai+'b. Apply (4.8).We get
-
.
',
.
.
-
1 a'+'b = 1 biab = 1 b'".
This proves (4.9). Finally we show, by a descending induction on i E { 0 , 1 , . . .,d } , that for all a E A, 1 aiAd-i -
.
-{I}.
This holds for i = d, and for i < d we have 1 aibAd-i-l 1 aiAd-i -
.
u
bcA
.
-u LEA
1 . bi+lAd-i-l -
- (11
-
by using (4.9). This proves the formula. For i = 0, it becomes 1 Ad = { l}, showingthat Ad.c X . This implies that Ad = X. Since G(Ad)is a cyclic group, it is not a Frobenius group. This yields the contradiction and concludes the proof. 0 Remark Consider a finite maximal biprefix code X of degree at most 3. If the degree is 1 or 2, then the code is uniform, and the group is a cyclic group. If d ( X ) = 3, then G ( X )is either the symmetric group 6, or the cyclic group over 3 elements. The latter group is regular, and according to Theorems 1.2 and 1.1, the code X is uniform. Thus except for the uniform code, all finite maximal biprefix codes of degree 3 have as a group G3 which is a Frobenius group.
V. GROUPS OF BIPREFIX CODES
278
We now establish an interesting property of the groups of biprefix codes. For this, we use a result from the theory of permutation groups which we formulate for convenience as stated in Theorem 4.5 (proofs can be found in the classical literature and references are given in Section 7). Recall that a permutation group G over a set Q is k-transitive if for all ( p l , ...,p c ) E Qkand (ql,. ..,qk)E Q' composed of distinct elements, there exists g E G such that p , g = q l , . . . , p r g = qk. Thus 1-transitive groups are precisely the transitive groups. A 2-transitive group is usually called doubly transitive. TH~OREM 4.5 Let G be a primitive permutation group of degree d containing a d-cycle. Then either G is a regular group or a Frobenius group or is doubly transitive. THEOREM 4.6 Let X be a Jinite maximal bipreJix code over A. If X is indecomposable and not uniform, then G ( X )is doubly transitive. Proof According to Theorem 2.5, the group G(X)is primitive. Let d be its degree. In view of Proposition 4.3, G ( X )contains a d-cycle. By Theorem 4.5, three cases may arise. Either G ( X )is regular and then, by Theorem 1.2, X is a group code and by Theorem 1.1, the code X is uniform. Or G(X)is a Frobenius group. By Theorem 4.4, we have d I 3. The only group of a nonuniform code then is 03, as shown in the remark. This group is both a Frobenius group and doubly transitive. Thus in any case, the group is doubly transitive. 0
In Theorem 4.6, the condition on X to be indecomposable is necessary. Indeed, otherwise by Theorem 2.5, the group G ( X )would be imprimitive. But it is known that a doubly transitive group is primitive (Proposition 0.8.6). .There is an interesting combinatorial interpretation of the fact that the group of a biprefix code is doubly transitive. 4.7 Let X be a thin maximal bipreJix code over A, and let PROPOSITION P = X A -. The group G ( X ) is doubly transitive ifl for all p , q E P - { l}, there exist x, y E X* such that PX = Yq.
Proof Let cp be the representation associated with the literal automaton d = (P, 1,l) of X * . Let d = d ( X ) , and let e by an idempotent of rank d in cp(X*). Let S = Fix(e). We have 1 E S, since S = Im(e). Let p, q E S - { l } , and assume that there exist x, y E X* such that p x = yq. We have 1 p = p and 1 q = q, whence
-
p x = 1 p x = 1 . yq = 1 q = q.
This shows that for the element ecp(x)e E G(e), we have pecp(x)e = q. Since lecp(x)e = 1, this shows that the restriction to S of ecp(x)e, which is in the
5.
279
EXAMPLES
stabilizer of 1, maps p on q. Thus this stabilizer is transitive, and consequently the group G, = G ( X )is doubly transitive. Assume now conversely that G ( X ) is doubly transitive, and let p,q E P - { l } . Let i,j E S be such that pe = i, qe = j . Then i, j # 1. Consider indeed a word w E q-'(e). Then 1 w = 1; thus the assumption i = 1 would imply that p w = pe = i = 1, and since 1 w = 1, Proposition 2.1 gives p = 1, a contradiction. Since G(X)is doubly transitive, and G(X)is equivalent to G,, there exists g E G(e)such that
-
ig=j
and
l g = 1.
Let m E cp(A*) be such that j m = q, and let f be the idempotent of the group G(em). Since e and f are in the same 9-class, they have the same nuclear equivalence. Thus the equalities qe = j = j e imply qf = j f . Further Im(f) = Im(em). Since qem = j m = q, we have q E Im(f). Thus q is a fixed point of f,and jf = qf = q. Consider the function egf. Then legf = lgf = If = 1,
pegf = igf =jf = q.
-
Let x be in cp-'(egf). Then x E X* and p x = q. This holds in the literal automaton. Thus there exists y E X* such that p x = yq. [I
5. EXAMPLES
The results of Section 4 show that the groups of finite maximal biprefix codes are particular ones. This of course holds only for finite codes since every transitive group appears as the group of some group code. We describe, in this section, examples of finite maximal biprefix codes with particular groups. Call a permutation group G realizable if there exists a finite maximal biprefix code X such that G ( X ) = G. We start with an elementary property of permutation groups.
-
LEMMA5.1 For any integer d 2 1, the group generated by a = ( 1 2 * * d ) and one transposition of adjacent elements modulo d is the whole symmetric group
%
*
Proof Let fi
= ( 1 d). Then forj E { 1,2,..
a-jfiaj = ( j , j
.,d - l } ,
+ 1).
Nextfor 1 < i < j S d , (i,j) = r ( j - l , j ) P ,
(5.1)
280
V. GROUPS OF BIPREFIX CODES
where z = (i,i
+ l)(i + 1,i + 2)...(j - 2, j - 1).
This shows that the group generated by ri and contains all transpositions. Thus it is the symmetric group Gd. Formula (5.1) shows that the same conclusion holds if /3 is replaced by any transposition of adjacent elements. 0
PROPOSITION 5.2 For all d 2 1, the symmetric group Gd is realizable by a jinite maximal biprejix code. Proof Let A = {a,b}. For d = 1 or 2, the code X = A d can be used. Assume d 2 3. By Theorems 111.4.2 and III.4.3., there exists a unique maximal biprefix code X of degree d with kernel
K = {ba}.
ba) = 2. No word has more than one K-interpretation. Indeed, p ( K ) = (LK, Consequently K is insufficient and by Proposition 111.5.4, the code X is finite. Let us verify that X n a*ba* = ba u {a'bad-' 11 5 i 5 d - 2} u ad-'b.
(5.2)
For each integer j E (41,. ...d - l}, there is a unique integer i E {0,1,. . ., d - l} such that a'ba' E X.It suffices to verify that the integer i is determined by formula (5.2). Let i , j E { O , l , ...,d - l } besuch thata'ba'EX.Byformula(1.5)inChapter 111, the number of X-interpretations of a'ba' is
+ la'bajl - (A*&A*,a'ba') = i + j + 2 - (A*XA*, a'ba').
(L,,a'ba') = 1
The number (A*XA*,a'bd) of occurrences of words of X in a'ba'is equal to 1 plus the number of occurrences of words of K in a'ba', except when j = 1 which implies i = 0 since ba E X. Thus
(Lx, a'baj) =
if
iE{1,2,..., d - l},
On the other hand, the word a'ba' must have d interpretations since it is not in K = K ( X ) .This proves formula (5.2). Now we consider the automaton d ( X * ) = (Q,1,l)and consider the ergodic representation *@associated to the idempotent rp(ad)defined in Section 4. Setting i = 1 a'-' for i E {1,2,...,d} we have a*,ad = (1 2,a.d).
5.
28 I
EXAMPLES
Set /I = b *sad and observe that /I= ( 1 d). Indeed by formula (4.4),
. 1 .ai-lbad = 1
1 ai-'bad = 1 . a j - l a d
i/j = j
.
Thus ib = 1 ai-'bad. For i = 1, this gives 1 f l = 1 baad-', whence l / j = 1 -ad-' = d. Next for i = d, we have d/I = ( 1 .ad-'). ad = 1. Finally, if 1 < i c d, then i/I = 1 a'- bad-('- ' ) a i - = 1 a'- ' = i. This shows that the group C ( X )contains the cycle
-
.
'
a = ( 1 2.a.d)
and the transposition b = ( I d ) . In view of Lemma 5.1, G(X)= Gd. 0 For the next result, we prove again an elementary property of permutations.
LEMMA5.3 Let d be an odd integer. The group generated by the two permutations a = (1,2,...,d )
and
y = dad,
where 6 is a transposition of adjacent elements modulo d, is the whole alternating group ad.
Proof The group a d consists of all permutations 0 i 6, which are a product of an even number of transpositions. A cycle of length k is in iff k is odd. Since d is odd, a,y E ad. By Lemma 5.1, the symmetric group is generated by a and 6. Each permutation 0 E Gd can be written as 0 = ak~dak~d...akn-l 6akn
and (r E a d if n is odd. In this case, setting n = 2m + 1, tJ
= @k73#/j4
with /Izi = 6ak2'6.Since /I2,.= by a and dad = y. 0
f .* P Z m @ k 2 m + '
this formula shows that a d is generated
PROPOSITION 5.4 For each odd integer d, the alternating group realizable by a j n i t e maximal biprejx code.
a d
is
Proof Let A = {a,b}. For d = 1 or 3, the code X = Ad can be used. Assume d 2 5. Let I = {1,2,...,d } ,
J
a},
= {1,2,5,...,
and Q=IvJ.
282
V. GROUPS OF BIPREFIX CODES
Consider the deterministic automaton d = (Q,1,l) with transitions given by i*a=i+l (1 Ii S d - l), d - a = 1,
-
(1 5 i 5 d - 2),
i-a=i+ 1
d - 1 * a = 1,
a*a=d,
and
-
(1 5 i S d - 3),
i.b=i+ 1
(d - 2) b = 2, (d-l)*b=d-l, d a b = 1,
-
-
(2 Ii 5 d - 11,
i*b=i+ 1
d a b = 1.
It is easily verified that this automaton is minimal. Let X be the prefix code such that Jdl= X*.Since I . a = I,
J*a=I,
I * b = J,
J b = J,
the functions cp(a) and cp(b), of rank d, have minimal rank. Since I and J are the only minimal images, and since they contain the state I, Proposition 2.l(ii) shows that X is a maximal biprefix code. It has degree d. Let us show that X is finite. For this, consider the following order on Q: and
1<2<3<...
d-l
.
For all c E {a, b} and q E Q,either q c = 1 or q c > q. Thus, there are only finitely many simple paths in d .Consequently, X is finite. Now let us compute G ( X ) . Since cp(a), q(b) have minimal rank, both a, b E E ( X ) .According to Proposition 4.1, the group G(X)is equivalent to the group generated by the four permutations a *, a,
a *, b,
b *,a,
b *, b.
By formula (4.5) we have a *, a = a *, b = a, with a = (1,2,. ..,d). Next, by formula (4.4),
b*,a = a,
b*,b = y
5.
EXAMPLES
283
with y = (1,2,. . . , d - 3,d - l,d - 2,d). In view of Lemma 5.3, G(X)= %*. 0 Observe that for an even d, the group 5& is not realizable. More generally, no subgroup of 5& is realizable when d is even. Indeed, by Proposition 4.3, the group G(X)of a finite maximal biprefix code X contains a cycle of length d, which is not in 5& since d is even.
EXAMPLE 5.1 We give, for d = 5, the figures of the automaton and of the code of the previous proof, Figs. 5.4 and 5.5. A
13-4-
lo-
\\
Fig.
V . GROUPS OF BIPREFIX CODES
284
EXAMPLE 5.2 For degree 5, the only realizable groups are 22/52!, 6,and
W5.It is known indeed that with the exception of these three groups, all transitive permutations groups of degree 5 are Frobenius groups. By Theorem 4.4, they are not realizable.
EXAMPLE 5.3 For degree 6, we already know, by the preceding propositions, that 2/62 and G6 are realizable. We also know that no subgroup of W6 is realizable. There exists, in addition to these two groups, another primitive group which is realizable. This group is denoted by PGL2(5)and is defined as follows. Let P = Z/5Z u { 00). The group PGL(2,5)is the group of all homographies from P into P XP + Y PH- zp t
+
for x, y, z, t E Z/5Z satisfying x t permutations
- yz #
a=(coO1423),
0. Consider, for later use, the
p=(co 10243).
We have a, p E PGL2(5).Indeed a and p are the homographies 01:
PH- p +2 2 '
p: P b -P - 1
p+2'
respectively. We verify now that a and p generate all PGL2(5).A straightforward computation gives p2ap = (co 0 4 2 1)(3),
b2apa-' = (00)(4)(01 3 2).
The permutation a together with these two permutations show that the group G generated by a and fl is triply transitive. Now each element D in PGL2(5)is characterized, as any homography, by its values on three points. Since G is 3-transitive, there exists an element g E G which takes the same three values on the points considered. Thus D = g, whence 0 E G. This proves that G = PG&(5). To show that PGL2(5) is realizable, we consider the automaton d = (Q, 1,l) given in Table 5.2. This automaton is minimal. Let X be the maximal prefix code such that d = d ( X * ) . Then X is a finite maximal biprefix code. Indeed, the images
--___ Im(a) = { 1,2,3,4,5,6}, Im(b) = { 1,2,3,4,5,6} are minimal images, containing both the state 1. By Proposition 2.1(ii), X is maximal biprefix with degree 6. The cock X is finite because if Q is ordered by 1<2<2<3<3<~<4<5<3<6<6,
EXERCISES
TABLE 5.2 The Transitions of the Automaton d
a b
2 Z
3 a
4 j
4 6
6 3
1 1
3 3
5 4
4 3
1 6
6 1
then the vertices on simple paths from 1 to 1 are met in strictly increasing order, with the exception of the last one. Next a, b E E ( X ) ,because of the minimality of the images Im(a), Im(b). Thus the group G(X)is generated by the permutations a =a
b = (1 2 3 4 5 6),
a =a
y=b*,a,
p=b*ab.
Formula (4.4) shows that
p=(l32546)
?=a,
Thus G(X)is generated by a and p. Let p be the bijection from 1,2, 3 , 4 , 5 , 6 onto P = Z/5Z v ( 0 0 ) given by
1 0
Then a = p-'ap, equivalent.
2 0
3 0
4 1
5 4
2
6 3
p = p - l p p . Consequently, the groups G ( X )and PGL,(5) are
EXERCISES
SECTION 1 1.1.
**1.2.
Let X c A + be a finite code and let d = (Q, 1,l) be an unambiguous trim automaton recognizing X*.Show that the group of invertible elements of the monoid cpd(A*)is a cyclic group. Let G be a transitive permutation group over a finite set Q. Let 1 E Q, and H = {g E GI lg = l}. Let $: A* --* G be a surjective morphism and let Z be the group code defined by Z* = $ - ' ( H ) . Let X be a finite subset of Z. Let cp be the representation associated with the minimal automaton d ( X * ) and set M = cp(A*).
286
V. GROUPS OF BIPREFIX CODES
Show that M contains an idempotent e such that the group G, is equivalent to G iff there exists a finite subset Y of A such that (i) 2 n A*YA* is finite, (ii) 2 n F ( Y * ) c X, (iii) II/(Y*) = G. Use this to show that for every finite transitive permutation group G, there exists a Jinite biprefix code X such that G is equivalent to G, for some idempotent e in the transition monoid of the minimal automaton of x*. +
SECTION 2 2.1.
Let X c A + be a biprefix code and let d ( X * ) = (Q, 1,l) be the minimal automaton of X*. Let rp = rpd be the associated representation and M = cp(A*). Show that for any idempotent e E rp(F(X)),the monoid of partial function Me is composed of injective partial functions.
SECTION 3 3.1.
Let X c A + be a thin maximal biprefix code, and let JD be the minimal ideal of rpD(A*).Show that for Card(A) 2 2,
R(X)= CpD'(J,). Let X c A be a finite maximal prefix code, let d ( X * ) = (Q, 1,l) be the minimal automaton of X*,set rp = v&(~.).Let a E A and let n be the order of a in X (a" E X). (a) Show that the idempotent in rp(a+)has rank n. (b) Show, without using Theorem 3.3, that rp(A+) is not nil-simple when n 2 1 + d ( X ) . 3.3. Let X c A be a thin complete code. Then X is called elementary if there exist an unambiguous trim automaton d = (Q, 1,l) recognizing X* such that the semigroup S = rpd(A+) has depth 1. Show that if X is elementary, then 3.2.
+
+
x = YOZ, 3.4.
where Y is an elementary biprefix code and G(X)= G(Y). Let d = (Q, 1,l) be a complete, deterministic trim automaton and let ~p be the associated representation. Suppose that &I+) has finite depth, and that rp(A+) has minimal rank 1. Show that the depth of rp(A+) is at most Card(Q) - 1. (Hint: Consider the sequence 0, of equivalence relations over Q defined by p=q
mod0,
iff V w e A i , p . w = q . w . )
287
EXERCISES
3.5.
Let X c A + be a finite biprefix code. Let cp be the representation associated with d ( X * )and M = q ( A *). Let J be the minimal ideal of M and let A be the set of its 9-classes. Let Lo be a distinguished 9-class in A. Define a deterministic automaton
a = (A, Lo
Y
Lo)
by setting L * w = Lcp(w).
+
Let be the representation associated with B, and let I be the minimal ideal of +(A*). (a) Show that +(A*) has minimal rank 1, and that + - ' ( I ) = q - ' ( J ) . (b) Use Exercise 3.4 to show that cp(A+) has depth at most Card(A) - 1. SECTION 4 4.1.
Let X be a finite maximal biprefix code of degree d. Let a E A and k 2 0 such that uk E E ( X ) .Show that for each integer n I d - k and each word u E A", there exist at least d - k - n integers i, 1 I i I d such that i(u
4.2. 4.3. **4.4.
**4.5.
*.
ak)2 i
-k
+ 1.
Derive directly Theorem 1.1 from Exercise 4.1 (take k = 0). Derive the inequalities (4.7) from Exercise 4.1 (take k = 1, u = b). Let G be a permutation group of degree d, let k = k(G) and suppose that G contains the cycle c1 = (1 2 . . . d ). Show that if d 2 4k2 + 8k + 2, then for every A E G which is not a power of a, there exist at least 2k + 2 integers i, 1 Ii I d, such that in I i. Derive from Exercises 4.1 and 4.4 that a finite maximal nonuniform biprefix code X of degree d satisfies k(G(X)) 2 ($/2)
- 1.
SECTION 5 Show that for A = {a,b}, there are exactly six elementary finite maximal biprefix codes over A (see Exercise 3.3) with group equivalent to PGLz(5). *5.2. Show that the automaton in Table 5.3 defines a finite maximal biprefix code X of degree 7. Show that G(X) is equivalent to the group GL3(2) of invertible 3 x 3 matrices with elements in Z/2Z, considered as a permutation group acting on (Z/2Z)3 - (0). 5.3. Show that the automaton in Table 5.4 defines a finite maximal biprefix code X of degree 11. Show that G(X) is equivalent to the Mathieu group
**5.1.
MI'.
2 88
V. GROUPS OF BIPREFIX CODES
TABLE 5.3 A Finite Code with Group GL,(2) ~~
a b
1
2
3
4
5
6
7
8
2 8
3 9
4 12
5 11
6
7
1
9
10
1
13
9
~
9 1 0 1 1 1 2 1 3 1 4 1 5 4 1 4 1 5 1 3 10 11 12 13
1 1
6 12
7 13
TABLE 5.4 A Finite Code with Group M ,
a b
,
1
2
3
4
5
6
7
8
2 12
3 13
4 16
5
6 14
7
17
15
8 20
9 1 0 1 1 19 18 1
9
1 0 1 1
12
13
14
1 2 2 2 3 2 4 21 13 14 15
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
25 16
26 17
27 18
28 19
29 20
21 21
1 1
4 14
5
15
6 16
7 17
8 18
9 19
10 20
11 21
NOTES
Theorem 1.1 already appears in Schutzenberger (1956). Theorem 1.3 and Corollary 1.4 are from Reis and Thierrin (1979). The ergodic representation of Section 4 is described in Perrin (1979). It is used in Lallement and Perrin (1981) to describe a construction of finite maximal biprefix codes. Theorem 4.5 is a combination of a theorem of Schur and of a theorem of Burnside. Schur’s theorem is the following “Let G be a primitive permutation group of degree d. If G contains a d-cycleand if d is not a prime number, then G is doubly transitive.” This result is proved in Wielandt (1964, pp. 52-66). It is the final development of what H. Wielandt calls the “Method of Schur.” Burnside’s theorem is the following: “A transitive permutation group of prime degree is either doubly transitive or a Frobenius group.” Burnside’s proof uses the theory of characters. It is reproduced in Huppert (1967, p. 609). An elementary proof (i.e., without characters) is in Huppert and Blackburn (1982, Vol. 111, pp. 425-434).
NOTES
289
The other results of this chapter are from Perrin (1975,1977b,1978).Perrin (1975)gives a more exhaustivecatalogue of exampleg than the list of Section 5. Exercise 1.2is from Rindone (1983).Exercise 2.1 is due to Margolis (1982). Exercise 3.4 is a well-known property of “definite” automata (Perles et al., 1969). The exercises of Section 4 are from Perrin (1978)and those of Section 5 are from Perrin (1975).
CHAPTER
VI
Densities
0. INTRODUCTION
We have used measures for studying codes as early as in the first chapter. In particular we have seen that for any code X over an alphabet A and for any Bernoulli distribution A over A*, we have n ( X ) 5 1. We also have seen that a thin code X is a maximal code iff A ( X )= 1. We thus have obtained an astonishingly simple method for testing the maximality of a code. This is a mark of the importance of the role played by probability measures in the theory of codes. In this chapter we again use measures. We will study asymptotic properties of certain sets related to a code. For this the notion of density of a subset L of A* is introduced. It is the limit in mean, provided it exists, of the probability that a word of length n is in L. We shall prove a fundamental formula (Theorem 3.1) that relates the density of the submonoid generated by a thin complete code to that of its sets of left factors and right factors. For this, and as a preparation, we shall see how one can compute the density of a set of words by transfering the problem to a study of probabilities in abstract monoids. 1. DENSITIES
Let A be a positive Bernoulli distribution on A*, It is a morphism A: A* +lo, I] 290
I . DENSITIES
29'
such that
c n(a)
= 1.
as.4
In the sequel, we use the notation A'") = (1) u A v
. * *
v A"-'.
In particular A'') = fa, A") = {I}. Let L be a subset of A*. The set L is said to have a density with respect to n if the sequence of the a(L n A") converges in mean, i.e., if 1"-1
lim n-tm
C n(L n Ak)
nk=O
exists:If this is the case, the density of L (relative to n) denoted by 6(L),is this limit, which can also be written as
6(L)= lim (l/n) a(L n A'")). n+cn
An elementaryresult from analysis shows that if the sequence n(L n A") has a limit, then its limit in mean also exists, and both are equal. This remark may sometimes simplify computations. Observe that 6(A*)= 1 and 0 I6 ( L )I1 for any subset L of A* having a density. If L and M are subsets of A* having a density, then so has L v M, and 6(L v M) I 6 ( L ) + 6(M). If L n M # fa, and if two of the three sets L, M and L v M have a density, then the third one also has a density and
+
6(L v M ) = 6(L) 6 ( M ) .
The function 6 is a partial function from ?(A*) into [0,1]. Of course, 6({w}) = 0 for all w E A*. This shows that in general
Observe that if a(L)< 00, then 6(L) = 0 since n(L n A'"))I x(L),
VI. DENSITIES
292
whence 1
-x(L n A'"))+ 0. n
EXAMPLE 1.1 Let L = (Az)* be the set of words of even length. Then n (n ~ A(Zk))= x(L n A(zk-l)) = k.
Thus 6(L)= 4.
1.2 Let D* = ( w E A* I Iwl. = (wIb} over A = {a,b}, and set EXAMPLE p = n(a), q = I@). Then (see Example 1.4.5) x(D* n
= 0.
Using Stirling's formula, we get 1
x(D* n Azn) N
showing that for all values of p and q, lim n ( ~ n * A'") = 0. n-r w
Thus 6(D*) = 0. The definition of density clearly depends only on the values of the numbers x(L n A"). It appears to be useful to consider an analogous definition for formal power series: we restrict ourselves to formal power series in one variable with coefficients in R,. Let
be a formal series. The density of f, denoted by S ( f ) is the limit in mean, provided it exists, of the sequence f,, In-1
S ( f ) = lim- C n-m n i = o
fi.
To each set L c A* we associate a series fL E R+ [ [ t ] ]by m
sL= C x(L n A")t" n=O
Clearly fL has a density iff L has a density, and
W )= S(fd.
293
I . DENSITIES
Let fL be given by (1.1). We denote by p L the radius of Convergence of the series fL. It is the unique real number p E R, such that
converges for IzI c p and diverges for 1zI > p. For any set L, we have p L 2 1 since n(L n A") I 1 for all n 2 0. The following proposition is a more precise formulation of Proposition 1.5.6. PROPOSITION 1.1 Let L be a subset of A* and let distribution. If L is thin, then p L > 1 and 6 ( L ) = 0.
R
be a positive Bernoulli
Proof Let w be a word in F(L),and set n = 1 wI. Then we have, for 0 Ii I n and k 2 0, L n Ai(A")'
t
- {w})'.
Hence
n(L n Ai(A")k)I(1 - R(W))". Thus for any p E R, satisfying (1 - n(w))p"< 1, we have n-1
m
i=O k=O
1
This proves that
For later use, we need an elementary result concerning the convergence of certain series. For the sake of completeness we include the proof. PROPOSITION 1.2 Let f = C n L O f n t ng , = Cnrogat"E R + [ [ t ] ] be two power series satisfying (i) 0 < 0 = C , z o g n < a; (ii) for all n 2 0,O 5 f, I 1. Then 6( f)exists zff 6(fg) exists and in this case,
S(fd
= 6(f)a.
VI. DENSITIES
294
Proof
Set m
h =f g =
C h,t'
n=O
Then for n 2 1,
where ri = c;=,gj.Let s, = C ',:;
Then for n 2 1,
Furthermore
c
n-1
s, =
c rn-I =
n-1
5
i=o
I=O
Since C g , converges, we have limi,
ri. i=1
ri = 0, showing that also
In-1
lim - C ri = 0, n i=o
n-m
and in view of (1 ,4), 1
lim -s, n-m n
= 0.
Since # 0 Eq. (1.3) shows that S( f ) exists iff S(h)exists and that 6(f )a = 6(h). This proves (1.2) and the proposition. 0
PROPOSITION 1.3 Let ~tbe a positive Bernoulli distribution on A*. Let L, M be subsets of A* such that
-=
(i) 0 < n(M) 00. (ii) The product LM is unambiguous. Then LM has a density iff L has a density, and if this is the case, Proof
S(LM) = S(L)x(M). Since the product LM is unambiguous, we have fLM
=fLfM
*
(1.5)
295
1. DENSITIES
In view of the preceding proposition
W M )= W L M ) =
w&,
M A") = n(M). 0 where o = C n ; r o ~ ( n This proposition will be useful in the sequel. As a first illustration of its use, note COROLLARY 1.4 Each right (left) ideal I of A* has a nonnull density. Each prejx-closed set has a density. More precisely where X = I - IA'. &I) = a(X),
Proof Let I be a right ideal and let X = I - IA'. By Proposition 11.1.2, the set X is prefix and I = XA*. The product XA* is unambiguous because X is prefix. Further n ( X ) I1 since X is a code, and n ( X )> 0 since I # fa and consequently also X # 0.Thus, applying the (symmetrical version of the) preceding proposition, we obtain 6(Z) = 6(XA*) = x(X)G(A*)= n ( X ) # 0.
Finally, the complement of a prefix-closed set is a right ideal. 0 Let X be a code over A. Then n(X)5 1 and n ( X ) = 1 if X is thin and complete. For a code X such that n(X) = 1 we define the aoerage length of X (relatively to n) as the finite or infinite number
n(x)= C Ixlz(x) = nCL O na(X n A"). xox
The followingfundamental theorem gives a link between the density and the average length.
THEOREM1.5 Let X c 'A be a code and let II be a positioe Bernoulli distribution. I f (i) n ( X ) = 1, (ii) A(X) < 00, then X* has a density and d(X*)= l/A(X). Proof Set f, = n(X n A"). Then f x = C,"= f,t". Since X is a code, we have as a consequence of Proposition 1.1.6, fx* = (1 - f x ) - l .
Let g =
c;=,grit" be defined by
VI. DENSITIES
296
Identifying terms, we have for n 2 0,
From n(X) = C,"=,h= 1 we get that
Note that by (1.9), gn 2 0 for n 2 0. Moreover, by (1.6) and(1.9), (1.10)
We see that (1.8) and (1.7) give t* = f x x g .
(1.11)
Since A(X) is finite and not zero (indeed A(X)= 1 shows that X # fa), we can apply Proposition 1.2 to (1.11). Since d(t*) = 1, S(fx*) exists and
Note the following important special case of Theorem 1.5. THEOREM 1.6 Let X be a thin complete code over A, and let n be a positive Bernoulli distribution. Then X* has a density. Further S(X*) > 0, A(X) < CQ, and d(X*) = l/A(X).
Proof Since X is a thin and complete code, n(X) = 1. Next, since X is thin, > 1 by Proposition 1.1. Thus the derivative of f x , which is the series
px
also has a radius of convergence strictly greater than 1. Hence f;(l) is finite. Now fk(1) =
nn(X n A") = A(X). n21
Thus A(X) < m and the hypotheses of Theorem 1.5 are satisfied. 0
2. PROBABILITIES OVER A MONOID
A detailed study of the density of a code, in relation to some of the fundamental parameters, will be presented in the next section. The aim of the present section is to prepare this investigation by the proof of some rather
2. PROBABILITIES OVER A MONOID
297
delicate results. We will show how certain monoids can be equipped with idempotent measures. This in turn allows us to determine the sets having a density, and to compute it. We need the following lemma which constitutes a generalization of Proposition 1.2.
LEMMA2.1 Let I be a set, and for each i E I , let
be formal power series satisfying
(i) 0 = C i o I C n r o d ' < (ii) 0 I fp) I 1 for all i E I , n 2 0, (iii) S(f(')) exists for all i E 1. Set di)=
Proof
c,"=,g!,"for i
E I.
Then
Xiel f(i)g(i)admits a density and
Set
Set also
We have, for n 2 1 and i E I,
Dividing by n and summing over I, one obtains
Set d = C d(f(")di). iol
Since S(f(')) I 1 for all i E I,we have d ICiE,di)= 0.This shows that d < co.
VI. DENSITIES
298
Let now E > 0 be a real number. Then there exists a finite subset F of I such that a2
c a"' 2 a
- E.
isF
This is equivalent to
7 di'
I E.
i s -F
Observing that S(f(") I 1 for i E I and (l/n)
c;:
fli)
I1, we get that both
This leads to the evaluation
I
1n - 1 d--chk nk=o
I
Let us estimate the s,. Since f I i ) I1, we have
Now
Thus s,/n I ( ~ i , F ( l / n ) ~ ; = rli)) + E. Next for each i E I, we have liml+m rii) = 0, whence also limn+m(l/n)c;=, ri') = 0. Consequently, by choosing a sufficiently large integer n, we have for each index i in the finite set F the
299
2. PROBABILITIES OVER A MONOID
inequality 1
2 m. E
- rji) I nl=1
Thus for sufficiently large n, sJn I 2 ~ Substituting . in (2.2), we get for sufficiently large n,
Now, if n is large enough, we also have
and this holds for each index i in F. Thus
This proves that d = l i m n + m ( l / n hk. ) ~ ~I]~ ~ Lemma 2.1 leads to the following proposition which extends Proposition 1.3. PROPOSITION 2.2 Let I be a set and for each i E I , let Li and Mi be subsets of A*. Let IC be a Bernoulli distribution on A* and suppose that (i) C i e I n(Mi) < 00, (ii) The products LiMi are unambiguous and the sets LiMi are pairwise disjoint, (iii) Each Li has a density S(Li). Then
uielLiMi has a density, and
Proof
Set in Lemma 2.1,
flf)
gf) = n(Mi n A").
= n(Li n A"),
Then f(')= fL,, g(" = fMi;furthermore S(f'") = 6(Li),a'" = a(Mi), and in particular cr = n(Mi) < co. According to the lemma, we have
c
6
c f(i)g'i))
(is,
=
1S(Li)a(Mi).
is1
VI. DENSITIES
300
Since.condition (ii) of the statement implies that
the proposition follows. 0 Let be a morphism from A* onto a monoid M, and let K be a positive Bernoulli distribution on A*. Provided M possesses certain properties which will be described below, each subset of A* of the form cp- ' ( P ) ,where P c M, has a density. The study of this phenomenon will lead us to give an explicit expression of the value of the densities of the sets cp-'(m) for m E M, as a function of parameters related to M. A monoid M is called well founded if it has a unique minimal ideal, if moreover this ideal is the union of the minimal left ideals of M, and also of the minimal right ideals, and if the intersection of a minimal right ideal and of a minimal left ideal is a finite group. Any unambiguous monoid of relations of finite minimal rank is well founded by Proposition IV.4.10 and Theorem IV.4.11. It appears that the development given now does not depend on the fact that the elements of the monoid under concern are relations; therefore we present it in the more abstract frame of well-founded monoids. Let cp: A* 3 M be a morphism onto an arbitrary monoid, and let m, n E M . We define C,,,,,= { w E A*Imcp(w) = n} = cp-'(m-'n). The set C,,,,,is a right-unitary submonoid of A*: for u, uv E C,,,, we have ncp(u) = n = ncp(uv) = ncp(u)cp(v)= ncp(u).Thus C,,nis free. Let X,, be its base; it is a prefix code. Let be the initial part of C,,,,,.Then Cm,n
= Zm,nX,*
and this product is unambiguous. Note also that C1.n = c~-'(n)
PROPOSITION 2.3 Let cp: A* -,M be a morphism onto a well-founded monoid M , and let R be a positive Bernoulli distribution on A*. Let K be the minimal ideal of M . 1. For all m,n E M , the set C,,,,,= cp-'(m-'n) has a density. 2. We have
a(Z,,,,)S(X:)
if n E K and m-'n # @, otherwise.
301
2. PROBABILITIES OVER A MONOID
3. For m, n E K such that nM = mM, n(Z,,,) = 1 and consequently d(Cm,n)= S(Cn,n) = a(X,*)*
Proof Let n E M, with n # K. Then m-’n n K = 0 . Indeed, assume that p E m- ‘n n K.Then mp = n and since K is an ideal, p E K implies n E K. Thus for an element n $ K, the set C,,” does not meet the ideal cp-’(K). Consequently C,,n is thin, and by Proposition 1.1, d(C,,,) = 0. Consider now the case where n E K. Let R = nM be the minimal right ideal containing n. Consider the deterministicautomaton over A, d = (R, n, n) with transition function defined by r a = rcp(a) for r E R, a E A. We have Id1 = X,*. Since R is a minimal right ideal, the automaton is complete and trim, and every state is recurrent. In particular, X, is a complete code (Proposition 11.3.9). Let us verify that the monoid cp,(A*) has finite minimal rank. For this, let u E A* be a word such that q(u) = n. Since d is deterministic, it suffices to compute rank,(u). Now rank(cp,(u)) = rank,(u) = Card(R.u) = Card(Rn) = Card(nMn). By assumption, nMn = nM n Mn is a finite group. Thus rank(q,(u)) is finite and the monoid qd(A*) has finite minimal rank. By Corollary IV.5.4,the code X,is complete and thin and according to Theorem 1.6, X,*has a positive density. Since Z,,,, is a prefix set, we have n(Zm,,) I1. In view of Proposition 1.3, the set C,,, has a density and
-
6(Cm,n)= 4zm,n)d(X,*)*
Clearly C,,,=@
o m-’n=$3
e
Z,,,=$3;
moreover, n being positive, A(Z,,,) > 0 iff Z,,, # $3; this shows that d(C,,,,) # 0 if m-’n # $3. This proves (2) and (1). 3. Let u E A* be a word such that ncp(u) = m and ncp(u’) # n for each proper nonempty left factor u’ of u. Then G n , n
c
Xn.
Thus Z,,” is formed of right factors of words in X,,and in particular Z,,, is thin. To show that Z,, is right complete, let w E A* and let n’ = rncp(w).Then n’ E R = nM,and since R is right-minimal, there exists n” E M such that n’n” = n. Let U E A* be such that cp(u) = n”. Then mcp(wu) = n, and consequently wu E C,,,”. This shows that Z,,, = {I} or Z,,,, is a thin right complete prefix code, thus a maximal code. Thus n(Z,,J = 1. Consequently d(Cm,n)= d(X,*). 0 2.4 Let cp: A* + M be a morphism onto a well-founded monoid, THJ~OREM and let II be a positive Bernoulli distribution on A*. Let K be the minimal ideal of M.
VI. DENSITIES
302
1. For all n E M , the set q - ' ( n ) has a density. 2. W e haoe S(q-'(n)) = 0
sft"
n$K;
for all 9-equivalent m, n E K ,
'
6 ( q - ' ( n ) ) = S ( q - '(m - n))S(
3. For all n E K ,
Note that according to Corollary 1.4, every one-sided ideal has a positive density; thus G ( q - ' ( n M ) ) and d ( q - ' ( M n ) ) exist. Proof 1. Since q - ' ( n ) = Cl,nthe claim results from Proposition 2.3. 2. By Proposition 2.3, 6(Cln)# 0 iff n E K, since CI," is never empty. Let us prove the second formula. For this, let Y = cp-'(K) - cp-'(K)A+ be the initial part of the ideal q - ' ( K ) . We have q - ' ( K ) = YA*. Since the set A* - q - ' ( K ) is thin, we have 6 ( q - ' ( K ) ) = 1. Consequently n ( Y ) = 1. For each W-class R of K , consider YR= Y n q-'(R). Then q- '(R) = YRA*, and hence, 6 ( q - ' ( R ) ) = n(YR). Let now R = n M . Then (2.3) CP- ' ( n ) = (YR n 9- '(r))Cr,n*
u
IPR
Indeed, each word w E qO-'(n) factorizes uniquely into w = UD, where u is the shortest left factor of w such that q(u) E R. Then u E YR n q- '(r) for some r E R, and u E Cr,n.The converse inclusion is clear. The union in (2.3)is disjoint, and the products are unambiguous because the sets YR n q - I ( r ) are prefix. Each Cr,nhas a density, and moreover
C n(YRn q - I ( r ) ) = n(YJ I; I.
rsR
Thus we can apply Proposition 2.2, which yields d(q-'(n)) =
C n(YR n q-'(r))NCrJ
roR
According to Proposition 2.3, all the 6(C,,J for r E R are equal. Thus, for any m E R,
6((P- '(n)) = 6(Cn1,n)n(~R) = 6(q-l(m= 6(q-'(m-'n))S(q-'(R)).
'n))X(yR)
1. PROBABILITIES
303
OVER A MONOID
3. Set R = nM, L = Mn, and H = R n L. Then we claim that L=
u m-'nnK
meH
and furthermore that the union is disjoint. First consider an element k E m-'n n K for some m E H. Then mk = n. Thus n E Mk, and since n is in the minimal ideal, M n = Mk. Therefore, k E Mn = L. This proves the first inclusion. For the converse, let k E L = Mn.The right multiplication by k,
mwmk is a bijection which exchanges the 2'-classes in K and preserves W-classes (Proposition 0.5.2). In particular, this function maps the 9-class L onto Lk = L,thus onto itself. It follows that there exists m E L such that mk = n. This element is W-equivalent with n. Consequently m E H and k E m-'n for some rn E H. Since the function m w m k is a bijection, the sets m-'n are pairwise disjoint. This proves the formula. For all m, n E K ,
d(rp-'(m-'n
n K)) = d((p-'(m-'n))
since the set cp-'(m- 'n n ( M - K)) is thin and therefore has density 0. The set H being finite, we have
Using the expression for d((p-'(n)) proved above, we obtain
This proves the last claim of the theorem. 0 Let E be a denumerable set. A probability measure over E is a function P: W E ) + LO, 11
that satisfies the following two conditions: for any subset F of E,
and p ( E ) = 1.
The support of a probability measure p is the set S = {eE Elp(e) # O}.
304
VI. DENSITIES
A probability measure over E is completely determined by a function p: E+[O,l]
such that
Indeed, in this case, any family ( p ( e ) ) e e F , where F is a subset of E, is summable, and p is extended to '$I by @ (2.4). ) Note once more that the density 6 defined by a Benoulli distribution on A* is not a probability measure over A* since it does not satisfy (2.4). We consider, for probability measures over E, the following notion of convergence: if (P,,),,~~and p are probability measures over E, then we set p = lirn pn n-r m
iff for all elements e E E, we have p(e) = lim p,,(e). n-rm
The following proposition is useful.
PROPOSITION 2.5. Let
(pn)n20and p be probability measures over E, such
that p = limn+ p,,. Then for all subsets F of E, p ( F ) = lirn p,,(F). n-r m
Proof
The conclusion clearly holds when F is finite. In the general case, set 6
= lirn inf p,,(F),
z = lirn sup pn(F),
and let F = E - F. Of course, o I z and 1 - z = lim inf pn(F).
Let F' be a finite subset of F. Then pn(F')I; p,,(F) for all n, and taking the inferior limit, p(F') I 6.It follows that p ( F ) = sup p(F') I
6.
F'cF F'finite
Similarly, p ( F ) I 1 - z. Since p ( F ) + p ( F ) = p ( E ) = 1, we obtain 1 I 6 + (1 - z), whence 6 2 z. Thus 6 = z. Since p ( F ) I 6 and p ( F ) I; 1 - 6,one has both p ( F ) I; CT and p ( F ) 2 6,showing that p ( F ) = 6. 0 THEOREM2.6 Let cp: A* + M be a morphism onto a well-founded monoid, and let II be a positive Bernoulli distribution on A*. For any subset F of M , the set
305
2. PROBABILITIES OVER A MONOID
cp-'(F) c A* has a density and S(cp -
'(m= c S(cp - '(N. meF
Proof It is convenient to introduce the shorthand v(F) = d(cp-'(F)).Let K be the minimal ideal of M, let r be the set of its W-classes and A the set of its 2'-classes. We have verified, during the proof of Theorem 2.4., that V(K)= ?C( Y) = 1,
V(R)= K( YR) where Y (resp., YR)is the initial part of cp - ' ( K ) ,(resp., of cp- '(R),with R E r). Since K is the disjoint union of its @-classes,we have
hence v ( K )=
c
1
v(R) = v ( L )= 1, Rer LEA where the intermediate assertion follows by symmetry. Now consider a fixed 9-class R E r. Then by Theorem 2.4,
= v(R)
c v(L)
= v(R)
LEA
and also
Since v(n) = 0 for n $ K , it follows that
c d((p-'(n))
= 1.
neM
Consider the family 9of subsets of A*, The preceding formula gives
c S(T)
= 1.
T€.%
This shows that there exists a probability measure 8 over 9, defined for a subset X c 9by B(X)=
c S(T).
TeX
VI. DENSITIES
For any positive integer n, define pn:9 + [0,1] by 1
p,,( T ) = - n( T n A'"'),
n
where A(")= { w E A* I Iw) < n}. The morphism cp being surjective, the sets T E 9forma partition of A*. Thus
showing that p,, (n 2 1) is a probability measure over 9.Moreover, we have &TI =
TI = lim ~ J T ) n-tm
by the definition of a density. Thus s^ = pn. Proposition 2.5 shows that for any subset X of 9, &x)= lim p , , ( . ~ ) . n- m
Now X = {cp-'(m) 1 m E P} for some subset P of M . Thus
but also
Consequently p , ( X ) = (l/n)n(cp-' ( P ) n A(")),and thus
The left side of the formula is d(cp-'(P)) and the right side is $(A!). Thus 6(cp-'(P)) exists and has the announced value. 0
PROPOSITION 2.7 Let cp: A* -+ M be a morphism onto a well-founded monoid and let 7c be a positive Bernoulli distribution on A*. The function v: M + [0,1] defined by v(m) = 6(cp-'(m)) is a probability measure ouer M. Let K be the minimal ideal of M. Then the following formulas hold: v(m) # 0
iff
v(m) = v(n-'m)v(mM)
if m,n E K and n g m ,
v(mM)v(Mm) v(m) = Card(mM n M m )
if
v(M') = v(M' n K )
mEK,
mEK,
for M' c M .
(2.6)
30.7
2. PROBABILITIES OVER A MONOID
For each &'-class H c K, and h E H,
v(h) = v ( H ) Card(H)' Proof All these formulas with the exception of (2.7), are immediate consequences of the relations given in Theorem 2.4. For (2.7)observe that the value of v is the same for all h E H by formula (2.6). Next v ( H ) = z h s H v ( h ) . This proves (2.7). 0
EXAMPLE 2.1 Let cp: A* + G be a morphism onto a finite group. Let R be a positive Bernoulli distribution. For g E G,
in view of formula (2.7) and observing that H = K = G. This gives another method for computing the density in Example 1.1. To that example corresponds a morphism cp: A* + Z/2Z onto the additive group 2/22 with q(a) = 1 for any letter a in A. EXAMPLE 2.2 Let cp: A* --t M be the morphism from A* onto the unambiguous monoid of relations M over Q = {1,2,3} defined by a = cp(a), /? = cp(b),with
This monoid has already been considered in Example IV.5.3.Its minimal ideal J is composed of elements of rank 1 and is represented in Fig. 6.1. Let x be the Bernoulli distribution defined by x(a) = p, n(b)= q, with p + q = 1, p , q > 0. Let us compute the probability measure v = Jcp-' over M. With the notations 00 1
110
L1
L2
Fig.6.1 The minimal ideal of the monoid M.
308
VI. DENSITIES
of Fig. 6.1, we have the equalities L,u = L,,
L,/3 = L,,
L,p= L , . Set X l = cp-'(L,), X , = cp-'(L,). By (2.9), L,a = L,,
X,a-' n cp-'(J) = 0,
X,b-' n cp-'(J) = X , ,
X,a-'
X,b-' n cp-'(J) = X , .
(2.10) r\
cp-'(J) = X , u X,,
Indeed consider, for instance, the last equation: if w E X , , then cp(w) E L,, hence cp(wb)E L2 by the fact that Ll/3 = L , . Thus w b E X , , and w E X,b-' n cp-'(J). Conversely, let w E X,b-' n cp-'(J). Since w E c p - l ( J ) , w E X l u X,. But if w E X,, then cp(wb)E ' L ~showing , that wb E X , , whence w # X,b-'. Thus w E X I . In view of (2.10), X,a-' = T l ,
Xlb-' = X2 u T i ,
X ~ U - '= XI u X, u T,,
X,b-' = Xl u T i ,
where T,, T;T,, T ; are disjoint from cp- ' ( J ) .Multiplication by a and b on the right gives Xi = Xzb u (Ti0 u T i b ) ,
X, = X i a u X ~ u U X i b u (TZUu Tib). Since T, is thin, d(T,a) = G(Tl)n(a)= 0, and similarly for the other T's. Thus
W,)= Wl)+ V , ) P
W , )= W , ) 4 , which together with 6(X,)
+ ax,) = 1
gives V l ) = -3
4
1 S(X2) = 1+q'
Thus 4 v(L,) = 1+q'
1
v(L,) = 1+q'
An analogous computation yields
P v(R,) = 1+p'
1
V(R2)= 1+p'
3.
309
CONTEXTS
In particular, since R2 n L, = {pa}, we obtain 1 v(pa) = (1 + p)(l + 4)3. CONTEXTS
Let X c A + be a thin completecode. We have seen that the degree d ( X )of X is the integer which is the minimal rank of the monoid of relations associated with any unambiguous trim automaton recognizing X*. It is also the degree of the permutation group G(X), and it is also the minimum of the number of disjoint interpretations in X (see Section IV.6). In this section,we shall see that d ( X )is related in a quite remarkable manner to the density 6(X*).The set Gx of left completable words is G, = { w E A* IA*w n X* # @} = (A*)-'X* = A - X * ; symmetrically, the set D, of right-completable words is D, = ( w ~ A * l w A *n X * # @} = X*(A*)-' = X * A - . THEOREM 3.1 Let X t A* be a thin complete code, and let x be a positioe Bernoulli distribution on A*. Then 1 J(X*)= d(X)W,)6(D,). (3.1) Proof Let d = (Q,1, I) be an unambiguous trim automaton recognizing X*, let cp be the associated morphism and M = cp(A*). In view of Corollary IV.5.4., the monoid M is well founded. Set v = 6cp-'. By Proposition 2.7, v is a probability measure over M, and the values of v may be computed by the formulas of this proposition. Let K be the minimal ideal of M. Since v vanishes outside of K, we have 6(X*) = v(cp(X*)n K ) . Let R be the union of the W-classes in K meeting cp(X*), and similarly let 2 be the union of those 9'-classes in K that meet cp(X*).Then v(cp(x*) n K ) = v ( c p ( ~ *n) R n 2 ) = CV(cp(X*)n H), H
where the sum is over all &'-classes H contained in k? n 2. For such an &'class H,we have
3'0
Vl. DENSITIES
where R and L are the 9-class and 9-class containing H. Thus v ( q ( X * )n H) =
Now observe that for any &-c'lass
Card(q(X*) n H) v(R)v(L)* Card(H)
H
c I? n
i,
Card(q(X*) n H) =- 1 Card(H) d ( X )* Thus the formula becomes
1 1 6(X*) = C -v(R)v(L)= -v(R)v(L). H d(X) d(X) A
-
Now q - @ ) = D, n ~ - I ( K ) .
(3.2) Indeed, let w E Dx n q - ' ( K ) . Then wu E X* for some word u. Consequently, ~ ( w u= ) q(w)cp(u)E q(X*) n K,showing that the W-class of p(w), which is the same as the 9-class of ~ ( w u )meets , cp(X*). Thus q ( w ) E fi. Conversely, let w E q - l ( I ? ) . Then q ( w ) E fi and there is some rn E M such that q(w)m E q(X*) n K.Thus wqO-'(m) n X* # 521 and we derive that w E D,. It follows from (3.2) that
v(B) = 6(q-I(I?))= 6 ( ~n, c p - ' ( ~ ) ) . Since A* - q - ' ( K ) is thin, we have 6(Dx)= 6(D, n q - ' ( K ) ) . Thus a(&) = v(I?)and similarly v ( z ) = 6(G,). This concludes the proof. 0 The following corollary is a consequence of Theorem 1.6. COROLLARY 3.2 Let X c A + be a thin complete code, and let A be a positive Bernoulli distribution on A*. Then
We observe that for a thin maximal biprefix code X c A', we have G, = D, = A*. Thus in this case, (3.3) becomes A(X) = d(X). This gives another proof of Corollary 111.3.9.
EXAMPLE3.1 Let A = {a,b ) and consider our old friend X = {aa, ba, baa, bb, bba} which is a finite complete code. In Fig. 6.2 an automaton d = (Q,1,l) recognizing X* is represented.
3.
CONTEXTS
a. b
Fig. 6.2 An unambiguous trim automaton recognizing X*.
We have qJa) = a, 4 4 6 ) = p, where a, p are the relations considered in Example 2.2. To obtain the set D,, we can compute the deterministic automaton having as a set of states, the set of row vectors {O,l}Q. The transition function is determined by multiplication on the right, with the elements in cpJA*) considered as matrices. The part of this automaton that is accessiblefrom the vector 100 is drawn in Fig. 6.3. (Note that this automaton can also be considered as the deterministic automaton associated with d by the subset construction, as given in Proposition 0.4.1 with sets of states represented by vectors.) The elements of D, are the words which are labels of paths from 100 to a state distinct from 000. We obtain Dx = a* u (a2)*6A*.
A similar computation on column vectors gives Gx = 6* u A*4bz)*.
Let K be the positive Bernoulli distribution given by .(a) = P,
4 6 ) = 4,
Fig. 6.3 A deterministic automaton for D,.
3'2
VI. DENSITIES
with p, 4 > 0, p + q = 1. Then 6(Dx)= 6(a*) + d((a2)*bA*)= d((a2)*bA*)
since d(a*) = 0. Since (a2)*bis a prefix code, we have
d(D,)
= n((a2)*b).
Thus 4 1 6(D,) = 7 =l-p l+p'
In a similar fashion, we obtain 1
6(Gx) = 1 +q'
On the other hand, d ( X ) = 1 since the monoid pd(A*) has minimal rank 1. Thus by formula (3.3),
4x1 = (1 + P)(l + 4). This can also be verified by a direct computation of the average length of X. The computations made in this example are of course similar to those of Example 2.2. Let X c A be a code. A context of w E A* is a pair (u,u) of words such that the following two conditions hold +
uwu = X 1 X 2 " ' X n
(Xi E
X)
and IuI < 1x1I,
101
< IxnI*
The set of contexts of a word w E A* (with respect to X)is denoted by C(w). The set C(l) is C(1) = {(u,u) E A + x A + I uu E X} u ((1, l)}. The context of a word can be interpreted in terms of paths in the flower automaton d t ( X ) = (P, (1, l), (1,l)):there exists a bijection between the set C(w)and the set P(w) of paths labeled w in the flower automaton. Indeed let c:
(u, u') A (u', u)
be a path labeled w in "QIg(X).Then uwu E X*.Thus either uwu = 1, or uwu = X l X 2 * * * x,
with x i E X and n > 0. In that case, IuI < lxll and 101 < Ix,I. Thus in both cases, (u,u) is a context. Consider another path c: (u, ii') (3, u).
3.
3’3
CONTEXTS
Then both paths
(1,l) A (u,u’) JL (u’,u) L(1, l), (1,l)
u-. (u,E’) -%
(F’,u) L(1,l)
are labeled uwu. By unambiguity, c = C: Conversely, if (u, u) is a context of w and uwu = x1***x,,,define two words u’, u’ by u’ = u-1 x1
if u # 1, u ’ = 1 otherwise,
u’ = x,,u-l
if
u#
1, v’
=
1 otherwise.
Then (u, u’) and (u’, u) are states in d g ( X ) ,and there is a path (u, u’) A (u’, u). The following result shows a strong relationship between all sets of contexts. THEOREM 3.3 Let X c A + be a thin complete code, and let a be a positiue Bernoulli distribution on A*. For all w E A*,
1
A(X) =
n(uu).
(u.0) E C ( W )
Proof Let d g ( X ) = (P, (1, l), (l., 1)) be the flower automaton of X, let M = cpD(A*) and set v = 6cp.’; Let w E A*, m = cpD(w),and let
T(m)= {(r,I ) E M x M I rml E cp,(X*)}
We compute t(m) in two ways. First define, for each state p
E P,
L , = ( 1 E it411p,l = l}. R, = { r E M Irl,, = l}, Then rml E cpD(X*)iff there exist p, 4 E P such that rl,, = 1,mp,q= 1, lq,l = 1. Consequently,
Thus
Set p = (u, u’), q = (u’, u). Then mpq = 1 iff there is a path c: p + q labeled w. According to the bijection defined above, this hold iff (u, u) E C(w).Next, vD’(R,) = X*U,
cp,’(L,) = uX*,
hence v(R,) = d(X*u) = B(X*)a(u),
v(L,) = S(uA*) = n(u)G(X*).
VI. DENSITIES
3'4
Consequently t(m) =
c
S(X*)n(u)n(u)S(~*) =[S(X*)]~
c
~(uu).
(U,U)E C(w)
(U,U)EC(W)
This is the first expression for t(m). Now we compute t(m) in the monoid M. Let K be the minimal ideal of M. Since v vanishes for elements not in K, we have t(m) =
C
v(r)v(l).
(r.1) E K x K rmlecpD,(X*)
Let N = qD(X*)n K. Then
Let r E K . Since (rm)- 'n # 0 iff M n , and since rgrrn, we have (rm)- 'n # fZI iff r 9 n and
by Proposition 2.6. Further
Comparing both expressions for t(m), we get
There is an interesting interpretation of the preceding result. With the notations of the theorem, set for any word w E A*,
Call y(w) the contextual probability of w. Then the theorem claims that if n is a Bernoulli distribution we have identically Y(W)
= n(w).
The fact that y and II coincide is particular to Bernoulli distributions (see Exercise 3.3.) We now study one-sided contexts. Let X c A + be a code, and let w E A*. The set of right contexts of w is C,(W) = { u E A*l(l, 0) E C(w)}.
Thus u E Cr(w)iff wu = x 1 x 2 ~ ~(x,~ Ex X) n with IuI < 1x.l.
3.
CONTEXTS
315
Symmetrically,the set of lejit contexts of w is
I
C,(W) = {u E A* (u, 1) E C(w)}. We observe that C,(W)X* = w-'x*.
(3.4)
The product C,(w)X* is unambiguous, because X is a code.
PROPOSITION 3.4 Let X c A' be a thin complete code and let d = (Q, 1,l) be an unambiguous trim automaton recognizing X*. Let K be the minimal ideal of the monoid M = qd(A*). Let n be a positive Bernoulli distribution. For all w E cp,'(K) n D,, we have
(3.6) n(Cdw))G(G,) = 1. Proof Set cp = cpd, v = &-',and let k (resp., 2) be the union of the Yeclasses (resp., 9-classes) in K that meet cp(X*). We have seen, in the proof of Theorem 3.1, that G(D,) = v(R) and 6(Gx)= ~ ( 2 According ). to formula (3.4),
6(w-'x*) = n(C,(w)>d(X*). Set n = ~ ( w and ) T = { k E K 1 nk E cp(X*)}. Then T c 2since for k E T, we have nk E Mk n cp(X*),showing that the left ideal Mk meets cp(X*). Let H be an &-'class contained in 2. The function h I+ nh is a bijection from H onto the X-class nH. Since n E R, we have nH c R; since H E 2 we have nH c Thus nH t k n 2. This implies that nH n cp(X*) # 0.Indeed let R and L denote the W-class and 9'-class containing nH, and take rn E R n cp(X*), m' E L n cp(X*). Then mm' E R n L n cp(X*) = nH n cp(X*). Setting d = d(X), it follows that
z.
Card(nH n cp(X*)) = -1 Card(nH) d'
Since H n T = {k E Hlnk E cp(X*)} is in bijection with nH n cp(X*), we have 1
Card(H n T) = Card(nH n cp(X*)) = - Card(H). d Therefore
316
VI. DENSITIES
We observe that cp-'(T) = w-'X* n q - ' ( K ) . Consequently 6 ( w - ' X * ) = v(T) = (l/d)v(i). Since also v ( z ) = S(G,), we obtain (3.7)
By Theorem 3.1, the last expression is equal to 1. 0 PROPOSITION 3.5 Let X c A + be a thin complete code. Let A be a positive Bernoulli distribution on A*. For all w E A*. the following conditions are equivalent. (i) The set C,(w) is maximal among the sets C,(u),for u E A*. (ii) n(C,(w))G(D,)= 1. Proof With the notations of Proposition 3.4, consider a word X E cp-'(K) n X*. Then C,(w) c C,(xw), hence also n(C,(w))I n(C,(xw)). On the other hand xw E rp-'(K) n D,. Indeed the right ideal generated by x is minimal, and therefore there exists u E A* such that ~ ( X W U= ) cp(x). Thus xwu E X*. By Proposition 3.4, we have n(C,(xw))G(D,)= 1 showing that .(CAW)) I 1/&Dd
(3.8)
Now assume C,(w)maximal. Then C,(w) = C,(xw),implying the equality sign in the formula. This proves (i) (ii). Conversely formula (3.8) shows the implication (ii) => (i). [I In fact, the set of words w E A* such that the set of right contexts is maximal is an old friend; in Chapter 11, Section 8, we defined the sets of extendable and simplifying words by E = {U E A* IVV E A*, 3~
E
A*:UUW EX*},
S = {U E A* IVx E X * , VV E A*:xuu E X * i.Jv E X * } . We have seen that these sets are equal provided they are both nonempty (Proposition 11.8.5). It can be shown (Exercise 3.1) that, for a thin complete code X,the following three conditions are equivalent for all words w E A*: (i) w E E, (ii) w E S, (iii) C,(w) is maximal. This leads to a natural interpretation of formula (3.5) (see Exercise 3.1). We now establish, as a corollary of formula (3.5) a property of finite maximal codes which generalizes the property for prefix codes shown in Chapter I1 (Theorem 11.6.4).
3.
3'7
CONTEXTS
THEOREM 3.6 Let X c A + be a $finite maximal code. For any letter a E A, the order of a is a multiple of d(X). Recall that the order of a is the integer n such that a" E X . Proof Let II be a positive Bernoulli distribution on A*. Let d = (Q, 1 , l ) be a trim unambiguous automaton recognizing X*. Let K be the minimal ideal of the monoid M = qd(A). Let x E X * n q,'(K). According to Proposition 3.4, n(Cdx))d(GX)= 1.
n(Cr(X))d(Dx)= 1,
By formula (3.3) the average length of X is
Thus
4x1 = d(X)~(Cr(X))~(C,(X)). Let a be a fixed letter and let n be its order. Consider a sequence ( n k ) k > O of positive Bernoulli distributions such that lim nk(a) = 1 k+ m
and for any b E A - a, Iimk+mnk(b)= 0. For any word w E A* we have limk+mn k ( W ) = 1 iff w E a*, and = 0 otherwise. For any k 2 0,denote by &(x) the average length of X with respect to r k a Then = d(X)nk(C,(x))IIk(C~(X)).
But also, by definition
Since X is finite, this sum is over a finite number of terms, and going to the limit, = C 1x1 Iim R k ( X ) . lim &(x) k+m
xeX
Since limk+mnk(x)= 0 unless x
E a*, we
k+m
have
lim nk(x)= n, k+m
where n is the order of a. O n the other hand,
318
VI. DENSITIES
The words in C,(x) are right factors of words in X.Since X is finite, C,(x) is finite. Thus, going to the limit, we have lim nk(C,(x)) = k-m
C
lim
V E C,(x)
?rk(u)
= Card(C,(x) n a*).
k-. m
Similarly Iim nk(C,(x)) =
C
lirn Kk(V) = Card(C,(x) n a*).
vsC,(x)k+m
k-m
Thus
n
= d(X)
9
Card(C,(x) n a*)Card(C,(x)n a*)
This proves that d ( X )divides n. 0
EXERCISES SECTION 1 1.1. Let X be a recognizable subset of A* and let K be a positive Bernoulli
distribution. (a) Show that px = co iff X is finite. (b) Show that, if X is infinite, then p x is a pole of the rational function fx. SECTION 2 2.1. 2.2.
Show that any recognizable subset of A* has a density. Let M be a monoid, and let p, v be two probability measures over M. The convolution of p and v is defined as the probability measure given by p * v(m) =
c
uv=m
(a) Show that
)*
lim p n (n-m
v = lim ( p n* v). a-m
(b) Let K be a positive Bernoulli distribution on A*. For n 2 0, let d")the probability measure defined by a(")(L) = n ( t n A")
for L
t
A*. Show that #+1)
= x(4*,+1).
(c) Let p:A* + M be a morphism onto a well-founded monoid. Let x be as above and let v = 6q-l be the probability measure over
3’9
EXERCISES
M defined in Proposition 2.7. Show that v is idempotent, i.e, v * v = v. 2.3. Let d = (Q, i, T) be a deterministic automaton and let n be a Bernoulli distribution on A*. A function Y: Q+Co,11 is a stationarv distribution if
Y ( A * ) . Show that if d has finite minimal rank, and if all states of ,d are recurrent, it admits a unique stationary distribution given for p E Q by
Y(P) = l/W.D),
where Xp is the prefix code such that X; = Stab(p). (Hint:show that for a stationary distribution y
where L, is the set recognized by the automaton (Q,p, q). (b) Let d be the minimal rank of M, with the hypotheses of (a). Let I be the set of minimal images of d.Let 9 be the deterministic automaton with states 8 and with the action induced by d .Let y be the stationary distribution of d,and (T that of 9.Show that
Where the sum runs over the minimal images I E I such that q E 1.
SECTION 3 3.1. Let X c A + be a thin complete code. Let S ( X ) and E(X) be the sets of simplifying and extendable words defined in 11.8. Show that for w E A*, the following conditions are equivalent: (i) w E S ( X ) ; (ii) w E E(X); (iii) C,(w)is maximal among all C,(u),u E A*. 3.2. Use Proposition 11.8.6 to give another proof of formula (3.7.). 3.3. Let n:A* + [0,1] be a function. Then K is called an invariant distribution if (i) n(1) = 1; (ii) CaEA ~ ( w a=) I , A n(aw) ,= n(w) for w E A*.
VI. DENSITIES
320
Let X c A + be a code and a: B* 4A* a coding morphism for X, i.e., a(B) = X.Let K be an invariant distribution on B*. Show that the function K'" from A* into [0,1] defined by
with I(a)= CXEX Ixln(a- '(x)) is an invariant distribution on A*. Compare with the definition of the contextual probability.
NOTES
Theorem 1.5 is a special case of a theorem of Feller which can be formulated as follows in our context: "Let X c A + be a code, and let K be a positive Bernoulli distribution such that n ( X ) = 1. Let p be the g.c.d. of the lengths of the words in X.Then the sequence
n(X* n AnP)
(n 2 0) has a limit, which is 0 or p / I ( X ) ,according to A(X) = co or not" (see Feller, 1957).Theorem 1.5 is less precise on two points: (i) we only consider the case where A(X) < co and (ii) we only consider the limit in mean of the sequence n(X* n A"). The results of Section 2 and related results, can be found in Greenander (1963)and Martin-Lof, (1965).Theorem 3.1 is due to Schutzenberger(1965b). Theorem 3.3 is from Hansel and Perrin (1983). Exercise 2.1 belongs to the folklore of automata theory. The notion of stationary distribution introduced in Exercise 2.3 is a classical one in the theory of Markov processes. Further developments of the results presented in this chapter may be found in Blanchard and Perrin (1980)and Hansel and Perrin (1983).In particular these papers discuss the relationship of the concepts developped in this chapter with ergodic theory.
CHAPTER
VII
Conjugacy
0. INTRODUCTION
In this chapter we study a particular family of codes called'circular codes. The main feature of these codes is that they define a unique factorization of words written on a circle. The family of circular codes has numerous interesting properties. They appear in many problems of combinatorics on words, several of which will be mentioned here. In Section 1 we give the definition of circular codes and we characterize the submonoid generated by a circular code. We also describe some elementary properties of circular codes. In particular we characterize maximal circular codes (Theorem 1.8). In Section 2 we introduce successive refinementsof the notion of a circular code. For this we define two notions, namely (p, q)-limitedness and sychronization delay. We study the relations between these parameters. For finite codes we show that these notions allow us to classify completely the family of circular codes (Theorem 2.6). We then proceed to a more detailed study of (1,O)-limited codes. These codes are exactly the left factors of bisections studied in Sections4 and 5. In particular, we show (Proposition 2.7) that (1,O)limited codes correspond to ordered automata. Comma-free codes are defined as circular codes satisfying the strongest possible condition.
VII. CONJUGACY
322
Section 3 is concerned with length distributions of circular codes. Two important theorems are proved. The first gives a characterization of sequences of integers which are the length distribution of a circular code (Theorem 3.6). The second shows that for each odd integer n there exists a system of representatives of conjugacy classes of primitive words of length n which not only is circular but even comma-free (Theorem 3.8). The proofs of these results use similar combinatorial constructions. As a matter of fact they are based on the notion of factorization of free monoids studied in the following sections. The main result of Section 4 (Theorem 4.1) characterizes factorizations of free monoids. It shows in particular that the codes which appear in these factorizations are circular. The proof is based on an enumeration technique. For this, we define the logarithm in a ring of formal power series in noncommutative variables. The properties necessary for the proof are derived. We illustrate the factorization theorem by considering a very general family of factorizations obtained from sets called Lazard sets. Section 5 is devoted to the study of factorizations into finitely many submonoids. We first consider factorizations into two submonoids called bisections. The main result (Theorem 5.3) gives a method to construct all bisections. hthen examine trisections or factorizations into three submonoids. We prove a difficult result (Theorem 5.4) showing that every trisection can be obtained by “pasting” together factorizations into four factors obtained by successive bisections. 1. CIRCULAR CODES
We define in this section a new family of codes which take into account, in a natural way, the operation of conjugacy. By definition, a subset X of A+ is a circular code if for all n, m 2 1 and xl, xz,. . .,xn E X,y,, yz,. ..,y,,, E X,and p E A* and s E A’, the equalities
imply n=m,
p= 1
and
xI = y ,
(1 5 i S n )
(see Fig. 7.1). A circular code is clearly a code. The converse is false, as shown in Example 1.3. The asymmetry in the definition is only apparent, and comes from the choice of the cutting point on the circle in Fig. 7.1. Clearly, any subset of a circular code is also a circular code.
I . CIRCULAR CODES
323
Fig. 7.1 Two circular factorizations.
Note that a circular code X cannot contain two distinct conjugate words. Indeed, if ps, sp E X with s, p E A' then NPdP = (SP)(SP). Since X is circular, this implies p = 1 which yields a contradiction. Moreover, all words in X are primitive, since assuming u" E X with n 2 2, it follows that u(u")u"-' = u"u".
This implies u = 1 and gives again a contradiction. We shall now characterize in various ways the submonoids generated by circular codes. The first characterization facilitates the manipulation of circular codes. A submonoid M of A* is called pure if for all x E A* and n 2 1, x" E M * x
E
M
or equivalently XE
M*&E
M.
A submonoid M of A* is very pure if for all u, v E A*, uv, V U E M
=-
u, U E M.
( 1*4)
A very pure monoid is pure. The converse does not hold (see Example 1.3).
PROPOSITION 1.1 A submonoid M of A* is very pure ifl its minimal set of generators is Q circular code.
324
VII. CONJUCACY
Proof Let M be a very pure submonoid. We show that M is stable. Let m, m', xm, m'x E M. Then setting u = x, u = mm',we have uu, uu E M. This implies XEM.
Thus M is stable, hence M is free. Let X be its base. Assume that (1.1) and (1.2) hold. Set u = s, u = x2x3 x,p. Then uu, uu E M. Consequently s E M. Since ps, ~ 2 x 3 .x,p E M, the stability of M implies that p E M.From ps E X, it follows that p = 1. Since X is a code, this implies n = m and x i = yi for i = 1, ..., n. Conversely, let X be a circular code and M = X*.To show that M is very pure, consider u, u E A* such that uu, uu E M. Set a . 9
uu = X I X ~ " ' X , ,
uu = y 1 y , . . . y m ,
with xt, yj E X. There exists an integer i, 1 Ii In such that u =xIxz"'xi-lp,
with x i = ps, p
E A*,
u = SXi+i"'X,
s E A + . Then uu may be written in two ways:
sxi + 1 * * . X,Xl
x2
*
" X i - 1p
= y , y,
* * *
ym.
Since X is a circular code, this implies p = 1 and s = y,. Thus u, u E M, showing that M is very pure. 0
EXAMPLE 1.1 Let A = { a , b } and X = a*b. Then X* = A*b u 1. Thus if uu, uu E X*, the words u, u either are the empty word or terminate by a letter b; hence u, u E X*. Consequently X* is very pure and X is circular. EXAMPLE 1.2 Let A = {a}and X = {a'}. The submonoid X * clearly is not pure. Thus X is not a circular code. EXAMPLE 1.3 Let A = { a ,b } and X = {ab, ba}. The code X is not circular. However, X* is pure (Exercise 1.1). The following proposition characterizes the flower automaton of a circular code.
PROPOSITION 1.2 Let X c A + be a code and let cp be the representation associated with the flower automaton of X . The following conditions are equivalent: (i) X is a circular code. (ii) For all w E A', the relation q ( w ) has at most one fixed point.
Proof For convenience, let 1 denote the state (1,l) of the flower automaton d g ( X ) . (i) =- (ii) Let w E A + , and let p = (u, u), p' = (u', u ' ) be two states of d t ( X ) which are fixed points, i.e., such that (p,cp(w),p)= ( p ' , cp(w),p')= 1.
I . CIRCULAR CODES
325
Since w # 1, Proposition IV.2.2 shows that w E oX*u and w E u'X*u'. Thus both paths c: p A p and c': p' 4p' pass through the state 1. We may assume that u I D'. Let z, t E A* be the words such that u' = oz and w = ozt. Then the paths c, c' factorize as c: p A l L r A p ,
c':
ptLs+l&pf.
Thus there are also paths d:
l A r L p L 1 ,
d':
l L p ' 4 s A l
showing that zto, tuz E X*. Since X* is very pure, it follows that z, to E X*. Consequently, there is a path e: 1 A 1 -%1 . By unambiguity, d = e, whence r = 1 . Thus 1 4 p a 1 which compared to d' gives p = p'. This proves that rp(w) has at most one fixed point. (ii) +-(i) Let u, u E A* be such that uu, uu E X*.Then there are two paths 1 Y,p A 1 and 1 A q A 1. Thus the relation rp(uo) has two fixed points, namely 1 and 4.This implies q = 1, and thus u, o E X*. 0 COROLLARY 1.3 A thin circular code is synchronous. Proof Let rp be the representation associated with the flower automaton dM X ) of X. Let e E cp(A+) be an idempotent with positive minimal rank. According to Proposition 1.2, the rank of e is 1. Thus d(X) = 1. 0
We now give a characterization of circular codes in terms of conjugacy. For this, the following terminology is used: Let X c A + be a code. Two words w, w' E X* are called X-co,rjugate if there exist x, y E X* such that w = xy,
w' = yx.
The word x E X* is called X-primitive if x = y" with y E X* implies n = 1. The X-exponent of x E X + is the unique integer p 2 1 such that x = y p with y an X-primitive word. Let a: B --t A* be a coding morphism for X. It is easily seen that w, w' E X* are X-conjugateiff a-'(w) and a-'(w') are conjugate in B*. Likewise,x E X* is X-primitive iff a-'(x) is a primitive word of B*. Thus, X-conjugacy is an equivalence relation on X*. Of course, two words in X* which are X-conjugate are conjugate. Likewise, a word in X* which is primitive is also X-primitive. When X = A, we get the usual notions of conjugacy and primitivity. PROFQSITION 1.4 Let X c A + be a code. The following conditions are equivalent. (i) X is a circular code;
326
VII. CONJUCACY
(ii) X* is pure, and any two words in X* which are conjugate are also Xconjugate.
Proof (i) => (ii) Since X* is very pure, it is pure. Next let w, w' E X* be conjugate words. Then w = uu, w' = uu for some u, u E A*. By (1.4),u, u E X*, showing that w and w' are X-conjugate. (ii) (i) Let u, u E A* be such that uu, uu E X*. If u = 1 or u = 1, then u, u E XI.Otherwise, let x, y be the primitive words which are the roots of uu and uu: then uu = x", uu = y" for some n 2 1. Since X* is pure, we have x, y E X*. Next uu = x" gives a decomposition x = rs, u = xpr, u = sxQfor some r E A*, s E A + , and p + q + 1 = n. Reporting this in the equation uu = y", yields y = sr. Since x, y are conjugate, they are X-conjugate. But for primitive words x, y, there exists a unique pair (r',s') E A* x A + such that x = r's', y = s'r'. Consequently r, s E X*. Thus u, u E X*, showing that X* is very pure. 0 PROPOSITION1.5 Let X c A be a code and let C c A" be a conjugacy class that meets X*. Then +
"21
1 1 -Card(X" n C ) 2 -Card(C). m n
Further, the equality holds iff the following two conditions are satisfed: (i) The exponent of the words in C n X* is equal to their X-exponent(ii) C n X* is a class of X-conjugacy.
Proof Let p be the exponent of the words in C. Then Card(C) = n/p. The set C n X* is a union of X-conjugacy classes. Let D be such a class, and set C' = C - D. The words in D all belong to Xkfor the same k, and all have the same X-exponent, say q. Then Card@) = k/q. Since C = C' u D, the left side of (1.5) is 1 " 1 -Card(X" n D). -Card(X" n C') + m=l m m
f
In the second sum, all terms vanish except for m ( l / k )Card(Xk n D ) = l/q. Thus " m=l
= k. Thus
this sum equals
1 1 " 1 -Card(X" n C) = - + -Card(X" n C'). m 4 m=lm
(1.6)
Since q Ip , we have l / q 2 l / p = ( l / n )Card(C). This proves formula (1.5). Assume now that (i) and (ii) hold. Then p = q, and D = C n X*.Thus C' n X* = 0.Thus the right side of (1.6) is equal to l/p, which shows that equality holds in (1.5).Conversely,assuming the equality sign in ( 1 3 , it follows
I . CIRCULAR CODES
327
from (1.6) that 1 p
1
- =-
+
"
1
- Card(Xmn C') m=lm
1 1 2 - 2 -, 4 P
which implies p = q and C' n X* = 0. 0 The proposition has the following consequence: PROPOSITION 1.6 Let X c A + be a code. The following conditions are equivalent: (i) X is a circular code. (ii) For any integer n 2 1 and for any conjugacy class C c A" that meets X*, we have 1 1 - Card(Xmn C)= - Card(C). mz1 m n Proof
By Proposition 1.4, the code X is circular iff we have
(iii) X* is pure. (iv) Two conjugate words in X* are X-conjugate. Condition (iii)is equivalent to: the X-exponent of any word in X* is equal to its exponent. Thus X is circular iff for any conjugacy class C meeting X*, we have (v) The exponent of words in C n X* equals their X-exponent. (vi) C n X* is a class of X-conjugacy. In view of Proposition 1.5, conditions (v) and (vi) are satisfied iff the conjugacy class C n A" satisfies the equality (1.7). This proves the proposition. [I. We now prove a result which is an analogue of Theorem 1.5.1.
PROPOSITION 1.7 Let X c A + be a circular code. If X is maximal as a circular code, then X is complete. Proof If A = { a } , then X = { a } . Therefore, we assume Card(A) 2 2. Suppose that X is not complete. Then there is a word, say w, which is not a factor of a word in X*. By Proposition 0.3.6, there is a word u E A* such that y = wu is unbordered. Set Y = X u y . We prove that Y is a circular code. For this, let x i (1 5 i In) and yi (1 s i I m) be words in Y, let p E A*, s E A + such that
sx2xJ"'x"p = y , y , . * * y ,
328
VII. CONJUCACY
If all xi (1 I i 5 n) are in X , then also all y j are in X , because y is not a factor of a word in X*.Since X is circular, this then implies that n = m,
p = 1,
xi = yi
( 1 Ii In).
(1.8)
Suppose now that xi = y for some i E { 1,. . .,n} and suppose first that i # 1. Then x i is a factor of y1yz * * y,. Since y 4 F ( X * ) , and since y is unbordered, this implies that there is a j E { 1,2,. .. ,m} such that y j = y, and
-
SX;Z"'Xi-1
=y,y,.*.yj-,,
x i + l * * * x n=p y j + l " ' y , .
This in turn implies sxz ' * xi- ,xi+ 1 * * * x , p = y , y , - - - yj- 1yj+1 * . . ym, and (1.8) follows by induction on the length of the words. Consider finally the case where i = 1, i.e., x 1 = y. Since X1Xz"'XnP
= PY1Y2"*Ym,
we have y x , * - . x , , p= py,y,...y, . Now p is a right factor of a word in Y*; further y 4 F ( X * ) and y is unbordered. Thus p = 1 and y, = y. This again gives (1.8)by induction on the length of the words. Thus if X is not complete, then Y = X u y is a circular code. Since y 4 X,X is not maximal as a circular code. 0 The preceding proposition and Theorem 1.5.7 imply THEOREM 1.8 Let X be a thin circular code. The three following conditions are equivalent.
(i) X is complete. (ii) X is a maximal code. (iii) X is maximal as a circular code. 0 Observe that a maximal circular code X c A+ is necessarily infinite, except when X = A. Indeed, assume that X is a finite maximal circular code. Then by Theorem 1.8, it is a maximal code. According to Proposition 1.5.9, there is, for each letter a E A, an integer n 2 1 such that a" E X . Since Xis circular, we must have n = 1, and consequently a E X for all a E A. Thus X = A. We shall need the following property which allows us to construct circular codes.
PROPOSITION 1.9 Let Y, Z be two composable codes, and k t X = Y 0 2. ff Y and Z are circular, then X is circular.
Proof Let u:B* + A * be a morphism such that X = Y O a Z . Let u, u E A* be such that uu, uu E X*.Then uu, uu E Z*, whence u, u E Z* because Z* is
329
2. LIMITED CODES
very pure. Let s = ct- '(u), t = a-'(u). Then st, ts E Y*. Since Y* is very pure, s,t E Y*, showing that u, v E X*. Thus X * is very pure. 0 2. LIMITED CODES
We introduce special families of circular codes which are defined by increasingly restrictive conditions concerning overlapping between words. The most special family is that of comma-free codes which is the object of an important theorem proved in the next section. Let p,q 2 0 be two integers. A submonoid M of A* is said to satisfy condition C(p , q) if for any sequence uo, Ul,...,Up+q
of words in A*, the assumptions ui-,uiEM
(lrirp+q)
imply up E M (see Fig. 7.2). For example, the condition C(1,O)simply gives U V E M ~ U E M , i.e., M is suffix-closed, and condition C(1 , l ) is uv, uw E M
=. v E M .
It is easily verified that a submonoid M satisfying C ( p , q ) also satisfies conditions C(p',q') for p' 2 p , q' 2 q.
PROPOSITION 2.1 Let p , q 2 0 and let M be a submonoid of A*. If M satisjes condition C(p,q), then M is very pure. Proof
p
Let u, v E A* be such that uu, uu E M. Define words ui (0 I i I
+ q) to be alternatingly u and u. Then assumption (2.1) is satisfied and
Fig. 7.2 The condition C( p, q ) (for p odd and q even).
330
VII. CONJUGACY
consequently either u or u is in M.Interchanging the roles of u and u, we get that both u and v are in M. 0 Let M be a submonoid satisfying a condition C(p,q). By the preceding proposition, M is very pure. Thus M is free. Let X be its base. By definition, X is called a (p, q)-limited code. A code X is limited if there exist integers p , q 2 0 such that X is (p, &limited.
PROPOSITION2.2 Any limited code is circular. 0 EXAMPLE 2.1 The only (0,0)-limited code over A is X = A. EXAMPLE 2.2 A (p, 0)-limited code X is prefix. Assume indeed X is (p, 0)1imited.Ifp = OthenX = A.Takeu, = ... = = l.Thenforanyu,-,,u,, we have Up-l,Up-lUpEX* * UPEX* showing that X* is right unitary. Likewise, a (0,&limited code is suffix. However, a prefix code is not always limited, since it is not necessarily circular.
EXAMPLE 2.3 The code X = a*b is (1,O)-limited. It satisfies even the stronger condition UUEX
=$
UEXU1.
EXAMPLE 2.4 Let A = {a, b,c} and X = ab*c u b. The set X is a biprefix code. It is neither (1, 0)-limitednor (0, 1)-limited.However, it is (2,O)- and (0,2)limited. EXAMPLE 2.5 Let A = { a , I i 2 0 } and X = {a,ai+lI i 2 O } . The code X is circular, as it is easily verified. However, it is not limited. Indeed, set ui = a, for 0 Ii In. Then ui- lui E X for i E { 1,2,. ..,n}, but none of the u, is in XI. This example shows that the converse of Proposition 2.2 does not hold in general. However it holds for finite codes, as we shall see below (Theorem 2.6). It also holds for recognizable codes (Exercise 2.5). One of the reasons which makes the use of (p, &limited codes convenient,is that they behave well with respect to composition. In the following statement, we do not use the notation X = Y 0 Z because we do not assume that every word of Z appears in a word in X. PROPOSITION 2.3 Let Z be a code over A, let /3: B* + A* be a coding morphism for Z, and let Y be a code over B. If Y is ( p , &limited and Z is (r, 2)limited, then X = /3( Y ) is ( p r, q t)-limited.
+
Proof
+
Let uo, u1,..., u ~ + , + E~A* + ~be such that ui-lu,EX* (1 Ii s p
+ r + q + t)
2. LIMITED CODES
Fig. 7.3 X is not ( p , &limited for p
+ q 5 3.
Since X c Z* and Z is (r,t)-limited, it follows from (2.2) that ur, ur+ 1, - * ur+p+ q E Z * Since Y is (p,q)-limited, (2.3) and (2.2) for r 1 S i I p u , , ~E X * . Thus X is ( p r,q + t)-limited. 0 9
+
+
(2.3) + q + r show that
EXAMPLE 2.6 Let A = {a,b, c, d } and X = {ba,cd, db, cdb, dba}. Then
x = z, 0 z, 0 z, 0 z, with Z4 = { b, c, d,'ba},
Z , 0 Z4 = {c,d, ba,db}, Z,
0
Z , 0 Z , = { d ,ba, db, cd, cdb}.
The codes Z3 and Z4 are (0, 1)-limited.The code Z 3 0 Z , is not (0, 1)-limited, but it is (0,2)-limited,in agreement with Proposition 2.3. The codes 2, and Z , are (0, 1)-limited. Thus X is (2,2)-limited. It is not (p, q)-limited for any ( p , q ) such that p + q I3, as shown by Fig. 7.3. Let X c A + be a code. Then X is called uniformly synchronous if there is an integer s 2 0 such that
XEX', U , U E A*, u x u ~ X * * U X , X U E X * (2.4) If (2.4) holds for an integer s, it also holds for s' 2 s. The smallest integer s such that (2.4) holds is called the synchronization delay of X . It is denoted by Observe that the synchronization delay is greater than or equal to the deciphering delay. Indeed, assume that a(X) = s. Then for all y E X , x E X", and u E A* yxu E
x* =-xu E x*
showing that the deciphering delay of X is at most s. The term "uniformly synchronous" is justified by
PROPOSITION2.4 A code X is uniformly synchronous if there exists an integer s 2 0 such that any pair (x, y ) E X" x X " is a synchronizing pair.
332
VII. CONJUCACY
Proof
Let s = a(X), and consider a pair ( x , y ) E X 9 x X’.To show that
(x, y ) is synchronizing, let u, u E A* be such that uxyu E X * . Then indeed, by (2.4),ux, yu E X * . Conversely, assume that any pair in X s x X s is synchronizing’and set t = 2s. Let x E X‘, and u, u E A* such that uxu E X*. Set x = x’x’’ with x’, x” E X’.Since the pair (x’,x”) is synchronizing, ux’x’’u E X* implies ux’, x”u E X*.Consequently ux, xu E X*.Thus o ( X ) I t . 0
EXAMPLE 2.7 Let A = { a , b } and X = a*b. Then X has synchronizing delay 1. Indeed X + = A*b; if x E X and uxu E X*,then ux and xu are in A*b. To verify that a code X is uniformly synchronous,we may use the following formulation:Call a word x E X* a constant for X if the set of its contexts is the direct product of the sets of its left contexts and of its right contexts, i.e., C(x) = C,(x) x C,(x).
Then a word x E X ssatisfies condition (2.4)iff x is a constant. Thus a(X) I; s iff the set X sis composed of constants.
PROPOSITION 2.5 Any unijiormly synchronous code is limited. Proof Let X c A + be a uniformly synchronous code, and let s = a(X) be its synchronization delay. We show that X is (2s, 2s)-limited. (see Fig. 7.4). Consider indeed words uo, u1,.
. .,u4s E A*,
and assume that for 1 I i I 4s, Ui-IUi E
x*.
Set, for 1 I i I 2s, xi = uzi-2uzi-1,
Yi
= uzi-iuzi
Let y = y , y z - - . y sand x = x l x z - * * x s . X _ # - - - _
\
/
Q* \
‘\ y1... \
\
k + hw
\
ys
Y
,/‘
/
/’
---_--/
Fig. 7.4 An X-factorization with s = o(X).
Y25
333
2. LIMITED CODES
Assume first that yi # 1 for all i = 1,. , .,s. Then y E X"X*. Since uOyuZs+ E X*,the uniform synchronization shows that uoy E X*. Since uoy = xuZsrthis is equivalent to XUZ, E
x*.
(2.5)
Next, consider the case that yi = 1 for some i E { 1,2,. . .,s}. Then uzi- = uzi = 1. It follows that y i + l " * y ,= uzi+l"'uzs = UZi"'UZ" = Xi+l"'X,UZ".
Thus, in this case also xuzsis in X*. Setting y' = y,+ . y z s , we prove in the same manner that uzsy' E x*.
(2.6)
Since X* is stable, (2.5)and (2.6) imply that uzsE X*.This shows that X is (2s,2s)-limited. 0
EXAMPLE2.8 Consider the (2,2)-limited code X = { ba, cd, db, cdb, dba} of Example 2.5. We have a(X)= 1. Indeed, the contexts of words in X are C ( b 4 = {(1,1),(4
1)}3
C(cd) = { ( 1 , 1 ) , w o }
C ( d 4 = {(1,1)9(194,(c, Mc,a)} C ( c W = {(I, 1),(1,41 C ( d W = {(1,1),(c, I)} Thus each x E X is a constant, and therefore a(X)= 1.
EXAMPLE 2.9 Let X = ab*c u b be the limited code of Example 2.4. It is not uniformly synchronous. Indeed, for all s 2 0, one has b" E X"and ab'c E X. However ub", b"c 4 X. This example shows that the converse of Proposition 2.5 does not hold. We now prove that in the case of finite codes, the concepts introduced coincide.
THEOREM2.6 Let X be
a finite code. The following conditions are
equivalent.
(i) X is circular. (ii) X is limited. (iii) X is uniformly synchronous. Proof We have already established the implications (iii) * (ii) without the finiteness assumption. Thus it remains to prove (i) * (iii).
=. (i)
334
VII. CONJUGACY
Let X c A + be a finite circular code, and let .dt(X) = (P,1,l) be the flower automaton of X with the shorthand notation 1 for the state (1,l). By Proposition 1.2, each element in cp,(A+) has at most one fixed point. In particular, every nonzero idempotent in cp,,(A+) has rank 1. Let M = cpD(A*), and let J be its O-minimal ideal. Since X is finite, the monoid M is finite. Let s = Card(M - J). We show that cp,(w) E J for every word w of length s. Indeed, set w = a 1 a 2 * * * 4with , a, E A, and set m, = cpD(a,az..*ai)for i = 0,.. .,s. If the m, are pairwise distinct, then at least one of them is in J , and consequently cp,(w) E J . On the contrary, there are indices i, j, with 0 I i < j 5 s, such that rn, = mj. Set u = cpD(a,+,-**a,). Then miu = mi. Let e be the idempotent which is a positive power of u. (Recall that M is finite). Then m,e = mj. Further, e has minimal rank; thus e E J . Consequently, mj E J and also cpD(w)E J. This proves the claim. We now show that a(X) I s. Indeed, let x E X sand let u, u E A* be such that uxu E X*.Since 1x1 2 s, one has cp&) E J . Thus cp,(~) has rank 1. Consequently, there are two matrices a E {O,l}pxrand fl E (0,l}'"' such that cp,(x) = afl. Since uxu E X*, there is a path 1 4 p -%q Y, 1. Hence patpq. From x E X*,one gets latpl. It follows that patpl and latflq. Therefore p A 1 and l a q , showing that ux, x u ~ X * This . shows that X is uniformly synchronous. 0 EXAMPLE 2.10
Let A = (a,,
X
Qz,.
I . ,Q Z k } and
= (a,ajI 1 5 i < j 4
2k).
We show that X is uniformly synchronous and a(X) = k. First, o(X)2 k since also (ala2)*'.(4zk-lazk) E x* and h o w e v e r ~ , a ~ . . . a ~$X*.Next,assumethatx~X~,anduxu ~-, E X*.If uand u have even length, then they are in X*.Therefore we assume the contrary. Then u = d o j , u = a,u' with aj, al E A and u', o' E X*.Moreover ujxaIE X*. Since x E X*,we have il < i 2 , i3 < i, . ..,izk- c i2k; since Set x = a,, aLZk. ajxa, E X*, we have j < i l , iz < i3 ,...,i2, < 1. Thus 1 5 j < i , < i z < < i2kiZk c I 5 2k, which is clearly impossible. Thus u and u have even length, showing that a(X) 5 k. This proves the equality.
(aza,)(a*U~)"'(azk-za2k-l) E Xk-' and
-=
Compare this example with Example 2.5, which is merely Example 2.10 with
k = 03. The infinite code was circular but not limited, hence not uniformly synchronous. We now give a characterization of (1,O)-limited codes by means of automata. These codes intervenein Section 5. For that, say that a deterministic automaton at = (Q, 1,l) is an ordered automatonif Q is a partially ordered set, and if the two following conditions hold: (i) For all q E Q, q 5 1.
335
2. LIMITED CODES
(ii) For all p , q E Q, and a E A, p s q
*
p . a ~ q * a .
PROPOSITION 2.7 Let X c A + be a prejx code. The following conditions are equivalent: (i) X is (1,O)-limited,i.e., uu E X* * u E X*. (ii) X* is recognized by some ordered automaton.
Proof (i) (ii) Let d ( X * ) = (Q, 1,l) be the minimal automaton of X*. Define a partial order on Q by p
~
qiff
-
L , c L,,
where for each state p , L, = { u E A* I p u = l}, This defines an order on Q, since by the definition of a minimal automaton, L, = L, =$ p = q. Next let q E Q, and let u E A* be such that 1 u = q. Then u E L, iff uu E X*. Since X is (1,O)-limited, uu E X* implies u E X*, or also u E L , . Thus L, c L ,, and therefore q I1. Further, if p , q E Q with p Iq, and a E A, let u E L,.a. Then au E L,, hence au E L,, and thus u E L,.a. This proves that d ( X * ) is indeed an ordered automaton for this order. (ii) =. (i)Let d = (Q, 1,l)be an ordered automaton recognizing X*. Assume that uu E X*, for some u,u E A*. Then 1 uu = 1. Since 1 u I1, we have 1 uu I1 u. Thus 1 I 1 u I 1, whence 1 v = 1. Consequently u E X*. 0
-
.
-
EXAMPLE 2.1 1 Consider the automaton (Q, 1,l) given in Fig. 7.5. The set Q = { 1,2,3,4} is equipped with the partial order given by 3 < 2 < 1 and 4 < 1. For this order, the automaton (Q, 1,l) is ordered. It recognizes the submonoid X* generated by X = (b2b*a)*{a,ba}. Consequently, this is a (1,O)-limited code. 0
b a
b
Fig. 7.5 An ordered automaton.
336
VII. CONJUGACY
The following proposition gives another characterization of (1, 0)-limited codes.
PROPOSITION2.8 A prefix code X c A + is (1,0)-limited iff the set = A* - XA*
R
of words having no left factor in X is a submonoid. Proof By Theorem 11.1.4,A* = I(*R. Suppose first that X is (1,O)-limited. Let u, u' E R, and set uu' = xr
with x E X*,r E R. Arguing by contradiction, suppose that x # 1. Then x is not a left factor of u. Consequently x = uu, ur = u' for some u E A*. Since X is (1,O)-limited,one has v E X*;this implies that u = 1, since v is a left factor of u'. Thus x = u, a contradiction. Consequently x = 1 and uu' E R. Conversely,suppose that R is a submonoid. Then, being prefix-closed,R is a left unitary submonoid. Thus R = Y* for some suffix code Y. From the power series equation, we get
A*
=
$*I*.
Multiplicationwith 1 - y o n the right gives &* = A* - A*y.Thus X * is the complement of a left ideal. Consequently X* is suffix-closed. Thus X is (1,O)limited. 0
EXAMPLE 2.12 The code X R
= A*
= (b2b*a)*{a,ba}of Example 2.11 gives, for
- XA*, the submonoid R
=
{ b ,b2a}*.
We end this section with the definition of a family of codes which is the most restrictive of the families we have examined: a code X c A + is called comma-free if: (i) X is biprefix; (ii) a(X)= 1. Thus X is comma-free iff for all x E X', u, u E A*,
UXUEX* *
U,UEX*.
(2.7)
Comma-free codes are those with the easiest deciphering: if in a word w E X*, some factor can be identified to be in X,then this factor is one term of the unique X-factorization of w.
PROPOSI~ON2.9 A code X c A + is commafree p , q withp + q = 3,and if A'XA' n X = 0.
i r
it is (p , q)-limited for all
3.
337
LENGTH DISTRIBUTIONS
Proof First suppose that X is comma-free. Let uo, ul, u2, u3 E A* be such that uoul, ulu2, u2u3 EX*.If u1 = u2 = 1, then uo, u3 EX*. Otherwise ulu2 E X + and uoul ~ 2 E X+. ~ 3Thus by (2.7) uo, u3 E X*. Since X is prefix, uo, uoul E X * implies that u1 E X*, and X being suffix, ~ 2 ~ uj3 E, X implies that u2 is in X*. Thus u 0 , ul,u 2 ,u3 E X*. Consequently, Xis (p, +limited for all p , q 2 0 with p + q = 3. Furthermore A+XA+ n X = 0.Indeed assume that uxu, x E X. Then by (2.7) u, u E X*, whence u = u = 1. Conversely, let u,u E A* and x E X + be such ,that uxu E X*. Since A + x A + n X = 0,there exists a factorization x = ps, with p, s E A*, such that up, su E X*. From up, ps, su E X* it follows, by the limitedness of X, that u, p , s, v E X*. Thus (2.7) holds. 0
PROPOSITION 2.10 Let X, 2 be two composable codes and let
x =YOZ. If Y and Z are comma-free, then X is comma-free. Proof Let u, v E A* and x E X t be such that uxu E X*. Since X c Z*, we have uxu E Z*, x c 2'. Since Z is comma-free, it follows that u, u E Z*. Since Y is comma-free, this implies that u, u are in X*.Thus X is comma-free by (2.7). 0
EXAMPLE 2.13 Let A = {a,b } and X = {aab,bab}. The code X is biprefix. Next C(aab)= C(bab)= { ( 1 , l)}. Thus o(X) = 1. This shows that X is commafree. 3. LENGTH DISTRIBUTIONS
In this section, A is supposed to be a finite alphabet. Let X be a subset of A . For n 2 1, let a,, = Card(X n A"). (3.1) The sequence a = (an)n2lis called the length distribution of X. We shall determine, in this paragraph, the length distributions of various families of codes. If X is a code, then by Corollary 1.4.3,the inequality +
holds, where k = Card(A). We begin with the following result which could, of course, have been proved much earlier since its proof is quite elementary. PROPOSITION 3.1
For each sequence of integers a = (a,,)n2
that zr a,k-" I 1, there exists a prejix code X over a k letter alphabetsuchhaving length distribution a.
338
VII. CONJUGACY
Take A a k letter alphabet. We define a sequence (X,Jnr such that,
Proof
(i) X,, c A", (ii) Card(X,,) = a,,, (iii) the union of the X,, is prefix. First choose Xl c A such that Card(X,) = ul .This is possible because a1 I k. Next suppose that Xmis constructed for 1 5 m 5 n - 1. Set
Y(")= x, u
x, u
* * *
u
x"-l
and We have
c a,k"-"' k" 5 1, we have 1 - c:l;=', rx,k-' 2 a,k-". Thus n-1
Card(R(")n A") = k" -
=
m= 1
Since
c",,
amk-m
Card(R(")n A") 2 a,,. Thus we may choose an elements in R(")n A" to form the set X,,. By construction, X,, u Y(")is prefix. 0 Having proved that a sequence a is the length distribution of a code if it satisfies (3.2), we now determine a stronger inequality for the length distribution of a circular code. The enumeration of classes of X-conjugacy plays a central role in this discussion. Given a code X c A + , we denote by L,(X) the number of X-conjugacy classes of X-primitive words in X* n A".
PROPOSITION 3.2 Let X c A + be a circular code. Then for all n 2
1,
LAXI &(A) (3.3) and the equality holds iff each primitive conjugacy class in A" meets X * . Proof Let 9Y be the set of X-primitive X-conjugacy classes in X* n A", and let d be the set of primitive'conjugacyclasses in A". Then by definition
L,(X) = Card(%),
L,(A) = Card(&).
Let C E %. Since the words in Care X-conjugate,they are conjugate, and since X is circular, they are primitive. Thus each class in % is a subset of a class in d . This defines a function cp: % + d which associates, with C E 9Y,the class C' E d containing C.This function is injective since two words in X* which are
3.
339
LENGTH DISTRIBUTIONS
conjugate are also X-conjugate. This proves (3.3). The function cp is surjective iff each element in d contains an element in 3. This proves the second claim. 0 We now compute the numbers L,(X), starting with the length distribution a of X. This will yield an inequality analogous to (3.3) but depending only upon a.
Let a = (a,,)n2 be a sequence of numbers. Define for i 2 1
c
=
mi + m l +
.(+ m i
amiam~*.*am~* =n
Of course, the indices mk satisfy 0 < mk I n, and clearly a f ) = 0 for i 2 n + 1. If a is the length distribution of a code X c A', then a:) = Card(X' n A")
and since a!)is the coefficient of t" in the series(Cn2 since X i = (X)' further
(3.4) Set
and
where p denotes the Mobius function (see Section 0.3). PROPOSITION 3.3 Let X c A + be a code and let a be its length distribution. The number L , ( X ) of X-conjugacy classes of X-primitive words in X* n A" is given by L A X ) = Ma)* Proof
Set for n 2 1 =
c dLd(X)
dln
We shall prove that SJX) = s,(a) by evaluating in two different ways the sum p,, =
" n :Card(X' n A").
'=I 1
(nli)a!). Hence pn = s,(a) by (3.5). To show First, by (3.4), we have pn = that pn = S,,(X), we introduce two notations: for x E X*,we define llxll to be the unique integer i such that x E Xi. Next, we denote by r? the set of X -
340
VII. CONJUCACY
primitive words in X*.With this notation, we may express pn as
Since each word w in X* n A" is the power of a unique word x in 8, and conversely, one has
Next we observe that
since the X-conjugacy class of a word x € 3 has exactly llxll elements. Consequently Pn =
cdLd(X)
= &i(x)*
dln
Thus, we have shown that sn(a)= Sn(X).In other words, we have for n 2 1, &(X) = Sn(a).In View of Mobius' inversion formula (Proposition 0.3.4), this is equivalent to
According to (3.6), this yields Ln(X)= ln(a). 0 Observe that in the case where X = A, the length distribution of X is a = (k,O,O, .. .) with k = Card A. In this case also sd(a)= kd;thus formula (3.6) becomes
ln(k)= A n din xp(i)kd with the convenient notation b ( k ) = ln(ac). Proposition 3.3 thus gives as a special case, Proposition 0.3.5. Proposition 3.3 shows that the numbers L n ( X )depend only on the length distribution of X.The expression of ln(a)allows us to give an explicit form for the inequalities of Proposition 3.2. If a is the length distribution of a circular code over a k letter alphabet, then for all n 2 1
Ma)
ln(k)-
(3.7)
3.
34'
LENGTH DISTRIBUTIONS
TABLE 7.1 Length Distributions of Circular Codes (k = 2) a1 a
0
,
1 1
1
1
0 2
1
0
a,
a
1
2 ,
0
1 0 2 1
1 0 1 2
1 0 0 3
0 1 2 3
For n = 1,2,3,4, we obtain
sl(a) = a19
+ a:, s3(a) = 3a3 + 3a1a, + a:, s4(a)= 4a4 + 2(a; + 2ala3) + 4a:a, + a:. s,(a) = 2a,
The inequalities (3.7) become a1 I
2a,
k
+ a: - a1 I k2 - k
3a3 + 2a1a,
+ a: - a1 I k 3 - k,
4a4 + 2(a2 + 2alct3) + 4cr:a,
+ a': - 2a, - a:
I k4 - k 2 .
Table 7.1 gives, for k = 2, the sequences (aJ1s,,s4 such that the inequalities above are satisfied.We have only reported maximal sequences in the sense that if a: I a,, ( 1 I n I 4), then only (a,,) is represented since (a:) satisfies the inequalities a fortiori. We shall need in the sequel several arithmetic properties of the functions I,,. They are gathered together in
PROPOSITION 3.4 Let a = (a,,),,2 and numbers. 1. If a,,, = B,,, for 1 I m I n - 1, then
fl = (fin)nr be
two sequgnces of
ln(a) - 1AB) = an - B n .
2. If a,,, = 0 for 1 I rn I n - 1, then /,,(a) = a,,. 3. Let n 2 1 be an odd integer. If a,,, = 0 for all odd integers rn with 1 Irn I n - 2, then l,,(a) = a;.
342
Proof n. Then
VII. CONJUCACY
1. By definition, I&)
= (l/n)Ll,p(n/d)sd(a). Let
d be a divisor of
In each monomial am,am2- * a,, , all indices m, satisfy m, s n - 1 except if d = n and i = 1. By the hypothesis, we have am,am,...ami= fl,,flm2...flm, for each of these monomials except for d = n and i = 1. Hence sd(a) = sd(B) for dl n and d # n, and s,,(a) - s,,(fl) = a,, - fl,,. Hence (,(a)- I,,@) = a,, - fl,,. Claim 2 is a consequence of 1 for B = 0. 3. Let d be a divisor of n. Since n is odd, d is odd. All the monomials in the sum (3.8) vanish except for d = n and i = 1. In fact, in the other cases, all the indices mI satisfy mi In - 1 and at least one of them has to be odd since their sum is d. Hence I&) = s,,(a) = a,,. 0 We shall now prove that the inequalities (3.7) characterize the length distributions of circular codes. For this, we say that a finite or infinite sequence (x,),~ of words in A + is a Hall sequence over A if it is obtained in the following way: Let X, = A. Then xl is an arbitrary word in X,.If xi and Xiare defined, then set
Xi+1 = x?(Xi - xi), and xi+ ,is an arbitrary chosen element in Xi+,satisfying 2 IXll. The sequence (Xi)i2 is the sequence of codes associated with the sequence lXi+lI
(Xi),r 1 .
PROFQSITION 3.5 Let associated sequence of codes.
,be a Hall sequence ouer A and let (Xi)l-, be the
1. For all i 2 1, the set Xi is a (i - 1,O)-limited code. 2. For all n 2 1, each primitiue word w E A + such that IWI
> IXll
has a conjugate in Xr+ 3. Far all n > IxJ, &(Xi + 1) = LAA)
Proof
1.
X, = A is (0,O)-limited. Next
xi+,= T o X , ,
(3.9)
3.
343
LENGTH DISTRIBUTIONS
where T is a code of the form b*(B - b). Clearly T is (1,0)-limited. Assuming by induction that Xi is ( i - 1,O)-limited, the conclusion follows from Proposition 2.3. 2. Define x,, = 1. We prove that the claim holds for all i 2 0 by induction on i. For i = 0, the claimjust states that any primitive word is in A*. Thus assume i 2 1, and let w E A + be a primitive word of length IwI > Ixil. Since lxil 2 lxi-ll, one has Iw( Ixi-ll. By the induction hypothesis, there is a conjugate word w f of w which is in X:. The word w’ is not in x: since w’ is primitive and lw’l > IxJ. Thus wf factorizes into
=-
w f = uxu
for some u, u E X: and x E Xi - xi.Then the conjugate W f f= uux
of w’ is in X:(Xi - xi) c X:+ 1. Thus a conjugate of w is in X:+ Claim 3 Follows from claim 2 and Proposition 3.2. 0
1.
Theorem 3.6 Let A be an alphabet with k 2 1 letters. For any sequence of integers such that
a = (a,),,>
(n 2 11,
&(a)Iin(k)
(3.10)
there exists a circular code X over A with length distribution a. Proof
Set for n, m 2 1 En
= 4(k) - Ma)
(3.1 1)
c
(3.12)
and m
(rm=
En.
n= 1
Define inductively a Hall sequence x1,xz,.
..,x,, . ..
as follows. Assume that a Hall sequence (x1, XZ,.
. . ,XqJ
of words of length at most m is already constructed, in such a way that the Y = X,,,+ satisfies length distribution fl of , /?, = a ,
(1 In I; m).
(3.13)
According to Proposition 3.5(3), we have L+1(Ym)= C + l ( k ) ,
(3.14)
344
since m
VII. CONJUGACY
+ 1 2 Ixuml.In view of Proposition 3.4(1),and using (3.13),we obtain 8m+1 -am+1
=L+l(B)-
L+l(a)
= Lrn + 1 (Ym) = Im+l(k)
L + 1 (a)
- L+I(a)
- &,+la This implies that the code Y, contains at least E , + ~ words of length m + 1. Extend the Hall sequence already obtained by choosing a sequence of E,+ words of length m + 1 in Y,, denoted by (Xum+ 1, x u m + 2 , .
.
- 9
xum+
,I.
,
Let y be the length distribution of the associated code Y,+ = X,
Y,, = B,,
,.Then
(1 s n I m).
Indeed, since the sequence of lengths lxil is nondecreasing, the codes X i and Xi+ coincide on all words of length less than Ixil. Consequently, the codes .. .,X , ++,, possess the same words of length at most m.' X, Furthermore, one has y,+ = a,+ 1 , since in the process of constructing Y,+ 1 , one has suppressed, in Y,, E,+ words of length m + 1, implying that
,
Ym+l
=
Em+1
= arn+l-
This shows that y,,=a,,,
15n<m+l.
Starting with the empty Hall sequence,we have thus constructed a sequence (Yo, Y1,* * * , Ym, * * *)
of circular codes such that Y, and Y,+ have the same words of length at most m,and such that Card( Y, n A") = a, . Define Y by
Y n A m= Y, n A". Then Y is a circular code with length distribution a.
0
EXAMPLE3.1 Table 7.2 gives circular codes with length distributions agreeing with those of Table 7.1. They are constructed by the method of Theorem 3.6.
3.
LENGTH DISTRIBUTIONS
345 TABLE 7.2
Circular Codes with Prescribed Length Distribution a, b
b a6 a2b a’b
6
b
a2b,ab2 a’b
a2b a’b, ab’
b ab a3b, a2b2
b
a’b, ab3 a2b2
ab a’b, bab a’b, ba2b b2ab
COROLLARY 3.7 Let A be an alphabet with k 2 1 letters. For all m 2 1, there exists a circular code X c A” such that Card(X) = l,,,(k). Proof Let a = (an)n2l be the sequence with all terms zero except for a, which is equal to l,,,(k). By Proposition 3.4(2), one has ],(a) = 0 for 1 4 n 5 m - 1 and I,(a) = a,. Thus ln(a)I; ln(k)for 1 I; n I; m. According to the proof of Theorem 3.6, this suffices to ensure the existence of a circular code X having a, words of length m. Thus X n A” satisfies the claim. 0
Corollary 3.7 can be formulated in the followingway: It is possible to choose a system X of representatives of the primitive conjugacy classes of words of length m in such a manner that X is a circular code. The following example shows that this does not hold for an arbitrary set of representatives of the primitive conjugacy classes within the words of the same length.
EXAMPLE3.2 Let X be a subset of A 2 - {a2 I a E A} and let 8 be the relation over A defined by a8b
iff
abEX.
Then X is a circular code iff the reflexive and transitive closure 8* of 8 is an order relation. Indeed, assume first that 8 is not an order. Then a1 a29 a203 3 *
* * 9
an - I a n , anal
E
X
for some n 2 1, and a,, ...,u,EA. If n is even, then setting u =a,, u = a2***an, one has uu, uu E X* and u 4 X*. If n is odd, then (ala2.~~an)2 E X* but not a, a, * .a,. Thus X is not circular. Assume conversely that 8 is an order. Then A can be ordered in such a way thatA= {al,a,, ..., a,}andaiOaj*i<j.ThenX c {a,a,Ii<j},andinview of Example 2.10, the set X is a circular code. This statement shows that codes like {ab,bc,ca} are not circular codes, despite the fact that they are composed of representatives of primitive conjugacy classes of words of the same length.
346
VII. CONJUCACY
Not only are the codes X c A" in Corollary 3.7 circular, but they also are uniformly synchronous by Theorem 2.6. We may want to. determine the synchronization delay of these codes. More precisely, we may look for the minimum delay which is realizable. The following result shows that for odd integers m, delay 1 is realizable. 3.8 For any alphabet A with k letters and for any odd integer THEOREM m 2 1, there exists a comma-free code X c A" such that
Card(X) = l,(k). It follows from Example 3.2 that a circular code X c A' having l,(k) elements has the form X = {aia,I i < j } for some numbering of the alphabet. If k is even, then according to Example 2.10, the code X has synchronization delay k/2. Consequently, a result like Theorem 3.8 does not hold for even integers m. To prove Theorem 3.8, we construct a Hall sequence (xi)iL1 and the sequence (XJiL of associated codes by setting
XI= A,
X i + , = $ ( X i - xi),
i 2 1,
(3.15)
where xi is an element of Xi of minimal odd length. By construction, ( x i ) i z is indeed a Hall sequence. Set
Thus Y is the set of words of even length in U,and Z = { x i l i 2 I}.
For any word u E U,we define
I 6(u) = sup{i E N I u E Xi}.
v(u) = min{i E N u E Xi}- 1,
Thus v(u) denotes the last time before u appears in some Xi and 6(u) is the last time u appears in a X i . Observe that Y = { u E U I6(u) = + 00). Next, note that 6(xi) = i, and if v(u) = q for some u E U - A, then u E Xl +, and u 4 X,. Consequently u = xqu for some u E X,, 1. Further, for all u E U and n 2 1, we have v(u) I n < 6(u) * x,u E U. (3.16) We shall prove by a series of lemmas that for any odd integer m, the code Z n A" satisfies the conclusion of Theorem 3.8. 3.9 For all odd integers m, we haue LEMMA
Card(Z n A") = lm(k).
3.
347
LENGTH DISTRIBUTIONS
Proof Let n be the smallest integer such that Jx,J= m. Then by construction of the Hall sequence (xi), we have Z n A" = {x,,,x,,+l,...,x,,+p} for some integer p. Then 2 n A"' = X,,n A", since for all k 2 1, words in X,+k which are not in X, have length strictly greater than IX, , ~ . Next, by the definition of n, we have m > Ix,- l(. According to Proposition 3.5(3), we have L m ( X n ) = Lm(A)By Proposition 3.3, L,(A) = /&), and L,(X,,) = /,,(a), where c1 is the length By Proposition 3.4(3), /,(a) = Card(X,, n A"), because distribution of X,,. ai = 0 for all odd integers i I m - 2. I] LEMMA 3.10 Each word w E A* admits a unique factorization w = YZ1Z2"'Zn
(3.17)
with y E Y*, z1 E 2, n 2 0, and d(Z1)
Proof
2 d(z,) 2
*..
2 d(z,)
First we show that for n 2 1
x,*= x:+,x,*Indeed, by definition X,+ = x,*(X,, - x,,). The product of x,* with X,,- x, is is a code. Thus one has in terms of formal power series unambiguous since X,, = z3xn - 3,).
&I+ I
Consequently, X,,+= x,*& -
-5 :
1
(3.18)
,and
= x,*& - &,* = &,*(X,, - 1).
Formula (3.18)follows by inversion. By successive substitutions in (3.18),starting with A* = X:, one gets for all n21
A*
= x,*+,x,*x,*-, *.ex:
(3.19)
Now let w E A* and set p = IwI. Let n be an integer such that X,,+ contains no word of odd length 5 p . By (3.19)there exists a factorization of w as w =yz1z2'**zk
with 6 ( z l ) 2 6(zJ 2 2 d(z,), zi E 2 and y E X,*+ 1 . Since lyJI p , the choice + of even length. Consequently of n implies that y is a product of words in X,, y E Y*. This proves the existence of one factorization (3.17).Assume that there
348
VII. CONJUCACY
is second factorization of the same type, say, w = y'z;z;*--z;.
Let rn be an integer greater than 6 ( z l ) and 6(z;),and large enough to ensure y, y' E X$+ . Such a choice is possible since all even words of some code X,are also in the codes XI. for 1' 2 1. Then according to (3.19), both factorizations of w are the same. 0
,
Now, we characterize successively the form of the factorization (3.17), for words which are left factors and for words which are right factors of words in U. LEMMA 3.1 1 Each proper left factor w of a word in U admits a factorization (3.17) with y = 1. Proof Each of the codes X, is a maximal prefix code. This follows by iterated application of Proposition 11.4.10. Consequently for n 2 0,
where P, + = X,, + A - is the set of proper left factors of words of X,+ Comparing this equation with (3.19), we get
,.
(3.20)
Let now w be a proper left factor of some word u in U . Then u E X,,+for some n 2 0 and consequently w E P,+ By Eq. (3.20), w admits a factorization of the desired form. 0 LEMMA 3.12 For all n, p 2 1, we have X,X,,+~ E Y*. Further for z E Z and y E Y , we have z y E Y*Z. The first formula is shown by induction on p . For p = 1, we have V ( X , , + ~ ) I n since x,+, EX,,,.Thus according to formula (3.16), we have X,X,,+~ E U . Since x,,x,,+ has even length, x,x,,+ E Y. Assume that the property holds up to p - 1, and set q = V ( X , + ~ ) . We distinguish two cases. First assume q I n. Then by (3.16), with x, + p playing the role of u, we have X,,X,,+~E U . This word has even length. Thus X,,X,,+~ E Y. Next suppose that n c q. Then x , + ~E U - A. Consequently x , , + ~= xqu for some u E U . Since q c n + p (= C~(X,,+~)), we have x,xq E Y* by the induction hypothesis. Next u has even length(because IX"~,Ixq)are both odd). Thus u E Y , whence x,,x,,+, E Y*. Let us prove the second formula. Set n = 6(z)and q = v( y).Then z = x, and y = xqx, for some t. If n c q, then x,xq E Y* by the preceding argument, and consequently z y E Y*Z. On the contrary, assume q < n. Then by (3.16), x,,x,x, = x,y E U . Since it has odd length this word is in Z . 0 Proof
3.
349
LENGTH DISTRIBUTIONS
LEMMA 3.13 Any right factor w of a word in U admits a factorization (3.17) with n = 0 or n = 1. Proof Given a word u E U , we prove that all its right factors are in Y*Z u Y*, by induction on lul. The case (uI = 1 is clear, and clearly it suffices to prove the claim for proper right factors of words in U . Assume l u l - 2 2. Set n = v(u). Since u E U - A, we have u = x,u’ for some u‘ E
u.
Let w be a proper right factor of u. If w is a right factor of u’, then by the induction hypothesis, w is in Y*Z u Y * . Thus we assume that w = w’u’, with w’a proper right factor of x,. By induction, w’ is in Y*Z u Y*.If w‘ E Y*, then w‘u’ E Y*(Y u 2 ) and the claim is proved. Thus it remains the case where w’ E Y*Z.In this case, set w’ = yxk with y E Y *, k 2 1. Observe that k I n since lxkl I lw‘l I Ix,( (see Fig. 7.6).
u:
-
X”
0
-
WI
W:
0
W’:
0
--
Y
xk
U’ 0
U’
0
’
Fig. 7.6
We now distinguish two cases: First, assume u’ E Y.Then by Lemma 3.12, xku‘ E Y*Z. Consequently, w = yxku’ E Y * Z . Second, suppose that u’ E 2. Then u’ = x, for some m. We have x, E X,, 1, implying that rn > n. Since k In, we have k < m and by Lemma 3.12, xkx, E Y*. Thus w = y x k x , E Y*. This concludes the proof. 0 Proof of Theorem 3.8 Let m be an odd integer and let X = Z n A”. Let x , x’, x” E X.Assume that for some u, u E A’, XX‘
= ux”u
(3.21)
Then for some w, t E A’, we have x = uw, x” = wt, x’ = tu. Since x” has odd length, one of the words wand t must have even length. Assume that the length of w is even (see Fig. 7.7). Since w is a proper left factor of x” F Z, we have by Lemma 3.11, a factorization w = z1z2*--z, with zl, z2,.. .,z, E Z and 6(z,) 2 -. 2 6(z,). On the other hand, the word w is a right factor of x E 2, and according to Lemma 3.13, we have w E Y * Z u Y * . Since w has even length, w E Y * . Thus n = 0 and w = 1, a contradiction. 0
VII. CONJUGACY
Fig. 7.7
EXAMPLE 3.3 Let A = {a,b}. A sequence (X& of the construction given above is
satisfying the conditions
XI = {a, b}, X, = {b, ab, a2b, a3b,a4b,...>,
X, = {ab,a2b,a3b,a4b,... bab, ba2b,ba3b,.. . b2ab, b2a2b,.. . b3ab,.. .},
X, = {ab,bab,a3b, a4b,... ba2b,ba3b,... b2ab, b2a2b,.. . b3ab,.. . a2bab,.. .},
X, = {ab,
a3b, a4b,... ba2b,ba3b,.. . b2ab, b2a2b,... b3ab,,.. a2bab,.. . babab, ...}.
We have represented only words of length at most five. Words of the same length are written in a column. Taking the words of length five in X,,we obtain all words of length five in the code 2. Thus the following is a commafree code X c A’:
X
= {a4b,ba3b,b2a2b,b3ab,a2bab,babab}.
4.
35'
FACTORIZATIONS OF FREE MONOIDS
It has Card(X) = 1,(5) = 6 elements. The words of length three in X , give the comma-free code of Example 2.13. 4. FACTORIZATIONS OF FREE MONOIDS
Several times in the previous sections, we have used special cases of the notion of factorization which will be defined here. We shall see in this section that these factorizations are closely related to circular codes. Let I be a totally ordered set and let (T)icr be a family of subsets of A + indexed by I. An orderedfuctorizution of a word w E A* is a factorization w = X1X,"'Xn, (4.1) with n 2 0, xi E Xji such thatj, 2 j 2 2 * * * j , . A family (XJiEris afuctorizution of the free monoid A* if each word w E A* has exactly one ordered factorization. If ( X J i EisI a factorization, then each Xi is a code, since otherwise the unique factorization would not hold for words in XF.We shall see later (Theorem 4.1) that each Xi is in fact a circular code. f Let us give a formulation in formal power series. Consider a family ( c i ) i sof formal power series over an alphabet A with coefficients in a semiring K, indexed by a totally ordered set I. Assume furthermore that the family ( b i ) i e is f locally finite. Let J = { j , , j 2 , . . .,j.} be a finite subset of I, with j, 2 j , 2 * .. 2 j,. Set ZJ
= (7j,(7j2"'(7j,.
Then for all w E A*, (55, W )
= X,XZ'
C
' ' X"
..
( a j , ,Xl)(Ojl, ~ 2 ) . =w
(oj.5
xn)*
(4.2)
Let d be the set of all finite subsets of I. Then the family ( t j ) j E dis locally finite. Indeed, for each word w E A*, the set F(w) of factors of w is finite. For each x E F(w),the set I, of indices i E I such that (ui,x) # 0 is finite. From (4.2), it follows that if ( t J w, ) # 0, then J c UXEF(,,,) 1,. Consequently there are only finitely many sets J such that (tJ,w ) # 0. These considerations allow us to define the product (7
=fl(l icf
+
(7J
by the formula fJ =
tJ. J € d
352
VII. CONJUGACY
If I is finite, we obtain the usual notion of a product of a sequence of formal power series, and the latter expression is just the expanded form obtained by distributivity. Consider a family (XJieIof subsets of A + indexed by a totally ordered set 1. If the family is a factorization of A*, then
Conversely, if the sets X , are codes and if the semigroups X: are pairwise disjoint, then the product fliplXf is defined and (4.3) implies that the family (Xi)ieris a factorization of A*.
EXAMPLE 4.1 Formula (3.19) states that the family (X,+l,x,,...,xl) is a factorization of A* for all n 2 1. Lemma 3.10 says that the family (Y, . ..,x,, x,- l,. ..,xl) is a factorization of A*. The main result of this section is the following theorem.
THEOREM 4.1 Let (Xi)iEI be a family of subsets of A + indexed b y a totally ordered set I . Two of the three following conditions imply the third. (i) Each word w E A* has at least one ordered factorization. (ii) Each word w E A* has at most one ordered factorization. (iii) Each of the Xi ( i E I ) is a circular code and each conjugacy class of nonempty words meets exactly one among the submonoids X:. The proof is based on an enumeration technique. Before we give the proof, we need some results concerning the logarithm of a formal power series in commuting or noncommuting variables. We shall consider a slightly more general situation, namely, the formal power series defined over monoids which are direct products of a finite number of free monoids. Let M be a monoid which is a direct product of finitely many free monoids. The set S=QM
of functions from M into the field C? of rational numbers is equipped with the structure of an algebra as it was done for formal series over a free monoid. In particular if 0, z E S, the product nz given by (az,m) =
C (0,u)(z, v )
uv=m
is well defined since the set of pairs (u,u ) with uu = m is finite. As in the case of
4.
FACTORIZATIONS OF FREE MONOIDS
353
formal power series over a free monoid, a family of elements of S is locally finite if for all m E M, the set { i E I I (ai, m) # 0} is finite. Define S'" = {a E S I (a,1) = O}.
For a E S(l),the family (d')nLO of powers of Q is locally finite. Indeed, for each m E M , (a",m) = 0 for all n greater than the sum of the lengths of the components of m. This allows us to define for all 0 E S").
+ a3/3 - .. + ( - 1)" a2 a" exp(a) = 1 + a + - + + - + 2! n!
log(1 + a) = a - a2/2
+. .
+ 'o"/n
* * *
'
(4.4) (4.5)
Let M and N be monoids which are finite direct products of free monoids. Let S = Q M and T = QN. A morphism u:
M+T
from the monoid M into the multiplicative monoid T is called continuous iff the family (u(m)),,,sMis locally finite. In this case, the morphism u can be extended into a morphism, still denoted by u, from the algebra S into the algebra T by the formula
This sum is well defined since the family (a(m)),,,EMis locally finite. The extended morphism a is also called a continuous morphism from S into T. For any locally finite family ( o ~ ) ~of , , elements of S, the family ( u ( o ~ ) ) ~ also € , is locally finite and (4.71
According to formula 4.7, a continuous morphism a: S -,T is entirely determined by its definition on M, thus on a set X of generators for M. Furthermore, a is continuous iff a(X - { l}) c T ( ' )and the family (a(x)),,x is locally finite. This is due to the fact that each m E M has only finitely many factorizations m = x1x2**.xk with x1,x2,.. . , x k E X - 1. It follows from (4.6) that if a E S('), then .(a) E T(').From (4.7), we obtain log(1
+ a(a))= a(log(1 + a)),
(4.8)
exp(a(4) = a(exp(4). (4.9) According to classical results from elementary analysis, we have the following formulas in the algebra Q[[s]] of formal power series in one
354
VII. CONJUCACY
variable s: exp(log(1 + s)) = 1
+ s,
Furthermore, in the algebra Q [ [ s , t ] ] commuting variables s, t, we have exp(s
+ t) = exp(s)exp(t),
log(exp(s)) = s.
(4.10)
of formal power series in two
log((1 + s)(l + t)) = log(1 + s) + log(1 + t). (4.1 1)
Let M be a monoid which is a finite direct product of free monoids and let S = Q M . Let a E S'" and let a be the continuous morphism from the algebra Q[[s]] into S defined by a(s) = 0. Then by formulas (4.8)-(4.10), we have exp(log(1
+ a))= 1 + a,
log(exp(a)) = Q
(4.12)
showing that exp and log are inverse bijections of each other from the set S'') onto the set 1 + S'" = { 1 + t 15 E S"'}. Now consider two series a, t E S") which commute, i.e., such that OT = TQ. Since the submonoid of S generated by a and T is commutative, the function a from s* x t* into S defined by a(sPtq)= Q P Z ~ is a continuous morphism from Q[[s, t ] ] into S and by (4.1I),
+ t )= exp(a)exp(t), log((1 + o)(l + T ) ) = log(1 + a) + log(1 + exp(a
T).
(4.13)
These formulas d o not hold when a and T do not commute. We shall give a property of the difference of the two sides of (4.13) in the general case. A series a E QA' = Q is called cyclically null if for each conjugacy class C c A* one has (a,C)=
c (a,
w ) = 0.
Wac
Clearly any sum of cyclically null series still is cyclically null. PROFQSITION 4.2 Let A be an alphabet and let S = Q ( ( A ) ) . Let a: S --* S be a continuous morphism. For each cyclically null series a E S, the series a(a) is cyclically null. Proof Let T c A* be a set of representatives of the conjugacy classes of A*. Denote by C(t)the conjugacy class of t E T. Let /
\
The family of polynomials ( ~ w s C ( rw) () (ow, - t)XeT is locally finite. Thus the
4.
FACTORIZATIONS OF FREE MONOIDS
355
sum is well defined. Next
Since 6 is cyclically null, the second series vanishes and consequently z = 0. It follows that
In order to prove the claim, it suffices to show that each series a(w) - a(t) for w E C(t)is cyclically null. For this, consider w E C(t).Then t = uu, w = uu for some u, u E A*. Setting p = a(u),v = a(u), one has a(w) - a(t) = vp - pv.
Next VP
=
c
x,ye A*
(V,X)(P,Y)XY.
Thus, V P - PV =
c
(v,X ) ( P , Y b Y - Y X ) .
&YEA+
Since each polynomial x y - y x clearly is cyclically null, the series v p hence .(a) is cyclically null. 0
- pv and
PROPOSITION 4.3 Let A = {a,b), and let C be a conjugacy class of A*. Then
+ a)(l + b)),C)= (log(1 + a), C)+ lOg(1 + b),C) (4.14) words, the series log((1 + a)(l + b ) ) - log(1 + a) - log(1 + b) is (log((1
In other cyclically null. Proof
One has (1 + a)(l + b) = 1 + a log((1
+ b + ab and
+ a)(l + b)) = C (-l)"+l m m>l
(a + b + ab)".
Let w E A", and let d be the number of times ab occurs as a factor in w. Let us verify that (4.15) ((a b ab)", W ) = ( n m). Indeed, ((a + b + ab)", w ) is the number of factorizations
+ +
w =XIX~"'Xm
356
VII. CONJUGACY
of win m words, with x i E {a, b, ab}. Since w has length n and the xys have length 1 or 2, there are exactly n - m xI)s which are equal to ab. Each factorization of w thus corresponds to a choice of n - m factors of w equal to ab among the d occurrences of ab. Thus there are exactly (,!,) factorizations. This proves (4.15). Now let C be a conjugacy class, let n be the length of the words in C and let p be their exponent. Then Card(C) = n/p. If C c a*, then C = {a"}. Then formula 4.15 shows that ((a + b + ab)",a") equals 1 or 0 according to n = m or not. Thus both sides of (4.14)in this case are equal to (- l)"/n. The same holds if C c b*. Thus we may assume that C is not contained in a* u b*. Then the right-hand side of (4.14)equals 0. Consider the left-hand side. Since each word in C contains at least one a, there is a word win C whose first letter is a. Let d be the number of occurrences of ab as a factor in w. Among the n/p conjugates of w, there are d/p which start with the letter b and end with the letter a. Indeed, set w = up. Then the word v has d / p occurrences of the factor ab. Each of the d / p conjugates of w in bA*a is obtained by "cutting" v in the middle of one occurrence of ab. Each of these d / p conjugates has only d - 1 occurrences of ab as a factor. The (n - d ) / pother conjugates of w have all d occurrences of the factor ab. According to formula 4.15, we have for each conjugate u of w,
Summation over the elements of C gives
+ b + ab)", C)= Since c): = ((d - n + m)/dX,,!,,,), we ((a
obtain ((a
+ b + ab)",Q = (m/p)(,,!,).
Consequently (4.16)
Since n > d and d # 0, this alternating sum of binomial coefficients equals 0. 0 The following proposition is an extension of Proposition 4.3. PROPOSITION 4.4 Let ( C T ~ ) ~be~ , a locally Jinite family of elements of Q ( ( A ) )indexed by a totally ordered set I , such that (ai,1 ) = 0 for all i E I . The
4.
357
FACTORIZATIONS OF FREE MONOIDS
series log
is1
+ ad) - i1c l log(1 + q)
(4.17)
is cyclically null. Proof Set S = QA' = Q ( ( A ) ) and S(') = {aE S I (a, 1) = O}. Let a, z E S"). The series
6 = log((1 + a)(l + 7 ) ) - log(1 + a) - log(1 + 7 ) is cyclically null. Indeed, either a and 7 commute and 6 is null by (4.13), or the alphabet A has at least two letters a, b. Consider a continuous morphism a such that a(a) = a, a(b) = 7 . The series d = (log((1
+ ~ ) (+l b)) - lOg(1 + U ) - log(1 + b)
iscyclically null by Proposition 4.3. Since 6 = a(d),Proposition 4.2 shows that 6 is cyclically null. Now let zl, 72,. . . , 7 , E 1 S(l).Arguing by induction, assume that
+
is cyclically null. In view of the preceding discussion, the series
n
is cyclically null. This proves (4.17) for finite sets I . For the general case, we consider a fixed conjugacy class C. Let n be the length of words in C and let B = alph(C). Then B is finite and C c B". Define an equivalence relation on S by a
7
iff (a,w ) = (7, w ) for all w E B'"'.
-
-
(Recall that B"']= { w E B* I IwI 5 n}.) Observe first that a 7 implies ak t k for all k 2 1. Consequently a 7 and a,7 E S(')imply log( 1 + a ) log( 1 + 7 ) . Consider the family (ai)islof the statement. Let
-
N
~ ~ = { i ~ I l a ~ - oI }' =, I - I ~ .
Then I' is finite. Indeed, for each w E BI"]there are only finitely many indices i such that (q,w ) # 0. Since B is finite, the set BI"]is finite and therefore I' is finite.
358
VII. CONJUGACY
Next observe that
since in view of (4.2), we have ( o J ,w) = 0 for w E B["]except when J c Z'. It follows from (4.18) that log(g(l+
(Ti))
*
log(gjl+
(Ti))*
Consequent I y
-
Next, since gi ,., 0 for i E I,, we have log( 1 + oi) 0 for i E I,. Thus
1log(1 +
(Ti),
(is1
)
C
=
1
1 log(1 + bJ,C .
(is1,
From the finite case, one obtains
Putting all this together, we obtain
Thus the proof is complete. 0 To prove Theorem 4.1, we need a final lemma which is a reformulation of Propositions 1.5 and 1.6.
PROPOSITION 4.5 Let X c A + be a code. 1. For each conjugacy class C meeting X * , we have
(log&*, C ) 2 (log A*, C), and the equality holds if X is a circular code. 2. Conoersely if (logX*,C)= (logA*, C)for all conjugacy classes that meet X*, then X is a circular code. Proof We have g*= (1 - X)-'. Thus log(X*(l - X))= 0. Since the series X* and 1 - & commute, we have 0 = l o g z * + log(1 - X),showing that l o g z * = -log(l - X).Thus
4.
FACTORIZATIONS OF FREE MONOIDS
359
In particular, if C c A'" is a conjugacy class, then (log$*, C) =
1 C -Card(X" mzim
n C).
For X = A, the formula becomes 1 (log A*, C) = -Card(C). n
With these computations, the proposition is a direct consequence of Propositions 1.4 and 1.5. I]
Proof of Theorem 4.1 Assume first that conditions (i) and (ii) are satisfied, I a factorization of A*. Then the sets Xi are codes i.e., that the family ( X ) i E is and by formula 4.3, we have (4.19)
Taking the logarithm on both sides, we obtain /
\
(4.20)
By Proposition 4.4, the series 6
= logA* -
1log xi*
(4.21)
is1
is cyclically null. Thus for each conjugacy class C (logA*, C) = C (1% isl
x: ,C).
(4.22)
In view of Proposition 4.5, we have for each i E I and for each C that meets Xi* the inequality (log A*, C ) 5 (log
xi*,C)
(4.23)
Formulas (4.22) and (4.23) show that for each conjugacy class C, there exists a uniquej E I such that C meets X J . For this indexj, we have (4.24)
Thus if some X ? meets a conjugacy class, no other Xi*(i E I - j) meets this conjugacy class. Since (4.24) holds, each of the codes Xiis a circular code by Proposition 4.5. This proves condition (iii). Now assume that condition (iii) holds. Let C be a conjugacy class and let i E I be the unique index such that X fmeets C. Since Xi is circular, (4.24) holds by Proposition 4.5 and furthermore (log $7, C ) = 0 for all j # i. Summing up
360
VII. CONJUCACY
all equalities(4.24), we obtain Eq. (4.22). This proves that the series 6 defined by (4.21) is cyclically null. Let a be the canonical morphism from Q ( A > onto the algebra Q [ [ A ] ] of formal power series in commutative variables in A. The set of words in A* having the same image by a is union of conjugacy classes, since a(uu) = a(uu). Since the series 6 is cyclically null, the series a(6) is null. Since a is a continuous morphism, we obtain by applying a to both sides of (4.21)
O = loga(,j*) - C Ioga(X:) iEI
Hence loga(A*) = C loga(Xr) iol
Next, condition (iii) ensures that the product 4.4. the series
(4.25)
nielX: exists. By Proposition
is cyclically null. Thus its image by a is null, whence
This together with (4.25) shows that
Since log is a bijection, this implies
,
This means that a(&)= 0, where
Observe that condition (i) means that all the coefficients of E are negative or zero. Condition (ii) says that all coefficients of E are positive or zero. Thus in both cases, all the coefficients of E have the same sign. This together with the condition a(&)= 0 implies that E = 0. Thus if condition (iii)and either (i) or (ii) hold, then the other one of conditions (i) and (ii) also holds. 0 A factorization (Xi)ialis called complete if each Xi is reduced to a singleton x i . The following result is a consequence of Theorem 4.1.
COROLLARY 4.6 Let (xJiol be a complete factorization of A*. Then the set
4.
361
FACTORIZATIONS OF FREE MONOIDS
X = { x i 1 i E I } is a set of representatives of the primitive conjugacy classes. In particular, for alt n 2 1, (4.26)
Card(X n A") = l,(k) with k = Card(A).
Proof According to condition (iii) of Theorem 4.1, each conjugacy class intersects exactly one of the submonoids xf. In view of the same condition, each code {xi} is circular and consequently each word x i is primitive. This shows that X is a system of representatives of the primitive conjugacy classes. Formula (4.26)is an immediate consequence. 0 Now we describe a systematic procedure to obtain a large class of complete factorizations of free monoids. These include the construction used in Section 3. A Lazard set is a totally ordered subset Z of A + satisfying the following property: . .,zk} with z1 < 2, c c zk For each n 2 1, the set Z n A["]= {z1,z2,. satisfies zi€zi
for 1 Ii Ik,
Zk+,n Atn] = 0,
and
where the sets 2, ,...,z k + are defined by 2, = A,
Z,,, = z:(Zi (Recall that A["]= { w E A* I IwI In}.)
(1 I i I k).
- zi)
EXAMPLE 4.2 Let (x,),~ be a Hall sequence over A and let Z = {x, I n 2 l } be the subset of A + ordered by the indices. Then Z is a Lazard set. EXAMPLE4.3 Let (x,Jn2 be the sequence used in the proof of Theorem 3.7. Recall that we start with XI= A and
i 2 1,
X,,, = x:(X, - x i )
where x i is a word in Xi of minimal odd length. Denote by Y the set of even xi. words in Now set Yl = Y and for i 2 1,
ui2,
where yi E
5 is chosen with minimal length. Let T=
ordered by
{Xi,JJili2
1)
362
VII. CONJUCACY
The ordered set T is a Lazard set. Indeed, let n 2 1 and
T n A["' = {xi, ~
2
-.,.,xr, y1, ~ 2 , .. .,y,}.
Set zi
lIiIr+l,
= xi,
1 I i I S.
Zr+i+l = yr(Zp+i - yi), We show by induction on i that = yl: n A["] Zr+in knJ
(1 Ii I s
,
+ 1).
(4.27)
Indeed, words in Xr+ = Zr+ of length at most n all have even length (since the words with odd length are xl,x2, ..., xr). Thus all these words are in Y = Y,. Conversely, any word of even length I n is already in Xr+I,since Ixr+11 > na Next, consider y E n A'"'. Then y = ypy' for some y' E yl. - yi. Since Iy'l I n, we have by the induction hypothesis y' E Z r + i ,whence y E Z r + i + The converse is proved in the same way. Equation (4.27) shows that yi E Zr+i for 1 5 i I s and that Zr+,+,n A[''] = 0. Thus T is a Lazard set.
x+,
,.
PROPOSITION 4.7 Let Z c A + be a Lazard set. Then the family ( z ) is~a ~ ~ complete factorization of A*. Proof Let W E A * and n = (wl.Set Z n A["]= {zl,zz,..., z,} with z1 < z2 < . * . < zk. Let Z, = A and Zi+ = zF(Zi - zi) for i = 1,2,.. .,k. Then zi E Zi for i = 1,2,... ,k and Z, + n A["]= 0. As in the proof of Lemma 3.10, we have for 1 I i I k,
,
,
zr - = g+, z r , whence by successive substitutions
A*
= z:+lz$.**zf.
(4.28)
Thus there is a factorization w = yzirzi2**.zi, with y E Zk+, and i, 2 i22 2 in. Since Zk+, n A["]= 0, we have y = 1. This proves the existence of an ordered factorization. Assume there is another factorization, say w = t,t2*-*t,, with ti E 2, t , 2 t2 2 ... 2 t,. Then t i E Z n AlnJfor each i. Thus by (4.28) both factorizations coincide. 0 We conclude this section with an additional example of a complete factorization. Consider a totally ordered alphabet A. Define a relation called alphabetic ordering on A* by setting u < u if u is a proper left factor of u, or if u = ras,
4.
FACTORIZATIONS OF FREE MONOIDS
u = rbt, a
-
363
< b for a, b E A and r, s, t E A*. The alphabetic ordering has the
property u i u
wuiwu.
By definition, a Lyndon word is a primitive word which is minimal in its conjugacy class. In an equivalent way, a word w E A + is a Lyndon word iff w = uu with u, v E A + implies w iuu. Let L denote the set of Lyndon words. We shall show that ( l ) I , L is a complete factorization of A*. For this we establish lemmas which are interesting on their own.
PROPOSITION 4.8 A word is a Lyndon word ifl it is smaller than all its proper nonempty right factors. Proof The condition is sufficient. Let w = uu, with u, u E A + . Since w iu and u < uu, we have w < vu. Consequently w E L. Conversely, let w E L and consider a factorization w = uu with u, u E A+. First, let us show that u is not a left factor of w. Assume the contrary. Then w = vt for some t E A + . Since w E L, we have w < tu. But w = uu implies uu 4 tu. This in-turn implies u it whence, multiplying on the left by u, uu iut = w, a contradiction. Suppose that v < uu. Since v is not a left factor of w, this implies that uu < uu and w 4 L, a contradiction. Thus uu iu, and the proof is completed. 0
PROPOSITION 4.9 Let I, m be two Lyndon words. I f I < m, then Im is a Lyndon word. Proof First we show that lm 4 m. If 1 is a left factor of m, let m = lm'. Then m < m' by Proposition 4.8. Thus lm < lm' = m. If 1 is not a left factor of m, then the inequality 1 < m directly implies lm < m. Let u be a nonempty proper right factor of lm. If u is a right factor of m, then by Proposition 4.8, m < u. Hence lm < m < u. Otherwise u = u'm for some proper nonempty right factor u' of 1. Then 1 < u' and consequently lm < u'm. Thus in all cases lm < u. By Proposition 4.8, this shows that lm E L. 0
THEOREM 4.10
The family ( l ) l e L is a complete factorization of A*.
Proof We prove that conditions (i) and (iii) of Theorem 4.1 are satisfied. This is clear for condition (iii)since L is a system of representativesof primitive conjugacy classes. For condition (i), let w E A'. Then w has at least one factorization w = ll I , * I, with li E L. Indeed each letter is already at Lyndon word. Consider a factorization w = 11l2*..ln
364
VII. CONJUGACY
into Lyndon words with minimal n. Then this is an ordered factorization. Indeed, otherwise, these would be some index i such that Ii < li+ But then 1 = Illi+ E Land w would have a factorization into n - 1 Lyndon words. Thus condition (i) is satisfied. 0
,.
It can be seen (Exercises4.3,4.4) that the set L is a Lazard set. 5. FINITE FACTORIZATIONS
In this section we consider factorizations (Xi)iE,with I a finite set. These are families X,,X,_ l , . .., X l of subsets of A + such that
A*
= g,*g,*-,*.*x:.
(5.1)
According to Theorem 4.1, each Xi is a circular code and each conjugacy class meets exactly one of the X : . The aim of this section is to refine these properties. We shall see that in some special cases the codes X i are limited. The question whether all codes appearing in finite factorizations are limited is still open. We start with the study of bisections, that is, factorizations of the form ( X , Y ) .Here X is called left factor and Y is called right factor of the bisection. Then
A* - _X * Y- * .
(5.2)
EXAMPLE 5.1 Let A = (a,b } . The pair (n*b,a) is a bisection of A*. More generally, if A = A, u Al is a partition of A, the pair (AX A,, A,) is a bisection of A*. Formula (5.2) can be written as (5.3) -Y -X + A = g + _r. Indeed,(5.2)is equivalent to 1 - A = (1 1 - g)by taking the inverses. This gives (5.3). Equations (5.2)and (5.3) show that a pair ( X , Y ) of subsets of A + is a bisection iff the following are satisfied:
x)(
A c X u Y , XnY=%,
(5.4)
Y X C X U Y,
each z E X u Y , z # A factorizes uniquely into z = yx with x E X , y E Y. (5.7) We shall see later (Theorem 5.4) that a subset of these conditions is already enough to ensure that a pair ( X , Y ) is a bisection.
5.
365
FINITE FACTORIZATIONS
Before doing that, we show that for a bisection ( X , Y ) the code X is (1,O)limited and the code Y is (0, 1)-limited.Recall that a (1, 0)-limited code is prefix and that by Proposition 2.8 a prefix code X is (1,O)-limited iff the set R = A* - XA* is a submonoid. Symmetrically, a suffix code Y is (0, 1)-limitediff the set S = A* - A*Y is a submonoid. PROPOSITION 5.1 Let X , Y be two subsets of A’. The following conditions are equivalent: (i) ( X , Y ) is a bisection of A*. (ii) X , Y are codes, X is (1,O)-limited and Y * = A* - X A * . (iii) X , Y are codes, Y is (0,1)-limited and X* = A* - A*Y.
Proof (i) =S (ii) From A* = X*y*we obtain by multiplication on the left = A* - XA*.The by 1 - X the equation (1 - 1;’)A* = _Y*, showing that number of left factors in X of any word w E A* is @A*, w). The equation shows that this number is 0 or 1, according to w E Y* or w # Y*. This proves that X is a prefix code. This also gives the set relation Y* = A* - XA*. Thus A* - X A * is a submonoid and by Proposition 2.8, the code X is (1,O)-limited. (ii) * (i) By Theorem 11.1.3, we have A* = X * R with R = A* - X A * . Since R = Y * and Y is a code, we have R = Thus A* = &*y*. Consequently (i)and (ii) are equivalent. The equivalence between (i) and (iii) is shown in the same manner. 0
x*
r*.
COROLLARY 5.2 The left fattors of bisections are precisely the (1, 0)-limited codes. 0. Observe that for a bisection ( X , Y ) , either X is maximal prefix or Y is maximal suffix. Indeed, we have Y* = A* - X A * . If Y * contains no right ideal, then XA* is right dense and consequently X is maximal prefix. Otherwise, Y* is left dense and thus Y is maximal suffix.
PROPOSITION 5.3 Let M , N be two submonoids of A* such that
A*
= MN.
Then M and N are free and the pair ( X , Y ) of their bases is a bisection of A*. Proof Let u, v be in A* such that uu E M.Set u = mn with m E M , n E N . Similarly set um = m’n‘ for some rn’ E M , n’ E N (see Fig. 7.8). Then UY = m’(n’n). Since uv E M , unique factorization implies n = n’ = 1. Thus v E M. This shows that M satisfies condition C(1,O). Thus M is generated by a (1,O)limited code X . Similarly N is generated by a (0, 1)-limited code Y . Clearly ( X , Y ) is a factorization. 0
EXAMPLE5.2 Let M and N be two submonoids of A* such that M n N = {l}, M u N = A*.
366
VII. CONJUCACY
Fig. 7.8 Factorizations.
We shall associate a special bisection of A* with the pair (M,N). For this, let R = { r E A* I r = u u a u E M}
be the set of words in M having all its right factors in M. Symmetrically,define
s = {s E A* I s = uu * u E N}. The set R is a submonoid of A* because M is a submonoid. Moreover, R is suffix-closed. Consequently the base of R, say X, is a (1,O)-limited code. Similarly S is a free submonoid and its base, say Y,is a (0, lklimited code. We prove that (X,Y) is a bisection. In view of Proposition 5.1, it suffices to show that Y* = A* - XA*. First, consider a word y E Y* = S. Then all its left factors are in N. Thus no left factor of y is in X. This shows that Y* c A* - XA*. Conversely,let w E A* - XA*. We show that any left factor u of w is in N by induction on lul. This holds clearly for lul = 0. Next, if 1u1 2 1, then u cannot be in R = X* since otherwise w would have a left factor in X.Thus there exists a factorization u = u’u’ with u’ $ M. Hence u’ E N and u’ # 1. By the induction hypothesis, u’ E N. Since N is a submonoid, u = u’u‘ E N. This proves that w E S = Y*. A special case of this construction is obtained by considering a morphism cp: A* + Z into the additive monoid Z and by setting M={rn~A*Icp(rn)>O}u(l},
N
I
= {n E A* cp(n) I O}.
Given a word w E A*, we obtain a factorization w = rs with r E R, s E S as follows. The word r is the shortest left factor of w such that cp(r)is maximal in the set of values of cp on the left factors of w (see Fig. 7.9). The construction of Example 5.2 can be considered as a special case of the very general following result. 5.4 Let (P,Q ) be a partition of A’. There exists aunique bisection THEOREM ( X , Y ) of A* such that X c P and Y c Q. This bisection is obtained as follows.
5.
FINITE FACTORIZATIONS
a
367
b
a
a
b
a
b
Fig. 7.9 q(a) = 1, q(b)= - 1, w = abaabab.
Let Xl=PnA,
Y,=QnA
(5.8)
Y, = 2, n Q.
(5.10)
and for n 2 2, n- I
X , = Z,, n P, Then X = UX.
and
Y=
nr1
u Y,.
(5.1 1)
n t1
Proof We first prove uniqueness. Consider a bisection (X, Y ) of A* such that X c P and Y c Q. We show that for n 2 1, we have X n A" = X,, Y n A" = Y,, with X , and & given by (5.8) and (5.10). Arguing by induction, we consider n = 1. Then X n A c P n A = X,. Conversely we have A c X u Y by (5.4) and P n Y = 0.Conseauently P n A c X . Thus X n A = X I . For n 2 2, we have 2, c Y X n A" by the induction hypothesis. Thus by (5.6), Z, t ( X u Y ) n A". This implies that 2, n P c X n A" and 2, n Q c Y n A". Conversely, let z E ( X u Y ) n A". Then by (5.7)z = y x for some y E Y , x E. X . By the induction hypothesis, y E X and x E X , - for i = I y ( . In view of (5.9), we have z E 2,. This shows that ( X u Y) n A" c Z,,. Hence X n A" 5 Z,, n P and Y n A" c 2, n Q. To prove the existence of a bisection, we consider the pair ( X , Y) given in (5.11). We proceed in several steps. Define Z , = A and set Z = Unrl2,. In view of (5.8) and (5.10) we have 2 = X u Y. Observe first that by formula (5.9) YXVA=XUY.
(5.12)
Clearly (5.12) implies Y X c X u Y. By induction, we obtain
Y * X * c x* u Y*.
(5.13)
368
VII. CONJUGACY
Next, we have
A*
= X*Y*.
(5.14)
Indeed, let w E A*. Since A c 2, the word w has at least one factorization w = zlz2 z , with z, E 2. Choose such a factorization with n minimal. Then we cannot have z, E Y,zi+ E X for some 1 5 i In - 1,since this would imply that zizI+ E 2 by (5.12) contradicting the minimality of n. Consequently there is somej E { 1,. ..,n} such that z , , . . ,,z, E X and zi+,,. . . ,z , E Y,showing that
,
w E X*Y*.
Now we prove that X* is suffix-closed. For this, it suffices to show that
UUEX
=$
(5.15)
UEX*.
Indeed, assuming(5.19, consider a word w = rs E X *, Then r = r’u, s = us’ for somer’,s’EX*anduuE X u l.By(5.15),uisinX*,andconsequently,s E X*, showing that X * is suffix-closed. We prove (5.15) by induction on the length of x = uu. Clearly the formula holds for 1x1 = 1. Assume 1x1 2 2. Then by (5.12) x = y,x, for some y, E Y , x1 E X. If y, is not a letter, then again by (5.12), y, = y2x2 for some y, E Y , x2 E X. Iterating this operation, we obtain a factorization x = ykxk*”x2xl with y, E Yn A and x,,. . .,xk E X. Each proper right factor u of x has the form u = upxp- * * - x1 for some right factor up of x p and 1 Ip 5 k. By the induction hypothesis, up E X*. Consequently u E X*.This proves (5.15).An analoguous proof shows that Y* is prefix closed. Next we claim that
,
X* n Y* = { l},
(5.16)
and prove this claim by induction, showing that X* n Y*contains no word of length n 2 1. This holds for n = 1 because X n Y = 0.Assume that for some w E A”, there are two factorizations w = x 1 x 2 * * - x p = y,y2...y, with xi E X, yi E Y. Since Y* is prefix closed, we have x 1 E Y+. Since X* is suffix closed y, E X*.Thus x1 E X n Y* and yq E X* n Y. By the induction hypothesis this is impossible if x1 and yg are shorter than w. Thus we have p = q = 1. But then w E X n Y = 0,a contradiction. This proves (5.16). Now we prove that X is prefix. For this, we show by induction on n 2 1 that no word in X of length n has a proper left factor in X.This clearly holds for n = 1. Consider uu E X n A” with n 2 2 and suppose that u E X.In view of (5.12), we have uu = yx for some y E Y,x E X.The word u cannot be a left factor of y, since otherwise u would be in X n Y* because Y* is prefix closed and this is impossible by (5.16).Thus there is a word u’ E A + such that u = yu’, u’u = x.
5.
369
FINITE FACTORIZATIONS
By (5.15), u’ EX*.Moreover 1x1 < n. By the induction hypothesis, the equation x = u’v implies v = 1. Thus u = uv, showing the claim for n. Consequently X is prefix. A similar proof shows that Y is suffix. We now are able to show that (X, Y) is a bisection. Equation (5.14) shows that any word in A* admits a factorization. To show uniqueness, assume that xy = x’y’ for x,x’ E X* and y,y’E Y*. Suppose 1x1 2 Ix’I. Then x = x’u and uy = y’ for some word u. Since X* is suffix closed and Y* is prefix closed, we have u E X* n Y*. Thus u = 1 by (5.16). Consequently x = x’ and y = y’. Since X and Yare codes, this completes the proof. 0 Theorem 5.4 shows that the following method allows us to construct all bikctions. 1. Partition the alphabet A into two subsets Xl and Yl. 2. For each n 2 2, partition the set 2, = qXn-iinto two subsets X, and Y..
u;:
3. S e t X = U , , , , X , a n d Y = U , , , Y , . In other words, it is possible to construct the components of the partition (P,Q)progressively during the computation. A convenient way to represent the computations is to display the words in X and Y in two columns when they are obtained. This is illustrated by
EXAMPLE 5.3 Let A = {a,b}. We construct a bisection ofA* by distributing iteratively the products yx (x E X,y E Y) into two columns as shown in Fig. 7.10. All the remainding products are put into the set R. This gives a defining equation for R, since from A u YX = X u Y and X = {a, ba} u R we obtain R = {b, b2a}R u b2a{a,ba}. Thus R = {b,b2a}*b2a{a,ba}or also R = (b2b*a)*b2b*a{a, ba}. Consequently X = (b2b*a)*{a,ba}, which is the code of Example 2.11.
1
2 3 >4
Fig. 7.10 A bisection of A*.
3 70
VII. CONJUCACY
The following convention will be used for the rest of this section. Given a code X over A, a pair ( V , V) of subsets of A* will be called a bisection of X* if
g* = _v*y*. To fit into the ordinary definition of bisection, it suffices to consider a coding morphism for X. A trisection of A* is a triple ( X , Y , Z ) of subsets of A’, which form a factorization of A*,
A*
= X*_y*Z*.
(5.17)
We shall prove the following result which gives a relationship between bisections and trisections.
THEOREM5.5 Let ( X , Y , Z ) be a trisection of A*. There exist a bisection (V, V )of Y* and a bisection (X’,Z’) of A* such that ( X ,V )is a bisection of X’* and W.Z ) is a bisection of Z’*,
A*
= X*_r*Z* = (X*V*)(_v*Z*)= X‘*Z‘*.
Before giving the proof we establish some useful formulas.
PROPOSITION 5.6 Let ( X , Y ,Z ) be a trisection of A*. 1. The set X*Y* is suffix-closed and the set Y*Z* is prefix-closed. 2. One has the inclusions
Y*X* c x * v Y*Z*,
(5.18)
Z*Y* c z* v X*Y*.
(5.19)
3. The codes X , Y , and Z are (2,O)-,(1,l)-, and (0,2)-lirnited,respectively. Proof We first prove 1. Let w E X*Y*, and let u be a right factor of w (see Fig. 7.11). Then w = uv for some u. Set u = xyz with x E X * , y E Y*, and z E Z*. Set also uxy = x’y’z’ with x’ E X*, y‘ E Y* and z’ E Z*. Then w = uu = uxyz = x’y’(z’2).
v c
A
0 X’
Y’
--
0 2’
Fig. 7.11 The set X * Y * is suffix-closed.
5.
37 I
FINITE FACTORIZATIONS
0
-
V
x’
X 0
Y’
*
2,
Fig. 7.12 Y * X * c X * u Y*Z*
Uniqueness of factorization implies z’ = z = 1. This shows that u E X * Y * and proves that X * Y * is suffix-closed. Likewise Y*Z* is prefix-closed. We now verify (5.18). Let x E X * and y E Y*. Set yx = x’y’z’ with x’ E X * , y’ E Y*, and z’ E Z*. If x’ = 1, then yx E Y*Z*. Thus assume that x’ # 1. The word x’ cannot be a left factor of y since y E Y*Z* and Y*Z* is prefix-closed and X * n Y*Z* = { l}. Thus there is a word u such that x’ = yu and x = uy’z’ (see Fig. 7.12). Since u is a right factor of x’ E X * Y * , it is itself in X * Y * . Thus u = x”y” for some x” E X* and y” E Y * . Thus x = x”y”y’z‘. Uniqueness of factorization implies y” = y’ = z’ = 1. Consequently yx = x’y’z’ = x’ E X * . This proves (5.18). Formula (5.19) is proved symmetrically. The code X is (2,O)-limited. Indeed, let u, u, w E A + be words such that uv, uw E X * . Since u and w are right factors of words in X * and since X * Y * is suffix closed, both v and w are in X * Y * . Thus u = x’y’,
w = xy
for some x, x’ E X * , y, y’ E Y* (see Fig. 7.13). The word y’x is a right factor of u u x E X * . Thus by the same argument, y‘x is in X * Y * and consequently y’x = x”y” for some
Fig. 7.13 The code X is (2,O)-limited.
VII. CONJUGACY
372
x" E X* and y" E
Y*.Thus uw = xfxffy"y. Sitlce by assumption uw E X*, uniqueness of factorization implies that y" = y = 1. Thus w = x E X*.This proves that X is (2,O)-limited.Likewise Z is (0,2)-limited. To show that Y is (1, 1)-limited consider words u,u, w E A* such that uu, uw E Y*.Thenu E X*Y*becauseuisarightfactorof theworduuinX*Y*and also U E Y*Z* as a left factor .of the word uw in Y*Z*. Thus u E X*Y* n Y*Z*.Uniqueness of factorization implies that u E Y*.This completes the proof. 0 Proof of Theorem 5.5 Set S = {s E Y*IsX* c X*Y*}.First, we observe that
S = { s E Y * ~ s X * c X *Y*}. u
(5.20)
Indeed, consider a word s E Y*. If sX* c X* u Y*, then clearly, sX* c X*Y*.Assume conversely that sX* c X*Y*.Since s E Y* we have sX* c Y*X* and it follows by (5.18) that sX* c X * u Y*Z*.Thus sx* c X*Y * n(X*u Y *Z*)= (X*Y*nX*)u (X*Y* n Y * Z * ) = x*u
Y * by uniqueness of factorization. This proves (5.20). Next, S is a submonoid. Indeed, 1 E S and if s,t E S, then stX* c sX*Y*c X*Y*.We show that the monoid S, considered as a monoid on the alphabet Y satisfiescondition C(1,O). In other words, s, t E Y*and st E Simply t E S. Indeed, consider x E X*.Since t x is a right factor of stx E X*Y* and since X*Y*is suffix closed, t x E X*Y*.Thus t . S.~This shows that S is a free submonoid of Y* generated by some (1,O)-limited code U c Y +.Note that U is (1,O)-limited as a code over Y. According to Proposition 5.1, the code U is the left factor of some bisection ( U , V) of Y*,with V* = Y* - UY*.We shall give another definition of V. For this, set
R
= {r E
Y* I rX* n Z* # @}.
Clearly R n S = 1. We prove that
R*
=
v*.
(5.21)
First, we show that R c V*. Let r E R - { 1). Set r = st with s E Y+,t E Y*. Since r E R , we have s t x E Z * for some X E X * .By (5.18), we have t x ~ X* u Y*Z*.If t x E Y*Z*,then st E Y+Z*which is impossible since stx E Z * . Consequently t x E X*.Thus s E R. Since R n S = { l}, it follows that s # S. Thus no left factor s E Y + of r is in S. In other words, no left factor of r is in the code U . This proves that r is in V*. Second, we prove that V* c R*. We proceed by induction on the length of words in V, the case of the empty word being trivial. Let u E V + .Since (U, V )is a factorization, we have U * n V* = { 1). Consequently u # U* = S. Thus by (5.20), there is some x E X * such that ux # X* u Y*.Since u E Y*,we have by
5.
373
FINITE FACTORIZATIONS
(5.18) vx E Y*Z*, and by a previous remark even ux E Y*Z+. Set ux = yz with y E Y*, z E Z +.Then z cannot be a right factor of x , since otherwise z would be in X* Y* n Z+, which is impossible. Thus there is some word w E A + such that u = yw,
wx = 2.
Since w is a right factor of u E X*Y*, we have w E X*Y*. Similarly w is a left factor of z E Y*Z*. Thus w E Y*Z*. Uniqueness of factorization implies w E Y*. The word y is in V*. Indeed, y E Y* is a left factor of u, and since V* is prefix closed as a subset of Y*, y E V*.Since lyl < 101, we have y E R* by the induction hypothesis. On the other hand, w E Y* and wx = z E Z* imply w E R. Thus u = yw E R*. This completes the proof of (5.21). Up to now, we have proved that (5.22) with _Y* = u*l/*,S = U * and R* = V*. To finish the proof, it suffices to show that the products M = X*U* and N = V*Z* are submonoids. Indeed, since the product (5.22) is unambiguous, we have &f = x*y* and # = y*z* whence A* = & By IProposition #. 5.3 the monoids M and N then are free and their bases constitute the desired bisection (XI, Y). To show that X*U* is a submonoid it suffices to show that U*X* c X* v U*.Thus, let us consider words x E X * and s E U * = S. Then by (5.20) sx E X* u Y*. But sx E Y* implies sx E S because s x X * c sX* c X* u Y*. Thus sx c X*-u S, showing that X*U*is a submonoid. Finally we show that V*Z* is a submonoid. For this, we show that Z*R c R u Z*.
(5.23)
This will imply that Z*R* c R* u Z* which in turn proves the claim in view of (5.21). To show (5.23), let z E Z* and r E R. Since r E Y*, formula (5.19)impliesthat ZT E Z* u X*Y*. Next, by definition of R, r x E Z* for some x E X*. Thus zrx E Z*. Since Y*Z* is prefix-closed, we have z E Y*Z*. Thus uniqueness of factorization shows that zr E Z* u Y*. If zr E Y*, then zr E R, since zrx E Z*. Thus zr E Z* u R and this.proves (5.23). 0 Theorem 5.5 shows that all trisections can be built by “pasting” together quadrisections obtained by a sequence of bisections. The following example shows that it is false for any trisection to be obtained by two bisections.
EXAMPLE5.4 Let A = { a , b ) . The suffix code 2’ = {b,ba,ba2)is (0,l)limited. Thus Z’ is the right factor of the bisection (X’,Z’) of A* with XI* = A* - A*Z’. The equation
374
VII. CONJUGACY
derived from (5.3) gives follows that
A - z’ = (1 - Z’)X’ whence X’= Z‘*(A - z’).It
Z’ = Z’*(U- ba - ba2) = (Z’* - - Z‘*b - Z‘*ba)a
+ Z’*(b + ba + ba’) - Z’*b - Z’*ba)a = (1 + Z’*ba2)a.
= (1
Thus
X’= Z‘*ba3 u {a}. Next define
U
= (ba)*ba3,
V = ba,
Z = {b,ba2}(ba)*.
The pair (V, 2)is clearly a bisection of Z’*. Moreover, by inspection U c X’. This inclusion shows that, over the alphabet X’,the set U is the right factor of the bisection (X,V) of X’* with X = U*(X’ - V). Moreover, U*V* = {ba,ba3}*.Then setting
Y = {ba, b d ) ,
(V,V ) is a bisection of Y*. Thus we have obtained A* - = X’*Z’* - - = X*V*_V*Z* = X*_Y*Z*, and (X,Y , Z ) is a trisection of A*. Neither X*Y* nor Y*Z* is a submonoid. Indeed, ba E Y and a E X (since a E X’- V).However, ba2 E Z and consequently ba2 # X*Y*.Similarly b E 2 and ba3 E Y but b2a3E X whence b2a3# Y*Z*. This means that the trisection (X, Y, 2)cannot be obtained by two bisections.
EXERCISES SECTION 1
Let A = {a, b}. Show that {ab,ba}* is pure. Let X = {x, y} be a code with two elements. Show that if X* is not pure, then the set x * y u y * x contains a word which is not primitive. 1.3. Deduce from Exercise 1.2 that if x = uu and y = uu are conjugate primitive words, then X* = {x, y}* is pure.
*1.1. *1.2.
SECTION 2 2.1.
Let X c A + be a finite code and let d = (Q,1,1)be an unambiguous trim automaton recognizing X*. Let cp be the associated represen-
375
EXERCISES
2.2.
tation. Show that X* is pure iff the monoid cp(A*) contains no nontrivial groups (Hint: Use Proposition 2.3.) Let A be a finite alphabet and let X c A be a finite code. Show that Xis circular iff there exist finite subsets T, U,V , W of A + such that +
X* = T u (UA* n A*V) - A*WA*. 2.3.
2.4.
2.5.
(Hint: Use Proposition 2.3.) A set X c A + is called (p, q)-constrained for some p, q 2 0 if for each sequenceu,,,~,, ..., up+,of wordstheconditionui-,uiEXfor 1 Ii 5 p + q implies up E X*. (a) Show that, for p + q I 2, a set X is (p,q)-constrained iff it is (p, @-limited. (b) Let A = {a, b) and X = {a,ab). Show that X is (3,0)-constrained but not (3,0)-limited. Let X c A + be a maximal prefix code. Show that the following conditions are equivalent. (i) a(X) = 1, (ii) A*X c x*, (iii) X is a semaphore code such that S = X - A'X satisfies SA* n A*S = S v SA*S. Show that a recognizable code is limited iff it is circular.
SECTION3 3.1.
3.2.
Show that the set L, of Lyndon words of length n over a k letter alphabet is a circular code. Show that L, is comma-free iff n = 1 or n = 2, k I 3 or (n = 3,4 and k I 2). Let A be a k letter alphabet and let s E A + be a word of length p. Let R be the finite set
R={WEA*ISWEA*S,IW~
Let X be the semaphore code X = A*s denote by aT the series aT =
- A*sA+. For
all T c A*, we
C Card(T n A")t". "20
Using Proposition 11.7.5, show that ax =
tP
tP + (1 - kt)C1R
z.
Now let 2 = (sA+ n A's) - A+sA+. Show that s + 44 = X + Let U = Zs-'. Show that for all n 2 p, the code U n A" is comma-free and
3 76
VII. CONJUCACY
that the generating series of the length distribution of U is
au =
(kt - 1)
tP
+ (1 - kt)aR + 1.
SECTION 4 4.1. 4.2.
Let A = {1,2,... , n } and f o r j E A, let Xj= j { j + 1, ..., n}*. Show that the family (Xi), is a factorization of A*. Let cp: A* + R be a morphism into the additive monoid. For r E R, let
C, = { u E A
+
1 q(u) = r } ,
B, = c, -
0,
c s ) A +.
<
4.3.
Show that if n(w) = (I, m) and n(m) = (p, q), then p 1 < m. The standard factorization of a Lyndon word w E L - A is defined as the pair
n(w) = (Lm)
4.4.
of words in A + such that w = lm and m is the longest proper right factor of w that is in L. Show that if n(w) = (I, m) and n(m) = (p, q), then p 5 1 < m. Let L be the set of Lyndon words over A. Set L n A" = {zl,z2 ,. . .,zk} with z1 < z2 < < zk and let Z , = A, Z i + , = z:(Z,
- zi)
(1 I i I k).
(a) Show that Zi contains all z, such that n(z,) = ( z , , z t ) with s
Show that if a factorization A* = XXXX-,...X:is obtained by a composition of bisections, then X,is a (i - 1, n - +limited code. 5.2. Let X be a (2,O)-limited code over A. Let M c A* be the submonoid generated by the right factors of words in X. Show that M is rightunitary. Let U be the prefix code generating M.Show that there exists a bisection of A* of the form (U, Z ) . Show that X , considered as a code over U is (1,0)-limited.Derive from this a trisection ( X , Y ,Z ) of A*. This shows that any (2,O)-limitedcode is a left factor of some trisection. 5.3. Let A = {a,b, c, d, e, f,g } and let Y = { d, eb, f a , ged, dac}. Show that Y is (1, 1)-limited.Show that there is no trisection of A* of the form ( X , Y,Z ) . 5.1.
NOTES
5.4.
377
Let y E A be an unbordered word. Show that there exists a trisection of A* of the form (X, y, Z). Show that a left (resp., right) factor of y is in Z* (resp., X * ) . (Hint: First construct a bisection (X’, 2)of A* such that X’* is the submonoid generated by the right factors of y.) +
NOTES
The definition of limited codes is from Schutzenberger (1965c), where limited codes are definedby some condition eS(p, q) for p < 0 < q which is our condition C( - p , q). The notions of synchronization delay and of uniformly synchronous code have been introduced in Golomb and Gordon (1965). Theorem 1.8 is from De Luca and Restivo (1980). Theorem 2.6 is in Restivo (1975). See also Lassez (1976) where the term “circular code” is used. The inequality (3.2) was already seen in Chapter 1. This and its converse (Proposition 3.1) are due to Kraft and McMillan (McMillan, 1956). Schutzenberger (1967) has proved a result which is stronger than Proposition 3.1 and gives a necessary and sufficient condition on a sequence of integers to be the length distribution of a synchronous prefix code. The explicit computation of the sequence L,(X) as a function of the length distribution of X is given in a slightly different form in Golomb and Gordon (1965). Theorem 3.6 has been conjectured by Golomb and Gordon. It has been proved by Schutzenberger(196%) and by Scholtz (1966). The story of comma-free codes is a pleasant one. They were introduced in Golomb et al. (1958). Some people thought at that time that the biological code is comma-free (Crick’s hypothesis). The number of amino acids appearing in proteins is 20. They are coded by words of length three over the alphabet of bases A, C, G, U.Now, the number 4(4)which is the maximum number of elements in a comma-free (or circular) code composed of words of length three over a four-letter alphabet is precisely 20. Unfortunately for mathematics, it appeared several years later with the work of Niernberg that the biological code is not even a code in the sense of this book. Several triples of bases may encode the same acid (see YEas, 1969 or Stryer, 1975). This disappointment does not weaken the interest of circular codes, we believe. Theorem 3.8 has been conjectured by Golomb et al. and proved by Eastman (1965). A simple construction has been given by Scholtz (1969), on which the proof given here is based. Other constructions which are possible are described in Devitt and Jacks03 (1981). For even length, no formula is known giving the maximal number of elements of a comma-free code (See Jiggs, 1963). The notion of a factorization has been introduced by Schutzenberger (1965a) in the paper where he proves Theorem 4.1. The factorizations of free
378
VII. CONJUCACY
monoids are very closely related with decompositionsin direct sums of free Lie algebras. This topic is not mentioned here. A complete treatment can be found in Viennot (1978) and in Lothaire (1983). Proposition 4.3 is a special case of a statement known as the Baker-Campbell-Hausdorff formula (see, e.g., Lothaire, 1983). The notion of a Lazard set is due to Viennot (1978). A series of examples of other factorizations and a bibliography on this field can be found in Lothaire (1983). Finite factorizations were studied by Schutzenberger and Viennot. Theorem 5.3 is from Schutzenberger (1965a). Theorem 5.4 is due to Viennot (1974). Viennot (1974) contains other results on finite factorizations. Among them, there is a necessary and sufficient condition in terms of the construction of Theorem 5.3 for the factors of a bisection to be recognizable. He also gives a construction of trisections analoguous to that of bisections given in Theorem 5.3. Exercises 1.2 and 1.3 are from Lentin and Schutzenberger(1967). Exercises 2.1 and 2.2 are from Restivo (1974) (see also Hashiguchi, Honda 1976b). These statements have a natural place within the framework of the theory of varieties of monoids (see Eilenberg, 1976 or Pin, 1984). Exercise 2.1 may be formulated in a more general way. For this a subset X of A + is called aperiodic if its syntactic monoid contains no nontrivial subgroup. Given an aperiodic code X,the monoid X* is pure iff X* is aperiodic (see Perrot, 1977 and Pin, 1979). Exercise 3.2 is from Guibas and Odlyzko (1978). The codes introduced in this exercise were defined by Gilbert (1960) and named prefix-synchronized. Gilbert has conjectured that U n A" has maximal size when the word s is chosen unbordered and of length log, n. This conjecture has been settled by Guibas and Odlyzko. It holds for k = 2, 3,4, but is false for k 2 5. The factorization of Example 4.2 is due to Spitzer (see Lothaire, 1983). Exercises 5.2 and 5.3 are from Viennot (1974).
CHAPTER
VIII
Polynomial of a Code
0. INTRODUCTION
The main object of this final chapter is the study of the polynomial associated with a finite maximal code obtained using commutative variables. The main result (Theorem 1.1) links properties of the code with factorization properties of the polynomial. The proof is difficult. However, the result is just one step in the direction of analysis of the characteristic polynomial of a code. This study is far from being complete at the present time. The main result is stated in Section 1 which also contains the notion of a factorizingcode. In Section 2, it is shown that the commutative polynomial of a code is equal to the determinant of a matrix naturally associated with the code. In Section 3 a decomposition of the state space into invariant subspaces is defined. This induces a factorization of the polynomial of the code. In Section 4 the factors of the polynomial are evaluated by showing the connection with left, right, and two-sided contexts. For this, results on measures from Chapter VI are used. Section 5 contains the end of the proof and a detailed description of two examples. The commutative polynomial. of a code contains all the information concerning the commutative properties of a code, such as its length distribution. In Section 6 we consider another commutative property, namely that of being commutatively prefix. We show that this property is equivalent 319
380
VIII. POLYNOMIAL OF A CODE
to the fact that the coefficients of some polynomial are nonnegative. We also give an example of a code which is not commutatively prefix. Section 7 presents a property of biprefix codes related to the decomposition of the state space described in Section 3. It concerns a property of the syntactic representation which is the analogue of the minimal automaton in linear algebra. This representation is shown to be completely reducible iff the associated code is biprefix.
1.
FACTORIZING CODES A code X over an alphabet A is a factorizing code iff there exist two subsets
P and Q of A* such that each word w E A* factorizes uniquely into w = qxp
(1.1)
with P E P , q E Q, x E X*. In terms of formal power series, (1.1) can be expressed as
A* = QX*P. -
(1.2)
The pair (P,Q)is called a jactorization of the code X.Observe that 1 E P and 1 EQ. Taking the inverses in (1.2),we obtain 1 - X = p(1 - A)Q -
(1.3)
or also
X
- 1 = f " Q- - PQ. This equation shows that each word in X can be written in at least one way as x = paq
with p E P, a E A, q E Q. If a subset X c A + satisfies(1.4)for some sets P, Q,then X is a code. Indeed, (1.4) implies that A* = Q(X)*P, which in turn shows that (X)*has only coefficients 0 or 1. In view of (1.2) the composition of factorizing codes is again a factorizing code. A prefix code X is factorizing since we have A* = X * p for P = A* - XA* being the set of word having no left factors in X. Conversely, if in a factorization (1.2), the set Q is {l}, then the code X is prefix. Indeed, if u, uu E X*, then setting u = x p with x E X * and p E P, we obtain (ux)p E X * , which implies p = 1 by the uniqueness of factorization. Thus X* is right unitary.
381
I . FACTORIZING CODES
Symmetrically, for a suffix code X,A* = Q&* with Q = A* If X is a biprefix code, then simultaneousry
A*=X*P - XA* and Q = A*
and
- A*X.
A*=Q&*
with P = A* - A * X . This shows in particular that a code may have several factorizations (see also Exercise 1.5). Recall that for a thin maximal biprefix code X,we have
X - 1 = d ( X ) ( A- 1) + ( A - 1)T(A - 1). Hence A* = X * p = Q - X * with
Q = d ( X ) + T ( 4 - 1). (1.5) Let X c A + be a factorizing code and let ( P ,Q)be a factorization of X . If P and Q are thin, then X is a thin maximal code. Indeed, Eq. (1.4) shows that X c PAQ. Since P, A , Q are thin, the product PAQ is thin also and consequently X is thin. Furthermore, X is complete. Indeed, let u E P(Q) and
P = d ( X ) + ( A - 1)T,
v E F(P). For each w in A* the word uwv is in Q X * P . By the choice of u and u, it follows that w is in F(X*). Thus X is complete. As a special case, note that if P and Q are finite, then X is a finite maximal code. We shall see later that if, conversely X is a finite maximal code, then P and Q are finite. There exist finite codes which are not factorizing. An example will be given in Section 6. However, no finite maximal code is known which is not factoring. Whether any finite maximal code is factorizing is still unknown. This is one of the main problems in the theory of codes. We shall prove, by passing to commutative variables a weakened version of this statement. Given a subset X c A + we denote by & the series in commutative variables which is the image of the series X under the canonical morphism a:
QP((4
+
Q"AI1.
Thus a(&) = y. For a E A, we write indifferently a or 8. THEOREM 1.1 Let X c A be a finite maximal code. The polynomial 1 - X is divisible by 1 - 4.Zf the quotient +
r = (1 - X)/(1 - 4) is irreducible in Q [ [ A ] ] , then one of the following three cases holds:
(i) X is prefix and synchronous; (ii) X is sufix and synchronous; (iii) X is biprefix. The proof of Theorem 1.1 is given in Sections 2-4. We first discuss the relationship between Theorem 1.1 and factorizing codes.
VIII. POLYNOMIAL OF A CODE
382
Let X be a finite maximal factorizing code and let (P,Q) be a factorization of
x,
1 - X = l'(1
- A)Q. -
The first statement of the theorem, namely that 1 - A divides 1 - X , shows that the product r = EQ is a polynomial. Consequently P and Q are finite sets. If the polynomial r is irreducible, then P = { I } or Q = { 1 ) . This implies by a previous remark that X is prefix or suffix. Thus we easily obtain a weakened version of the theorem for factorizingcodes. However, this gives no indication whether the code is synchronous or not. EXAMPLE 1.1
Let A = {a,b}, and let
X = {a4,ab, aba6,abajb, aba3ba2,aba2ba,aba2ba3,aba2b2,aba2b2a2,b, ba'}. The set X is a factorizing code. Indeed, an easy computation gives 1 - X = ( 1 + a + aba2(1
+ a + b))(l - a - b)(l + a2).
Thus this is a factorization (P,Q) with P = { 1, a, aba2,aba3,aba'b},
Q = { 1, u'}.
Since P and Q are finite, X is a maximal code. We may verify that X is indecomposable. This is the smallest known example of a finite maximal indecomposable code which is neither prefix nor suffix (see Example 1.6.4 and Exercise 1.4). 2. DETERMINANTS
In this section we prove two results about the determinant of a matrix which is associated in a natural way to an automaton. In the first statement, formula (2.1)gives an expression of the polynomial 1 - & for a finite code X.
PROPOSITION 2.1 Let X c A+ be a finite code and let d = (Q, 1,l) be a finite unambiguous trim automaton recognizing X*.Let t be the Q x Q-matrix with elements in Q[A] given by, tP4 = 4 P 4 where A , , is the set of letters a E A such that p A q. Then
1 - $ = det(l - t).
(2.1)
Proof There exists no path q A q with q # 1 and w E A + that does not pass through state 1. Otherwise uw*v c X for words u, u such that l A q 4 1 , contradicting the finiteness of X. Thus we can set Q = { 1,2,. ..,n} in such a way that whenever i &j for a E A, j # 1, then
2. DETERMINANTS
i cj.Define for i, j E Q, r..IJ = 6.. - A- IJ. . IJ
3
where 6, is the Kronecker symbol. Let A be the polynomial
where E(O) = & 1 denotes the signature of the permutation well-known formula for determinants we have
Q.
According to the
det(1 - t) = a(A). (Recall that a maps &P((A))on Q [ [ A ] ] ) .Thus it suffices to show that
so that A=
C
(-l)e(a)Aa.
U s e ,
Consider a permutation o E 6,such that Aa # 0. If o # 1, then it has at least ) length k 2 2. Since A,, # 0, by (2.2) the sets A i , i 2 , one cycle ( i l i 2 * * . i k of A i Z i 3., .. ,Aiki1are nonempty. This implies that the cycle (i,, .. .,ik) contains state 1. Consequently each permutation Q with A,, # 0 is composed of fixed points and of one cycle containing 1. If this cycle is (i,, i 2 , . . .,ik) with i, = 1, then 1 c i2 < * * * c ik by the choice of the ordering of states in d .Set Xu = Ali2AiZi3 ... Aik,,.Then Aa = (- l)k&,, and also (- l)e(a)= (- l)k+ The set Xuis composed of words u1u2*..akwith a, E A and such that
i2+i3A.**
+ i k 5 1 .
These words are in X. Denote by S the set of permutations Q E 6 - 1 having just one nontrivial cycle, namely, the cycle containing 1. Then X = Cues&,, since each word in X is the label of a unique path (1, i 2 , . . .,ikr1) with 1 c i2 < ". < i k . It follows that A=1
+ 1(-
l)e(u)A,,
acS
=1-&.
0
VIII. POLYNOMIAL OF A CODE
384
In the sequel, we shall need several canonical correspondences, in addition to the morphism, a: Q QCCAII. -+
Let n 2 0 be an integer, For 1 Ii,j I n, let "4.j:
Q n X n ( ( A ) ) + Q
be the Q-linear function that associates, to a formal power series a with coefficients in the set of n x n matrices over Q, the series
Let 8: Qn "((A))-+ (Q((A>>)"" be the function defined by 8(a)i,,= ~,,,(a).It is easily seen that 8 is an isomorphism of Q-algebras. The morphism a is extended in the natural way to Q n X n ( ( A )and ) to (Q((A>>)" '". This leads to the following commutative diagram:
To avoid cumbersome notation, we have used the same symbol 8 to denote the two horizontal bijections. Observe also that with this notation, the matrix t of Proposition 2.1 is expressed as
PROPOSITION 2.2 Let cp:
A*+QnX"
be a morphism from A* into the monoid 69" ". Let 1 E Q" be a row vector, let C E Q" be a column vector and let ~EQ((A)) be the series defned by (a,w) = Iq(w)c. Set t = cp(u)e). Then a(a)det(l - t ) is a polynomial.
8(LeA
cwEA.
Proof Set F = z E A c p ( a ) and g E= cp(w)w. Then E = F*, since cp is a morphism. Consequently E(I - F ) = I . (2.4)
3.
STATE SPACE
385
Applying 0 to (2.4) we obtain e(E)(I - e ( F ) ) = I .
(2.5)
Let p = a(8(E)).Since a(d(F))= t, Eq. (2.5)implies p(I - t) = I.
This shows that p is the inverse of matrix I - t. The usual formula from linear algebra then gives Pij pi*j
- t)
=
for some polynomials pi,j E Q [ A ] . Thus det(l - t)p E (Q[A])"
".
Next, for all w E A* (a, w ) = lq(w)c. Hence a=
c
1 lq(w)cy.
(6, w)w =
WEA*
WEA*
An easy computation shows that D
= lO(E)c. Consequently
(2.8)
a(a) = lpc.
Multiply both sides of (2.8) by det(l - t). According to (2.7), we have a(a)det(I - t ) E QCA]. 0
Remark If the series 6 in Proposition 2.2 has the special form where X c A + is a recognizable code, then by the proposition,
6=
g*,
$*det(I - t) = p , for some polynomial p E Q [ A ] . This implies that det(I - t) = (1
- &)p.
In view of Proposition 2.1, the polynomial p equals 1 for a finite code X.This can also be shown directly (Exercise 2.1).
3. STATE SPACE
In this section, we consider a fixed maximal recognizable code X c A +.We also fix a finite unambiguous trim automaton d = (Q, 1 , l ) recognizing X*. Set
Q = {1,2 ,..., n}.
386
VIII. POLYNOMIAL OF A CODE
Let cp be the representation associated with the automaton d and let = cp(A*). Let J be the minimal ideal of M and let r (resp. A) be the set of minimal right ideals (resp. minimal left ideals) in M . Observe that according to the assumptions, X is complete and cp(X*) n J f fa. Let E=QnX1, E'=Q'"",
M
The vector space E is called the state space of the automaton d.A matrix m E M acts on the left on E and on the right on E'. For a subspace F of E we denote as usual by F' the subspace of E',
F'
={I E
I
E' VC E F, IC = O}.
This is the orthogonal of F. Symmetrically,we define the orthogonal F'' of a subspace F' of E'. Observe that since E has finite dimension, a classical result from linear algebra says that (F')' = F. A subspace F of E is called invariant if for all c E F and m E M we have mc E F. Likewise a subspace F' of E' is invariant if 1 E F' and m E M imply lm E F'. Observe that F is invariant iff 'F is invariant. Indeed, assume F c E invariant and consider 1 E F'. Then for all c E F and rn E M , we have mc 6 F and consequently (1m)c = I(rnc) = 0. Thus lrn is in F'. We now define a sequence of subspaces of E and E'. For each R E r, set
Thus cRis the sum of the first columns of the elements in R. Likewise for L E A, set Let E2 be the subspace of E generated by the vectors cR
(REr)
and let E ; be the subspace of E' generated by the vectors 1L
(LE A).
Let El denote the subspace of E2 generated by the vectors cR - cR,
(R,R' E r)
and let E ; = E i . Let E ; denote the subspace of E ; generated by the vectors 1, - IL,
and let E , = S;'.
(L,L' E A)
3.
STATE SPACE
387
Finally, for convenience, set E, = {0},
Eb = E',
E , = E,
Ek = {O}. With these wonderful notations, we have E f = E: for i = 0, 1, 3,4. However, we do not have E i = E ; in general (see Example 5.2).
PROPOSITION 3.1 Let R E r and L E A. 1. For all m E M , meR = c,R,
1,m = lLm,
mlrcR= Card(mR n cp(X*)), lLm+, = Card(Lm n cp(X*)).
2. For all m E J,
ml,
E E;,m,l E E 3 .
(3.4)
3. We have
l,cR # 0.
(3.5)
Proof The function r H mr is a bijection of R onto the right minimal ideal mR E r. Thus
C rel
mcR = m (..R
)
The relation l,m = lLm is shown in the same way. Next mlrCR
= (mcR)l* = (cmR)l*
= Card(mR n cp(X*)).
This proves (3.2), formula (3.3) is shown in the same way. Now, let m E J . Then mR = mM, showing that ml.cR is independent of R. Consequently ml*(cR - CR') = 0
(3.6)
388
VIII. POLYNOMIAL OF A CODE
for R, R' E r.Thus m,, is orthogonal to the generators of El and consequently m,, E E ; . Similarly m,, E E,. To show ( 3 . 3 , note that Card(Lm n cp(X*)).
~ L c R= meR
Since X is a complete code, cp(X*) n J # 0 and therefore at least one term in the sum above does not vanish. 0
PROPOSITION 3.2 The following inclusions hold Eo c El
Ek
c
9 E2 c E3 c E4,
E; E E ; c E ; c Eb.
The subspaces Ed and E ; are invariant. Proof The invariance of the subspacesis a direct consequenceof formulas (3.1). The inclusion El c E2 is clear. Next E, c E,. Consider indeed a vector cR = r,, for some R E r.Since by formula (3.4), all rrl are in E , , we have cRE E,. Thus all cR (R E r) are in E,. This proves the inclusion. The same argument holds for the E;. It remains to prove that El # E2.For this, we assume that some cRis in El. Then cRis orthogonal to all elements in E ; hence also to all elements in E ; . In particular lLcR= 0 for L E A contradicting (3.5);Thus no cR,R E I-, is in El. 0
ElER
Now let B = B1 u B, u B, u B4 be a base of E such that
B1
is a base of
El,
B, u 8,
is a base of
E,,
B, u B2 u 8,
is a base of
E,.
Set nk = dim(Ek)for 0 5 k 5 4 and Sl = { L 2 , . . . , n 1 } , S3 = { n ,
+ 1,...,n,},
S, = { n l
+ 1, ...,nz},
S4 = { n ,
+ 1,...,n4}.
With these notations, we may assume that for 1 5 k I4, B k = {bi I i E s k ) . Let K be the matrix
I 1
K = b l b 2 * * * b. , L
J
Let $: A* -P Q"'" be the morphism defined by
+(w) = K-'cp(w)K.
3.
389
STATE SPACE
Since the subspaces Eiare invariant, the matrix $(w)has the form
where $k(W) is the restriction of $(w)to the set s k x S,. Let B' = {b',, b;,...,br} be the base of E' dual to the base B. Then 1 I i, j bibj = a, where 6 is the Kronecker symbol. Set
h, = { b : l i E S , } ,
s n,
&= {b;li~S~}.
Since B, u B, u B, is a base of E, and since each b: E h4 is orthogonal to the vectors in this set, h4 c E:. Thus h4 c E ; . Next dim&) = n dim(E,) = n - n, = Card(h4). This shows that h4is a base of E ; . Similarly, since E ; = E i , the set {bnl+ l,...,b;} is a base of E;. This means that h, is the base of some supplementary space of E ; in E'. The base B4 of EL can be completed by sets h3and h2in such a way that '
h4 u 8, h4 u 8, u h2
is a base of
E;,
is a base of
E;.
Finally, since h1is the base of some supplemptary subspace of E ; ! the set u 3, u 8, is a base of E'. Set bi = bi for 1 I i I n, and for n3 + 1 I i 5 n. Next let n; = dim@;). Set
8 = 8, u
s, =' {n, + 1,. ..&},
s, = {n; + 1,. . ,?I3},
h2 = {biI i E s,},
h3= {&Ii E i,}.
A
*
and A
To have homogeneous notation, set A
s 1
= s,,
A
s4
=
s,.
390
Observe that S, u S, = $, u
*_[ 6,
VIII. POLYNOMIAL OF A CODE
g3.As before define a matrix. 61
6" and define a morphism $': A*
+ Q" ' I t
by setting
$'(w) = i c p ( w ) R - ' .
The matrices $'(w) have the form
7 1j
...........
VAW)
$"w) =
.....
where $:(w) is the restriction of V ' ( w ) to the set &. By construction, the and $; are closely related. In fact, we have the following result.
PROPOSITION 3.3 The following equality holds: $1
=
*;
$4
=
$ b e
Furthermore, the matrices $ J W ) , $;(w) both are the identity matrix of dimension one.
Proof
By definition K$(w) = cp(w)K and $'(w)K RK$(w) = $'(w)RK.
= K p ( w ) . Thus
(3.7)
Let us compute RK.We claim the R K has the form
where the syGbol * denotes blocks which are not described in more detail. Indeed, first, b,bJ = ~ 3 for , ~ i E S, and j E { 1,. ..,n). This gives the form of the
3.
39'
STATE SPACE
erst row in (3.8). The same argument holds for the last row. Finally, we have bibj = 0 for i E S, u S3(= s^, u i3) a n d j E S1, since in this case bi is orthogonal to El. From (3.8) and from the expression of $(w) and $'(w), we derive the forms
In view of (3.7), it follows that I&(w) = $;(w) and also &+(w)= I,K,(w). To prove the second claim, recall that in view of the definition of El and E,, we have dim(E,) s dim(E,) + 1. Indeed, let R , E r be fixed. Then each R E r satisfies C R = c R 0 + ( c R - c R 0 ) E cR0 + E 2 . Since E, # El, by Proposition 3.2, we get that dim(E,) = dim(El) + 1. Thus B, = ( b }for some vector b in E,. Since E, is generated by the cR (R E r),we have b = CREr t l R C R for some ctR E Q. Let w be a fixed word and rn = cp(w). Then q(w)b = uRc,,,R = b + d,
1
REF
where d = C R E r c t R ( C , , , R - cR)E El. This shows that $,(w) = 1. Similarly, we show that i&(w) = 1, arguing in this case on E; and E;. 0 For 1 Ii I4, define
where 0 is the canonical isomorphism (of convenient dimension) defined in Section 2. Furthermore, set pi = det(Z - s:)
pi = det(l - si),
for i = 1, 2, 3,4.
PROPOSITION 3.4 The following equalities hold: P1
=
Pi,
P2
= Pk,
P3
l-A=p,.
=
Pi,
P4
=Pi,
(3.9)
(3.10)
Moreover, the polynomials pi have integral coefficients and they have constant term 1. Proof
Set
392
VIII. POLYNOMIAL OF A CODE
Since $(a) = K-'cp(a)K, we have I - s = K - ' ( I
- t ) K . This shows that
det(l - t) = det(I - s).
(3.11)
Similarly, since $'(a) = k $ ( a ) k ' , we have det(Z - t ) = det(I - s').
(3.12)
The matrices +(a) being upper triangular by blocks d e w - 4 = PlP2P3P4
(3.13)
and similarly det(l - s') = p;p;p;pk whence p1p2p3p4 = p',p;p;pk. The polynomials p i and pi as they are defined clearly have a constant term 1. Next by Proposition 3.3, t,h1 = $', and t,h4 = $.; Consequently p1 = p; and p4 = p i . The same proposition shows that s2 = A, hence p2 = 1 - A. Likewise, p; = 1 - 4. Since p l p 2 p 3 p 4 = p;p;p;pk and none of these polynomials vanishes, we have p3 = p;. Since the p i have a constant term equal to 1 and since the product of the pi has integral coefficients, the lemma of Gauss shows that the p i have integral coefficients. 0
PROPOSITION 3.5 Let w E cp- ' ( J ) and let
u = w-'x*,
U' = x * w - ' .
Then
u = PkP;P;U,
u' = P l P 2 P 3 V '
are polynomials with integral coe@cients.
Proof Let z E A*. Then cp(zw)l,l= ( u ' , z )since zw E X* iff z E U'. Using Proposition 2.2, we set 1 = [lO...O] E Q l X and"y = c p ( ~ ) , Then ~. cp(ZW)l,l
=W
) Y =w(z)K-'Y.
The matrix $ ( z ) can be partitioned as
By Proposition 3,1(ii),the vector y is in E3.Thus the coefficients of K - ' y with indices in S4 vanish and K-'y =
[i].
3.
393
STATE SPACE
Setting we obtain
We now apply Proposition 3.3 to the series det(l - 6
v' and the morphism v. Since
1 v(a)a)) = PlPZP3,
(..A
Proposition 2.2 shows that u' = p1p2p3u' is a polynomial. It has integral coefficientssince pl ,p z , p 3 , and g' have integral coefficients.The proof of the other formula is symmetric. 0
PROFJOSITION 3.6 Let D = X * ( A * ) - ' , G = ( A * ) - ' X * be the sets of words right, left completable. Let w E cp-'(J) n D and let V = Dw-'. Then v = PlPZY
is a polynomial with integral coeficients. Let w' E cp-'(J) n G and let V' = w'- ' G . Then v' = p;pky'
is a polynomial with integral coeficients.
Proof Let
The set R = cp(wA*)is a minimal right ideal which intersects cp(X*).
k = Card(R n cp(X*)).
For each R' E r,either Card(R' n cp(X*) = 0 or this number is k, according to R' n cp(X*) = 0 or not. By Proposition 3.1. we have for z E A*, (dz)cR)l
= d z ) I * c R = Card(dz)R
cp(x*)).
Next cp(z)R n cp(X*) # 0 iff z E V. Indeed, assume z E V. Then zwz' E X* for some z' E A* and consequently ~ ( z w z 'E) cp(z)R n cp(X*). Conversely, if cp(z)Rn cp(X*) # 0,then there exists some r = ~ ( w z 'E)R such that zwz' E X*. It follows from this that ((p(Z)cR)l= k(y,z). Set 1 = [l O...O] E Q' '". Then ((P(z)cR)~
= I ~ ( z ) c= , lKcp(z)K-'cR.
Set Q) =
.. . . .. [...*b'.z.'..j..&)I*
394
VIII. POLYNOMIAL OF A CODE
Since cR E E 2 , we have
with F E Qnz
'. Setting 1K = [T *]
gives as before
-
= lV(2)F.
Again we apply Proposition 2.2. In this case we get that U = k y p , p , is a polynomial. This of course implies that u = yp1p2 is a polynomial. It has integral coefficients since pl, p 2 , and y have integral coefficients. This completes the proof of the first claim. The second one is proved symmetrically. 0
4. EVALUATION OF THE POLYNOMIALS
In this section we keep the notations of the preceding section and assume further that the code X is finite. Let A be a finite alphabet and let n:A* + [0,1] be a Bernoulli distribution. Set A = { a , , ...,a,,}. For each polynomial p = p ( a , , a,, . . .,a,) E Q [ A ] , we denote by a( p) the real number
n(p) = p(n(al),z(a2),* * * n(an))* This defines a function again denoted by 11: Q [ A ] + R, which is a morphism of Q-algebras. Moreover, if L c A* is a finite set, then z ( L ) = a(C.). Now consider a series o E Q [ [ A ] ]and set 9
where onis the homogeneous component of degree n of o (i.e., the sum of the monomials of total degree n). The density 6(o)of ~7 with respect to II is defined as the limit in mean of the numbers a(a,), provided it exists, 1 n-1
6(o)= lim n-m
C II(~).
n i=o
With this definition, it is easy to see that S(L) exists for L c A* iff 6 ( L )exists and that S(C)= 6(L).
4.
395
EVALUATION OF THE POLYNOMIALS
For any k E Q, a E QCCA]], we have 6(ka) = k6(a).We also have, for w E A*, the relation 6(wa) = n(w)b(a). It follows that for any polynomial p E QCA],
6(P4 = 4 P ) W .
(4.1)
Thefollowing notation is useful. Let a E A be a letter. Then c4:
QC4
+
QCA - {a}]
is the morphism defined by c4(b)= b
for b E A - {a},
c4(a) = 1 -
C
b.
bsA-4
The following result will be used several times in the sequel.
PROPOSITION 4.1 Let p E QCA] be a polynomial and let a E A be a letter. The following three conditions are equivalent: (i) p is divisible by the polynomial 1 - 4, (ii) C,(P) = 0, (iii) A( p) = 0 for each positive Bernoulli distribution. Proof The implications (i) => (ii) => (iii) are clear. To prove (iii) 3 (i), set A = {a,,. . .,a,}. Consider p as a polynomial in the variable a, with coefficients in Q[a2,. ..,a,]. Likewise, consider 4 - 1 = a, - k as a polynomial in a, with constant term k = 1 ai. The Euclidean division of p by a , - k gives
p = q(a,
- k)
+r
with q E QCA] and I E Q[a2,. . .,a,]. Since n(p) = 0 and n(al - k) = 0 for each positive Bernoulli distribution K, the polynomial r vanishes at all points (z2,...,z , ) ~ Q " - ' s u c h that z i 2 0 a n d z , + . . . + z , < 1 . It follows that r vanishes and consequently 1 - 4 divides p. 0 Remark Given a finite maximal code X c A + , it follows from Theorem 1.5.10 that A(X) = 1 for any positive Bernoulli distribution A. By Proposition 4.1, this implies that 1 - 4 divides 1 - $. This is part of Proposition 3.4 proved in a completely different way. We now come back to the notation of the previous sections. We shall give information about the polynomials pi = det(1 - si),
with
1 I i I4
396
VIII. POLYNOMIAL OF A CODE
Since X is finite, we have by Proposition 2.1 and formulas (3.1 l), (3.13), 1 - ;Y = P l P Z P 3 P 4 .
We have already seen that p 2 = 1 - 4. We now examine the polynomial pl. Recall that, given a word w E A*, the set Cr(w)of right-contexts of w (with respect to a code X ) is the set of words u E A* such that wu = X 1 X 2 " ' X n ,
with x i E X and 101 < IxJ. A set of words C c A* is called a set of right contexts if C = Cr(w)for some w E A*. In this case, we have CX* = with U = w-'X*. The same relations hold symmetrically for left contexts.
v
PROPOSITION 4.2 Let C and C' be maximal sets of right and lejl contexts, respectively, with respect t o X . Then there exist polynomials u and u' in E [ A ] such that p4u' = C'. p , u = c, Moreover, for each Bernoulli distribution A, and Proof Set D = X*(A*)-' and G = (A*)-'X*. Let w , E A* be a word such that C = C,(wl)and consider a word x E X* n cp- l(J),where J is the minimal ideal of M = cp(A*).Let w = x w l . Since cp(x)M is a minimal right ideal, there exists a word u E A* such that cp(xwlu) = cp(x). Thus xwlu = wu E X* and consequently w E D n cp- ' ( J ) . Next C,(w,) c Cr(xwl) and thus Cr(w,)= C,(w). Thus C = Cr(w) for some w E cp-'(J) n D. Similarly we have C' = Cl(w')for some w' E cp-'(J) n G. Set
U
= w-'X*,
V = Dw-'.
According to Proposition 3.5 (and Proposition 3.4) = P2P3P4Y
(4.2)
is a polynomial with integral coefficients. According to Proposition 3.6, 0
=PlP2Y
(4.3)
is a polynomial with integral coefficients. Next V n cp-'(J) = D n cp-'(J).
(4.4)
4. EVALUATION OF
397
THE POLYNOMIALS
Indeed, let z E V n cp-'(J). Then zw E D and consequently z E D. Thus z E D n cp- ' ( J ) . Conversely, let z E.D n cp-'(J). Since the right ideal cp(z)M is minimal, there exists z' E A* such that ~(zwz') = cp(z). Thus zwz' E D, whence zw E D. Thus.z E I/ n cp- ' ( J ) ,proving (4.4). Let II be a positive Bernoulli distribution. In view of Proposition VI.2.7, formula (4.4)implies
6(Y)= 6(D).
(4.5)
Consequently S(Y) = 6@). Furthermore, it follows from (4.3), observing that p 2 = 1 - A, that ud* = ply.Applying (4.1),we obtain n(u) = n(p1)d(y) hence K ( V ) = n(p1)8(D).In view of Proposition VI.3.4, n(C)d(D) = 1. Thus n(v)n(C) = n(p1).
v
(4.6)
u(1
Next = C&*. It follows that - X)= 5;, which implies by (4.2), u(1 - X)= p2p3p4(;. Since 1 - $ = p1p2p3p4, we obtain UP1 = 5;.
(4.7)
This proves the first claim. Next, substituting n(C)from (4.7)in formula (4.6), we obtain n(u)II(u)= 1.
(4.8)
This formula holds for any positive Bernoulli distribution 71. According to Proposition 4.1 c,(uu - 1) = 0, hence [,(u)C,(u) = 1. This equality holds in the ring of polynomials Z [ A - {a}], because B and u have integral coefficients. It follows that c,(u) = C,(u) = k 1. We shall see that C,(u) = c,(u) = 1. For this, let y:
Q [ A ] +a;P[a]
be the morphism defined by y(b) = 0
?(a) = a,
for b E A - {a}.
Since X is a finite maximal code, X n a* = {a"} for some integer n 2 1. For this integer n, we have y ( 8 ) = a". Taking the image of 1 - 8 = plp2pJp4by y, we obtain 1
Y(Pl)(l - ahJ(P3P4).
--a"=
Setting q = y(p,), this formula becomes 1+a
+ + a"-' **.
= qy(p3p4).
(4.9)
The left side of (4.9)takes strictly positive values for each nonnegative value of a. Thus the polynomial q = q(a)does not vanish and does not change its sign
398
VIII. POLYNOMIAL OF A CODE
for values in R,. Now q(0) = 1, since p1 has constant term 1. This implies in particular that q( 1) > 0. Observe that q( 1) is the value of C,(p,) at the point (0,. ..,O). Now, by (4.7) la(u)lo(Pl) = la(€)* The polynomial on the right-hand side takes the value Card(C n a*) at the point (0,. ..,0). Consequently the value of ~ J u at ) this point is positive. Since it is a constant, it follows that la(u) = 1. By Proposition 4.1, this implies that a(u) = 1. Symmetrically we can prove that p4u’ = &‘ and that A@’) = 1. For the last formula, we again consider the expression 1 - Y = plpzp3p4and multiply it by 4*&*.Since &*(l - &) = 1, we obtain d* = pIp3p4X*. According to formula (4.1) this gives
1 = n(p1)a(p,)n(p,)G(X*). Next, by Theorem V1.3.1, 6 ( X * ) = (l/d(X))S(G)G(D). Since G(G)a(C’)= 1, 6(D)n(C)= 1 and also x ( p l ) = n(C), n(p4) = n(C’),it follows that d ( X ) = 74Pd 0 5. PROOF OF THE THEOREM
Now we are able to give a proof of Theorem 1.1. Let X be a finite maximal code over A and let 1 - Y = Pl(1 - 4 h P 4 be the factorization of Proposition 3.4.
PROPOSITION5.1 The following equivalences hold: 1. X i s p r e j x o p , = 1. 2. X is su@x o p4 = 1. 3. X is synchronous o la(p3)= 1. Proof 1. Assume that X is prefix. Then the set C’ = { 1) is a maximal set of right contexts. Indeed, if 1 E Cr(w),then w E X * and any u E Cr(w)satisfies wu E X * , whence u E X * , which in turn implies u = 1. By Proposition 4.2, the polynomial p1 divides the polynomial G‘ = 1. Since the constant term of p1 is 1, we have p1 = 1. Conversely assume pi = 1. Again by Proposition 4.2, any maximal set C of right contexts satisfies a(C) = 1 for any Bernoulli distribution a. This implies that if 1 E C, then C = { l}. This implication holds for any set of right contexts since any of them is contained in a maximal one. Now consider x, y E A* with x, x y E X * . Then 1 E Cr(x)and therefore C,(x) = { l}. This implies that the X factorization of x y passes between x and y . Thus y E X * . This shows that X * is right unitary and hence that X is prefix.
5.
399
PROOF OF THE THEOREM
2. The proof is analogous. 3. In view of Proposition 4.2, we have n ( p 3 )= d ( X ) for any Bernoulli distribution. By Proposition 4.1, co(p3 - d(X)) = 0. Consequently ca(p3)= d(X). Thus the equivalence is proved. 0 Proof of Theorem Z.2 The polynomial r = (1 - X)/(1 - A ) is a product = plp3P4.
This polynomial cannot be irreducible in Z [ A ] unless two of these three polynomials are constants. Since the constant term of these polynomials is 1, the irreducibilityof r cannot be realized unless one of the three following cases holds: (i) p1 = p 3 = 1. This implies that X is prefix and synchronous. (ii) p 3 = p4 = 1. This implies that X is suffix and synchronous. (iii) p , = p 3 = 1. This implies that X is biprefix. 0
EXAMPLE 5.1 Let A = {a, b } and let X be our friend X = {aa,ba, baa, bb, bba}. We shall illustrate on this code the computations made in the previous sections. We know already that Xis a maximal code and that X* is recognized by the automaton given in Fig. 8.1. The associated representation cp is given by
We have already seen (Example IV.5.3) that the minimal ideal of the monoid cp(A*) is composed of elements of rank 1. They are all idempotents. The minimal ideal is represented in Fig. 8.2, with a = cp(a),f l = cp(b).It is formed of two minimal right ideals R, and R, and of two minimal left ideals L1and L,.
0 0 1
LI
Fig. 8.1
1 1 0
L2
Fig. 8.2 The minimal ideal.
400
VIII. POLYNOMIAL OF A CODE
Set c1 = cR, and c2 = cR4and similarly for the IL. Then
Thus
El = Q[
+ E ; = Ql1 + E2
-:I,
=Q c ~ Qc~,
E ; = Q[l
QI2,
1
- 11.
Since E ; has dimension 1, we have dim(&) = 2. Since E2 c E , , we have E2
= E3.
Consider the base B = {bl, b2,b3} given by
Thus b , is a base of El and { b l , b 2 }is a base of E,. In this base, the matrices q(a)and q ( b )become $(a) =
0
0
0
Thus the matrix I - e(-Bll/(a) + hJl(b))=
gives the polynomials p1=1+g,
+
p4=1+b,
(5.1) 1 - y = (1 e)(l - g - h)(l + b). This is the decomposition described in the previous sections. Of course, (5.1) may be verified by direct computation.
5.
PROOF OF THE THEOREM
40‘
In fact, formula 5.1 holds in noncommuting variables because X is a factorizing code, 1 - X = (1
+ a)(l - A)(1 + b).
To obtain this formula, we may observe that setting 2 = {aa,ab,b}, we have 1 - Z: = (1
+ a)(l - A),
1 - & = (1 - Z)(l
+ b),
which yields the desired formula. 0
EXAMPLE5.2 Let A
and consider the maximal prefix code The code X is represented in Fig. 8.3 (cf. also Example IV.6.7, where the code Z4was considered). The minimal automaton d ( X * )is represented in Fig. 8.4. The matrices q(a) and q ( b )are 2 = a u bA2. Let X
= {a,b}
= Z2.
- 0 1 0 0 0 0 cp(b)=0 0 0 0 1 0
0 1 0 0 0 0
0 0 1 0 0 0
0 0 0 1 0 0
0 0 0 0 1 0
Let us compute the minimal ideal of the monoid of functions q(A*). It is
Fig. 8.3 The maximal prefix code X = 2’.
VIII. POLYNOMIAL OF A CODE
Fig. 8.4 The automaton d ( X * ) .
represented in Fig. 8.5 with the following conventions (justified by Proposition IV.5.8).Each #-class is identified by the nuclear equivalence characterizing its 9?-class and by the image characterizing its 9-class. Here tl = cp(a), f l = cp(b).The minimal ideal is composed of a single minimal right ideal R and of three minimal left ideals. The minimal rank is 2 and the group of the code X consequently is 2/22. Easy computation shows that the first column of the matrices in Lz u L, vanishes. Consequently
Next, setting Ii = I,, we have 1, = [l I2
=to
I , = [O
0 0 1 0 01, 1 0 0
1
01,
0 1 0 0 11.
5. PROOF
OF THE THEOREM
403
Thus E ; = Ql, + QI, E ; is generated by
+ Q1,
I , - & = [l - 1
0 1 -1
has dimension 3, whence dim(&) = 2. The space
I, - I ,
01,
= [l
0 -1
1 0 -11.
Since dim@,) = 1, we have dim@,) = 0. Set
-
7
1 1 1 b, = b, = 1 ’ 1 1-
1 0 0 b, = 1 ’ 0 0-
b4 =
b, =
These vectors are linearly independent. Since b, = c R ,the set { b , } is a base for E,. Since {bl, b,, b 3 ,b4}all are in E;* (as is easily verified), they form a base of E,. In the new base, matrices q(a) and q ( b )have the form
-
-
l i 1 0 0: 1 0 ................................... 0 1 0; 0 0 l : o 0 $(a) = : 0 0 0: 0 0 ’ ................................... : o 0 : -1 0 - . .-1
:: o
1. 4
1 o o I 1 ............................. -1
1
0;
0
;;;
0 O -1 ............................. : 1 :-1
0 Ol
1 0
Hence we obtain the matrix
.“1
1-4: 0 ............................................... il+A -A 0 i 0 : 0 1-_b O -o4::- 0d 0 : o h 1 . 0 ............................................... il+b
-_bO I
404
VIII. POLYNOMIAL OF A CODE
This gives P1 = 1,
P2 =
1 - 4,
P3 = (1 + 4)(1- I! + bd) = 1 + f? + 642, P4 =
1 + I!
+ bP,
Of course 1 - $ = P1P2pJP4. Also ra(p3) = 2. Since X = Z2 and 1 - z = b bA)(l - A), we obtain, using the formula 1 - 3 = (1 + z)(1- z),
(1
+ +
1 - X = (1
+ a + b A L ) ( l+ bA)(1 - A).
(5.2)
By passing to commutative variables in formula (5.2), the first factor gives p 3 , the second one gives p4 and the last p 2 . Observe that the code X admits a second decomposition. Indeed, all words in X have even length. Consequently X = Y 0 A 2 for some code Y. This leads to the factorization 1 - X = (1
+ ab + bA + bA2b)(l - A?).
(5.3)
By passing to commutative variables in (5.3) the first factor of (5.3) is the product of p4 with the second factor of p3 and the second factor of (5.3)is the product of p2 with the first factor of p3. If we consider the invariant subspace, which is generated by b , and b,, the factorization of det(l - O($(a)g + t,b(b)b) which corresponds to this decomposition of the state space, leads to expression (5.3).
6. COMMUTATIVE EQUIVALENCE
Recall that denotes the canonical morphism that associates to a formal power series its commutative image. By definition, for each o E Q and w E A*,
where K ( w ) is the class of commutativity of w, K ( w ) = { u E A* I p = y } . Two series 0, z E Q are called comrnutatively equivalent if a(o) = a(?). Two subsets X and Y of A* are commutatively equivalent if their characteristics series X and j' are so, which means that = 1.In an equivalent manner, X and Y are commutatively equivalent iff there exists a bijection y: X -,Y such that y(x) E K ( x ) for all x E X.
6.
4-05
COMMUTATIVE EQUIVALENCE
A subset X of A* is called commutatively prefix if there exists a prefix subset Y of A* that is commutatively equivalent to X. EXAMPLE 6.1 Any suffix code is commutativelyprefix. More generally, any code obtained by a sequence of compositions of prefix and suffix codes is commutatively prefix. In particular, our friend X = { aa, ba,baa,bb, bba} is commutatively prefix. EXAMPLE 6.2 Let A = {a, b} and let
X
= {aa,ba,bb,abab,baab,bbab,a3bz,
’
’
a3ba2,a b’ab, a ba’b,a3babab}.
This set is easily verified to be a code, by computing, for instance, the sets of Q’s of Section 1.3,
U,= {abb,abaz,ab2ab,aba3b,(ab)3,ab},U2= {ab}, 4 = {ab}. Further, X is maximal since for x(a) = x(b) = 3, we obtain x ( X ) = 1. Finally X is commutatively prefix since
Y = {aa, ba,bb, abab,abba,abbb,abaab, aba4,aba3b2,aba3ba2, aba3bab) is a prefix code with J’ = $.
THEOREM 6.1 For each subset X of A* the following conditions are equivalent: (i) X is comrnutatively prefix. (ii) The series (1 - X)/(1 - 4)has nonnegative coeficients. (iii) For each commutativity class K intersecting X , we have Card(K) 2 (XA*, 5).
(6.1)
Proof (i) =S (ii). First assume that & = for some prefix set Y . Let P = A* - YA*. Then A* = _Y*p, hence 1 - _Y= p(1 - A). Thus 1 - g = p(1 - 4).Clearly f = (1 - &)/(1 - 4)has nonnegative integral coefficients. To prove (ii) =. (iii), we show that for all w E A*,
(&4*, a) = @A*, K), where K is the commutativity class of w. Indeed
G4*,w) = (a(X)a(A*ba) = (a(XA*),a) =
ad*,5).
(6.2)
406
VIII. POLYNOMIAL OF A CODE
Next, observe that since (1 - A)-' = A*, we have (1 - ty)/(l - 4) = 4*- ad*.
Thus the assumption for this series to have nonnegative coefficients implies that for all w E A*, (4*, y ) 2 (84*, y). Since (4*,y) = Card(K(w)),this gives in view of (6.2), Card(K) 2 (XA*,g), which is the desired inequality. Finally, assume that (iii) holds. For n 2 1, let X,, = X n A",
X,, = X n A(").
(Recall that A(")= 1 u A u A' u u An-'). We shall define by induction on n a sequence k;,,, of prefix codes such that &)
=
For n = 1, set q1)= X(l,. Let n 2 2 and assume a prefix code k;,,) constructed, satisfying I;n, = &(,,).Let K be a commutativity class in A" that meets X. Then by (6.3, (X(n)A*,K ) = (&A*, K ) (6.3) since &,,) = 4;,,,. Next (6.1) shows that Card(K) 2 (&A*, K) 2 ( X ( n + ')A*,K) = (X(n)A*,K )+ C a r W n X,,). Using (6.3), we obtain Card(K) 2 (&A*, 5)
+ Card(K n X,,).
Since k;,,, is prefix
(X,,,A*,K) = Card(k;,,,A* n K). Thus Card(K) 2 Card(K n k;,,,A*) + Card(K n X,,) or equivalently Card(K n (A* - k;,,,A*))2 Card(K n X,,). This inequality shows that we can choose, in the set K, a set YK of Card(K n X,,) words which have no left factor in k;,,,. Since , Y c K, the sets ,Y and K n X,, are commutativelyequivalent. Set Y, = U KYK,where K runs over the commutativity classes contained in A" that meet X. Then
Y, =
u YK K
= U K n Xn = a n . K-
6.
COMMUTATIVE EQUIVALENCE
407
Further, setting qn+1) = q,,)u proof. 0
Y.,this set is a prefix code. This completes the
EXAMPLE 6.3 Let A = {a, b} and let X c a*ba*. Then X is commutatively prefix iff for all n 2 0 Card(X n A("+'))I n.
(6.4)
Indeed, the only commutativity class K c A" that intersects X is K = {a'bajl i + j = n - l}. Since Card(K) = n and since
(gA*,&) = (&*,
&) = Card(X n A("+')).
Formula (6.4) is a translation of formula (6.1). COROLLARY 6.2 A factorizing code is commutatively prefix. Proof Let X c A be a factorizingcode and let (P,Q) be a factorization of X. Then by definition 1 - 3 = p(1 - A)Q.Passing to commutative variables gives 1 - X = p(1 - A)Q or also (1 1$)/(1 - 4) = PQ. Since PQ has nonnegative coefficients, the conclusion follows from Theoyem 6.1. U p +
Now we give an example of a code which is not commutatively prefix. EXAMPLE 6.4 Let X c a*ba* be the set given in Table 8.1, with the convention that a'baj E X iff the box ( i , j ) contains a dot. Clearly X c and Card(X) = 16. Thus according to Example 6.3, X is not commutatively prefix. To show that X is a code, assume the contrary (see Fig. 8.6). Then there are in X four words of the form x = a'baj,
y = aka'ba",
z
= xuk,
t = a'ba".
The dots representing x and z are in the same row in Table 8.1. Thus necessarily k E { 1,2,4,6,7,12,13,14}since these are the distances separating two dots in a same row. Next, the dots representing y and t are in rows whose difference of indices is k. Thus k E {3,5,8,1l}. This gives the contradiction. We can observe that Xis a code with decipheringdelay 1. Also by Corollary 6.2, the code X is not a factorizing code. We have mentioned in Section 1 that all finite maximal codes known up to now are factorizing. A fortiori they are commutatively prefix. If this property holds for all finite maximal codes, then it follows that the code of Example 6.4 cannot be completed into a finite maximal code. As a matter of fact, we do not know at present whether it is decidable that a finite code can be included in a finite maximal code.
VIII. POLYNOMIAL OF A CODE
TABLE8.1 A Code X Which Is Not Commutatively Prefix
V
X
Z
t
ak
o------o Fig. 8.6 If X were not a code.
7. COMPLETE REDUCIBILITY
Let A be an alphabet. Let a E Q((A))be a series and let u E A* be a word. We define a series a u by (a u, u) = (a,uu) for all u E A*. The following formulas hold: 6 . 1 =a,
(a*u)*u=a*uu.
7.
COMPLETE REDUCIBILITY
409
Let V, be the subspace of the vector space Q((A>>generated by the series a. u for u E A*. For each word w E A*, we denote by $,(w) the linear function from V, into itself (acting on the right) defined by *u(w):
-
.
P H P w.
The formula ( p u)$,(w) = p uw = p$,(uw) is straightforward. It follows that $, is a morphism $:,
A* + End( V,)
from A* into the monoid End(V,) of linear functions from V, into itself. The morphism $, is called the syntactic representation of a.
PROPOSITION7.1 Let Y be a subset of A* and let a = 1.Let cp be the canonical morphism from A* onto the syntactic monoid of Y . Then for all u, E A*,
cp(4 =
*
$u(4
= $u(v).
I n particular the monoid $,(A*) is isomorphic to the syntactic monoid of Y. Proof
Assume first that $,,(u) = $,(u). Then for all r E A*,
-
-
.
a ru = (a r) $&) = (a.r) $,(u) = a ru.
Thus also for all s E A*, (a,rus) = (a * ru, s) = (a * 10, s) = (a,rus).
That means that rus E Y iff rus E Y, which shows that cp(u) = cp(u). Conversely, assume cp(u) = cp(u). Since the vector space V, is generated by the series a. (r E A*), it suffices to show that for r E A*, (a r)$,(u) = (a * r)$u(u).
Now for all s E A*, ((a * r)$,(u), s) = (a * ru, s) = (a, rus) = (a, rus) = ((a *
r)$,(u), s). 0
The preceding result gives a relationship between the syntactic representation of the characteristic series a of a set Y c A* and the syntactic monoid of Y. It should be noted (see Example 7.2) that the monoid $,(A*) is not always equivalent to the transition monoid of the minimal automaton of Y considered as a function monoid over the vector space generated by the states. However, it can be shown that the vector space V, has finite dimension iff Y is recognizable (Exercise 7.2).
-
EXAMPLE 7.1 Let a = A*. Then a u = a for all u E A*. Consequently V, = Qa is a vector space of dimension 1.
4‘0
VIII. POLYNOMIAL OF A CODE
EXAMPLE7.2 Let A = {a,b} and let X t A’ be the biprefix code of Fig. 8.7. Let a = &*. We shall see that the vectors 6,a a, 6.a’, and a . 6 form
Fig. 8.7
a base of the vector space V,. Indeed, the formulas a - a 3= c r . a b = a ,
-
a 6a = 6 a2, 6 -6’ = 60 a26 = a
- a + a a’
-a
b,
show that the four vectors a, a - a , a . a 2 , and a . 6 generate V,. A direct computation shows that they are linearly independent. The matrices of the linear mappings $,(a) and $,(6) in this base are
The relation between $a and the minimal automaton of X* is now to be shown. The minimal automaton has fivestates which may be written as 1,l a, 1 a’, 1 6, 1 mb’. Let Y be the Q-vector space formed of formal linear combinations of these five states. The linear function a: V - r V, defined by a(1 u) = a u satisfies the equality a(q u) = &) u and moreover we have a(l a -+ 1 a’ - 1 6 - 1 62)= 0. Thus V has dimension 5 and V, has dimension 4.
-
.
-
-
-
Let V be a vector space over Q and let N be a submonoid of the monoid End(V )of linear functions from Vinto itself. The action of elements in End( V) will be written on the right.
7.
COMPLETE REDUCIBILITY
411
A subspace W of V is invariant under N if for p E W, n E N, we have pn E W. The submonoid N is called reducible if there exists a subspace W of V which is invariant under N and such that W # {0}, W # V. Otherwise, N is called irreducible. The submonoid N is completely reducible if for all subspaces W of V which are invariant under N, there exists a subspace W’ of V which is a supplementary space of Wand invariant under N. If V has finite dimension, a completely reducible submonoid N of End(V) has the following form. There exists a decomposition of V into a direct sum of invariant subspaces W, ,W2,. . .,wk, V=w1@w2@**’@wk
such that the restrictions of the elements of N to each of the K’s form an irreducible submonoid of End(WJ. In a base of V composed of bases of the subspaces the matrix of an element n in N has a diagonal form by blocks,
w,
Let M be a monoid and let V be a vector space. A linear representation rl/ of M over V is a morphism from M into the monoid End( V). A subspace W of V is called invariant under rl/ if it is invariant under $ ( M ) . Similarly rl/ is called reducible, irreducible, or completely reducible if this holds for rl/(M). The syntactic representation of a series CT is an example of a linear representation of a free monoid. The aim of this section is to study cases where this representation is completely reducible. We recall that all the vector spaces considered here are over the field Q of rational numbers. The following result is a classical one.
THEOREM 7.2 A linear representation of a Jinite group is completely reducible. Proof Let V be a vector space over Q. It suffices to show that each finite subgroup of the monoid End( V) is completely reducible. Let G be a finite subgroup of End( V) and let W be a subspace of V which is invariant under G.Let W, be any supplementary space of Win V. Let n:V -+ V be the linear function which associates to p E V the unique p1 in W, such that p = p1 p’ with p’ E W. Then n(p)= 0 for all p E W and n(p)= p for p E W, . Moreover, p - n(p)B W for all p E V. Let n = Card(G). Define a linear function 8:V -+ V by setting for p E V,
+
412
VIII. POLYNOMIAL OF A CODE
Let W' = 8( V). We shall see that W' is an invariant subspace of V under G which is a supplementary space of W. First, for p E W,
qp)= 0. (7.1) Indeed, if p E W, then p g E W for all g E G since W is invariant under G. Thus n(pg) = 0 and consequently 8(p) = 0. Second, for p E V,
By definition of R each pg - n(pg)is in W for g E G. Since W is invariant under G, also (pg - a(p8))g-l E W. This shows formula (7.2). By (7.1) we have W c Ker(8) and by (7.2), Ker(8) c W since p E Ker(8) implies p - 8 ( p ) = p. Thus W = Ker(8). Formula (7.1) further shows that 8' = 8. Indeed, 8 ( p ) - 02(p)= 8 ( p 8(p)).By(7.2),p - 8 ( p ) E W.HenceO(p) - OZ(p)= Oby(7.1).Since02 = 8,the subspaces W = Ker(8) and W' = Im(8) are supplementary. Finally, W' is invariant under G. Indeed, let p E V and h E G. Then
The function g H k = h- ' g is a bijection from G onto G and thus
This completes the proof. 0
THEOREM 7.3 Let X c A + be a thin biprejix code. The syntactic representation of 1(*is completely reducible. In the case of group codes, this theorem is a direct consequenceof Theorem 7.2. For the general case, we need the following proposition in order to be able to apply Theorem 7.2.
PROPOSITION 7.4 Let X c A+ be a thin prejix code and let $ = $x, be the syntactic representation of X*.The monoid M = #(A*) contains an idempotent e such that (i) e E $(X*). (ii) The set eMe is the union of the finite group G(e)and of the element 0, provided 0 E M .
7. COMPLETE REDUCIBILITY
4'3
Proof Let S be the syntactic monoid of X * and let cp: A* -,S be the canonical morphism. Consider also the minimal automaton d ( X * ) of X*. Since X is prefix, the automaton d ( X * ) has a single final state which is the initial state (Proposition 11.2.2). Let ,u = cpdpl(Xr) be the morphism associated with d ( X * ) .We claim that for all u, u E A*, P ( 4 = P(d
* $(4= $W.
(7.3)
Indeed, in view of Proposition 0.4.3, we have =P(4
* cp(4 = c p ( 4
and by Proposition 7.1,
cp(4 = cp(4 *
$(u) = t,b(u).
Formula (7.3) shows that there exists an isomorphism /3:p(A*)+ $(A*) = M defined by /3 0 p = t,b. In particular, t,b(X*) = /3(p(X*)).Thus the monoid M shares all properties of the monoid p(A*). In particular by Theorem IV.5.6,the monoid M has a unique 0-minimal or minimal ideal, say J , according to whether M does or does not have a zero. There exists an idempotent e in J which is also in t,b(X*).The &-class of this idempotent is a finite group that is isomorphic to the group of X. 0 Proof of Theorem 7.3 For convenience, set V = V', and denote by t,b the syntactic representation I(/',. Let M = $(A*). By Proposition 7.4, there exists an idempotent e E J/(X*rssuchthat e M e is the union of 0 (if 0 E M ) and of the group G(e). Let L = M e and define S = { pe I p E V}. Since e 2 = e, we have re = z for all T E S. Next, for all 1 E L, we have le = 1 since 1 = me for some m E M and consequently le = me2 = me = 1. Thus for all 1 E L, Vl c s.
(7.4)
Let W be a subspace of Vwhich is invariant under M.We shall see that there exists a supplementary space of W which is invariant under M. For this, set T = W n S a n d G = G(e). The group G acts on S. The subspace T of S is invariant under G.Indeed, let T E T and let g E G.Then zg E W since W is invariant under M and tg E S by (7.4) since g = ge. By Theorem 7.2, there exists a subspace T' of S which is supplementary of T in S and which is invariant under G.Set W ' = { p E VlvlEL,plET'}.
We shall verify that W' is a supplementary space of W invariant under M . First observe that W' clearly is a subspace of V. Next it is invariant under M since for p E W' and m E M , we have, for all 1 E L, (prn)l = p(m1) E T' and consequently pm E W'.
4'4
VIII. POLYNOMIAL OF A CODE
Next we show that
T'c W'.
(7.5)
Indeed, let z' E T'. Then t' E S and thus z'e = z'. Hence 7'1 = z'el for all 1 E L. Since el E eMe and since T' is invariant under G,it follows that 7'1 E T'. Thus 2'E
w.
Now we verify that V = W + W'. For this, set a = $* and first observe that ae = a.
(7.6)
(Note that a E V and e acts on V.) Indeed, let x E X* be such that I&) = e. Since X * is right unitary, we have for all u E A* the equivalence xu E X * o u E X*.Thus (ae,u) = (a x, u) = (a, xu) = (a,u) which proves (7.6). In view of (7.6), we have a E S. Since S = T T', there exists z E T and T' E T' such that a = z 2'. Then for all m E M, am = trn z'm. For each m E My zm E Tm c Wm c W ,whence zm E W. Using (7.9, also z'm E T'm c W'm. Since W' is invariant under M, we obtain z'm E W'. Thus a m E W + W'. Since V is generated by the vectors a m for m E My this proves that V = w + W'. Finally, we claim that W n W' = (0). Indeed, let p E W n W'. Then for all IEL,
+
+
+
(7.7)
pl = 0.
Indeed, let 1 E L. Then pl E W,since W is invariant under M and pl E S by Eq. (7.4). Thus pl E W n S = T. Further pl E T' by the definition of W'and by the fact that p E W'. Thus pl E T n T' = (0). Since V is generated by the series a u (u E A*), there exist numbers a, E ds (u E A*) with only a finite number of them nonzero such that
-
p=
C
-
a,(a u).
UEA*
Again, let x E X* be such that #(x) = e. Since X* is left unitary, we have, as above, (a, w ) = (a,w x ) for all w E A*. Thus for all u E A*,
=
C a,(a,uux) = C au(a*u,ux)
usA*
UEA*
= ( p , ux) = ( p ' ux, 1).
Thus setting m = $(u), we have ( p , u) = (pme, l), and since me E L, we have pme = 0 by (7.7). Consequently (p,u) = 0 for all u E A*. Thus p = 0. This shows that W n W'= (0)and completes the proof. 0
7.
4'5
COMPLETE REDUCIBILITY
EXAMPLE 7.2 (continued) The subspace W of V = V, generated by the vector p = a + a . a a a2 is invariant under $,. Indeed, we have
+
-
p . a = P,
P - b = P-
We shall exhibit a supplementary space of W invariant under $,,. It is the subspace generated by a-ass, a - a - a 2 ,
a-a-b.
Indeed, in the base p,
a-a-a,
a-a*a2, a-a.b,
the linear mappings $,,(a) and $,(b) have the form
We can observe that there are no other nontrivial invariant subspaces. We now give a converse of Theorem 7.3 for the case of complete codes. The result does not hold in general if the code is not complete (see Example 7.3).
THEOREM 7.5 Let X c A + be a thin complete code. If the syntactic representation of X* is completely reducible, then X is biprefx. Proof
Let d = (Q, 1,l) be a trim unambiguous automaton recognizing
X*.Let cp be the associated representation and let M = cp(A*). Set a = and also V = V,, I) = $ ., Let p be the canonical morphism from A* onto the syntactic monoid of X*.By Proposition 0.4.3, we have for u,
x*
u E A*, cp(u) = cp(u)*p(u) = p(u). Since by Proposition 7.1, p(u) = p(u)$(u) = J/(u), it follows that ~ ( u=) q ( u ) =s $(u) = $(u). Thus we can define a linear representation 8: M + End( V) by setting for m E M ,O(m) = I,@), where u E A* is any word such that p(u) = m. If $ is completely irreducible, then this
holds also for 8. For notational ease, we shall write, for p E V and m E M, p m instead of p u, where u E A* is such that cp(u) = m. With this notation, we have for 6
.
m = cp(u),
.
p u = PI&)
= pO(m) = pa m.
Observe further that with m = cp(u), (a e
m ,
1) = (a u, 1) = (a,~).
4’6
VIII. POLYNOMIAL OF A CODE
Hence (a m, 1) =
1
0
if m E cp(X*), otherwise.
- .
Finally, we have for p E V, m, n E M , ( p m) n = p mn. For p E V and for a finite subset K of M,we define
In particular, (7.8) gives (a.K, 1) = Card(K n cp(X*)).
(7.9) The code X being thin and complete, the monoid M has a minimal ideal J that intersects cp(X*).Further, J is a 9-class. Its W-classes (resp., 9-classes) are the minimal right ideals (resp., minimal left ideals) of M (see Chap. IV, Sect. 5). Let R be an W-class of J and let L be 9-class of J. Set H = R n L. For each m E M,the function h H hm induces a bijection from H onto the %‘-class Hm = Lm n R. Similarly,the function h H mh induces a bijection from H onto the %‘-class mH = L n mR. To show that X is suffix, consider the subspace W of V generated by the series a*H-o.K (7.10)
for all pairs H, K of #-classes of J contained is the same W-class. We shall first prove that W = (0). The space W is invariant under M.Indeed, let H and K be two %‘-classes contained in some W-class R of J. Then for m E M , (a H ) m = a (Hm) since the right multiplication by m is a bijection from H onto Hm. Thus (a H - o K ) m = a (Hm) - am ( K m ) and the right-hand side is in W since Hm,K m c R . Next for all p E Wand m E J,
-
- -
p-m=O
(7.11)
Indeed, let H and K be two %‘-classes contained in an W-class R of J. Then for m E J, Hm, K m c R n Mm. Since R n M m is an %‘-class, we have Hm = K m = R n Mm.Thus
-
(a H - a K) m = 0. 9
Since p E W is a linear combination of series of the form (7.10) this proves (7.11). Since the representation of M over Vis completely reducible, there exists a supplementaryspace W’ of W which is invariant under M.Set CT = p + p’ with p E W, p‘ E W’. Let H, K be two &-classes of J contained in an W-class R .
7.
4’7
COMPLETE REDUCIBILITY
We shall prove that 0
H =c*K.
(7.12)
We have 0*
H -0*K
= ( p * H’- p
*
K ) + (p’ H
- p’
K).
Since p E W and H, K c J, it follows from (7.11) that (7.13)
p - H = p * K = 0.
Next, there exists numbers a,,, E Q (m E M) which almost all vanish such that p’ = CmeMa,,,(cr. m).Since the left multiplication is a bijection on &‘-classes, we have
- -
-
(cr rn) H - (cr m) K = cr ( m H ) - 0 * (mK). 0
Thus, since mH, mK c mR, the right-hand side is in W. Thus also p’ H - p’ K E W. Since W’ is invariant under M, this element is also in W’. Consequently it vanishes and
.
p’
H
= p’
*
(7.14)
K.
Thus (7.12) follows from (7.13) and (7.14). In view of (7.9), formula (7.12) shows that if cp(X*) intersects some &‘-class H in J, then it intersects all &‘-classes which are in the W-classcontaining H. In view of Proposition IV.5.8,this is equivalent to X being ~uffix. We conclude by showing that X is prefix. Let T be the subspace of I/ composed of the elements p E I/ such that ( p H, 1) = ( p K, 1) for all pairs H, K of &‘-classes of J contained in a same 9’-class. The subspace T is invariant under M. Indeed if p E T and H, K c L, then for all m E M,
-
( p . m ) . H = p.mH,
.
(7.15)
( p . m ) * K= p * m K .
-
Since mH, mK are in the 9-class L, we have by definition ( ( p m) H , 1) = ( ( p m) K,1). Thus p m E T. Next for all m E J and p E V , 0
(7.16)
p-mET.
Indeed, let m E J and let H,K be two &‘-classes contained in the 9-class L c J. Then mH = mK. Thus by (7.15), ( ( p m) H , 1) = ( ( p m) K, 1). Thus p-mET. Let T’be a supplementary space of Twhich is invariant under M. Again, set
- .
. -
a=p+p’
this time with p E T, p’ E T’. Let H, K be two %-classes in J both contained in
VIII. POLYNOMIAL OF A CODE
4‘8 some .%’-class
L.Then
H,1) - (a K , 1) = ( ( p .H,1) - ( p K , 1)) + ( P ’ H,1) - ( P ‘ * K,1). By definition of T, we have ( p H,1) - ( p K , 1) = 0. In view of (7.16), we have p’ H, p’ * K E T, whence p’ H - p’ .K E T n T’ = (0). Thus (a H,1) = (a K, 1). Interpreting this equality using (7.9), (a
.
it is shown that if cp(X*) meets some #-class of J , it intersects all #-classes contained in the same 9‘-class. By Proposition IV.5.8, this shows that X is prefix. 0
EXAMPLE 7.3 Let A = {a,b} and let X = {a, ba}. The code X is prefix but not suffix. It is not complete. Let a = g*.The vectors a and a 6 form a base of the vector space V, since
-
a‘a=a,
c~.ba=a,
o.bb=O.
In this base, the matrices of &,(a) and + ,I@ , ) are
is irreducible. Indeed, The representation ~,
Thus the matrices #&), u E A* generate the whole algebra Q 2 z. Thus no nontrivial subspace of V is invariant under A*. This example shows that Theorem 7.5 does not hold in general for codes which are not complete.
EXERCISES
SECTION 1 1.1. A code X c A + is called separating if there is a word x E X* such that each w E A* admits a factorization w = uu with xu, ux E X*.
(a) Show that a separating code is complete and synchronous. (b) Show that a separating code is factorizing and that its factorization is unique. 1.2. Let b E A be a letter and let X c A + be a finite maximal code such that for all x E X,lxla s 1. Let A‘ = A - b. Let X’= X’= X n A’*. Show that there is a factorization (P,Q)of X’considered as a code over A’ such
419
EXERCISES
that X = X' u PbQ. 1.3.
Let A = { a , b } . Use Exercise 1.2 to show that a finite code X c a* u a*ba* is maximal iff X = a" u PbQ with n 2 1 and P,Q c a* satisfying PQ = 1 a a"-'. Let X, Y c-A+ be two distinct finite maximal prefix codes such that X n Y # 0.Let P = A* - XA*, Q = A* - YA* and let
+ + +
**1.4.
R c (X n Y)* be a finite set satisfying uu E R, u E (X n Y)* + D E R. (This means that R is suffix-closed considered as a set on the alphabet X n Y.) (a) Show that there is a unique finite maximal code Z c A + such that Z - 1 = (X n Y - 1)p. (b) Show that there exists a unique finite maximal code T c A + such that T - 1 = ( p wQ)(A - 1)R, -
+
1.5.
where w is a word of maximal length in 2. (c) Show that the code T is indecomposable under the following three assumptions: (i) 2 is separating. (ii) Card(P u wQ) and Card(R) are prime numbers. (iii) R is not suffix-closed (over the alphabet A). (d) Compare with Example 1.1, by taking P = { l,a}, Q = { l , a , b } , R = { 1, aa}, w = abaa. Let A = {a, b} and let
X
= ( A 2 - b z ) u b2A,
Y = A'u u b.
(a) Verify that X is a maximal prefix code and that Y is maximal suffix. (b) Show that the code Z defined by Z* = X* n Y* satisfies
z - 1 = (1 + A + b2)((4- l)a(A - 1) + A - 1)(1 + u + Au). [Hint: Show that z - 1 = (X- 1)f = Q(X - - 1) for some
P c X*,Q c Y*.] (c) Show that Z is synchronous but not separating. (d) Show that Z has two distinct factorizations.
SECTION 2 2.1.
Let X t A + be a finite code. Let d = (Q, 1,l) be an unambiguous trim automaton recognizing X*.
VIII. POLYNOMIAL OF A CODE
420
(a) Show that d is finite. (b) Let cp = qd and let
as in Proposition 2.2. Show that the determinant of the restriction of the matrix I - t to (Q - 1) x (Q - 1) is equal to 1. Deduce from this that det(1 - t) = 1 - y. 2.2.
[Hint: Compute the coefficient (1,l) of the inverse of I - t.] Let P E Q ( A ) be a polynomial with (P, 1) = 0. Show that there exists a morphism cp:
A*-+CyX”
such that (i) for all w E A + , ( P ,w ) = ~ p ( w ) , , ~ , \
\
SECTION 3
The notation of Section 3 is assumed here and in the following exercise. 3.1. Let d = (Q, 1,l) be a complete deterministic trim automaton and let Q = {1,2,..., n}. (a) Show that for each R E r the vector cR is colinear to the vector
[ j*
(b) Show that for each L E A the vector lL is colinear to any vector c,,pmc,,for m E L. 3.2. Let e be an idempotent in the minimal ideal J of M , let S = Fix(e) and let e = cl be the column-row decomposition of e . Set R = M e , L = e M . (a) Show that cR is colinear to the vector (b) Show that lL is colinear to the vector CsES ls,.
xsES~,S.
SECTION 4 **4.1.
Let X c A + be a set. A word x E X is said to be a pure square for X if (i) x = w zfor some w E A + , (ii) X n wA* n A*w = {x}:
42'
EXERCISES
(a) Let X c A" be a finite maximal prefix code and let x = w2 be a pure square for X . Set G = X w - l , D = w - ' X . Show that the polynomial d
= (1
+ w)(& - 1 + (G - l ) w @ - 1)) + 1
is the characteristic series of a finite maximal prefix code denoted by dW(X). [Hint: First show that CT has positive coefficients by setting G1= G - w, D , = D - w and observing that d
= (& - w
e , - Gw) + GwQ,
+ (8- wD)w + GwDw.
Then show that there exists a set P c A* such that P ( A - l)(w + 113. Show that the polynomial
0
-
1=
(X- 1 + (G - l ) w @ - 1 ) ) ( 1 + w ) + 1 (b) (c)
(d)
(e)
(f) (8)
is the characteristic series of a finite maximal code denoted by YW(X) Let X c A + be a finite maximal prefix code. Show that if x = w2 is a pure square for X , then x2 is a pure square for yw(X). Let X c A + be a finite maximal prefix code. Let x = w2 be a pure square for X . Show that the codes Y = y,(X) and Z = 6,(X) have the same degree. [Hint: Show that there is a bijection between Y-interpretations and 2-interpretations of a word.] Let X be a finite maximal biprefix code. Let x = w2 be a pure square for X and Y = y,(X). Show that d ( X ) = d(Y). [Hint: Show that { 1 , w } is a maximal set of left contexts and use Proposition 4.2.1 Let X be a finite maximal biprefix code. Let x = w2 be a pure square for X and let Y = y,(X). By (b) the word x2 is a pure square for Y. Let Z = d,(Y). Show that d ( 2 ) = d ( X ) . [Hint: Show that { 1 , w, w2,w"} is a maximal set of left contexts for T = y,(Y). Deduce from Proposition 4.2 that d ( T )= d ( X ) . Then use (c) to show that d ( Z ) = 477.1 Show that if d ( X )is a prime number and d ( X ) > 2, the code Z of (e) does not admit any decomposition over a suffix or a prefix code. Use the above construction to show that for each prime number d 2 3, there exist finite maximal codes of degree d which are indecomposable and are neither prefix nor suffix.
VIII. POLYNOMIAL OF A CODE
422
SECTION 6
Let X c A + be a circular code. Show that the series logd* - logy* has nonnegative coefficients (use Proposition VII.1.5). Show that X is commutatively prefix. 6.2. Let X be the circular code X = {a, ab, c, acb}. Show that there is no bijection a: X + Y of X onto a prefix code Y such that a(x) is a conjugate of x for all x E X. 6.3. Let X c A + be a finite set and assume there is a letter a E A such that X n a* = a" for some n 2 1. Let b E A - {a] be another letter. (a) Show that if X is a code, then Card(X n a*ba*) 5 n. (b) Show that if X is complete, then Card(X n a*ba*) 2 n. (c) Show that if X is a maximal code, then Card(X n a*ba*) = n. 6.1.
SECTION 7
Let d = (Q, i, T) be a finite automaton. The aim of this exercise is to construct the syntactic representation of the series o = Id1 E N((A)). Let cp be the representation associated with d and let M = cp(A*).We may assume Q = { 1,2,. ..,n} and i = 1. Let E = Q", let Eo be the subspace of E generated by the vectors MI., m E 'PA*).Let E - I be the subspace of E, composed of all vectors 1 E Eo such that for all rn E M ,c,s,(lrn), = 0. Show that the linear function a: E, + V, defined by a: c p ( ~ ) ~H . u u has kernel E- l . Deduce from this fact a method for computing a basis of V, and the matrices of $,(a) in this basis for a E A. 7.2. Let S c A + and o = S. Show that V, has finite dimension iff S is recognizable (use Exercise 7.1). 7.3. Let K be a commutative field and let 0 E K((A)). The syntactic representation of 0 over K is defined as in the special case K = Q. Recall that the characteristic of a field is the greatest common divisor of all integers n such that n 1 = 0 in K. Let X be a very thin biprefix code. Let K be a field of characteristic 0 or which is prime to the order of G(X).Show that the syntactic representation of X* over K is completely reducible. 7.4. Let X be a very thin biprefix code. Show that if X is synchronizing,then $.,,(A*) is irreducible. 7.1.
-
NOTES Theorem 1.1 is due to Schiitzenberger (1965b). The proof given here is essentially that of this paper with some additions from Hansel et al. (1984).
423
NOTES
The problem of characterizing commutatively prefix codes has an equivalent formulation in terms of optimality of prefix codes with respect to some cost functions, namely, the average length of the code for a given weight distribution on the letters. In this context it has been treated in several papers and in particular in (Carter and Gill, 1961; Karp, 1961). The codes of Example 6.3 have been studied more recently under the name of bayonet codes (Hansel, 1982;Pin and Simon 1982; De Felice, 1983).Example 6.4 is due to Shor (1983). It is a counterexample to a conjecture of Perrin and Schutzenberger(1981).A particular case of commutatively prefix codes is studied in Mauceri and Restivo (1981). Results of Section 7 are due to Reutenauer (1981). The syntactic representation appears for the first time in Schutzenberger (1961a). It has been developed more systematically in Fliess (1974) and in Reutenauer (1980). Theorem 7.2 is Maschke’s theorem. The property for an algebra of matrices to be completely reducible is equivalent to that of being semisimple (see, e.g., Herstein, 1969).Thus Theorem 7.3 expresses that the syntactic algebra $JA*) for t~ = &*, & a thin biprefix code, is semisimple. This theorem can be considered a generalization of Maschke’s theorem. Exercises 1.1 and 1.5 are from Boe (1981), 1.3 from (Restivo 1977), 1.4 from (Boe 1978), 2.2 from Hansel et al. (1984),4.1 from Perrin (1977a) and Vincent (1982), 6.1 from Perrin and Schutzenberger (1977) and 6.3 from Perrin and Schutzenberger (1976). A new proof of Theorem 1.1 has been obtained recently by Reutenauer (1983a). Further developments can be found in (Reutenauer, 1984a,b)where the following result is proved: for any finite maximal code X c A + there exist polynomials P , Q E H ( A ) such that 1 - X = P(l - A)Q.
If one could prove that, in fact, P, Q can be chosen in N (A), it would imply that any finite maximal code is factorizing and, therefore, commutatively prefix. This is the main question left open in the theory of codes.
References
Aho, A., Hopcroft, J., and Ullman, J., 1975. “The Design and Analysis of Computer Algorithms.” Reading, Massachusetts: Addison Wesley. Apostolico, A., and Giancarlo, R., 1984. Pattern matching implementation of a fast test for unique decipherability. Information Processing Letters (to appear). Arnold, A,, and Latteux, M. 1981. A new proof of two theorems about rational transductions. Theoret. Comput. Sci. 8,261-263. Ash, R. B. 1965. “Information Theory.” New York: Wiley (Interscience). Bandyopadhyay, G. 1963. A simple proof of the decipherability criterion of Sardinas and Patterson. Inform. and Control 6,331-336. Berstel, J. 1979. “Transductions and Context-Free Languages.” Stuttgart: Teubner. Berstel, J., and Reutenauer, C., 1984. “Les series rationnelles et leurs langages.” Paris: Masson. Berstel, J., Perrin, D., Perrot, J. F., and Restivo, A. 1979. Sur le theorlime du defaut. J. Algebra 60, 169-180.
Blanchard, F., and Perrin, D. 1980. Rel6vement d’une mesure ergodique par un codage. 2. Wuhsch. Verw. Gebiete 54,303-311. Blum, E. K. 1965. Free subsemigroups of a free semigroup. Michigan Math. J . 12, 179-182. Bobrow, L. S., and Hakimi, S. L. 1969. Graph theoretic prefix codes and their synchronization properties. Inform. and Control 15,70-94. Boe, J. M. 1972. Sur les codes #inclusion. Inform. and Control 4,291-299. Boe, J. M. 1976. “Representations des monoides: Applications Ii la theorie des codes.” Thbe de 3e cycle, Montpellier. Boe, J. M. 1978. Une famille cemarquable de codes indecomposables. In “Automata, Languages and Programming Lecture Notes in Comput. Sci., 105-1 12. Boe, J. M. 1981. Sur les codes synchronisants coupants. I n “Non-commutative Structures in Algebra and Geometric Combinatorics” (A. De Luca, ed.). Quaderni de la ricerca scientifca, 109 7-10.
Boe, J. M., Boyat J., Bordat, J. P., and CBsari Y. 1979. Une caracterisation des sous-monotdes libkrables. In “Theorie des codes” (D. Perrin, 4.) 9-20. . LITP. 424
REFERENCES
425
Boe, J. M., De Luca A., and Restivo, A. 1980. Minimal completable sets of words. Theoret. Comput. Sci. 12, 325-332. Brzozowski, J. A. 1967. Roots of star events. J. Assoc. Comput. Mach. 14,466-477. Carter, L., and Gill, J. 1961. Conjectures on uniquely decipherable codes, IRE Trans. Inform. Theory IT7.27-38. Cerny, J. 1964. Poznamka K homogenym s Konecnymi automati. Mat.-jLz. cas. SAV., 14,208215. Ctsari, Y. 1982. Sur un algorithme donnant les codes biprtfixes finis. Math. Systems Theory 6, 221-225. Ctsari, Y. 1974. Sur I’application du theoreme de Suschkevitch a I’ttude des codes rationnels complets. In “Automata, Languages and Programming.” Lecture Notes in Comput. Sci., 342-350. Berlin and New York: Springer-Verlag. Cksari, Y. 1979. Proprietts combinatoires des codes biprefixes. In “Theorie des codes” (D. Perrin, ed.). 20-46. Paris: LITP. Chofht, C. 1979. Une caracterisation des codes 5 dblai borne par leur fonction de decodage. In “Theorie des Codes” (D. Perrin, ed). LITP. Clifford,A. H., and Preston, G. B. 1961. “The algebraic theory of semigroups.” Vol. 1. Providence, R. I.: Amer. Math. SOC. Cohn, P. M. 1962. On subsemigroups of free semigroups. Proc. Amer. Math. SOC.63,347-351. Cohn, P. M. 1971. “Free Rings and Their Relations,” New York: Academic Press. Csiszir, k,and Korner, J. 1981. “Information Theory: Coding Theorems for Discrete Memoryless Systems.” New York: Academic Press. De Felice, C. 1983. A note on the triangle conjecture. Inform. Process. Lett. 14, 197-200. De Felice, C., and Restivo, A., 1984. Some results on finite maximal codes. RAIRO Informat. Thtor. (to appear). De Luca, A. 1976. A note on variable length codes. Inform. and Control 32,263-271. De Luca, A,, and Restivo, A. 1980a. On some properties of very pure codes. Theoret. Comput. Sci. 10,157-170. Devitt, J. S., and Jackson, D. M. 1981. Comma-free codes: An extension of certain enumerative techniques to recursively defined sequences. J. Combin. Theory Ser. A 30, 1-18. Eastman, W. L. 1965. On the construction of common free codes. IEEE Trans. Inform. Theory IT-11,2,263-267. Ehrenfeucht, A., and Rozenberg, G. 1978. Elementary homomorphisms and a solution to the DOL sequence equivalence problem. Theoret. Comput. Sci. 7 , 169-184. Ehrenfeucht, A., and Rozenberg, G. 1983. Each regular code is included in a regular maximal code RAIRO Informat. Theor. (to appear). Eilenberg, S. 1974. “Automata, Languages, and Machines.” Vol. A. New York: Academic Press. Eilenberg, S. 1976. “Automata, Languages, and Machines.” Vol. B. New York: Academic Press. Elgot, C., and Mezei, G. 1965. On relations defined by generalized finite automata. IBM J. Res. Deoelop. 9,47-65. Feller, W. 1957. “An Introduction to Probability Theory and Its Applications.” New York: Wiley. Fliess, M. 1974. Matrices de Hankel. J. Math. Pures Appl. 53, 197-222. Gilbert, E. N. 1960. Synchronization of binary messages. IRE Trans. Inform. Theory IT-6,470477. Gilbert, E. N. and Moore, E. F. 1959. Variable length binary encodings. Bell System Tech. J . 38, 933-967. Golomb, S. W., and Gordon, B. 1965. Codes with bounded synchronization delay. Inform. and Control 8,355-377 Golomb, S . W., Gordon, B., and Welch, L. R. 1958. Comma free codes. Canad. J. Math. 10,202209.
426
REFERENCES
Greenander, U. 1963. “Probabilities on Algebraic Structures.” New York: Wiley. Guibas, J. L., and Odlyzko, A. 1978. Maximal prefix synchronized codes. Sf AM J. Appl. Math. 35, 401-418. Hansel, G. 1982. Baronettes et cardinaux. Discrete Math. 39,331-335. Hansel, G., and Perrin, D. 1983. Codes and Bernoulli partitions. Math. Systems Theory. 16, 133-1 57. Hansel, G., Pemn, D., and Reutenauer, C. 1984. Factorizing the commutative polynomial of a code. Trans. Amer. Math. SOC.285,91-105. Hashiguchi, K., and Honda, N. 1976a. Homomorphisms that preserve star height. Inform. and Control 30,247-266. Hashiguchi, K., and Honda, N. 1976b. Properties of code events and homomorphisms over regular events, J. Comput. System Sci. 12,352-367. Herstein, I. N. 1969. “Non-commutative Rings,” Carus Mathematical Monographs. New York: Wiley. Huppert, B. 1967. “Endliche Gruppen.” Berlin and New York: Springer-Verlag. Huppert, B., and Blackburn, N. 1982. “Finite groups I1 and 111.” Berlin and New York: SpringerVerlag. Jiggs, B. H. 1963. Recent results in comma free codes. Canad. J. Math, 15, 178-187. Karhumaki, J. 1984. A property of three element codes, in “STACS 84,” Lecture Notes in Computer Science. 166,305-313, Karp, R. M. 1961. Minimum redundancy codes for the discrete noiseless channel. IRE Trans. fnform. Theory. IT-7,27-38. Knuth, D. 1973. “The Art of Computer Programming. Vol. 1. Fundamental Algorithms.” Reading, Massachusetts: Addison-Wesley. Lallement, G . 1979. “Semigroups and Combinatorial Applications.” New York: Wiley. Lallement, G., and Perrin, D. 1981. A graph covering construction of all the finite complete biprefix codes. Discrete Math. 36,261-271. Lassez, J. L. 1973. Prefix codes and isomorphic automata. Internat. J. Comput. Math. 3,309-314. Lassez, J. L., 1976. Circular codes and synchronization. International Journal of Computer and System Sciences, 5,201-208. Lentin, A. 1972. Equations dans les Monoides Libres.” Paris: Gauthier-Villars. Lentin, A., and Schutzenberger, M. P. 1967. A combinatorial problem in the theory of free monoids. In “Combinatorial Mathematics and Its Applications.” (R. C. Bose and T. A. Dowlings, eds.). 128-144. Chapell Hill: North Carolina Press. Lerest, E., and Lerest M. 1980. Une reprtsentation fidble des groupes d’un monolde de relations sur un ensemble fini. Semigroup Forum 21, 167-172. Levenshtein, V. I. 1964. Some properties of coding and self-adjusting automata for decoding messages. Problemy Kirbernet. 11.63-121. [Russian] Levi, F. W. 1944. On semigroups. Bull. Calcutta Math. SOC.36, 141-146. Lothaire, M. 1983. “Combinatorics on Words.” Reading, Massachusetts: Addison-Wesley. McEliece, R. 1977. “The Theory of Information and Coding.” Reading, Massachusetts: AddisonWesley. McMillan, B. 1956. Two inequalities implied by unique decipherability. IRE Trans. Inform. Theory IT-2, 115-1 16. McWilliams, F.and Sloane, J. 1977. “The Theory of Error Correcting Codes.” Amsterdam: North-Holland. Makanin, G. S. 1976. On the rank of equations in four unknowns in a free semigroup. Mat. Sb. (N.S.) 100,285-31 1; English translation Math. USSR-Sb. 29,257-280. Margolis, S. W. 1982. On the syntactic transformation semigroup of a language generated by a finite biprefix code. Theoret. Comput. Sci. 21,225-230. Markov, Al. A. 1962. On alphabet coding. Soviet. Phys. Dokl. 6, 553-554. [Russian]
REFERENCES
427
Martin-Lof, P. 1965. Probability theory on discrete semigroups, Z . Wahrsch 4,78-102. Mauceri, S., and Restivo, A., 1981. A family of codes commutatively equivalent to prefix codes. Inform. Process. Lett. 12,l-4. Moore, E. F. 1956. Gedanken Experiments I n “Automata Studies,” Ann. of Math. Stud. 34,129153. Nivat, M. 1966. Elements de la theorie gherale des codeS. In “Automata Theory.” (E. Caianiello, ed.). 278-294, New York: Academic Press. Perles, M., Rabin, M., and Sharnir, E. 1963. The theory of definite automata. IEEE Trans Electronic Computers, 12,233-243. Perrin, D. 1975. “Codes biprkfixes et groupes de permutations,” These. Universiti Paris 7. Perrin, D. 1977a. Codes asynchrones. Bull. Soc. Math. France 105,385-404. Perrin, D. 1977b. La transitivitt du groupe d’un code biprtfixe fini. Math. Z . 153,283-287. Perrin, D. 1978. Le degre minimal du groupe d’un code biprtfixe fini J . Combin. Theory Ser. A 25, 163-173. Perrin, D. 1979. La representation ergodique d’un automate fini. Theoret. Comput.Sci. 9,221-241. Perrin, D. 1982. Completing biprefix codes. In “Automata, Languages and Programming.” Lecture Notes in Computer Science. 140,397-406. Perrin, D., and Schutzenberger. M. P. 1976. Codes et sous-monotdes posskdant des mots neutres. In “Theoretical Computer Science.” Lecture Notes in Computer Science 48,270-281. Perrin, D. and Schutzenberger, M. P. 1981. A conjecture on sets of differences of integer pairs. J. Combin. Theory. Ser. B 30,91-93. Perrot, J. F. 1972. “Contribution H I’btude des monordes syntaxiques et de certains groupes associk aux automates finis.” These. Universitt de Pans. Perrot, J. F. 1974. Groupes de permutations associes aux codes prefixes finis. In “Permutations,” 19-35. Paris: Gauthier-Villars. Perrot, J. F. 1977. Informatique et algebre: la theorie des codes iS longueur variable. I n “Theoretical Computer Science.” Lecture Notes in Computer Science. 48,2744. Pin, J. E. 1978. “Le p r o b l h e de la synchronisation; contribution H I’etude de la conjecture de Cerny.” These de 3 cycle. Universitb Paris 6. Pin, J. E. 1979. Une caracthation de trois varietCs de langages bien connues. In “Theoretical Computer Science.” Lecture Notes in Computer Science. 67,233-243. Pin, J. E. 1984. Varittb de langages formels, Paris: Masson. Pin, J. E., and Sakarovitch, J. 1983. Some operations and transductions that preserve rationality. In “Theoretical Computer Science.” Lecture Notes in Computer Science. 145, 277-288. Pin, J. E., and Simon, I. 1982. A note on the triangle conjecture. J. Combin. Theory Ser. A 32,106109. Reis, C., and Thierrin, G. 1979. Reflective star languages and codes. Inform. and Control 42, 1-9. Restivo, A. 1974. On a question of McNaughton and Pappert. Inform. and Control 25,l. Restivo, A. 1975. A combinatorial property of codes having finite synchronization delay. Theoret. Comput. Sci. 1,95-101. Restivo, A. 1977. On codes having no finite completions. Discrete Math. 17,309-316. Reutenauer, C. 1980. Series formelles et algebres syntaxiques. J. Algebra 66,448-483. Reutenauer, C. 1981. Semisimplicityof the algebra associated to a biprefix code. Semigroup Forum 23,327-342. Reutenauer, C. 1983. Sur un t h e o r h e de Schutzenberger, C. R. Acad. Sci. Paris 296,619-621. Reutenauer, C. 1984a. Sulla f a t t o r i d o n e dei codici. Richerche di Maternatica, XXXII, 115-130. Reutenauer, C. 1984b. Non commutative factorization of variable length codes J . Pure and Applied Algebra. (to appear) Riley, J. A. 1967. The Sardinas-Patterson and Levenshtein theorems. Inform. and Control 10,120136. Rindone, G. 1983. “Groupes finis et monoides syntaxiques.” These de 3 cycle. Universite Paris 7.
428
REFERENCES
Rodeh, M., 1982. A fast test for unique decipherability based on suffix trees. IEEE Trans. on Inf. Theor., IT-28,648-651. Salomaa, A. 1981. “Jewels of Formal Language Theory.” Washington, D.C.: Computer Science Press. Sardinas, A. A., and Patterson, C. W. 1953. A necessary and sufficient condition for the unique decomposition of coded messages. IRE Internat. Cono. Rec. 8, 104-108. Scholtz, R. A. 1966. Codes with synchronization capability. IEEE Trans. Inform. Theory. IT-2, 135-142. Scholtz, R. A. 1969. Maximal and variable length comma free codes. IEEE Trans. Inform. Theory. IT-15 300-306. Schiitzenberger, M. P. 1955.“Une thborie algebrique du codage.” SCminaireDubreil-Pisot, 19551956. Expose No. 15. Schiitzenberger, M. P. 1956. On an application of semigroup methods to some problems in coding. IRE Trans. Inform. Theory. IT-2,47-60. Schiitzenberger, M. P. 1961a. On the definition of a family of automata. Inform. and Control 4, 245-270. Schiitzenberger, M. P. 1961b.On a special class of recurrent events. Ann. Math. Statist. 32,12011213. Schiitzenberger, M. P. 1961c. On afamily of submonoids. Publ. Math. Inst. Hunger. Acad. Sci. Ser. A VI, 381-391. Schutzenberger, M. P. 1964. On the synchronizing properties of certain prefix codes. Inform. and Control 7,23-36. Schutzenberger, M. P . 1965a. On a factorization of free monoids. Proc. Amer. Math. SOC.16.2124. Schiitzenberger, M. P. 1965b. Sur certains sous-monoides libres. Bull. SOC.Math. France. 93,209223. Schutzenberger, M. P. 196%. Sur une question concernant certains sous-monordes libres. C. R. Acad. Sci. Paris 261,2419-2420. Schutzenberger, M. P. 1966. On a question concerning certain free submonoids. J . Combin. Theory. 1,437-422. Schutzenberger, M. P. 1967. On synchronizing prefix codes. Inform and Control 11, 396-401. Schiitzenberger, M. P. 1979. A property of finitely generated submonoids of free monoids. In “Algebraic Theory of Semigroups” (G. Pollak, ed.). 545-576. Amsterdam: North-Holland. Shevrin, L. N. 1960. On subsemigroups of free semigroups, Sooiet Math. Dokl. 1,892-894. Shor, P. W. 1983. A counterexample to the triangle conjecture. J . Combin. Theory Ser. A. Spehner, J. C. 1975. Quelques constructions et algorithmes relatifs aux sous-monoides d’un monoide libre. Semigroup Forum 9,334-353. Spehner, J. C. 1976. “Quelques problemes d’extension, de conjugaison et de presentation des sousmond~desd’un monotde libre. Thise. Universitk Paris 7. Stryer, L. 1975. “Biochemistry,” San Francisco: Freeman. Tilson, B. 1972. The intersection of free submonoids is free. Semigroup Forum 4,345-350. Van Lindt, 1982. “Introduction to Coding Theory,” Berlin and New York: Springer-Verlag. Viennot, G. 1974. “AlgBbres de Lie libres et monoides libres.” These. Universite Paris 7. Viennot. G. 1978. “Algebres de lie et monoides libres”. Lecture Notes in Mathematics 691. Vincent, M. 1982. “Une famille de codes indecomposables.” These de 3e cycle. Montpellier. Wielandt, H.1964. “Finite Permutation Groups.” New York: Academic Press. Ycas, M. 1969. “The Biological Code.” Amsterdam. North-Holland.
Index
A
B
Alphabet, 4 Alphabetic ordering, 362 Alternating group, 32, 281 Aperiodic set, 378 Automaton, 10 asynchronous, 16 behavior of, 183 complete, 12 deterministic, 11 edges of, 11 literal, 91 minimal, 13, 92 ordered, 334 path in, 1 1 , 183 of a prefix code, 90 reduced, 12 representation associated with, 183 square of, 187 state space of, 386 transition monoid of, 14 trim, 1 1 , 184 trim part of, 184 unambiguous, 183 Average length, 124
Bayonet codes, 423 Bernoulli distribution, 54, 160, 172, 290, 394 positive, 54 Biological code, 377 Biprefix code, 41, 141 degree of, 152 derived, 158 group of, 261 indicator of, 142 insufficient, 173 kernel of, 163, 280 maximal, 144 tower over, 154 Biprefix set, 40 Bisection, 364 L
Catalan numbers, 105 Chain, 96 Circular code, 322 Code, 38 average length of, 295
429
490
INDEX
Code (continwd) biprefix, 41 circular, 322 coding morphism for, 38 comma-free, 336, 346 commutatively prefix, 405, 421 complete, 62 deciphering delay of, 128, 407 degree of, 234, 309 elementary, 286 factorization of, 380 finite deciphering delay, 128, 270 group of, 234 indecomposable, 74 limited, 330 maximal. 41 prefix, 41 separating. 418 suffix, 41 synchronous, 239,248, 325, 381, 398 thin, 65 two elements, 49, 374 uniform, 39 uniformly synchronous, 331 very thin, 224 Codes composable, 71 composition of, 71 Comma-free code, 336, 346 Commutatively prefix, 405, 421 Commutativity class, 405 Completely reducible, 41 1 Complete set, 62 Composed code, 197 Congruence, 3 Conjugacy class, 7, 326 primitive, 338 Conjugacy equivalence, 7 Conjugate words, 7, 323 Contexts, 312 left, 396 right, 131, 396 Contextual probability, 314, 320 Cylindric measure, 121
Decoding function, 200 Defect theorem, 48 Dense, 62 Density, 291, 292, 394 Derived code, 158 Dihedral group, 216, 247 Dyck code, 48, 59, 62, 65, 68, 80, 126, 135, 153
E Entropy, 137 Ergodic representation, 273 Extendable word, 131, 316
F Factor, 5 left, 5 right, 6 Factorization, 6 of the free monoid, 351 complete, 360 finite, 364 Fibonacci number, 184 Flower automaton, 190, 324 Free hull, 48 Frobenius group, 266, 275 Function image of, 215 nuclear equivalence of, 215
G Group code, 47, 70,262 Group induced, 241 Group of units, 3, 22 Group alternating, 32 dihedral, 216, 247 free, 10, 47 primitive, 35, 246 symmetric, 32 transitive, 32
D 9-class, 20, 206 regular, 23 Deciphering delay, 128, 270, 407
H %-class, 19 Hadamard product, 32
INDEX
43 I
Hall sequence, 342 Huffman encoding, 137
I Ideal 0-minimal, 18 left, 18 minimal, 18 right, 18 two-sided, 18 Idempotent, 3, 203 column-row decomposition of, 204 monoid localized at, 204 Image, 230 Imprimitivity equivalence, 34, 241 Imprimitivity quotient, 34, 241 Index of a cyclic monoid, 3 of a subgroup, 32, 70, 262 Induced groups, 34 lnitial part of a set, 85 Internal factor, 151, 269 Internal transformation, 148, 170 Interpretation, 141, 236 Interpretations adjacent, 236 disjoint, 236 Invariant distribution, 319
J 2-class, 19
M Maschke theorem, 423 Mathieu group, 287 Maximal decomposition, 243 Measure of a set, 55 Minimal image, 230 Monoid, 2 cyclic, 3 9-class in, 20, 206 free, 5 =% ac-sl' in, 19 of K-relations, 29 9-class in, 19 linear representation of, 41 1 prime, 24, 221 9-class in, 19 syntactic, 14, 258, 259, 264, 378, 413 of unambiguous relations, 202, 203 equivalent. 207 3-representation of, 212 minimal rank of, 220 %-representation of, 212 transitive, 203 well founded, 300 Morphism, 2 continuous, 353 Mobius function, 8, 339 Mobius Inversion Formula, 9, 340
N Nuclear equivalence, 230
K P
Kernel, 163, 280
L 3-class, 19 Lazard set, 361 Left completable word, 309 Left contexts, 315 Length distribution, 337 of a circular code, 340 Letter, order of, 68, 117, 169, 317 Locally finite, 30 Logarithm, 352 Lyndon word, 363 standard factorization of, 376
Parse, 141 Passing system, 206 Path, 11, 182 end of, 11 label of, 11, 182 null, 182 origin of, 11 simple, 189 successful, 11, 183 Period of a cyclic monoid, 3 Permutation group, 32 degree of, 33 doubly transitive, 36, 278 k-fold transitive. 35
492
INDEX
Permutation group (continued) minimal degree of, 266 primitive, 35, 246, 268 realizable, 279 regular, 35, 250, 263 transitive. 32 Polynomial, 29 Prefix code, 41, 84 maximal, 98 synchronous, 115, 137 Prefix set, 40 Prefix-closed set, 6 Prefix-synchronized code, 378 Prime monoid, 24, 221 Probability measure, 303 idempotent, 318 Probability measures, convergence of, 304
Q Quasipower, 178
R %-class, 19 Radius of' convergence, 293 Rank, 116. 217 Rational set. 17 Recognizable set, 15, 69, 136, 176, 255, 318, 375, 422 Reduction, 193 Regular permutation group, 35, 250, 263 Relation minimal decomposition of, 217 rank of, 217 unambiguous, 202 Right completable word, 98, 309 Right complete set, 98 Right contexts, 314 Right cosets, 32 Right dense set, 98 Right ideal, 18 base of', 85 S
Sandwich matrix, 257 Semaphore code, 107, 126, 136, 159, 248, 259, 375
Semigroup, 3 depth, 269 nil-simple, 269 Semiring, 26 Boolean, 26 complete, 27 ordered, 26 Series, 29 characteristic, 31 commutatively equivalent, 404 cyclically null, 354 density, 292, 394 logarithm of, 352 support, 30 syntactic representation of, 409 Simplifying word, 130, 316 Stable submonoid, 43, 131, 256, 324 Stabilizer, 95 State, 10 accessible, 11, 184 coaccessible, 11, 184 initial, 10 terminal, 10 States, inseparable. 12 Stationary distribution, 319 Submonoid, 2 base of, 43 biunitary, 45 completely reducible, 41 1 left unitary, 45 pure, 323, 374, 375 right unitary, 45 srable, 43, 131, 256, 324 very pure, 323 Subspace invariant, 386 orthogonal, 386 Suffix set, 40 Suschkewitch group, 224, 234 Synchronization delay, 331, 346 Synchronizing pair, 240, 331 Synchronizing word, 115 Syntactic congruence, 14 Syntactic monoid, 14, 258, 259, 264, 378, 413 Syntactic representation, 409 System of coordinates, 210
T Thin set, 65, 2Y3
INDEX
433
Transducer, 260 Transition function, I 1 Trisection, 370
U Unambiguous automaton, 180 Unambiguous product, 31 Unambiguous relation, 202 decomposition of, 218 fixed point of, 203 minimal decomposition of, 218
W Weak Euclidian algorithm, 179 Word, 4 completable, 61, 309 empty, 5
exponent of, 7 full, 173 factor of, 5 interpretation of, 236 left factor of, 5 ordered factorization of, 351 palindrome, 176 point in, 142 primitive, 6 reverse of, 6 right factor of, 6 root of, 7 unbordered, 9, 62, 129, 178 X-exponent, 325 X-factorization of, 6 X-primitive, 325 Words conjugate, 323 X-conjugate, 325
Pum and Applied Mathematics A Series of Monographs and Textbooks Editors l a m u d Ellmnbrrg and Hymn B a u Columbia University, New York RECENT TITLES
CARLL. DEVITO.Functional Analysis MICHIEL HAZEWINKEL. Formal Groups and Applications SIGURDUR HELGASON. Differential Geometry, Lie Groups, and Symmetric Spaces ROBERT B. BURCKEL. An Introduction to Classical Complex Analysis: Volume 1 JOSEPH J. ROTMAN. An Intfoduction to Homological Algebra C, TRUESDELL AND R. G.MUNCASTER. Fundamentals of Maxwell’s Kinetic Theory of a Simple Monatomic Gas: Treated as a Branch of Rational Mechanics BARRY SIMON. Functional Integration and Quantum Physics G R Z E ~ROZENBERG ~RZ AND ARTO SALOMAA. The Mathematical Theory of L Systems DAVID KINDERLEHRER AND Gurw STAMPACCHIA. An Introduction to Variational Inequalities and Their Applications H. SEIFERT AND W. THRELFALL. A Textbook of Topology; H. SEIFERT. Topology of 3-Dimensional Fibered Spaces Lours HALLEROWEN. Polynominal Identities in Ring Theory DONALD W. KAHN.Introduction to Global Analysis DRAGOS M. CVETKOVIC. MICHAFLh e . AND HORST SACHS. Spectra of Graphs ROBERT M. YOUNG. An Introduction to Nonharmonic Fourier Series MICHAEL C. IRWIN. Smooth Dynamical Systems JOHNB. GARNETT. Bounded Analytic Functions EDUARD PRUGOVECKI. Quantum Mechanics in Hilbert Space, Second Edition M. SCOTTOSBORNE AND GARTH WARNER. The Theory of Eisenstein Systems K. A. ZHEVLAKOV. A. M. SLIN’KO, I. P. SHESTAKOV, AND A. I. SHIRSHOV. Translated by Harry Smith. Rings That Are Nearly Associative JEANDIEUWNNB. A Panorama of Pure Mathematics; Translated by I. Macdonald JOSEPH G. ROSENSTEIN. Linear Orderings AVRAHAM FEINTUCH AND RICHARD SAEKS. System Theory: A Hilbert Space Approach ULFGRENANDER. Mathematical Experiments on the Computer HOWARD OSBORN. Vector Bundles: Volume 1, Foundations and Stiefel-Whitney Classes K. P. S. BHASKARA RAOAND M.BHASKARA RAO.Theory of Charges RICHARD V. KADISON AND JOHN R. RINGROSE. Fundamentals of the Theory of Operator Algebras, Volume I EDWARD B. MANOUKIAN. Renormalization BARRETT O’NEILL.Semi-Riemannian Geometry: With Applications to Relativity LARRY C. GROVE. Algebra E. J. MCSHANE. Unified Integration STEVEN ROMAN. The Umbra1 Calculus AND HYMAN BASS(Eds.). The Smith Conjecture JOHNW. MORGAN
SIGURDUR HELGASON. Groups and Geometric Analysis: Integral Geometry, Invariant Differential Operators, and Spherical Functions ISAAC CHAVEL. Eigenvalues in Riemannian Geometry W.D. CURTIS AND F. R. MILLER. Differential Manifolds and Theoretical Physics JEANBERSTEL AND DOMINIQUE PERRIN. Theory of Codes IN PREPARATION
ROBERT B. BURCKEL. An Introduction to Classical Complex Analysis: Volume 2 RICHARD V. KADISON AND JOHN R. RINGROSE. Fundamentals of the Theory of Operator Algebras, Volume I1 A. P. MORSE. A Theory of Sets, Second Edition E. R. KOLW. Differential Algebraic Groups
This Page Intentionally Left Blank