Jean Berstel Christophe Reutenauer
Rational Series and Their Languages November 6, 2006
c 2006 Jean Berstel and Christophe Reutenauer
c English version Springer-Verlag 1988
c French edition Les s´eries rationnelles et leurs langages Masson 1984
ii
Preface to the electronic edition This electronic edition of the English edition is at the date of November 6, 2006, an almost literal transposition of the original text. The modifications concern notation, several proofs, one new statement, and some new exercises. It is planned that the electronic edition will progressively be modified, with inclusion of some new material, but will remain basically of the same size and of the same algebraic style. Notation Alphabets are named A, B, C, . . . instead of X, Y, Z, . . ., letters are a, b, c, . . . instead of x, y, z, . . .. Terminology prefix, suffix replaces left, right factor. New statements Corollary III.2.2 is a new statement. New exercises Exercise VIII.2.3 is new.
Marne-la-Vall´ee — Montr´eal Jean Berstel
November 6, 2006 Christophe Reutenauer
iii
iv
Preface
Preface to the first English edition This book is an introduction to rational formal power series in several noncommutative variables and their relations to formal languages and to the theory of codes. Formal power series have long been used in all branches of mathematics. They are invaluable in enumeration and combinatorics. For this reason, they are useful in various branches of computer science. As an example, let us mention the study of ambiguity in formal grammars. It has appeared, for the past twenty years, that rational series in noncommutative variables have many remarkable properties which provide them with a rich structure. Knowledge of these properties makes them much easier to manipulate than, for instance, algebraic series. The depth and number of results for rational series are similar to those for rational languages. The aim of this text is to present the basic results concerning rational series. The point of view adopted here seems to us to be a natural one. Frequently one observes that a set of results becomes a theory when the initial combinatorial techniques are progressively replaced by more algebraic ones. We have tried wherever possible to substitute an algebraic approach for a combinatorial description. This has made it possible for us to give a unified and more complete presentation that is hopefully also easier to understand. We feel that, in this manner, the fundamental mechanisms and their interactions are easier to grasp. The first part of the book, comprising the first two chapters, illustrates very well how the introduction of an algebraic concept, namely syntactic algebra, can give a unified presentation. These two chapters contain the most important general results and discuss in particular the equality between rational and recognizable series and the construction of the reduced linear representation. The following two chapters are devoted to the two applications which seemed most important to us. First, we describe the relationship with the families of formal languages studied in theoretical computer science. Next, we establish the correspondence with the rational functions in one variable as studied in number theory. Chapter V presents arithmetic properties of rational series and their relations to the nature of their coefficients. These results are fairly profound, and there is a constant interaction with number theory. Let us mention the analytic characterization of N-rational series, which is the first result of this kind. The next chapter presents several results on decidability. We describe only some positive results which are of increasing importance. Those given here are v
vi
Preface
directly related to the Burnside problem. The last two chapters are devoted to the study of polynomials in noncommutative variables, and to their application to coding theory. Because of noncommutativity, the structure of polynomials is much more complex that it would be in the case of commutativity, and the results are rather delicate to prove. We present here basic properties concerning factorizations, without trying to be complete. The main purpose of Chapter VII is to prepare the ground for the final chapter which contains the generalization of a result of M.-P. Sch¨ utzenberger concerning the factorization of a polynomial associated with a finite code. Exercises are provided for most chapters and also short bibliographical notes. The algebraic and arithmetic approach adopted in this book implies a choice in the set of possible applications. We do not describe several important applications, such as the use of polynomials in control theory, where formal series in noncommutative variables are employed to represent the behavior of systems and replace the Volterra series (Fliess 1981, Isidori 1985). Another area of application is combinatorial graph theory. Enumeration of graphs by wellchosen encodings leads to systems of equations in noncommutative formal series whose solutions give the desired enumeration. Cori (1975) gives an introduction to the topic. The analysis of algorithms also leads to the study of formal series in a somewhat larger context (see Steyaert and Flajolet 1983, Berstel and Reutenauer 1982). This book issued from an advanced course held several times by the authors, at the University Pierre et Marie Curie, Paris and at the University of Saarbr¨ ucken. Parts of the book were also taught at several different levels at other places. Any concept from algebra that might not be familiar to the reader can be found in S. Lang’s Algebra (Lang 1984). Finally, thanks are due to Rosa de Marchi who carefully typed the manuscript. Paris — Montr´eal August 1988
Jean Berstel Christophe Reutenauer
Note to the reader Following usual notation, items such as sections, theorems, corollaries, etc. are numbered within a chapter. When cross-referenced the chapter number is omitted if the item is within the current chapter. Thus “Theorem 1.1” means the first theorem in the first section of the current chapter, and “Theorem II.1.3” refers to the equivalent theorem in Chapter II. Exercises are numbered accordingly and the section number should help the reader to find the section relevant to that exercise.
Contents Chapter I Rational Series 1 Semirings . . . . . . . . . . 2 Formal Series . . . . . . . . 3 Topology . . . . . . . . . . 4 Rational Series . . . . . . . 5 Recognizable Series . . . . . 6 The Fundamental Theorem Exercises . . . . . . . . . . Notes . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
1 1 2 3 4 7 12 15 19
Chapter II Minimization 1 Syntactic Ideals . . . . . . . . . . 2 Reduced Linear Representations 3 The Reduction Algorithm . . . . Exercises . . . . . . . . . . . . . Notes . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
21 21 26 29 34 34
Chapter III Series and Languages 1 The Theorem of Kleene . . . . 2 Series and Rational Languages 3 Support . . . . . . . . . . . . . 4 Iteration . . . . . . . . . . . . . 5 Complementation . . . . . . . . Exercises . . . . . . . . . . . . Notes . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
35 35 37 40 43 44 47 48
Chapter IV Rational Series in One Variable 1 Rational Functions . . . . . . . . . . 2 The Exponential Polynomial . . . . 3 A Theorem of P´ olya . . . . . . . . . 4 A Theorem of Skolem, Mahler, Lech Exercises . . . . . . . . . . . . . . . Notes . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
49 49 54 58 62 70 70
Chapter V Changing the Semiring 1 Rational Series over a Principal Ring . . . . . . . . . . . . . . . . 2 Positive Rational Series . . . . . . . . . . . . . . . . . . . . . . . 3 Fatou Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . .
71 71 74 83
. . . . . . . .
. . . . . . . .
vii
. . . . . . .
viii
Contents Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Chapter VI Decidability 1 Problems of Supports 2 Growth . . . . . . . . Exercises . . . . . . . Notes . . . . . . . . .
86 88
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
89 89 91 96 96
Chapter VII Noncommutative polynomials 1 The Weak Algorithm . . . . . . . . 2 Continuant Polynomials . . . . . . 3 Inertia . . . . . . . . . . . . . . . . 4 Gauss’s Lemma . . . . . . . . . . . Exercises . . . . . . . . . . . . . . Notes . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
99 99 102 106 111 113 114
Chapter VIII Codes and Formal Series 1 Codes . . . . . . . . . . . . . 2 Completeness . . . . . . . . . 3 The Degree of a Code . . . . 4 Factorization . . . . . . . . . Exercises . . . . . . . . . . . Notes . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
115 115 119 123 124 131 132
. . . .
. . . .
. . . .
. . . .
. . . .
. . . . . .
. . . .
. . . . . .
. . . . . .
References
133
Index
138
Chapter I
Rational Series This chapter contains the definitions of the basic concepts, namely rational and recognizable series in several noncommutative variables. It also gives a short account of some preliminary notions that will appear frequently throughout the book. We start with the definition of a semiring, followed by the notation for the usual objects in free monoids and formal series. The topology on formal series is only treated to the extent required for later reference. Section 4 contains the definition of rational series, together with some elementary properties and the fact that certain morphisms preserve the rationality of series. Recognizable series are introduced in Section 5. An algebraic characterization is given. We also prove (Theorem 5.1) that the Hadamard product preserves recognizability. The fundamental theorem of Sch¨ utzenberger (equivalence between rational and recognizable series, Theorem 6.1) is the concern of the last section. This theorem is the starting point for the developments given in the subsequent chapters.
1
Semirings
Recall that a monoid is a set equipped with an associative binary operation and having a neutral element for this law. A semiring is, roughly speaking, a ring without subtraction. More precisely, it is a set K equipped with two operations + and · (sum and product) such that the following properties hold: (i) (ii) (iii) (iv)
(K, +) is a commutative monoid with neutral element denoted by 0. (K, ·) is a monoid with neutral element denoted by 1. The product is distributive with respect to the sum. For all a in K, 0a = a0 = 0.
The last property is not a consequence of the others, as is the case for rings. A semiring is commutative if its product is commutative. A subsemiring of K is a subset of K containing 0 and 1, which is stable for the operations of K. 1
2
Chapter I. Rational Series A semiring morphism is a function f : K → K′
of a semiring K into a semiring K ′ that maps the 0 and 1 of K into the corresponding elements of K ′ and that respects sum and product. Let us give some examples of semirings. Among them are, of course, fields and rings. Next, the set N of natural numbers, the sets Q+ of nonnegative rational numbers and R+ of nonnegative real numbers are semirings. The Boolean semiring B = {0, 1} is completely described by the relation 1 + 1 = 1 (see Exercise 1.1). If M is a monoid, the set of its subsets is naturally equipped with the structure of a semiring: the sum of two subsets X and Y of M is simply X ∪ Y and their product is {xy | x ∈ X, y ∈ Y } . Let K be a semiring and let P, Q be two finite sets. We denote by K P ×Q the set of P × Q-matrices with coefficients in K. The sum of such matrices is defined in the usual way, and if R is a third finite set, a product K P ×Q × K Q×R → K P ×R is defined in the usual manner. In particular, K Q×Q thus becomes a semiring. If P = {1, . . . , m} and Q = {1, . . . , n}, we will write K m×n for K P ×Q ; moreover, K 1×1 will be identified with K. For the rest of this chapter, we fix a semiring K.
2
Formal Series
Let A be a finite, nonempty set called alphabet . The free monoid A∗ generated by A is the set of finite sequences a1 · · · an of elements of A, including the empty sequence denoted by 1. This set is a monoid, the product being the concatenation defined by (a1 · · · an ) · (b1 · · · bp ) = a1 · · · an b1 · · · bp and with neutral element 1. An element of the alphabet is called a letter , an element of A∗ is a word , and 1 is the empty word . The length of a word w = a1 · · · an is n; it is denoted by |w|. The length |w|a relative to a letter a is defined to be the number of occurrences of the letter a in w. We denote by A+ the set A∗ \ 1. A language is a subset of A∗ . A formal series (or formal power series) S is a function A∗ → K . The image by S of a word w is denoted by (S, w) and is called the coefficient of w in S. The support of S is the language supp(S) = {w ∈ A∗ | (S, w) 6= 0} .
3
3. Topology
The set of formal series over A with coefficients on K is denoted by KhhAii. A structure of a semiring is defined on KhhAii as follows. If S and T are two formal series, their sum is given by (S + T, w) = (S, w) + (T, w) , and their product by X (ST, w) = (S, x)(T, y) . xy=w
Observe that this sum is finite. Furthermore, two external operations of K on KhhAii, one acting on the left, the other on the right, are defined, for k ∈ K, by (kS, w) = k(S, w),
(Sk, w) = (S, w)k .
There is a natural injection of the free monoid into KhhAii as a multiplicative submonoid; the image of a word w is still denoted by w. Thus the neutral element of KhhAii for the product is 1. Similarly, there is an injection of K into KhhAii as a subsemiring: to each k ∈ K is associated k · 1 = 1 · k, simply denoted by k. Thus we identify A∗ and K with their images in KhhAii. A polynomial is a formal series with finite support. The set of polynomials is denoted by KhAi. It is a subsemiring of KhhAii. The degree of a polynomial is the maximal length of the words in its support (and is −∞ if the polynomial is zero). When A = {a} has just one element, one get the usual sets of formal power series Khhaii = K[[a]] and of polynomials Khai = K[a]. For the rest of this chapter, we fix an alphabet A.
3
Topology
We have seen that KhhAii is the set of functions A∗ → K. In other words, ∗
KhhAii = K A . Thus, if K is equipped with the discrete topology, the set KhhAii can be equipped with the product topology. This topology can be defined by an ultrametric distance. Indeed, let ω : KhhAii × KhhAii → N ∪ ∞ be the function defined by ω(S, T ) = inf{n ∈ N | ∃w ∈ A∗ , |w| = n and (S, w) 6= (T, w)} . For any real number σ with 0 < σ < 1, the function d : KhhAii × KhhAii → R d(S, T ) = σ ω(S,T )
4
Chapter I. Rational Series
is an ultrametric distance, that is d is a distance which satisfies the enforced triangular inequality d(S, T ) ≤ max(d(S, U ), d(U, T )) The function d defines the topology given above (Exercise 3.1). Furthermore, KhhAii is complete for this topology, and it is a topological semiring (that is sum and product are continuous functions). Let (Si )i∈I be a family of series. It is called summable if there exists a formal series S such that for all ε > 0, there exists a finite subset I ′ if I such that every finite subset J of I containing I ′ satisfies the inequality X d Sj , S ≤ ε . j∈J
The series S is then called the sum of the family (Si ) and it is unique. A family (Si )i∈I is called locally finite if for every word w there exists only a finite number of indices i ∈ I such that (Si , w) 6= 0. It is easily seen that every locally finite family is summable. The sum of such a family can also be defined simply for w ∈ A∗ by X (S, w) = (Si , w) , i∈I
observing that the support of this sum is finite because the family (Si ) is locally finite (all terms but a finite number in this sum are 0). However, it is not true that a summable family is always locally finite (see Exercise 3.2), but we shall need mainly the second concept. Let S be a formal series. Then the family of series ((S, w)w)w∈A∗ clearly is locally finite, since each of these series has a support formed of at most one single word, and supports are pairwise disjoint. Thus the family is summable, and its sum is just S. This gives the usual notation X S= (S, w)w . w∈A∗
It follows in particular that KhAi is dense in KhhAii which thus is the completion of KhAi for the distance d.
4
Rational Series
A formal series S ∈ KhhAii is proper if the coefficient of the empty word (that is the constant term of S) vanishes, thus if (S, 1) = 0. In this case, the family (S n )n≥0 is locally finite. Indeed, for any word w, the condition n > |w| implies (S n , w) = 0. Thus the family is summable. The sum of this family is denoted by S ∗ X S∗ = Sn , n≥0
and is called the star of S. Similarly, S + denotes the series X S+ = Sn . n≥1
5
4. Rational Series
The fact that KhhAii is a topological semiring and the usual properties of summable families imply that S∗ = 1 + S+
and S + = SS ∗ = S ∗ S .
From these, it follows that if K is a ring, then S ∗ is just the inverse of 1 − S since S ∗ (1 − S) = S ∗ − S ∗ S = S ∗ − S + = 1. This also implies the following classical result: a series is invertible if an d only if its constant term is invertible in K (still assuming K to be a ring); see Exercise 4.5. Let us return to the general case of a semiring. Lemma 4.1 Let T and U be formal series, with T proper. Then the unique solution S of the equation S = U + T S (of S = U + ST ) is the series S = T ∗ U (the series S = U T ∗ , respectively). Proof. One has T ∗ = 1 + T T ∗ , whence T ∗ U = U + T T ∗U . Conversely, since T is proper X lim T n = 0 and lim Ti = T∗ . n
n
0≤i≤n
From S = U + T S, it follows that S = U + T (U + T S) = U + T U + T 2 S and inductively S = (1 + T + · · · + T n )U + T n+1 S . Thus, going to the limit, and using the fact that KhhAii is a topological semiring, one gets S = T ∗ U . Definition The rational operations in KhhAii are the sum, the product, the two external products of K on KhhAii and the star operation. A subset of KhhAii is rationally closed if it is closed for the rational operations. The smallest subset containing a subset E of KhhAii and which is rationally closed is called the rational closure of E. Definition A formal series is rational if it is in the rational closure of KhAi. Observe that if K is a ring, then the rational closure of KhAi is the smallest subring of KhhAii containing KhAi and closed under inversion (in other words, the star operation and inversion play equivalent roles). Definition If L is a language, its characteristic series is the formal series X w. L= w∈L
In other words, (L, w) = 1 for w ∈ L, and (L, w) = 0 if w ∈ / L.
6
Chapter I. Rational Series
Example 4.1 The series A is proper and X An . A∗ = n≥0
n
Since A is the sum of all words of length n, it follows that X A∗ = w w∈A∗
is the characteristic series of A∗ . Thus, this series is rational. Consider now a letter a. The series A∗ aA∗ , as a product of A∗ , a, and A∗ , is also rational. By the definition of product, X (A∗ aA∗ , w) = (A∗ , x)(a, y)(A∗ , z) . xyz=w
Since (a, y) = 0 unless y =P a (and then (a, y) = 1), and since (A∗ , x) = (A∗ , z) = ∗ ∗ 1, one has (A aA , w) = xaz=w 1, which is the number of factorizations w = xaz, that is the number |w|a of occurrences of the letter a in w. Thus X A∗ aA∗ = |w|a w w
is a rational series. Let B be an alphabet, and let ρ be a function ρ : A → KhhBii . Then ρ extends to a morphism of monoids ρ : A∗ → KhhBii . If K is commutative, then ρ can be extended in a unique manner into a morphism of semirings ρ : KhAi → KhhBii with ρ|K = id. Indeed, it suffices, for any polynomial P = KhAi, to set X ρ(P ) = (P, w)ρ(w)
P
w∈A∗ (P, w)w
∈
w∈A∗
which is a finite sum since P is a polynomial. Then ρ isK-linear. Moreover, in view of the commutativity of K X X ρ(P )ρ(Q) = (P, x)ρ(x) (Q, y)ρ(y) x∈A∗
X
=
y∈A∗
(P, x)ρ(x)(Q, y)ρ(y) =
x,y∈A∗
=
X
X
x,y∈A∗
(P, x)(Q, y)ρ(xy)
x,y∈A∗
=ρ
X
x,y∈A∗
(P, x)(Q, y)xy = ρ(P Q) .
(P, x)(Q, y)ρ(x)ρ(y)
7
5. Recognizable Series
Assume now that for each letter a ∈ A, the series ρ(a) is proper. Then ρ : KhAi → KhhBii is uniformly continuous. Indeed, let P and Q be two polynomials with ω(P, Q) = n . Then, for any word x in B ∗ of length < n, X X (P, w)(ρ(w), x) (ρ(P ), x) = (P, w)(ρ(w), x) = w∈A∗
|w|
since (ρ(w), x) = 0 whenever |w| ≥ n by the hypothesis on ρ. Thus X (Q, w)(ρ(w), x) = (ρ(Q), x) (ρ(P ), x) = |w|
showing that ω(ρ(P ), ρ(Q)) ≥ n . Since KhhAii is the completion of KhAi (see Section 3), the function ρ extends uniquely to a morphism of semirings KhhAii → KhhBii which induces the identity mapping on K and which is continuous. Proposition 4.2 Suppose K is commutative. Let ρ : A → KhhBii be a function such that, for all a ∈ A, the series ρ(a) is a proper rational series. Then ρ extends uniquely to a morphism of semirings KhhAii → KhhBii which induces the identity on K and which is continuous. Moreover, the image of any rational series is again rational. Proof. It suffices to show the last claim. If P is a polynomial, then ρ(P ) = P (P, w)ρ(w) is a rational series since ρ(a) is a rational series for each letter a in A and since ρ is multiplicative. Next, if ρ(S) and ρ(T ) are rational series, then so are ρ(S + T ) and ρ(ST ). Finally, if S is a proper series and ρ(S) is rational, then ρ(S) is proper and X X ρ(S ∗ ) = ρ Sn = ρ(S n ) = ρ(S)∗ n≥0
n≥0
by the continuity of ρ, showing that ρ(S ∗ ) is rational. This proves that ρ preserves rationality.
5
Recognizable Series
Definition A formal series S ∈ KhhAii is called recognizable if there exists an integer n ≥ 1 and a morphism of monoids µ = A∗ → K n×n
8
Chapter I. Rational Series
(K n×n with its multiplicative structure) and two matrices λ ∈ K 1×n and γ ∈ K n×1 such that, for all words w, (S, w) = λµwγ . In this case, the triple (λ, µ, γ) is called a linear representation of S, and n is its dimension. For further purpose, we admit the representation of dimension 0, which corresponds to the null series S = 0. We shall need the notion of module over a semiring. A left K-module is a commutative monoid M with law denoted by + and neutral element 0, equipped with an external law K × M → M denoted by (k, x) 7→ kx such that, for all k, ℓ in K and x, y in M the following relations hold: k(x + y) = kx + ky (k + ℓ)x = kx + ℓx (kℓ)x = k(ℓx) 1x = x 0x = 0 k0 = 0 A submodule of M is a subset of M containing 0 and closed for the operations of M . A left K-module is finitely generated if there exists finitely many elements x1 , . . . , xn ∈ M such that any element in M can be written as a linear combination k1 x1 + · · · + kn xn
(ki ∈ K) .
The semiring KhhAii of formal power series is a left K-module, where the external law K × KhhAii → KhhAii is the law considered in Section 2: (k, S) 7→ kS . We now define an operation of A∗ on KhhAii. To each word x, and to each formal series S, we associate the series denoteed by x−1 S and defined by X x−1 S = (S, xw)w . w∈A∗
In other terms, for all words x and w, the coefficient of w in the series x−1 S is (S, xw), thus (x−1 S, w) = (S, xw) . A more combinatorial view of this fact is given in the case where S = y is a single word. Then x−1 y vanishes, unless y has x as a prefix, that is y = xy ′ . In this case, x−1 y = y ′ . Observe that this defines completely the operation S → x−1 S
9
5. Recognizable Series since the operation is additive, that is x−1 (S + T ) = x−1 S + x−1 T since it commutes with the external operation of K on KhhAii, that is x−1 (kS) = k(x−1 S), x−1 (Sk) = (x−1 S)k for all k in K, and since, finally, this operation is continuous. Example 5.1 (ab)−1 (a2 + aba2 + abab + ab2 + b) = a2 + ab + b .
The same remark shows that if P is a polynomial, then x−1 P is still a polynomial, with degree less than or equal to the degree of P . Furthermore, this operation of A∗ on KhhAii is associative in the following sense: (xy)−1 S = y −1 (x−1 S) as is easily verified. Another property is the following formula which holds for any series S: X S = (S, 1) + a(a−1 S) . (5.1) a∈A
This formula is indeed easily proved when S is a word, and then extended by linearity and continuity. A subset M of KhhAii is called stable if, for all S in M and x in A∗ , the series x−1 S is in M . Proposition 5.1 A formal series S ∈ KhhAii is recognizable if and only if there exists a stable finitely generated left submodule of KhhAii which contains S. Proof. Assume that S is recognizable, and let (λ, µ, γ) be a linear representation of S of dimension n. Consider the formal series S1 , . . . , Sn defined by (Si , w) = (µwγ)i for all words w. Let M be the left K-module generated by the series Si . Thus M is finitely generated. It contains S, since X X λi (Si , w) , λi (µwγ)i = (S, w) = λµwγ = i
showing that S =
P
i
i
λi Si . Next, M is stable. Indeed, let x be a word. Then
(x−1 Si , w) = (Si , xw) = (µ(xw)γ)i = (µxµwγ)i X X (µx)i,j (Sj , w) . (µx)i,j (µwγ)j = = j
j
Thus x−1 Si = j (µx)i,j Sj ∈ M . Hence M is stable, since the mapping T 7→ x−1 T is K-linear and sends the generators into M . P
10
Chapter I. Rational Series
Conversely, let M be a stable left submodule of KhhAii generated by S1 , . . . , Sn and containing S. Then X λi Si S= i
for some λi in K. Moreover, for any letter a, there exists a matrix µa ∈ K n×n such that, for all i, X (µa)i,j Sj . a−1 Si = j
Let µ : A∗ → K n×n be the morphism of monoids which extends this mapping. Then, for any word w, X (µw)i,j Sj . w−1 Si = j
Indeed, this relation holds for w = 1, and if it holds for some word w, then X (wa)−1 Si = a−1 (w−1 Si ) = a−1 (µw)i,k Sk k
=
X
(µw)i,k (a−1 Sk ) =
k
=
(µw)i,k
k
X
(µa)k,j Sj
j
k
X X j
X
X (µwa)i,j Sj , (µw)i,k (µa)k,j Sj = j
and consequently the relation holds for all words. Set γj = (Sj , 1) and let γ ∈ K n×1 be the matrix defined in this way. Then X (µw)i,j Sj , 1 (Si , w) = (w−1 Si , 1) = j
=
X
(µw)i,j (Sj , 1) =
X
(µw)i,j γj = (µwγ)i .
j
j
Consequently, λµwγ =
X
λi (µwγ)i =
i
X
λi (Si , w) = (S, w) ,
i
showing that S is recognizable. Example 5.2 We use Proposition 5.1 to give an example of a recognizable series. Let A = {a0 , a1 } and let K = N. For each word w, let hwi be the integer for which w is an expansion in base 2. More precisely, if w = aik · · · ai0 , then hwi = ik 2k + · · · + i1 2 + i0 .
11
5. Recognizable Series For w = 1, we set hwi = 0. Let X S= hwiw . w∈A∗
Thus S starts with S = a1 + a0 a1 + 2a1 a0 + 3a21 + a20 a1 + 2a0 a1 a0 + 3a0 a21 + 4a1 a20 + 5a1 a0 a1 + 6a21 a0 + 7a31 + · · · Given a word w, one has the relations (S, a0 w) = (S, w) and (S, a1 w) = 2|w| + (S, w) , where |w| denotes the length of w. In other words, −1 a−1 0 S = S and a1 S = T + S ,
where T is the series X T = 2|w| w . w
From the relations (T, a0 w) = 2(T, w), (T, a1 w) = 2(T, w) it follows that −1 a−1 0 T = a1 T = 2T .
This shows that the submodule M of NhhAii generated by S and T is stable under the operations U 7→ a−1 U
(a ∈ A) .
The associativity of the operation U 7→ w−1 U implies that the submodule is stable. Proposition 5.1 shows that S is recognizable. Corollary 5.2 Any linear combination of recognizable series is a recognizable series. Proof. Indeed, the sum of several stable submodules of KhhAii is a stable submodule, and the corollary follows from the proposition. Definition The Hadamard product of two formal series S and T is the series S ⊙ T defined by (S ⊙ T, w) = (S, w)(T, w) . Theorem 5.3 (Sch¨ utzenberger 1962a) Let K1 and K2 be two subsemirings of K such that each element of K1 commutes with each element of K2 . If S1 is a K1 -recognizable series and S2 is a K2 -recognizable series, then S1 ⊙ S2 is K-recognizable.
12
Chapter I. Rational Series
Proof. We apply Proposition 5.1. Let M1 (M2 ) be a left submodule of K1 hhAii (of K2 hhAii) which contains S1 (S2 ), is stable, and is generated by the series T11 , . . . T1n ∈ K1 hhAii (the series T21 , . . . , T2m ∈ K2 hhAii respectively). Let M be the left KhAi-submodule of KhhAii generated by M1 ⊙ M2 = {T1 ⊙ T2 | T1 ∈ M1 , T2 ∈ M2 }. Clearly, P S1 ⊙ Si2 is in M . Moreover, M is finitely generated. Indeed, if T1 = 1≤i≤n ki T1 ∈ M1 with ki ∈ K1 and P T2 = 1≤j≤m ℓj T2j ∈ M2 with ℓj ∈ K2 , then for any word w, (T1 ⊙ T2 , w) = (T1 , w)(T2 , w) = =
X
X
ki (T1i , w)ℓj (T2j , w)
i,j
ki ℓj (T1i , w)(T2j , w)
i,j
since (T1i , w) and ℓj commute. Thus T1 ⊙ T2 =
X i,j
ki ℓj T1i ⊙ T2j ,
showing that M is generated, as a K-module, by the series T1i ⊙ T2j . Finally, M is stable, since for any word x, and for series T1 ∈ M1 , T2 ∈ M2 , x−1 (T1 ⊙ T2 ) = (x−1 T1 ) ⊙ (x−1 T2 ) ∈ M . Example 5.3 For n ∈ N, we denote P by n the element 1 + · · ·+ 1 (n times) of K. Let a be a letter. Then the series w |w|a w is recognizable (it is also rational, as seen in Example 4.1). Indeedtheseries admits (λ, µ, γ) the linear representation 11 10 0 defined by λ = (1, 0), µa = , µb = , for b ∈ A \ a, and γ = . It 01 01 1 is indeed easily seen that for any word w, 1 |w|a µw = . 0 1 As an application, let P (t1 , . . . , tn ) be a commutative polynomial with coefficients in K. Then the formal series (over the alphabet A = {a1 , . . . , an } ) X S= P (|w|a1 , . . . , |w|an )w . w∈A∗
is recognizable. This P follows from Theorem 5.3, Corollary 5.2 and from the recognizability of |w|a w.
6
The Fundamental Theorem
Theorem 6.1 (Sch¨ utzenberger 1961a) A formal series is recognizable if and only if it is rational. We start with several lemmas which will be needed for the proof.
13
6. The Fundamental Theorem Lemma 6.2 Let S and T be formal series, and let a be a letter. Then a−1 (ST ) = (a−1 S)T + (S, 1)(a−1 T ) . If S is proper, then a−1 (S ∗ ) = (a−1 S)S ∗ . Proof. For any word w, (a−1 (ST ), w) = (ST, aw) =
X
(S, u)(T, v)
uv=aw
= (S, 1)(T, aw) +
X
(S, au)(T, v)
uv=w
= (S, 1)(T, aw) +
X
(a−1 S, u)(T, v)
uv=w
= (S, 1)(a−1 T, w) + ((a−1 S)T, w) . This proves the first relation. For the second claim, observe that S ∗ = 1 + SS ∗ , whence a−1 (S ∗ ) = −1 (a S)S ∗ , since (S, 1) = 0. Let m be an n × n-matrix with coefficients in KhhAii: m ∈ KhhAiin×n . The matrix is proper if, for all indices i and j, the series mi,j is proper. In this case, the star of m can be defined as X m∗ = mk . k≥0
The existence of m∗ can be verified by considering the product topology induced by KhhAii on KhhAiin×n (the details are left to the reader). It is easily seen that m∗ = 1 + mm∗ ,
(6.1)
where 1 is the identity matrix. Lemma 6.3 If m is a proper matrix with elements in KhhAii, then all coefficients of m∗ are in the rational closure of the coefficients of m. Proof. Let m be an n × n-matrix. If n = 1, the result is clear. Arguing by induction, assume n > 1 and consider a decomposition into blocks ab m= cd where a and d are square matrices, and set αβ m∗ = γ δ
14
Chapter I. Rational Series
where the blocks have the same dimensions as the corresponding blocks in m. By Eq. (6.1), we get α = 1 + aα + bγ
β = aβ + bδ
γ = cα + dγ
δ = 1 + cβ + dδ
Observe that Lemma 4.1 extend to matrix equations; thus we have β = a∗ bδ,
γ = d∗ cα ,
whence α = 1 + aα + bd∗ cα = 1 + (a + bd∗ c)α δ = 1 + ca∗ bδ + dδ = 1 + (ca∗ b + d)δ . Again, Lemma 4.1 gives α = (a + bd∗ c)∗ δ = (ca∗ b + d)∗ . Finally β = a∗ b(ca∗ b + d)∗ γ = d∗ c(a + bd∗ c)∗ . By the induction hypothesis, all coefficients of a∗ , d∗ are in the rational closure of the coefficients of m. The same holds for the coefficients of a + bd∗ c and ca∗ b + d, and using again the induction hypothesis, the coefficients of α, δ, and also those of β and γ, are in the rational closure. Proof of Theorem 6.1. In order to show that any rational series is recognizable, we use Proposition 5.1. If P is a polynomial, then w−1 P = 0 for any word w of length greater than deg(P ). Consequently, the set {w−1 P | w ∈ A∗ } is finite. Since it is stable, it generates a stable submodule which, moreover, is finitely generated and also contains P (because 1−1 P = P ). Thus P is recognizable. If S and T are recognizable, then there exist stable finitely generated submodules M and N of KhhAii with S ∈ M and T ∈ N . Then M + N contains S + T , is finitely generated and is stable, showing that S + T is recognizable. Next, let P be the submodule P = M T + N . Clearly, P contains ST , and according to Lemma 6.2, P is stable. It is finitely generated because M and N are finitely generated. Hence ST is recognizable. Assume now that S is proper. Let Q be the submodule Q = K + M S ∗ . Then Q contains S ∗ = 1 + SS ∗ , and Q is stable since, by Lemma 6.2, a−1 (S ′ S ∗ ) = a−1 (S ′ )S ∗ + (S ′ , 1)a−1 (S)S ∗ is in Q for all S ′ in M . Finally, Q is finitely generated. Hence S ∗ is recognizable. Conversely, let S be a recognizable series and let (λ, µ, γ) be a linear representation of S of dimension n. Consider the proper matrix X m= µaa ∈ K n×n hhAii . a∈A
15
6. The Fundamental Theorem
We use below the natural isomorphism between K n×n hhAii and KhhAiin×n . Then k X X X X X X m∗ = mk = µaa = µww = µww . k≥0
k≥0 a∈A
k≥0 w∈Ak
w∈A∗
Thus m∗i,j =
X
(µw)i,j w ,
w
is rational in view of Lemma 6.3. Since X λi m∗i,j γj , S= i,j
the series S is rational.
Exercises for Chapter I 1.1 Let K = {0, 1} be a semiring composed of two elements. Show that, according to the value of 1 + 1, K is either the field with two elements or the Boolean semiring. 1.2 Let K be a semiring. Acongruence in K is an equivalence relation ≡ which is compatible with the laws of K, that is for all a, b, c, d ∈ K, a ≡ b, c ≡ d =⇒ a + b ≡ b + c, ac ≡ bd . a) Show that K/≡ has a natural structure of a semiring. Such a semiring is called a quotient of K. b) Show that if K is a ring then there is a bijection between congruences and two-sided ideals in K. c) Show that any quotient semiring of N which is not isomorphic to N is finite. 1.3 The prime subsemiring of a semiring K is the semiring L generated by 1. Show that every element in L commutes with every element in K and that L either is isomorphic to N or is finite. 1.4 Let K be a commutative semiring. a) Define two operations on K × K by (a, b) + (a′ , b′ ) = (a + a′ , b + b′ ) (a, b)(a′ , b′ ) = (aa′ + bb′ , ab′ + ba′ ) Show that these operations make K × K a semiring with zero (0, 0) and unity (1, 0). Show that i : a 7→ (a, 0) is an injection of K into K × K. Show that the relation ≡ defined by (a, b) ≡ (a′ , b′ ) ⇐⇒ ∃c : a + b′ + c = a′ + b + c
16
Chapter I. Rational Series is a congruence on K × K. Show that L = K × K/≡ is a ring. b) Denote by p the canonical surjection p : K ×K → L. Show that p ◦ i : K → L is injective if and only if for all a, b, c ∈ K a + b = a + c =⇒ b = c . A semiring having this property is called regular . Show that K can be embedded into a ring if and only if it is regular. c) Show that the ring L is without zero divisors if and only if for all a, b, c, d ∈ K, the following condition holds: ac + bd = ad + bc =⇒ a = b or c = d . Show that K can be embedded into a field if and only if K is regular and this condition is satisfied. d) K is simplifiable if for all a, b, c ∈ K ab = ac =⇒ b = c or a = 0 .
Show that if K can be embedded into a field, then it is regular and simplifiable. e) Let a, b, c, d be commutative indeterminates and let I be the ideal of Z[a, b, c, d] generated by (a−b)(c−d). Show that the image K of N[a, b, c, d] in Z[a, b, c, d]/I is a regular and simplifiable semiring, but that K cannot be embedded into any field. 3.1 Give complete proofs for the claims in Sect. 3. 3.2 Let B be the Boolean semiring and for all n ∈ N, let Sn = 1. Show that the family (Sn )n∈N is summable, but not locally finite. 3.3 Let K, L be two semirings, and let A, B be two alphabets. A function f : KhhAii → LhhBii is a morphism of formal series if f is a morphism of semirings and moreover is uniformly continuous. a) Show that the mapping LhhBii → L S 7→ (S, 1) is a continuous morphism of semirings. Show that if f : KhhAii → LhhBii is a morphism of semirings which is continuous at 0, then (i) for all k ∈ K and a ∈ A, the elements f (k) and f (a) commute, (ii) the multiplicative subsemigroup of L generated by {(f (a), 1) | a ∈ A} is nilpotent.
17
6. The Fundamental Theorem
b) Let f : A ∪ K → LhhBii be a function satisfying conditions (i) and (ii) of a). Show that f extends in a unique manner to a morphism of formal series KhhAii → LhhBii . 3.4 Let M be a commutative monoid, with law denoted additively, having an ultrametric distance d which is subinvariant with respect to translation (that is such that d(a + c, b + c) ≤ d(a, b) for a, b, c ∈ M ). Show that every series that converges in M converges commutatively. 3.5 Assume that K is a commutative field. Recall that for any K-vector space E, for any subspace F and any vector v in E \ F , there exists a linear form h on E such that h(E) = 0 and h(v) 6= 0. We use here the identification of KhhAii and of the dual of KhAi (see beginning of Chap. II). a) For each subspace V of KhAi (subspace W of KhhAii), define its orthogonal in KhhAii (in KhAi) to be given by V ⊥ = {S ∈ KhhAii | ∀P ∈ V, (S, P ) = 0}
(W ⊥ = {P ∈ KhAi | ∀S ∈ W, (S, P ) = 0}, respectively) Show that if V is a subspace of KhAi, then V ⊥⊥ = V . b) Show that a linear form h on KhhAii is continuous (for the discrete ∗ topology on K and the product topology on K A ) iff Kerh contains all but a finite number of elements of A∗ . Show that the topological dual space of KhhAii can be identified with KhAi. Show that for any closed subspace W of KhhAii, and for any formal series S not in W , there exists a continuous linear form h on KhhAii such that h(S) 6= 0 and h(W ) = 0. Show from this that for any subspace W of KhhAii, W ⊥⊥ is the adherence of W . 4.1 Let S ∈ KhhAii, let c be its constant term and let T be a proper series with S = c + T .P P n a) Show that if S n converges in KhhAii, then c also converges in K (for the discreteP topology). P n b) Show that if cn converges in K, then S converges in KhhAii, and then X ∗ X X Sn = cn T cn n≥0
n≥0
n≥0
P n P n c) Show that if S is rational and if S converges, then S is rational. d) Show that if f : KhhAii → LhhBii is a morphism of formal series (see Exercise 3.3) such that f (S) is rational for all S ∈ K ∪ A, then f preserves rationality. 4.2 Let (Sn ) be a sequence of proper series. Show that if lim Sn = S, then S is proper and lim Sn∗ = S ∗ . 4.3 Recall that an element a of a ring K is called quasi-regular (in the sense of Jacobson) if there exists some b ∈ K such that a + b + ab = 0. Recall also that the radical R of K is the greatest two-sided ideal of K having only quasi-regular elements (it exists by (Herstein 1968) Th. 1.2.3). a) Show that S ∈ KhhAii is quasi-regular in KhhAii if and only if its constant term is quasi-regular in K.
18
Chapter I. Rational Series b) Show that the radical of KhhAii is {S ∈ KhhAii | (S, 1) ∈ R} .
4.4 Let k ≥ 2 be an integer and let A = {a0 , . . . , ak−1 }. For any word w over A, we denote by hwi the integer having w as an expansion in base k. For example ha0 a1 a1 a1 i = k 2 + k + 1 . Let S and T be the series defined by X X S= hwiw, T = k |w| w . w
w
Show that T =1+
X
kai T
i
(S, ai w) = (S, w) + i(T, w) . P Using the formula S = (S, 1) + ai (a−1 i S), show that
S = (a1 + 2a2 + · · · + (k − 1)ak−1 )T + (a0 + · · · + ak−1 )S
and show that S=
X i
ai
∗ X i
iai
X i
kai
∗
by applying Lemma 4.1. This also shows that S is rational. 4.5 Assume that K is a ring. Show that a series is invertible in KhhAii iff its constant term is invertible in K. 5.1 a) Suppose that K is a field with absolute value | |. Show that if S ∈ KhhAii is recognizable, then there is a constant C ∈ R such that for all w ∈ A∗ |(S, w)| ≤ C 1+|w| . b) Suppose that K is a (commutative) integral domain with quotient field F . Show that if S ∈ F hhAii is recognizable and Phas a linear representation (λ, µ, γ), then for some C ∈ K \ 0 the series w C 2+|w| (S, w)w is in KhhAii and is K-recognizable and has the linear representation (Cλ, Cµ, Cγ) (“Eisenstein’s criterion”). 5.2 Let w = a1 · · · an be a word (ai ∈ A). For any subset I = {i1 < · · · < ik } of {1, . . . , n}, define w|I to be the word ai1 · · · aik . Given two words x and y of length n and p respectively, define their shuffle product x x y to be the polynomial X xxy = w(I, J) where the sum is over all partitions {1, 2, . . . , n + p} = I ∪ J with |I| = n, |J| = p, and where w(I, J) is defined by w(I, J)|I = x, w(I, J)|J = y. Moreover, 1 x y = y x 1 = y. For example, ab x ac = abac + 2a2 bc + 2a2 cb + acab .
6. The Fundamental Theorem
19
Let K be a commutative semiring. Extend the shuffle product to KhhAii by linearity and continuity, that is X SxT = (S, x)(T, y)x x y . x,y∈A∗
Show that the shuffle product is commutative and associative. Show that the operator S 7→ a−1 S
(a ∈ A)
is a derivation for the shuffle, that is a−1 (S x T ) = (a−1 S) x T + S x(a−1 T )
(*)
Show that the shuffle product of two recognizable series is still recognizable. (Hint : Proceed as in the proof of Theorem 5.3 and use Eq.(*).)
Notes to Chapter I The theorem showing the equivalence between rationality and recognizability was first proved by Kleene (1956) for languages (which may be seen as series with coefficients in the Boolean semiring) and later extended by Sch¨ utzenberger (1961a, 1962a,b) to arbitrary semirings. Here we shall derive Kleene’s theorem from Sch¨ utzenberger’s (see Chapter III). The condition “recognizable” =⇒ “rational”, which is essentially Lemma 6.3, is proved by using an argument of Conway (1971). Other proofs are also given in Eilenberg (1974) and Salomaa and Soittola (1978). The characterization of recognizable series (Proposition 5.1) is taken from Jacob (1975) who extends to semirings a Hankel-like property given by Fliess (1974a) for fields. Closure under shuffle product (Exercise 5.2) is due to Fliess (1974b) and has many applications in Control Theory, see Fliess (1981). We do not consider algebraic formal series in this book; the reader may consult Salomaa and Soittola (1978) or Kuich and Salomaa (1986).
20
Chapter I. Rational Series
Chapter II
Minimization This chapter gives a presentation of well-known results concerning the reduction of linear representations of recognizable series. The central concept of this study is the notion of syntactic algebra, which is introduced in Section 1. Rational series are characterized by the fact that their syntactic algebras are finite dimensional (Theorem 1.1). The syntactic right ideal leads to the notion of rank and of Hankel matrix; the quotient by this ideal is the analogue for series of the minimal automaton for languages. Section 2 is devoted to the detailed study of reduced linear representations. The relations between representations and syntactic algebra are given. Two reduced representations are shown to be similar (Theorem 2.4), and an explicit form of the reduced representation is given (Corollary 2.3). The reduction algorithm is presented in Section 3. We start with a study of prefix sets. The main tool is a description of bases of right ideals of the ring of noncommutative polynomials (Theorem 3.2). Several important consequences are given. Among them are Cohn’s result on the freeness of right ideals, the Schreier formula for right ideals and linear recurrence relations for the coefficients of a rational series. A detailed description of the reduction algorithm completes the chapter. In this chapter, K denotes a commutative ring.
1
Syntactic Ideals
The algebra of polynomials KhAi is a free K-module having as a basis the free monoid A∗ . Consequently, the set KhhAii of formal series can be identified with the dual of KhAi. Each formal series S defines a linear form KhAi → K P 7→ (S, P ) =
X
(S, w)(P, w) ,
w∈A∗
the sum having a finite support because P is a polynomial. Thus, one may consider the kernel of S, denoted by KerS: KerS = {P ∈ KhAi | (S, P ) = 0} . 21
22
Chapter II. Minimization
Next, any multiplicative morphism µ : A∗ → M , where M is a K-algebra, can be extended uniquely to a morphism of algebras KhAi → M . This extension will also be denoted by µ. We shall use this convention tacitly in the sequel. Clearly X µ(P ) = (P, w)µ(w) . w∈A∗
Definition The syntactic ideal of a formal series S ∈ KhhAii is the greatest two-sided ideal of KhAi contained in the kernel of S. It is denoted by IS . Observe that this ideal always exists, since it is the sum of all ideals contained in KerS, X IS = I. I⊂KerS
Definition The syntactic algebra of a formal series S ∈ KhhAii, denoted by MS , is the quotient algebra of KhAi by the syntactic ideal of S, MS = KhAi/IS . The canonical morphism KhAi → MS is denoted by µS . Since KerµS = IS ⊂ KerS, the series S induces on MS a linear form denoted φS . Consequently S = φS ◦ µS .
Theorem 1.1 (Reutenauer 1978, 1980a) A formal series is rational if and only if its syntactic algebra is a finitely generated module over K. Proof. If S is rational, S is recognizable and has a linear representation (λ, µ, γ), with µ : A∗ → K n×n a morphism. Since A is finite, the subring L of K generated by the coefficients of λ, µ(a), (a ∈ A) and γ is a finitely generated ring. Note that a ring of polynomials in several commuting variables over Z is a Noetherian ring (Lang 1984, Cor. IV.2.4), and that each quotient ring of a Noetherian ring is again Noetherian (ibid. Prop. X.1.4). Therefore L is a Noetherian ring. Hence each submodule of any finitely generated module over L is finitely generated. Since Ln×n is a finitely generated module over L, this implies that so is µ(LhAi). In other words, for w in A∗ long enough, µw is a L-linear combination of µ(v) for shorter words v. This implies in turn that µ(KhAi) is a finitely generated K-module. Now Kerµ is an ideal contained in KerS. Thus by definition Kerµ ⊂ IS , and MS is a quotient of µ(KhAi). Hence it is a finitely generated module over K.
23
1. Syntactic Ideals
Conversely, suppose that the syntactic algebra of S is a finitely generated module over K. Consider, for each word w in A∗ , the K-endomorphism νw of MS defined by m 7→ µS (w)m . The function ν : A∗ → End(MS ) is a morphism, and moreover (S, w) = φS ◦ µS (w) = φS (µS (w)) = φS (νw(1)) . In order to conclude, it suffices to apply the following lemma and Theorem I.6.1. Lemma 1.2 (This lemma is true for any semiring K, even noncommutative.) Let M be a finitely generated right K-module, let φ be a K-linear form on M , let m0 be an element of M and let ν be a morphism A∗ → End(M ). Then the formal series X S= φ(νw(m0 ))w w∈A∗
is recognizable. More precisely, if M has a generating system of n elements, then S admits a linear representation of dimension n. Proof. Let m1 , . . . , mn be generators of M . Then for each letter a ∈ A, and each j in {1, . . . , n}, there exist coefficients αai,j such that νa(mj ) =
X
mi αai,j .
i
The matrices (αai,j )i,j ∈ K n×n define a function µ : A → K n×n which extends to a morphism µ : A∗ → K n×n . An induction shows that for any word w, X mi µ(w)i,j . νw(mj ) = i
Let λ ∈ K 1×n and γ ∈ K n×1 be given by λi = φ(mi ) and m0 = νw(m0 ) = νw
X j
XX mi µ(w)i,j γj , mj γj = j
P
j
mj γj . Then
i
thus φ(νw(m0 )) =
X
λi (µw)i,j γj = λµwγ ,
i,j
which completes the proof. Definition The syntactic right ideal of a formal series S ∈ KhhAii is the greatest right ideal of KhAi contained in KerS. It is denoted ISr .
24
Chapter II. Minimization
The existence of ISr is shown in the same manner as that of IS . We now introduce an operation of KhAi on KhhAii on the right. Recall that, since KhhAii is the dual of KhAi, each endomorphism f of the K-module KhAi defines an endomorphism, called the adjoint morphism, of the K-module KhhAii by the relation (S, f (P )) = (tf (S), P ) for every series S and polynomial P . The function f 7→ tf is an antimorphism: t
(g ◦ f ) = tf ◦ tg
(1.1)
Given a polynomial P , we consider the endomorphism Q 7→ P Q of KhAi and its adjoint morphism, denoted by S 7→ S ◦ P . Thus (S, P Q) = (S ◦ P, Q) . In particular, (S, xy) = (S ◦ x, y) .
(1.2)
Consequently, S ◦ x = x−1 S with the notation of Section I.5. Observe that the operation ◦ is already defined by Eq. (1.2); it suffices to extend it by linearity. In view of Eq. (1.1), one obtains (S ◦ P ) ◦ Q = S ◦ (P Q) .
(1.3)
Thus KhhAii is a right KhAi-module. Proposition 1.3 The syntactic right ideal of a series S is ISr = {P ∈ KhAi | S ◦ P = 0} . Proof. Since the operation ◦ defines on KhhAii a structure of right KhAi-module, it is clear that the right-hand side of the equation is a right ideal of KhAi. It is contained in KerS because S ◦ P = 0 implies (S, P ) = (S ◦ P, 1) = 0. It is the greatest right ideal with that property since, given a polynomial P , the relation P KhAi ⊂ KerS implies (S ◦ P, Q) = (S, P Q) = 0 for all polynomials Q, whence S ◦ P = 0. Corollary 1.4 KhAi/ISr is isomorphic to S ◦ KhAi as a right KhAi-module. This module is the analogue for series of the minimal automaton of a formal language. We suppose from now on that K is a field. Definition The rank of a formal series S is the dimension of the space S ◦KhAi.
1. Syntactic Ideals
25
Definition The Hankel matrix of a formal series S is the matrix H indexed by A∗ × A∗ defined by H(x, y) = (S, xy) for all words x, y. Theorem 1.5 (Carlyle and Paz 1971, Fliess 1974a) The rank of a formal series S is equal to the codimension of its syntactic right ideal, and is equal to the rank of its Hankel matrix. The series S is rational if and only if this rank is finite and in this case, its rank is equal to the minimum of the dimension of the linear representation of S. The theorem shows that the rank of a formal series could have been defined by an operation of KhAi on KhhAii on the left (analogue to ◦), or also by means of the syntactic left ideal (whose definition is straightforward). Indeed, the Hankel matrix is an object which is essentially unoriented. Recall that the rank of a matrix (even an infinite one) can be defined to be the greatest dimension of a nonvanishing subdeterminant, and that it is equal to the rank of the rows (and the rank of the columns). Proof. The first equality, namely rank(S) = codim(ISr ) is a direct consequence of Corollary 1.4. Next, the space S ◦KhAi has as set of generators {S ◦x | x ∈ A∗ }. Thus rank(S) is equal to the rank of this set. Since each S ◦ x can be identified with the row of index x in the Hankel matrix of S, the rank of S is equal to the rank of this matrix. If S is rational, it has a linear representation (λ, µ, γ) of dimension n. The right ideal J = {P ∈ KhAi | λµ(P ) = 0} is contained in KerS, and its codimension is ≤ n. Consequently, J is contained in ISr , showing that rank(S) = codim(ISr ) ≤ codim(J) ≤ n. Conversely, let n = rank(S) = dim(S ◦ KhAi). Let φ be the linear form S ◦ KhAi → K
T 7→ (T, 1) .
Then for any word w, (S, w) = (S ◦ w, 1) = φ(S ◦ w) .
(1.4)
Let µw be the matrix of the endomorphism of S ◦ KhAi which maps a series T on T ◦ w, in some basis of S ◦ KhAi. (Each element of S ◦ KhAi is represented by a vector K 1×n , and each endomorphism of S ◦ KhAi is represented by a matrix in K n×n ; then K n×n acts on the right on K 1×n .) In view of Eq. (1.3), one has (µx)(µy) = µ(xy) for any words x and y. Let λ be the row vector representing S in the chosen basis, and let γ be the column representing φ. Then Eq. (1.4) can be expressed as (S, w) = λµwγ showing that S is recognizable, with a linear representation of dimension n. The theorem justifies the following definition.
26
Chapter II. Minimization
Definition A reduced linear representation of a rational series S is a linear representation of S with minimal dimension among all its representations. Example 1.1 The only series of rank 0 is the null series. Example 1.2 Let S be a series of rank 1. It admits a representation (λ, µ, γ), with µ : KhAi → K a morphism of algebras and λ, µ ∈ K. Set αa = µ(a) for each letter a. For w = a1 · · · an (ai ∈ A), this gives µ(w) = αa1 · · · αan =
Y
a α|w| . a
a∈A
Consequently, (S, w) = λγ
Y
a . α|w| a
a∈A
Such a series is called geometric. It follows that S = λγ
X
αa a
a∈A
∗
−1 X = λγ 1 − αa a . a∈A
An example of a geometric series is the characteristic series of A∗ : S=
X
w∈A∗
w=
X ∗ X −1 a = 1− a . a∈A
a∈A
P Example 1.3 The series S = w∈A∗ |w|a w has rank 2. Indeed, it has a linear representation of dimension 2 (see Example (5.1)). Next, the subdeterminant of its Hankel matrix corresponding to the rows and columns 1 and a is 0 1 = −1 . 1 2 Thus, S has rank ≥ 2. In view of Theorem 1.5, the rank of S is 2.
2
Reduced Linear Representations
K denotes a (commutative) field. Proposition 2.1 A linear representation (λ, µ, γ) of dimension n of a series S is reduced if and only if, setting M = µ(KhAi), λM = K 1×n and M γ = K n×1 . In this case, ISr = {P | λµP = 0} .
2. Reduced Linear Representations
27
Proof. Suppose that (λ, µ, γ) is reduced, and let J = {P | λµP = 0}. Then J is a right ideal of KhAi and codim(J) = dim(λM ) ≤ n. Since J ⊂ KerS, one has J ⊂ ISr and codim(J) ≥ codim(ISr ) = n (Theorem 1.5). Consequently codim(J) = n, J = ISr and λM = K 1×n . The equality M γ = K n×1 is derived symmetrically. Conversely, assume λM = K 1×n and M γ = K n×1 . Then there exist words x1 , . . . , xn (y1 , . . . , yn ) such that λµx1 , . . . , λµxn (µy1 γ, . . . , µxn γ) is a basis of K 1×n (of K n×1 ). Consequently det(λµxi yj γ)1≤i,j≤n 6= 0 . Since λµxi yj γ = (S, xi yj ), the Hankel matrix of S has rank ≥ n. In view of Theorem 1.5, the representation (λ, µ, γ) is reduced. Corollary 2.2 If the linear representation (λ, µ, γ) of the formal series S is reduced, then the kernel of µ is exactly the syntactic ideal of S, and consequently µ(KhAi) is isomorphic to the syntactic algebra of S. Proof. Since Kerµ is contained in KerS, it is contained in IS . Conversely let P ∈ IS . Then QP R is in IS for all polynomials Q, R, and consequently (S, QP R) = 0. It follows that λµQP Rγ = 0 and in fact λµ(KhAi)µP µ(KhAi)γ = 0. In view of Proposition 2.1, this implies µP = 0, whence P ∈ Kerµ. Corollary 2.3 (Sch¨ utzenberger 1961a) If (λ, µ, γ) is a reduced representation of dimension n of a formal series S, then there exist polynomials P1 , . . . , Pn , Q1 , . . . , Qn such that, for every word w, µw = ((S, Pi wQj ))1≤i,j≤n . Proof. In view of Proposition 2.1, there are polynomials P1 , . . . , Pn , Q1 , . . . , Qn such that (λµPi )1≤i≤n is the canonical basis of K 1×n and similarly (µQj γ)1≤j≤n is that of K n×1 . Thus (µw)i,j = λµPi µwµQj γ = (S, Pi wQj ) . Two linear representations (λ, µ, γ) and (λ′ , µ′ , γ ′ ) are called similar if there exists an invertible matrix m such that λ′ = λm, µ′ w = m−1 µwm (for all words w), γ ′ = m−1 γ. Clearly they recognize the same series. Theorem 2.4 (Sch¨ utzenberger 1961a, Fliess 1974a) Two reduced linear representations are similar. Proof. Let (λ, µ, γ) be a reduced linear representation of a series S. Since, by Proposition 1.3 and 2.1, ISr = {P ∈ KhAi | λµP = 0} = {P ∈ KhAi | S ◦ P = 0} , the two right KhAi-modules S ◦ KhAi and K 1×n = λµ(KhAi) (with the action on K 1×n defined by (v, P ) = vµ(P )) are isomorphic. Consequently, there exists a K-isomorphism f : K 1×n → S ◦ KhAi
28
Chapter II. Minimization
such that, for any polynomial P , and any v ∈ K 1×n , f (vµP ) = f (v) ◦ P and, moreover f (λ) = S . Next, consider the linear form φ on S ◦ KhAi defined by φ(T ) = (T, 1). Then for v = λµP , one gets φ(f (v)) = φ(f (λµP )) = φ(f (λ)◦P ) = φ(S ◦P ) = (S ◦P, 1) = (S, P ) = λµP γ = vγ, which shows that φ◦f =γ if γ is set to be the linear form v → vγ. If (λ′ , µ′ , γ ′ ) is another reduced linear representation, there exists an analogous isomorphism f ′ . Thus there exists an isomorphism ψ = f −1 ◦ f ′ : K 1×n → K 1×n such that ψ(vµ′ P ) = ψ(v)µP, ψ(λ′ ) = λ, ψ(γ ′ ) = γ . It suffices to write these relations in matrix form to obtain the announced result. Corollary 2.5 (Sch¨ utzenberger 1961a) Let (λ, µ, γ) and (λ′ , µ′ , γ ′ ) be two linear representations of some series S, and assume the second representation is ¯ µ reduced. Then there exists a representation (λ, ¯ , γ¯) similar to (λ, µ, gamma) and having a block decomposition of the form µ1 0 0 0 ¯ = (×, λ′ , 0), µ λ ¯ = × µ′ 0 , γ¯ = γ ′ . × × µ2 × Proof. 1. Assume first that (λ, µ, γ) has the block decomposition µ1 0 0 0 λ = (λ1 , λ2 , 0), µ = × µ2 0 , γ = γ2 × × µ3 γ3
for some morphisms µi : A∗ → K ni ×ni , with the conditions
(i) λµ(KhAi) = K n1 × K n2 × {0}n3 (we write here K r for K r×1 , the set of row vectors), and (ii) if v ∈ K n2 and (0, v, 0)µ(KhAi)γ = 0, then v = 0. By using the block decomposition, we see that λµwγ = λ2 µ2 wγ2 , so that (λ2 , µ2 , γ2 ) is a representation of S, of dimension n2 . We show that it is reduced, by using Proposition 2.1. Using again the block decomposition, we obtain for P in KhAi, λµ(P ) = (×, λ2 µ2 (P ), 0). Thus (i) implies that λ2 µ2 (KhAi) = K n2 . Now, let v ∈ K n2 be such that vµ2 (KhAi)γ2 = 0. Then, since (0, v, 0)µ(P )γ = vµ2 (P )γ2 , we see
3. The Reduction Algorithm
29
by (ii) that v = 0. This implies that µ2 (KhAi)γ2 = K n2 ×1 , and Proposition 2.1 now shows that (λ2 , µ2 , γ2 ) is reduced. Applying Theorem 2.4, we deduce the corollary in this case. 2. Now consider any representation (λ, µ, γ) of S. Define V1 = λµ(KhAi) ∩ {v | vµ(KhAi)γ = 0}. Let V2 be a subspace of K 1×n such that V1 ⊕ V2 = λµ(KhAi) and V3 such that V1 ⊕ V2 ⊕ V3 = K 1×n . The subspaces V1 and V1 ⊕ V2 are both stable under the right action of the matrices in µ(KhAi). Moreover λ is in V1 ⊕ V2 and V1 γ = 0. This shows that, by a change of basis (which amounts to similarity), we are reduced to the form in 1. We verify that (i) and (ii) hold. Condition (i) is implied by the very definition of V1 and V2 . For (ii), let v ∈ V2 be such that vµ(KhAi)γ = 0; then v ∈ V1 , so that v = 0.
3
The Reduction Algorithm
We now give an effective procedure for computing a reduced linear representation of a recognizable series. Definition A prefix set is a subset C of A∗ such that x, xy ∈ C implies y = 1 for all words x and y. It is right complete if it meets every right ideal of A∗ . In other words, C is right complete if for every word w in A∗ , wA∗ meets CA∗ . Equivalently, each word w either has a prefix in C, or is a prefix of some word in C. Definition A subset P of A∗ is prefix-closed if xy ∈ P implies x ∈ P for all words x and y. In other words, a prefix-closed set contains all the prefixes of its elements, while a prefix set contains none of them. Proposition 3.1 There exists a bijection between prefix sets and prefix-closed sets. To a prefix set C is associated the prefix-closed set P = A∗ \ CA∗ , and the reciprocal bijection is defined by C = P A \ P . In this case, A∗ = C ∗ P . This bijection defines, by restriction, a bijection between finite right complete prefix sets and finite nonempty prefix-closed sets. Proof. The prefix order u ≤ v on A∗ is defined by the condition that u is a prefix of v. Clearly, a right ideal I of A∗ is generated, as a right ideal, by the set of its minimal elements for the prefix order. Evidently, this set is a prefix set. On the other hand, the complement of a right ideal is a prefix-closed set, and conversely. This proves the existence of the bijection. It shows also that if the prefix-closed set P and the prefix set C correspond to each other under this bijection, then P = A∗ \ CA∗ and I = A∗ \ P = CA∗ . Let w ∈ C; then w is minimal in I, hence w = ua, a ∈ A, and u ∈ A∗ \I = P , implying C ⊂ P A. The fact that P = A∗ \ CA∗ implies that P and C are disjoint, hence C ⊂ P A \ P . Conversely, if w ∈ P A \ P , then w ∈ A∗ \ P =⇒ w ∈ CA∗ . Thus w = xu = pa, a ∈ A, x ∈ C. Then x cannot be a prefix of p (otherwise I meets P ), hence p is a proper prefix of x and this implies x = pa, u = 1, hence w ∈ C.
30
Chapter II. Minimization
If P is finite, then C = P A \ P is finite. Moreover A∗ = P ∪ CA∗ , hence each long enough word is in CA∗ , implying that C is right complete. Conversely, suppose that C is right complete and finite. Let n be the length of the longest words in C. Since CA∗ ∩ wA∗ 6= ∅, any word w of length at least n is in CA∗ , hence not in P . Thus P is finite. Remark In order to illustrate Proposition 3.1, let us consider the tree representation of the free monoid A∗ . Let for instance A = {a, b}. Then A∗ is represented by
For instance, the circled node corresponds to aba. A finite right complete prefix set C then is represented by a finite tree of the shape
with the elements of the set being the tree’s leaves, and the prefix-closed set associated with C being represented by its interior nodes. Example 3.1 The tree
represents the prefix set C = a3 + a2 b + aba2 + abab + ab2 + b , with P = 1 + a + a2 + ab + aba . The white circles ◦ represent the elements of the set, and the black circles • the elements of P . This representation helps understanding the proof. In the following statement, K is assumed to be a commutative field.
3. The Reduction Algorithm
31
Theorem 3.2 Let I be a right ideal of KhAi. There exists a prefix closed set C with associated prefix-closed P set P , and coefficients αc,p (c ∈ C, p ∈ P ), such that the polynomials Pc = c − p∈P αc,p p (c ∈ C) generate freely I as a right KhAi-module and such that P defines a K-basis in KhAi/I. Proof. Let φ : KhAi → M = KhAi/I be the canonical morphism. Let P be a prefix-closed subset of A∗ such that the elements φ(p), for p ∈ P , are K-linearly independent in M , and maximal among the subsets of A∗ having this property. Let C = P A \ P . Then C is a prefix set (Proposition 3.1). For each c ∈ C, the set P ∪c is prefix-closed, and by the maximality of P , φ(c) is in the subspace of M spanned by φ(P ). Thus there exist coefficients αc,p ∈ K such that X Pc = c − αc,p p ∈ I . (3.1) p∈P
We now show that any polynomial R can be written as X X R= Pc Qc + βp p c∈C
(3.2)
p∈P
for some polynomials Qc (c ∈ C) and coefficients βp (p ∈ P ). It suffices to prove this for the case where R = w is a word, and even in the case where w ∈ / P. But then w = cx (c ∈ C) since A∗ \ P = CA∗ by Proposition 3.1. We argue by induction on the length of the word x. First, observe that by Eq. (3.1), X w = Pc x + αc,p px . p
Since each of the words px is either in P or of the form c′ x′ with |p| < |c′ |, whence |x′ | < |x|, the induction hypothesis completes the proof. If the polynomial R of Eq. (3.2) is in I, then X 0 = φ(R) = βp φ(p) . p
Consequently, βp = 0 for all p and X R= Pc Qc , c∈C
which shows P that the right ideal I is generated by the Pc . Let Pc Qc = 0 be a relation of KhAi-dependency between the Pc , and assume that not all Qc vanish. Then X X cQc = αc,p pQc . (3.3) c
c,p
Consider a word w for which there is a c0 ∈ C with (Qc0 , w) 6= 0, and which is a word of maximal length. For this word w, the coefficient of c0 w on the left-hand side of Eq. (3.3) is (Qc0 , w) 6= 0 because C is a prefix set. Thus X 0 6= (Qc0 , w) = αc,p (pQc , c0 w) . c,p
32
Chapter II. Minimization
However, px = c0 w implies that p is a proper prefix of c0 , thus c0 = py for some y 6= 1 and x = yw. Consequently, the right-hand side of the previous equality is X αc,p (Qc , yw) = 0 y6=1,c0 =py
in view of the maximality of w, a contradiction. Corollary 3.3 (Cohn 1969) Each right ideal of KhAi is a free right KhAimodule. Corollary 3.4 (Lewin 1969) Let I be a right ideal of KhAi of codimension n and rank d (as a right KhAi-module). Let r be the cardinality of A. Then d = n(r − 1) + 1 . Proof. Indeed, if P is a finite prefix-closed set, with associated prefix set C, then by Proposition 3.1, C = P A \ P . Now, each nonempty word in P is in P A. Thus we have the equality with disjoint unions: C ∪ P = P A ∪ {1}. Thus |C| + |P | = |P |· |A| + 1, implying d + n = nr + 1. We also obtain linear recurrence relations for rational series which generalize those for one-variable series (see Chapter IV). Corollary 3.5 For any rational series S of rank n, there exist a prefix-closed set P of n elements, with an associated prefix set C, and coefficients αc,p , (c ∈ C, p ∈ P ) such that, for all words w and all c ∈ C, (S, cw) =
X
αc,p (S, pw) .
(3.4)
p∈P
Proof. It suffices to apply Theorem 3.2 to the syntactic right ideal of S which has codimension n. Corollary 3.6 Let S be a rational series of rank ≤ n, such that (S, w) = 0 for all words w of length ≤ n − 1. Then S = 0. Proof. This is a consequence of Corollary 3.5. Indeed, |p| ≤ n − 1 and therefore (S, p) = 0 for all p ∈ P . Assume S 6= 0, and let w be a word with (S, w) 6= 0. Then w = cx for some c ∈ C. We choose w in such a way that the corresponding word x has minimal length. By Eq. (3.4), (S, cx) =
X
αc,p (S, px) ,
p∈P
and by the choice of x, one has (S, px) = 0 for all p ∈ P : indeed, either px ∈ P , or px = c′ y for some c′ ∈ C and y shorter than x. Thus (S, cx) = 0, a contradiction. A subset T of A∗ is suffix-closed if xy ∈ T implies y ∈ T for all words x and y.
3. The Reduction Algorithm
33
Corollary 3.7 Let S be a rational series of rank n. There exists a prefix-closed set P and a suffix-closed set T , both with n elements, such that det((S, pt))p∈P,t∈T 6= 0 . Proof. Let (λ, µ, γ) be a reduced linear representation of S. It has dimension n. In view of Theorem 3.2, there exists a prefix-closed set P such that λµ(P ) is a basis of K 1×n , and symmetrically, there is a suffix-closed set T such that µ(T )γ is a basis of K n×1 . Thus the determinant of the matrix (λµpµtγ)p,t does not vanish. This proves the corollary. A careful analysis of the preceding proofs shows how to compute effectively a reduced linear representation of a rational series S given by any of its linear representations. Indeed, let (λ, µ, γ) be such a representation, of dimension n. The first step consists in reducing the representation to satisfy K 1×n = λµ(KhAi). To do this, consider a prefix-closed subset P of A∗ such that the vectors λµp, for p ∈ P , are linearly independent, and which is maximal for this property. Then for each c in the prefix set C = P A \ P , there are coefficients αc,p such that X λµc = αc,p λµp . p
Consider, for each letter a, 1 ′ (µ a)p,q = αc,p 0
the matrix µ′ a ∈ K P ×P defined by if pa = q if pa = c ∈ C otherwise.
In other words, µ′ a is the matrix, in the basis λµP of λµ(KhAi), of the endomorphism v 7→ vµa. In this basis the matrix for λ is λ′ defined by λ′1 = 1, and λ′p = 0 for p 6= 1; the matrix for γ is γ ′ defined by γp′ = λµpγ = (S, p). Then (λ′ , µ′ , γ ′ ) is a linear representation of S, since for any word w, one has λµw ∈ λµ(KhAi), whence λµwγ = λ′ µ′ wγ ′ . Moreover, the representation (λ′ , µ′ , γ ′ ) satisfies K 1×P = λ′ µ′ (KhAi). Indeed, since λ′ µ′ p represents the vector λµp in the basis λµ(P ), one has λ′ µ′ p = (δp,q )q∈P , which shows that λ′ µ′ (KhAi) contains the canonical basis of K 1×P . If in the preceding construction, we assume moreover that µ(KhAi)γ = K n×1 , then also µ′ (KhAi)γ ′ = K P ×1 . Indeed, the first equality implies that every linear form on the space λµ(KhAi) is represented by a matrix of the form µ(R)γ for some R ∈ KhAi. In the new basis λ′ µ′ (P ) of λ′ µ′ (KhAi), this matrix becomes µ′ (R)γ ′ . Thus any linear form on K 1×P = λ′ µ′ (P ) is represented as some µ′ (R)γ ′ , which proves the claim. Now the work is almost done. In a first step, one reduces the representation to satisfy the condition µ(KhAi)γ = K n×1 , using a construction which is symmetric to the preceding one, based on suffix sets and suffix-closed sets. In a second step, the representation is transformed to satisfy in addition λµ(KhAi) = K 1×n , and (λ, µ, γ) is reduced by Proposition 2.1.
34
Chapter II. Minimization
Exercises for Chapter II 1.1 The reversal of a word w, denoted by w, ˜ is defined as follows. If w = 1, then w ˜ = 1; if w = a1 · · · an (ai ∈ A), then w ˜ = an · · · a1 . A word w is a palindrome if it is equal to its reversal. Let L be the set of palindrome words. a) Assume |A| ≥ 2. Show that if x, x1 , . . . , xn are words with |x| ≤ |x1 |, . . . , |xn |, and x 6= x1 , . . . , xn , then there exists y such that xy ∈ L, x1 y, . . . , xn y ∈ / L. ( Hint : Take y = ap bbap x ˜, where a and b are distinct letters and p = sup{|xi | − |x|}.) b) Let S ∈ KhhAii be such that (S, w) = 1 if w ∈ L and (S, w) = 0 for w ∈ / L. Show that all syntactic ideals of S are null (see (Reutenauer 1980a)). c) (K is a commutative semiring.) Let S ∈ KhhAii be a recognizable series. P Show that S ′ = (S, w)w ˜ is recognizable. w
1.2 A finitely generated K-algebra M is syntactic if there exists a formal series S whose syntactic algebra is isomorphic to M . a) Show that M is syntactic if and only if it contains a hyperplane which contains no nonnull two-sided ideal. b) Let M = K · 1 ⊕ K · α ⊕ K · β, with multiplication defined by α2 = αβ = βα = β 2 = 0 .
Show that M is not syntactic. c) Show that KhAi is syntactic (use Exercise 1.1). 1.3 Show that the converse of Lemma 1.2 holds, and that M may be chosen to be a free right K-module (K is any semiring).
Notes to Chapter II The notions of syntactic ideal and algebra are introduced in Reutenauer (1978, 1980a), which also contains Theorem 1.1. The notions of Hankel matrix and rank of a formal series, which are classical in the case of one variable, were introduced by Carlyle and Paz (1971) and Fliess (1974a). The reduced linear representation of a rational series was first studied by Sch¨ utzenberger (1961a,b), mainly in connection with the linear recurrence relations (Corollary 3.4). His methods are used here to prove Theorem 3.2 and the reduction algorithm. Observe that this construction is closely related to Schreier’s construction of a basis of a subgroup of a free group (see Lyndon and Schupp (1977), Proposition I.3.7).
Chapter III
Series and Languages This chapter describes the relations between rational series and languages. It contains a criterion for the support of a rational series to be a rational language, an also an iteration theorem for these supports. We start by Kleene’s theorem as a consequence of Sch¨ utzenberger’s theorem. Then we describe the cases where the support of a rational series is a rational language. The most important result states that if a series has finite image, then its support is a rational language (Theorem 2.8). The family of languages which are supports of rational series have closure properties given in Section 3. The iteration theorem for rational series is proved in Section 4. The last section is concerned with an extremal property of supports which forces their rationality.
1
The Theorem of Kleene
Definitions A language is a subset of A∗ . A congruence in a monoid is an equivalence relation which is compatible with the operation in the monoid. A language L is recognizable if there exists a congruence with finite index in A∗ that saturates L (that is L is union of equivalence classes). It is equivalent to say that L is recognizable if there exists a finite monoid M , a morphism of monoids φ : A∗ → M and a subset P of M such that L = φ−1 (P ). The product of two languages L1 and L2 is the language L1 L2 = {xy | x ∈ L1 , y ∈ L2 }. If L is a language, the submonoid generated by L is ∪n≥0 Ln . For this reason, we denote it by L∗ . Definition The set of rational languages (over A) is the smallest set of subsets of A∗ containing the finite subsets and closed under union, product, and submonoid generation. Theorem 1.1 (Kleene 1956) A language is rational if and only if it is recognizable. We will obtain this theorem as a consequence of Sch¨ utzenberger’s Theorem I.6.1.
35
36
Chapter III. Series and Languages
Lemma 1.2 Let K, L be two semirings, and let φ : K P→ L be a morphism of semirings. If S ∈ KhhAii is recognizable, then φ(S) = (φ((S, w))w ∈ LhhAii is recognizable. Proof. If indeed S has a linear representation (λ, µ, γ), then φ(S) admits the linear representation (φ(λ), φ ◦ µ, φ(γ)), where we still denote φ the extension of φ to matrices. Lemma 1.3 A language L is recognizable if and only if it is the support of some recognizable series S ∈ NhhAii. Proof. If L is recognizable, there exists a finite monoid M , a morphism of monoids φ : A∗ → M and a subset P of M such that L = φ−1 (P ). Consider the right regular representation of M ψ : M → NM×M defined by ψ(m)m1 ,m2 =
(
1 if m1 m = m2 , 0 otherwise.
It is easy to verify that ψ is a morphism of monoids. Define λ ∈ N1×M and γ ∈ NM×1 by λm = δm,1 , ( 1 if m ∈ P , γm = 0 otherwise. Then ψ(m)1,m′ = 1 if and only if m = m′ , and consequently λψ(m)γ = 1 if m ∈ P , and = 0 otherwise. Now let µ = ψ ◦ φ : A∗ → NM×M and let S be the recognizable series with representation (λ, µ, γ). Then S = P w∈L w, whence L = supp(S). Conversely, assume that S ∈ NhhAii is recognizable and let L = supp(S). Consider the Boolean semiring B = {0, 1} with 1 + 1 = 1. Then the function φ:N→B defined by φ(0) = 0 and φ(r) P = 1 for r ≥ 1 is a morphism of semirings. By Lemma 1.2, the series φ(S) = φ((S, w))w ∈ BhhAii is B-recognizable. Thus there exists a linear representation (λ, µ, γ) of φ(S) with µ : A∗ → Bn×n . Let M = Bn×n , and P = {m ∈ M | λmγ = 1}. Since M is finite, the language {w | µ(w) ∈ P } is recognizable, but this language is exactly supp(φ(S)) = supp(S) = L.
2. Series and Rational Languages
37
Lemma 1.4 A language L over A is rational if and only if it is the support of some rational series S ∈ NhhAii. Proof. The following relations hold for series S and T in NhhAii: supp(S + T ) = supp(S) ∪ supp(T ) supp(ST ) = supp(S) supp(T ) supp(S ∗ ) = (supp(S))∗ if S is proper. It follows easily that the support of a rational series in NhhAii is a rational language. For the converse, one can use the same relations, provided one has proved that any rational language can be obtained from finite sets by union, product, and submonoid generation restricted to proper languages (that is languages not containing the empty word). We shall prove a stronger result, namely that for any rational language L, the language L \ 1 can be obtained from the finite ∗ subsets of A+ = product and generation of subsemigroup (that S A n\ 1 by union, + A = AA∗ ). is A 7→ A = n≥1
Indeed, if L1 and L2 have this property, then clearly so does L1 ∪ L2 also, since (L1 ∪L2 )\1 = L1 \1∪L2 \1, and L1 L2 , since L1 L2 \1 = (L1 \1)(L2 \1)∪K, where K = L1 \ 1, L2 \ 1 = L1 \ 1 ∪ L2 \ 1 according to L2 , L1 or both contain the empty word. Finally, if L has the announced property, then so does L∗ , since L∗ \ 1 = (L \ 1)∗ \ 1 = (L \ 1)+ . Kleene’s Theorem 1.1 is now an immediate consequence of Lemmas 1.3, 1.4, and of Theorem I.6.1. Corollary 1.5 The complement of a rational language is rational. Proof. If L is saturated by a congruence with finite index, then A∗ \L is saturated by the same congruence. Corollary 1.6 A language L over A is rational if and only if the set of languages {w−1 L | w ∈ A∗ } is finite (with w−1 L = {x ∈ A∗ | wx ∈ L}). Proof. Note that a language L is rational if and only if its characteristic series over the Boolean semiring is rational. Hence the corollary is a consequence of Proposition I.5.1.
2
Series and Rational Languages
Proposition 2.1 Over any semiring, the characteristic series of a rational language is a rational series. Proof. This follows from the first part of the proof of Lemma 1.3, with “recognizable” replaced by “rational”, which can be done in view of Theorem 1.1 and Theorem I.6.1. P Given a language L ⊂ A∗ , we call generating function of L the series n≥0 αn xn , where αn = |L ∩ An |.
38
Chapter III. Series and Languages
P Corollary 2.2 A series n≥0 αn xn in Z[[x]] is the generating function of some rational language if and only if it is rational over the semiring N and has constant term 0 or 1. In particular, theP αn satisfy a linear recurrence relation, see Chapter IV. Proof. Suppose that αn xn is the generating function of the rational language L. By Proposition 2.1, the characteristic series L of L is rational over N. By sending each letter a of A onto x, we obtain a morphism KhhAii → K[[x]] which sends L onto an N-rational series in N[[x]] by Proposition I.4.2. Clearly, this series is the generating series of L, which therefore is N-rational. Conversely, let S be an N-rational series in N[[x]]. It is obtained from elements in N[x] by the rational operations. It has therefore a rational expression involving these operations. We may assume that the only scalar in the expression is 1 (by replacing n by 1 + 1 · · · + 1). We now replace in the expression each monomial xd by a1 a2 · · · ad , where ai are distinct letters, distinct also from the letters for each monomial. An inductive argument then shows that this rational expression defines an N-rational series T with coefficients 0 and 1. Hence T is the characteristic series of some rational language, whose generating series is S. P Example 2.1 Let S = (x+x2 )∗ = n≥0 Fn xn , where the Fn are the Fibonacci numbers (F0 = F1 = 1, Fn+2 = Fn+1 + Fn for n ≥ 0). Then S is the generating function of the rational language (a ∪ bc)∗ . Similarly, (x + 2x2 )∗ (1 + 2x) + x is the generating function of the rational language (a ∪ bc ∪ de)∗ (1 ∪ f ∪ g) ∪ h. Corollary 2.3 If S is a rational series and L is a rational language, then S ⊙ P L = w∈L (S, w)w is rational.
Proof. Let K1 be the prime semiring of K, that is the semiring generated by 1. Then by Proposition 2.1, the series L is K1 -rational. Since the elements of K1 and K commute, it suffices to apply Theorem I.5.3. Let S be a formal series, and let V be a subset of K. We denote as usual by S −1 (V ) the language S −1 (V ) = {w ∈ A∗ | (S, w) ∈ V }. Proposition 2.4 If K is finite and if S ∈ KhhAii is rational, then S −1 (V ) is rational for any subset V of K. In particular, supp(S) is rational. Proof. Since S is recognizable, it admits a linear representation (λ, µ, γ). Since K is finite, K n×n is finite, and S −1 (V ) is saturated by a congruence with finite index. Thus S −1 (V ) is recognizable, hence rational. Corollary 2.5 If S ∈ ZhhAii is a rational series and a, b ∈ Z, b 6= 0, then S −1 (a + bZ) is a rational language. Proof. Let φ : Z → Z/bZ be the canonical morphism. Then φ(S) is rational by Lemma 1.1. Since S −1 (a + bZ) = φ(S)−1 (φ(a)), the result follows from Proposition 2.4. Corollary 2.6 If S ∈ NhhAii is rational and if a ∈ N, then the languages S −1 (a), S −1 ({n | n ≥ a}), S −1 ({n | n ≤ a}) are rational.
2. Series and Rational Languages
39
Proof. Let ∼ be the congruence of the semiring N generated by the relation a + 1 ∼ a + 2; in this congruence, all integers n ≥ a + 1 are in a single class, and each n ≤ a is alone in its class. Let K be the quotient semiring and let φ : N → K be the canonical morphism. Then φ(S) is rational by Lemma 1.2, and it suffices to apply Proposition 2.4, K being finite. Corollary 2.7 Let S ∈ ZhhAii be a rational series. If there is an integer d ∈ N which divides none of the nonzero coefficients of S, then the support of S is a rational language. Proof. If this is true, then supp(S) = A∗ \ S −1 (dZ) and it suffices to apply Corollaries 2.5 and 1.5. We denote by Im(S) the set of coefficients of S. It is called the image of S. Theorem 2.8 (Sch¨ utzenberger 1961a, Sontag 1975) Assume that K is a commutative ring. If S ∈ KhhAii is a rational series with finite image, then S −1 (V ) is rational for any V ⊂ K. Thus in particular the support of S is rational. Proof. (i) We first show that S satisfies a linear recurrence relation close to that of Eq.(II.3.4). The result will be less precise because K is not assumed to be a field. Let (λ, µ, γ) be a linear representation of S. Consider the ring generated by the coefficients of the matrices λ, µa for a ∈ A, and γ. It shows that we may assume K to be finitely generated, and thus to be Noetherian (cf. the proof of Theorem III.1.1). Let n be the dimension of µ and consider a maximal submodule among the submodules of K 1×n generated by a set of the form {λµp | p ∈ P } with P a finite prefix-closed subset of A∗ (see Section II.3). Such a maximal submodule exists because K is Noetherian. Let C be the complete finite prefix set C = P A \ P (ibid.). Then P ∪ c being prefix-closed for P all c ∈ C, the maximality of the submodule shows that λµc = p∈P αc,p λµp for some αc,p ∈ K. Multiplying on the right by µwγ gives (S, cw) =
X
αc,p (S, pw) .
(2.1)
p∈P
(ii) We now consider the set E of sequences of words of the form (pw)p∈P . For each word x, define a function fx from E into E by fx ((pw)p ) = (pxw)p . Then fy ◦ fx = fyx since indeed fy ◦ fx ((pw)p ) = fy ((pxw)p ) = (pyxw)p = fyx ((pw)p ). Consider the image of E by S, that is the set F of sequences ((S, pw))p∈P . The functions fx induce functions on F (still denoted fx ) since if ((S, pw))p∈P = ((S, pw′ ))p∈P then also ((S, pxw))p∈P = ((S, pxw′ ))p∈P . It suffices to prove this claim for x = a ∈ A. In this case, either pa ∈ P and then (S, paw) = (S, paw′ ), or pa = c ∈ C, and (S, paw) = (S, paw′ ) by Eq. (2.1). (iii) We have defined a morphism of monoids of A∗ into the monoid M of function from F into F by x 7→ fx .
40
Chapter III. Series and Languages
We now apply the hypothesis. Since Im(S) is finite, the set F is finite, and consequently M is finite. Let Q be the subset of M composed of those functions that map the sequence ((S, p))p∈P onto an element F of the form (βp )p with β1 ∈ V . Since fx ((S, p)p∈P ) = ((S, px)p∈P ), we have fx ∈ Q ⇐⇒ (S, x) ∈ V ⇐⇒ x ∈ S −1 (V ) . This shows that S −1 (V ) is recognizable, whence rational.
3
Support
In this and the next section, we study properties of languages which are supports of rational series. These languages strongly depend on the underlying semiring. Thus we have seen in Sections 1 and 2 that the rational languages are exactly the supports of rational series when the semiring is N or is finite. This is not generally true. Example 3.1 Let K = Z, A = {a, b}, and let S be the series X S= (|w|a − |w|b )w . w
This series is rational (Example I.5.3). Its support is the language supp(S) = {w ∈ A∗ | |w|a 6= |w|b } and its complement is L = {w ∈ A∗ | |w|a = |w|b } . We shall prove that L is not a support of a rational series over Z. This shows that L is not a rational language, by Proposition 2.1, and shows also that supp(S) is not rational, by Corollary 1.5. Arguing by contradiction, we assume that L = supp(T ) for some rational series T having a linear representation (λ, µ, γ) of dimension n. Then the matrix µan is a linear combination of the matrices µ1, µa, . . . , µan−1 , and µan = α1 µ1 + · · · + αn µan−1 . Multiplying on the left by λ and on the right by µbn γ, one gets (T, an bn ) = α1 (T, bn ) + · · · + αn (T, an−1 bn ) . Since ai bn ∈ / L for i 6= n, the right-hand side of this equation vanishes, and the left-hand side is not zero, a contradiction. Example 3.2 Recall that a palindrome word w is a word which is equal to its reversal, that is w = w ˜ (see Exercise II.1.1). We show that the language L = {w ∈ A∗ | w 6= w} ˜ of words which are not palindromes is the support of a rational series over Z. Assume for simplicity that A = {a0 , a1 }, and consider the series X hwiw , w
41
3. Support
where hwi is the integer represented by w in base 2. This series is rational (see Example I.5.2). Consequently the series X hwiw ˜ w
also is rational (see Exercise II.1.1). Thus the series X (hwi − hwi)w ˜ w
is rational, and its support is L. By a technique analogous to that of Example 3.1, one can show that A∗ \ L is not a support of a rational series. For the rest of this section, we fix a subsemiring K of the field R of real numbers. We denote by K the family of languages which are supports of rational series, that is L ⊂ A∗ is in K if and only if L = supp(S) for some rational series S ∈ KhhAii. We shall see that K has all the closure properties usually considered in formal language theory, excepting complementation, as follows from Example 3.1. The morphisms considered in the next statement are morphisms from one free monoid into another. Theorem 3.1 (Sch¨ utzenberger 1961a, Fliess 1971) The family K contains the rational languages. Moreover, K is closed under finite union, intersection, product, submonoid generation, direct and inverse morphism. Proof. The first claim is a consequence of Proposition 2.1. Consider now a language L ⊂ A∗ in K, and let S ∈ KhhAii be a rational series with L = supp(S). If φ : B ∗ → A∗ is a morphism, then X φ−1 (S) = (S, φ(w))w w∈B ∗
is rational. Indeed, if (λ, µ, γ) is a linear representation of S, then clearly (λ, µ ◦ φ, γ) is a linear representation of φ−1 (S). Consequently φ−1 (L) = supp(φ−1 (S)) is in K. Next, let L′ ⊂ A∗ be another language in K, with L′ = supp(S ′ ), and S ′ rational. Then L ∩ L′ = supp(S ⊙ S ′ ) is also in K, by Theorem I.5.3. In order to show that the submonoid L∗ generated by L is also in K, observe first that L∗ = (L \ 1)∗ and that L \ 1 = L ∩ A+ is in K. Thus we may assume 1∈ / L, that is (S, 1) = 0. Next, we may suppose that S has only nonnegative coefficients, by considering S ⊙ S instead of S, which is possible in view of Theorem I.5.3. Under these conditions, L∗ = supp(S ∗ ) , showing that L∗ is in K. It is easily seen that K is closed by union and product, using the formulas supp(S + S ′ ) = supp(S) ∪ supp(S ′ ) supp(SS ′ ) = supp(S) supp(S ′ )
42
Chapter III. Series and Languages
which hold if S and S ′ have nonnegative coefficients. Finally, consider a morphism φ : A∗ → B ∗ . (i) First we assume that φ(A) ⊂ B + . In this case, the family of series (S, w)φ(w) w∈A∗ , with each of these series reduced to a monomial, is locally finite, and its sum, the series X φ(S) = (S, w)φ(w) w∈A∗
is rational by Proposition I.4.2. If moreover S has nonnegative coefficients, then supp(φ(S)) = φ(L) ,
showing that φ(L) is in K. (ii) Next, we assume that A = B ∪ {a}, with a ∈ / B, and that φ is the projection A∗ → B ∗ , that is φ|B = id, φ(a) = 1. Let n be the dimension of a linear representation (λ, µ, γ) of S, and set P = A∗ \ A∗ an A∗ . We claim that φ(L) = φ(L ∩ P ) .
(3.1) n
Let indeed w ∈ L. If w ∈ / P , then w = xa y for some words x and y. But the characteristic polynomial of µa shows that (S, xan y) is a linear combination of the (S, xai y) with 0 ≤ i ≤ n − 1. Consequently, there is such an i with (S, xai y) 6= 0, whence xai y ∈ L. Since φ(w) = φ(xai b), induction on the length completes the proof. Let ψ : B ∗ → KhAi be the morphism of monoids defined by ψ(b) = (1 + · · · + an−1 )b(1 + · · · + an−1 ) .
Further, recall that we may assume that S has nonnegative coefficients. Let T ∈ KhhBii be the rational series with the linear representation (λ, µ ◦ ψ, γ), with µ extended to KhAi by linearity. Let w = b1 · · · bm ∈ B ∗ . The coefficient of w in T is λ(µ ◦ ψw)γ. Since ψw is an N-linear combination of words of the form ai0 b1 ai1 · · · bm aim
(3.2)
and since any word of the form given by Eq. (3.2) with i0 , . . . , im ∈ {0, . . . , n−1} appears in ψw, by definition of ψ, it follows that (T, w) is an N-linear combination of coefficients of the form (S, ai0 b1 ai1 · · · ym aim ) . In view of Eq. (3.1), and by the fact that all coefficients are nonnegative, this implies that φ(supp(S)) = supp(T ) . (iii) Consider finally an arbitrary morphism φ : A∗ → B ∗ and L in K. We may assume that A and B are disjoint. Then φ = φ2 ◦ φ1 , where φ1 : A∗ → (A ∪ B)∗ is defined by φ1 (a) = aφ(a) for each letter a, and with φ2 : (A ∪ B)∗ → B ∗ defined by φ2 (a) = 1 for a ∈ A, and φ2 (b) = b for b ∈ B. In view of (i), φ1 (L) ∈ K. Moreover, φ2 can be factorized into a sequence of morphisms of the type considered in (ii). Thus φ2 (φ1 (L)) ∈ K, and φ(L) ∈ K.
4. Iteration
4
43
Iteration
In this section, we assume that K is a commutative field. We prove the following.
Theorem 4.1 (Jacob 1980) Let L be a language which is support of a rational series. There exists an integer N such that for any word w in L, and for any factorization w = xuy satisfying |u| ≥ N , there exists a factorization u = pvs such that the language L ∩ xpv ∗ sy . is infinite. We need a definition and a lemma. Definition A quasi-power of order 0 is any nonempty word. A quasi-power of order n + 1 is a word of the form xyx, where x is a quasi-power of order n. Example 4.1 If x 6= 1, then xyxzxyx is a quasi-power of order 2. Lemma 4.2 Sch¨ utzenberger (1961b) Let A be a (finite) alphabet. There exists a sequence of integers (cn ) such that any word on A of length at least cn has a factor which is a quasi-power of order n. Proof. Let d = |A|, c0 = 1 and inductively cn+1 = cn (1 + dcn ) . Suppose that any word of length cn contains a factor which is a quasi-power of order n. Let w be a word of length at least cn+1 = cn (1 + dcn ). Then w has a factor of the form x1 x2 · · · xr , with each xi of length cn and r = 1 + dcn . Since there are only dcn distinct words of length cn on A, two of the xi ’s are identical, and w has a factor xyx with |x| = cn . By the induction hypothesis, x = zx′ t with x′ a quasi-power of order n. Thus w has as a factor x′ tyzx′ which is a quasi-power of order n + 1. Proof of Theorem 4.1. Let S be a rational series with L = supp(S), let (λ, µ, γ) be a linear representation of S, with dimension n. Set N = cn where cn has the meaning of Lemma 4.2. Consider a word w = zut ∈ L, with |u| ≥ N . Then u contains a quasi-power of order n. Thus there exist words 1 6= x0 , x1 , . . . , xn , y1 , . . . , yn such that xn is a factor of u and, for each i = 1, . . . , n, xi = xi−1 yi xi−1 . Next n ≥ rank(µxi−1 ) ≥ rank(µxi−1 yi xi−1 ) ≥ rank(µxi ) . Consequently, there is an integer i such that rank(µxi−1 ) = rank(µxi−1 yi xi−1 ). Set p = µxi−1 and q = µyi . Let these matrices act on the right on K 1×n . From rank(p) = rank(pqp), it follows that Im(p) ∩ Ker(qp) = 0 .
(4.1)
44
Chapter III. Series and Languages
Moreover, rank(p) ≥ rank(qp) ≥ rank(pqp) = rank(p) , showing that rank(p) = rank(qp), and since Im(qp) ⊂ Im(p), it follows that Im(qp) = Im(p). By Eq. (4.1), this gives Im(qp) ∩ Ker(qp) = 0 . Since n = dim Ker(qp)+ dim Im(qp), the space K 1×n is the direct sum of Im(qp) and Ker(qp). In a basis adapted to this direct sum, the matrix qp has the form m0 0 0 where m is an invertible matrix. Consequently the minimal polynomial P (t) of qp is not divisible by t2 . This shows that u can be factorized into u = pvs, with v 6= 1, and where the characteristic polynomial P (t) = tr − a1 tr−1 − · · · − ar−1 t − ar of µv has at least one of the coefficients ar−1 or ar nonnull. Consider the sequence of numbers (bk ) defined by bk = (S, xpv k sy) = λµ(xp)(µv)k µ(sy)γ . For all k ≥ 0, the following relation holds: bk+r = a1 br+k−1 + · · · + ar−1 bk+1 + ar bk . Since w ∈ L, one has b1 = (S, xpvsy) = (S, w) 6= 0. The condition ar−1 6= 0 or ar 6= 0 implies that there exist infinitely many k for which bk 6= 0, whence xpv k sy ∈ L.
5
Complementation
In this section, K is a commutative field. We have seen that the complement of the support of a rational series is not the support of a rational series, in general. However, the following result holds. Theorem 5.1 (Restivo and Reutenauer 1984) If the complement of the support of a rational series is also the support of a rational series, then it is a rational language. For the proof, we use the following theorem. Theorem 5.2 (Ehrenfeucht et al. 1981) Let L be a language, and let n be an integer such that for any word w and any factorization w = ux1 · · · xn v, there exist i, j with 0 ≤ i < j ≤ n such that w ∈ L ⇐⇒ ux1 · · · xi xj+1 · · · xn v ∈ L . Then L is a rational language.
45
5. Complementation
Proof of Theorem 5.1. Let L = supp(S) and let L′ = A∗ \ L = supp(T ) be two complementary languages which are supports of the rational series S and T respectively. Consider linear representations (λ, µ, γ) and (λ′ , µ′ , γ ′ ) of S and T . Further, let n be an integer greater than the dimension of both representations. Let w = ux1 · · · xn v ∈ A∗ . (i) Assume that w is in L. Then 0 6= λµ(ux1 · · · xn v)γ and in particular λµu 6= 0. The n + 1 vectors λµu, λµux1 , . . . , λµux1 · · · xn belong to a space of dimension at most n. Consequently, there is an integer j with 1 ≤ j ≤ n such that λµux1 · · · xj is a linear combination of the vectors λµux1 · · · xi (0 ≤ i < j), say X λ(µux1 · · · xj ) = αi λµ(ux1 · · · xi ) 0≤i<j
for αi ∈ K. Multiplying on the right by µ(xj+1 · · · xn v)γ, one gets X (S, w) = αi (S, ux1 · · · xi xj+1 · · · xn v) . 0≤i<j
Since (S, w) 6= 0, there exists i with 0 ≤ i < j such that (S, ux1 · · · xi xj+1 · · · xn v) 6= 0 and hence ux1 · · · xi xj+1 · · · xn v ∈ L. (ii) Assume now that w ∈ / L, that is w ∈ L′ . A similar proof, this time ′ ′ ′ with (λ , µ , γ ), shows that there are integers i, j (0 ≤ i < j ≤ n) such that (T, ux1 · · · xi xj+1 · · · xn v) 6= 0, showing that ux1 · · · xi xj+1 · · · xn v ∈ L′ , whence ux1 · · · xi xj+1 · · · xn v ∈ / L. Thus we have shown that the language L satisfies the conditions of Theorem 5.2. Consequently, L is rational. For the proof of Theorem 5.2, we use without proof the well-known theorem of Ramsey. In order to state it simply, we introduce the following notation: For any set E, we denote by E(p) the set of subsets of p elements of E. Theorem 5.3 (Ramsey; see e.g. Ryser 1963 or Harrison 1978) For any integers m, p, r, there exists an integer N = N (m, p, r) such that for any set E of N elements and for any partition E(p) = X1 ∪ · · · ∪ Xr , there exists a subset F of E with m elements, such that F (p) is contained in one of the Xi ’s. Proof of Theorem 5.2. Let n be a fixed integer, and let L be the set of all languages L over A satisfying the hypotheses of Theorem 5.2 for this n. We prove below that L is finite. It is not difficult to show that for any L ∈ L and any word w, the language w−1 L = {x ∈ A∗ | wx ∈ L} is still in L. In view of Corollary 1.6, any language in L is rational.
46
Chapter III. Series and Languages
In order to show that L is finite, we use Ramsey’s theorem for m = 1 + n, p = 2, r = 2. Let N = N (m, 2, 2). Let L and K be two languages in L such that for all w of length < N − 1, w ∈ L ⇐⇒ w ∈ K .
(5.1)
We prove that then L = K. This clearly implies that L is finite. To prove the equality, we argue by induction on the lengths of words in A∗ . Let w be a word of length ≥ N − 1, let w = a1 a2 · · · aN −1 s
(ai ∈ A)
and E = {0, 1, . . . , N − 1}. Consider the partition E(2) = X ∪ Y , with X = {(i, j) | 0 ≤ i < j ≤ N − 1 and a1 · · · ai aj+1 · · · aN −1 s ∈ L} , Y = E(2) \ X . Observe that by the induction hypothesis, X = {(i, j) | 0 ≤ i < j ≤ N − 1 and a1 · · · ai aj+1 · · · aN −1 s ∈ K} . By Ramsey’s theorem, there exists a subset F of E with m = n + 1 elements such that F (2) ⊂ X
or F (2) ⊂ Y .
Cutting w into m + 1 = n + 2 factors u, x1 , . . . , xn , v according to the indices in F , one obtains a factorization w = ux1 · · · xn v such that (i) either, for all 0 ≤ i < j ≤ n, the word ux1 · · · xi xj+1 · · · xn v is both in L and K; (ii) or, for all 0 ≤ i < j ≤ n, the word ux1 · · · xi xj+1 · · · xn v is neither in L nor in K. Since L and K are in L, the first condition implies that w ∈ L and w ∈ K, and the second condition that w ∈ / L and w ∈ / K. The Eq. (5.1) is satisfied and the proof is complete. Theorem 5.1 is a special case of the following conjecture. Conjecture Let L and K be disjoint languages which are both support of some rational series. Then there exist two disjoint rational languages L′ and K ′ such that K ⊂ K ′ , L ⊂ L′ (that is K and L are rationally separated ).
47
5. Complementation
Exercises for Chapter III 1.1 Show that a subset of a∗ (where a is a letter) is rational if and only if it is the union of a finite set and of a finite set of arithmetic progressions (we identify a∗ = {an | n ∈ N} with N). 1.2 Let L be a language. The syntactic congruence of L, denoted by ∼, is the congruence on A∗ defined by u∼v
iff ∀x, y ∈ A∗ , xuy ∈ L ⇐⇒ xvy ∈ L .
a) Show that ∼ is the coarsest congruence on A∗ that saturates L (that is such that u ∼ v and u ∈ L imply v ∈ L). The monoid M = A∗ /∼ is called the syntactic monoid of L. b) Let K be a commutative ring and let S be the characteristic series of L. Let I be the syntactic ideal of S. Show that u ∼ v if and only if u − v ∈ I. Show that the canonical morphism of KhAi into N = KhAi/I admits the natural factorization KhAi
N
ρ K[M ] where ρ is the extension by linearity of the canonical morphism A∗ → M and where K[M ] is the K-algebra of the monoid M . Show that N is a quotient of K[M ]. Using for example the language L = (1 + a3 )(a4 )∗ , show that the dotted arrow is not an isomorphism in general (show that M = Z/4Z and 1 − a + a2 − a3 ∈ I) (see Reutenauer 1980a). 2.1 Let K be a commutative field. The set of rational series of KhhAii, equipped with the sum and the Hadamard product, is a K-algebra (Theorem I.5.3). Show that the idempotents of this algebra are precisely the characteristic series of the rational languages. 3.1 Denote by RK the set of supports of rational series with coefficients in the semiring K. Thus RN is the set of rational languages (cf. Section 1). a) Show that if K and L are (commutative) fields and L is an algebraic extension of K, then RK = RL . b) Show that if K is a finite field and t is a variable, then the support of the series over the field K(t) X ((t + 1)n − tn − 1)an n≥0
is not a rational language (use Exercise 1.1). c) Show that, given a commutative field K, one has RK = RN if and only if K is an algebraic extension of a finite field (use Example 3.1) (see Fliess 1971). 3.2 Let f, g : A∗ → B ∗ be two morphisms of a free monoid into another. Define the equality set of f and g as the language E(f, g) = {w ∈ A∗ | f (w) = g(w)} .
48
Chapter III. Series and Languages
Show that the complement of E(f, g) is the support of some rational series over Z (see Turakainen 1985). 4.1 Let up be a quasi-power of order p, with u0 6= 1 and ui = ui−1 vi ui−1 for i = 1, . . . , p. a) Show that there exist words w1 , . . . , wp such that for all i = 1, . . . , p, ui = u0 wi wi−1 · · · w1 . b) Use question (a) to prove that for all integers n and p, there is an integer ℓ such that for every morphism µ : A∗ → K n×n and for any word w of length at least ℓ, there exist nonempty words w1 , . . . , wp such that wp wp−1 · · · w1 is a factor of w and all the µwi ’s have the same kernel N and the same image I with N ∩I = 0 (and consequently belong to the same group contained in the multiplicative monoid K n×n ) (see Jacob 1978, Reutenauer 1980b).
Notes to Chapter III Theorem 2.8 is due to Sch¨ utzenberger (1961a) for fields, and to Sontag (1975) for rings. Theorem 3.1 is from Sch¨ utzenberger (1961a), except for the closure under direct morphism which is due to Fliess (1971) for K = R and to Reutenauer (1980b) for the general case. The proof of Jacob’s theorem (Theorem 4.1) is from Reutenauer (1980c); in this paper, another argument makes it possible to extend the result to infinite alphabets, and also to give a smaller bound N which depends only on the rank of the series (and not on the size of the alphabet). The cancellation property of Theorem 5.2 characterizes the rationality of a language: indeed, each rational language holds this property, for some n, as may be easily verified. Let us mention the following open problem (Salomaa and Soittola 1978). Does there exist a language which is support of a R-rational series without being support of a Q-rational series?
Chapter IV
Rational Series in One Variable This chapter gives a short introduction to some striking arithmetic properties of the expansion of rational functions. In the first section, the notions of rational series, Hankel matrix and rank are shown to coincide, in the case of series in one variable, with the classical definitions. The exponential polynomial is defined in Section 2, with emphasis on its algebraic aspects. As an application, we obtain Benzaghou’s theorem on the invertible series in the Hadamard algebra (Theorem 2.3). Section 3 is devoted to a theorem of G. P´ olya concerning arithmetic properties of the coefficients of a rational series. In the final section, we give an elementary proof, due to G. Hansel, of the famous Skolem-Mahler-Lech theorem on the positions of vanishing coefficients of a rational series.
1
Rational Functions
We consider a commutative field K and an alphabet consisting of a single letter x. We write, as usual, K[x] and K[[x]] instead of Khxi and Khhxii. An element S of K[[x]] is written as S=
X
an xn .
n≥0
Proposition 1.1 A series S is rational if and only if there exist polynomials P and Q in K[x] with Q(0) = 1 such that S is the power series expansion of the rational function P/Q. Note that Q(0) is the constant term of the denominator Q of P/Q. Proof. Let E be the set of series which are the power series expansion of the form described. Then clearly E is contained in the algebra of rational series. Moreover, E is a subalgebra of K[[x]] closed under inversion, since if S ∈ E, 49
50
Chapter IV. Rational Series in One Variable
and S = P (x)/Q(x) is invertible in K[[x]], then its constant term is invertible in K. This constant term is P (0)/Q(0) = P (0) = λ. Thus λ 6= 0 and S −1 =
λ−1 Q(x) ∈ E. λ−1 P (x)
The constant term of the denominator is 1. This shows that any rational series is in E. Let S be the rational series which corresponds to the rational function P (x)/Q(x). The quotient is called normalized if P and Q have no common factor in K[x] and if Q(0) = 1. In this case, Q is called the minimal denominator of S. The roots of Q, which are the poles of the rational function, are called the poles of S. P an xn and let What about the syntactic ideal of S? Set S = n≥0
R = xk + α1 xk−1 + · · · + αk ∈ K[x] be a polynomial. Since K is commutative, the syntactic ideal I of S and the syntactic right ideal coincide. Thus R ∈ I if and only if S ◦ R = 0 by Proposition II.1.3. Since X S ◦ xi = an+i xn n≥0
this gives the equivalence R ∈ I ⇐⇒ for all n ∈ N, an+k + α1 an+k−1 + · · · + αk an = 0 . Observe that in view of Theorem II.1.1, the series S is rational if and only if its syntactic ideal is not null, since a nonnull ideal in K[x] always has a finite codimension. This yields the classical result stating that a series is rational if and only if it satisfies a linear recurrence relation. The syntactic ideal of S is thus precisely the ideal of polynomials associated with the linear recurrence relations satisfied by S. We refer to the generator of the syntactic ideal of S having leading coefficient equal to 1 as the minimal polynomial of S. It is the polynomial associated with the shortest linear recurrence relation. The eigenvalues of S are the roots of its minimal polynomial, and their multiplicities are defined similarly. Proposition 1.2 Let X S= an xn = P (x)/Q(x) n≥0
be a rational series with an associated normalized rational function. Let k = sup(deg(P )−deg(Q)+1, 0) and let (λ, µ, γ) be a reduced linear representation of S. Then the characteristic polynomial of µx is equal to the minimal polynomial of S, and is also equal to xk Q(x), where Q is the reciprocal polynomial of Q. In particular, Q is equal to the reciprocal polynomial of the minimal polynomial of S.
51
1. Rational Functions Recall that the reciprocal polynomial of a polynomial α0 xp + α1 xp−1 + · · · + αq xp−q
with α0 αq 6= 0, p ≥ q is the polynomial αq xq + · · · + α1 x + α0 obtained by replacing x by 1/x and then by multiplying the resulting expression by xp . Proof. The rank r of S is equal to the degree of the characteristic polynomial R(x) of µx (because (λ, µ, γ) has dimension r), and it is also equal to the degree of the minimal polynomial, say R1 (x), of S, since dim(K[x]/R1 ) = deg(R1 ) (cf. Theorem II.1.5). Let R(x) = xr + α1 xr−1 + · · · + αr . Then R(µx) = 0 (Cayley-Hamilton Theorem). Consequently, by multiplying this equation on the left by λµxn for n ∈ N and on the right by γ, one obtains an+r + α1 an+r−1 + · · · + αr an = 0,
(n ≥ 0) .
(1.1)
In other words, and using the notations of Section II.1, S ◦R = 0. Thus R is in the syntactic ideal of S, and therefore is a multiple of R1 . Since they have the same degree and leading coefficient 1, they are equal. Let s be such that R(x) = xr + α1 xr−1 + · · · + αs xr−s ,
αs 6= 0, s ≤ r .
Then the reciprocal polynomial of R is R(x) = 1 + α1 x + · · · + αs xs . Let P1 = RS. Then for all n ≥ r (which implies n ≥ s), one has, in view of Eq. (1.1), (P1 , xn ) = an + α1 an−1 + · · · + αs an−s = 0 . Thus P1 is a polynomial of degree at most r − 1, and since P1 = RS, the polynomial R is a denominator of S. Consequently Q divides R. Let q = deg(Q) and Q = 1 + β1 x + · · · + βq xq ,
βq 6= 0 .
Then q ≤ s. Let p = deg(P ). Then k = sup(p − q + 1, 0). If k = 0, then p − q + 1 ≤ 0 and p + 1 ≤ q. If k > 0, then k = p − q + 1 and q + k = p + 1. In all cases, q + k > p. Since QS = P is a polynomial of degree deg(P ), one has, for all n ∈ N, 0 = (P, xn+q+k ) = an+q+k + β1 an+q+k−1 + · · · + βq an+k . Thus, since Q(x) = xq + β1 xq−1 + · · · + βq , S ◦ (xk Q) = 0 .
52
Chapter IV. Rational Series in One Variable
This shows that xk Q is in the syntactic ideal of S, and consequently R divides xk Q. Thus r ≤ q + k. If k = 0, then r ≤ q, q ≤ s and s ≤ r imply that all these numbers are equal, whence R = Q and Q = R. If k 6= 0, then k = deg P − deg Q + 1, and since P1 Q = P R, k = deg P1 − deg R + 1 ≤ r − deg R , whence k + q ≤ k + s ≤ r. Thus r = k + q and s = q, showing that R = xk Q and Q = R. P The Hankel matrix of S = an xn has a very special form, which is classical. It is the matrix (ai+j )i,j∈N . P Corollary 1.3 Let S = an xn be a rational series with associated irreducible fraction P (x)/Q(x). Its rank is equal to sup(deg Q, 1 + deg P ), to the degree of its minimal polynomial, to the length of the shortest linear recurrence relation satisfied by S, and to the rank of its Hankel matrix. Proof. We have only to verify the rank property. We take the notations of the previous proof. If k = 0, then p < q and the rank is deg(R) = q = sup(q, p + 1). If k > 0, then k = p − q + 1 and deg(R) = k + deg(Q) = k + deg(Q) = p + 1 = sup(q, p + 1), since p − q + 1 > 0. Observe that the set of eigenvalues 6= 0 of S is precisely the set of inverses of its poles, with the same multiplicities. Definition A rational series is regular if it admits a linear representation (λ, µ, γ) such that µx is an invertible matrix. Regular rational series can be defined inP several ways. Indeed, the following assertions concerning a rational series S = an xn are equivalent. (i) S is regular. (ii) Any reduced linear representation (λ, µ, γ) of S is regular , that is the matrix µx is invertible. (iii) The sequence (an ) satisfies a proper linear recurrence relation, that is an+k = α1 an+k−1 + · · · + αk an , (iv) (v) (vi) (vii)
n ≥ 0, αk 6= 0 .
The shortest linear recurrence relation satisfied by S is proper. There exists a polynomial P such that S ◦ P = 0 and P (0) 6= 0. The minimal polynomial of S has a non vanishing constant term. S = P (x)/Q(x) with deg P < deg Q.
The equivalence of these assertions is a consequence of the preceding propositions and of the following observation: if (an ) satisfies some proper linear recurrence relation and if m is the the companion matrix of this relation, then det(m) 6= 0 and there exist λ, γ such that an = λmn γ. Proposition 1.4 For every rational series S, there exist a unique couple (T, P ), where T is a regular series and P is a polynomial, such that S = P + T .
1. Rational Functions
53
This proposition is a direct consequence of the decomposition of the rational fraction associated with S into simple elements. Then P is just the integral part of the fraction. We give here a different proof. Observe that, as a consequence of this result, a regular rational series which is a polynomial is null. Proof. Let xq R(x), with R(0) 6= 0, be the minimal polynomial of S. Then (S ◦ R) ◦ xq = S ◦ (xq R) = 0
which shows that S ◦ R is a polynomial. Consider the function Q 7→ Q ◦ R K[x] → K[x] Since R(0) 6= 0, one has deg(Q ◦ R) = deg(Q), and this function is consequently a linear automorphism of K[x]. Thus there is some P in K[x] such that P ◦R = S ◦R. Let T = S − P . Then T ◦R = S ◦R −P ◦R = 0, showing that T is regular rational. If T + P = T ′ + P ′ , where T and T ′ are regular rational series and P, P ′ are polynomials, then T − T′ = P′ − P
In view of condition (vii) above, the series T − T ′ is regular. Thus it suffices to show P that if S is regular and is a polynomial, then S = 0. For this, set S = an xn . There exist coefficients αi in K such that for all n ≥ 0 an+k = α1 an+k−1 + · · · + αk an
(1.2)
with αk = 6 0. Assume S 6= 0, and let n be the greatest index such that an 6= 0. For this n, Eq. (1.2) gives αk an = 0, whence an = 0, a contradiction. In view of Proposition 1.4, it suffices for many purposes to study regular rational series. We will restrict ourselves to these series in the following. Proposition 1.5 The subset of regular rational series of K[[x]] is closed under linear combination, product, and Hadamard product. Observe that this set does not contain any non vanishing polynomials. Proof. Let S1 = P1 /Q1 and S2 = P2 /Q2 be regular series with deg(P1 ) < deg(Q1 ) and deg(P2 ) < deg(Q2 ). Then S1 + S2 = (P1 Q2 + P2 Q1 )/Q1 Q2 and S1 S2 = P1 P2 /Q1 Q2 . Since deg(P1 Q2 + P2 Q1 ) < deg(Q1 Q2 ) and deg(P1 P2 ) < deg(Q1 Q2 ), the series S1 + S2 and S1 S2 are regular. Moreover, if (S1 , xn ) = λ1 µ1 xn γ1 and (S2 , xn ) = λ2 µ2 xn γ2 , where µ1 x and µ2 x are invertible matrices, then (S1 ⊙ S2 , xn ) = (S1 , xn )(S2 , xn ) = (λ1 ⊗ λ2 )(µ1 ⊗ µ2 )(xn )(γ1 ⊗ γ2 ) , and since (µ1 ⊗ µ2 )(x) is invertible, this shows that S1 ⊙ S2 is regular. The set of regular rational series equipped with the structure of vector space and with the Hadamard product is thePHadamard algebra of regular rational series. Its neutral element is the series xn = 1/(1 − x).
54
2
Chapter IV. Rational Series in One Variable
The Exponential Polynomial
We assume from now on that K has characteristic zero. Let Λ be the multiplicative group K \ 0, and let t be an indeterminate. We consider the algebra K[t][Λ] of the group Λ over the ring K[t]. It is in particular an algebra over K. An element of K[t][Λ] is called an exponential polynomial . Theorem 2.1 Let K be algebraically closed. The function which associates to an exponential polynomial X Pλ (t)λ λ∈Λ
of K[t][Λ] the regular rational series X an xn n≥0
defined by an =
X
Pλ (n)λn
λ∈Λ
(with the sum computed in K) is an isomorphism of K-algebra from K[t][Λ] onto the Hadamard algebra of regular rational series. P Proof. Let φ be the function of the statement. Let E = PλP (t)λ and F = P Qλ (t)λ be two exponential polynomials, and let G = E +F = Rλ (t)λ, H = P EF = Sλ (t)λ ∈ K[t][Λ]. Then X Rλ = Pλ + Qλ , Sλ = Pµ Qν . µν=λ
Consequently (φ(G), xn ) =
X
X
Pλ (n)λn +
λ
µν=λ
X
Qν (n)ν n
Rλ (n)λn =
X
Qλ (n)λn
= (φ(E), xn ) + (φ(F ), xn ) , X X X (φ(H), xn ) = Sλ (n)λn = λn Pµ (n)Qν (n) =
X
Pµ (n)µn
µ
ν
n
= (φ(E), x )(φ(F ), xn ) .
Thus φ(E + F ) = φ(E) + φ(F ), φ(EF ) = φ(E)φ(F ) . Let us now verify that φ is a bijection. Let α1 , . . . , αk be elements of K with P αk 6= 0, and let V be the set of all (regular rational) series S = an xn satisfying the relation an+k = α1 an+k−1 + · · · + αk an ,
(n ≥ 0) .
2. The Exponential Polynomial
55
Clearly, V is a vector space of dimension k. Let λ1 , . . . , λp be the roots of the polynomial R(x) = xk − α1 xk−1 − · · · − αk with multiplicities n1 , . . . , np respectively. Consider the subspace V ′ of K[t][Λ] of dimension k o n X Pi (t)λi | deg(Pi ) ≤ ni − 1 V′ = 1≤i≤p
We show that φ induces a surjection V ′ → V (and consequently an injection) and this will P prove the theorem. Any S = an xn in V can be written as P (x)/Q(x), with deg(P ) < deg(Q) and Q being the reciprocal polynomial of R. Decomposing P/Q into simple elements shows that S is a linear combination of series 1 , (1 − λi x)j
1 ≤ i ≤ p, 1 ≤ j ≤ nj .
Next, it is well-known that X n + j − 1 1 = λn xn . (1 − λx)j j−1 n≥0
Since n+j−1 is a polynomial of degree j − 1 in the variable n, the surjectivity j−1 of φ : V ′ → V is proved. Observe that in the bijection described P in the theorem and its proof, the support of an exponential polynomial E = Pλ (t)λ (that is the set of λ ∈ Λ such that Pλ 6= 0) is exactly the set of eigenvalues (that is inverses of poles) of S, and that the multiplicity of a eigenvalue λ is equal to 1 + deg(Pλ ). Furthermore, if the coefficients and the eigenvalues of S are in some subfield K1 of K, then the corresponding exponential polynomial is in K1 [t][Λ1 ], with Λ1 = K1 \ 0. P Corollary 2.2 Let S = an xn be a rational series over an algebraically closed field K of characteristic 0. (i) The coefficients an are given, for large enough n, by X an = λni Pi (n) ,
(2.1)
1≤i≤p
where λ1 , . . . , λp ∈ K \ 0 and Pi (t) ∈ K[t]. (ii) The expression (2.1) is unique if the λi ’s are distinct; in particular, the nonzero eigenvalues of S are the λi ’s with Pi 6= 0. Proof. (i) By Proposition 1.4, S = P + T for some polynomial P and some rational regular series T . Thus, it suffices to use Theorem 2.1. (ii) Let X X T = λni Pi (n) xn n≥0 1≤i≤p
56
Chapter IV. Rational Series in One Variable
Then, in view of Theorem 2.1, T is rational regular. Moreover S = P + T for some polynomial P (because S and T have by assumption the same coefficients for large enough n). By Proposition 1.4, T depends only on S, and by Theorem 2.1, the exponential polynomial of T is unique. This proves the first assertion. By the remark following the proof of Theorem 2.1, the λi ’s with Pi 6= 0 are exactly the eigenvalues of T . Now, it is clear that T and S have the same poles, so they have the same nonzero eigenvalues. Definition Let S0 , . . . , Sp−1 be formal series in K[[x]]. The merge of these series is the formal series defined for m ∈ N and i ∈ {0, . . . , p − 1} by (S, xmp+i ) = (Si , xm ) . In other words, if n = mp + i (Euclidean division of n by p), then (S, xn ) = (Si , xm ). This can also be written as X
S(x) =
xi Si (xp )
0≤i
with self-evident notation. P P An example. If p = 2 P and S0 = an xn and S1 = bn xn , then the merge n of S0 and S1 is the series cn x where the sequence (cn ) is a0 , b 0 , a1 , b 1 , a2 , b 2 , a3 , . . .
P Observe that for any series S = an xn ∈ K[[x]] and any p, there is a unique p-tuple of series (S0 , . . . , Sp−1 ) whose merge is S. These series are indeed Si =
X
ai+np xn .
n≥0
Definition A series an = bcn .
P
an xn is geometric if there exist b, c in K such that
Theorem 2.3 (Benzaghou 1970) If a regular rational series is invertible in the Hadamard algebra of regular rational series, then it is a merge of geometric series. The conclusion can also be formulated as follows: there exist an integer p and elements a0 , . . . , ap−1 , b0 , . . . , bp−1 in K such that the series is X
0≤i≤p−1
ai xi . 1 − bi xp
Proof. (i) Let i and p be natural numbers and consider the K-linear function ψ : K[t][Λ] → K[t][Λ] defined on monomials by ψ(P (t)λ) = (λi P (i + pt))λp ,
2. The Exponential Polynomial
57
where P (t) ∈ K[t], λ ∈ Λ and where λi P (i + pt) is an element of K[t]. The function ψ is a morphism of K-algebra. To see this, it suffices to compute ψ on products of monomials, and indeed ψ(P (t)Q(t)λµ) = (λi µi P (i + pt)Q(i + pt))λp µp = ψ(P (t)λ)ψ(Q(t)µ) . (ii) Consider now two exponential polynomials E, F ∈ K[t][Λ] and let Λ1 be the subgroup of Λ generated by supp(E) ∪ supp(F ). The group Λ1 is a finitely generated Abelian group, thus is isomorphic to the product of a finite group (of p elements, say) and of a finitely generated free Abelian group. Consequently, the subgroup Λ2 of Λ1 generated by the λp , for λ ∈ Λ1 , is free. By construction, the supports of ψ(E) and ψ(F ) are in Λ2 (for any i, and for the fixed p), and ψ(E), ψ(F ) ∈ K[t][Λ2 ]. Assume now EF = 1. Then ψ(E)ψ(F ) = 1. Since Λ2 is free, the only invertible elements of K[t][Λ2 ] have the form aλ, with a ∈ K, λ ∈ Λ2 . Indeed, this is a consequence of the fact that the only invertible elements of an algebra of commutative polynomials are the constant polynomials. P (iii) nConsider now two regular rational series S and T such that S ⊙ T = n≥0 x (the neutral element of the Hadamard algebra). Let E, F ∈ K[t][Λ] be such that φ(E) = S, φ(F ) = T , where φ is the isomorphism of Theorem 2.1. Then EF =P 1. P P i Set S = an xn . If E = Pλ (t)λ and ψ(E) = λ Pλ (i + tp)λp , then X X φ(ψ(E)) = λi Pλ (i + pn)λpn xn = Si , n≥0
λ
where Si =
X
ai+pn xn .
n≥0
In view of the conclusion of (ii), ψ(E) = aλ for some a ∈ K, λ ∈ Λ. Consequently, X Si = aλn xn . n≥0
This proves the theorem because S is the merge of the Si ’s, i = 0, . . . p − 1. The proof of the theorem suggests the following definition and proposition which will be of use later. Definition A regular rational series is simple if the Abelian multiplicative subgroup of K \ 0 generated by its eigenvalues is simple. Similarly, a set of regular rational series is simple if the set of all its eigenvalues generates a free Abelian group. Proposition 2.4 Let S be a finite set of regular rational series. There exists an integer p ≥ 1 such that the set of series of the form X ai+pn xn n≥0
for i ∈ N and for
P
an xn ∈ S is simple.
58
Chapter IV. Rational Series in One Variable
Proof. Since S is finite, there exists an invertible matrix m ∈ K q×q such that each S ∈ S can be written as X S= φS (mn )xn n≥0
for some linear form φS on K q×q . Let Λ1 be the set of eigenvalues of m. The group generated by Λ1 in K \ 0 is finitely generated, and consequently there is an integer p ≥ 1 such that the group G generated by the λp , for λ ∈ Λ1 , is free Abelian. P be the characteristic of mp . For each i ∈ N and P Let P polynomial n n S = an x ∈ S, the series Si = ai+pn x has the form Si =
X
φS (mi (mp )n )xn ,
n
showing that Si ◦ P = 0. Consequently, the eigenvalues of Si are in G.
3
A Theorem of P´ olya
In this section, we consider series with coefficients in Q. Recall that for any prime number p, the p-adic valuation vp over Q is defined by vp (0) = ∞ and vp (pn a/b) = n for n, a, b ∈ Z, b 6= 0 and p dividing neither a nor b. Definition Let S = of prime numbers
P
an xn ∈ Q[[x]]. The set of prime factors of S is the set
P (S) = {p | ∃n ∈ N, vp (an ) 6= 0, ∞} . Theorem 3.1 (P´olya 1921) The set of prime factors of a rational series S is finite if and only if S is the sum of a polynomial and of a merge of geometric series. We start with a lemma of independent interest. P Lemma 3.2 (Benzaghou 1970) Let S = an xn be a rational series which is not a polynomial, and let p be a prime number. There exist integers n0 ≥ 0 and q ≥ 1 such that the function n 7→ vp (an0 +qn ) is affine. Proof. (i) We start by proving a preliminary result. Let K be a commutative field with a discrete valuation v : K → N ∪ {∞}. Let A be its valuation ring, A = {z ∈ K | v(z) ≥ 0}, let I be the maximal ideal of A, I = {z ∈ K | v(z) ≥ 1} and let U = A \ I = {z ∈ K | v(z) = 0} be the group of invertible elements of A. Suppose further that the residual field F = A/I is finite. Since v is discrete, I is a principal ideal, and consequently I = πA for some π ∈ A with v(π) = 1. [For a systematic exposition of these concepts, see e. g. Amice (1975), Koblitz (1984).] Let λ1 , . . . , λk be elements of A\0, let P1 , . . . , Pk ∈ K[t] be polynomials and let (an ) be a sequence of elements in A defined by an =
X
1≤i≤k
Pi (n)λni .
(3.1)
59
3. A Theorem of P´ olya
Then we claim that there exist integers n0 and q such that the function n 7→ v(an0 +qn ) is affine. The proof is in three steps. 1. One may assume that all the Pi are in A[t] (by multiplying the polynomials by a common denominator, if necessary). 2. Assuming that λi ∈ I for all i = 1, . . . , k, set r = inf{v(λi ) | i = 1, . . . , k} . Then r ≥ 1. Since each Pi has coefficients in A and v(λi ) ≥ r for all i, it follows that v(an ) ≥ rn. Consequently v(an /π rn ) ≥ 0 and the sequence (bn ) defined by bn = an /π rn has its elements in A. Further bn =
λ n i Pi (n) r . π
X
1≤i≤k
Thus we may assume in addition that λi ∈ U for at least one index i. 3. Let ℓ ≥ 1 be such that λ1 , . . . , λℓ ∈ U and λℓ+1 , . . . , λk ∈ I (possibly ℓ = k). Set bn =
ℓ X
Pi (n)λni , cn =
i=1
k X
Pi (n)λni
i=ℓ+1
(cn = 0 if ℓ = k). We prove that there is an arithmetic progression of integers n where v(bn ) P is constant. For this, observe that the minimal polynomial of the regular series bn xn is P (x) =
ℓ Y
i=1
(x − λi )deg(Pi )+1
(cf. Theorem 2.1 and the observation following its proof). By setting P (x) = xh − α1 xh−1 − · · · − αh , one has αh ∈ U . Let s = inf{v(b0 ), . . . , v(bh−1 )} . Since the sequence (bn ) satisfies the recurrence relation associated with P , and since the coefficients of P are in A, it follows that v(bn ) ≥ s for all n. Consequently, the sequence (b′n ) defined by b′n = bn /π s is also in A. It has the same minimal polynomial as (bn ) and there is an integer j such that v(b′j ) = 0 , that is b′j ∈ U . Next b′n = λmn γ ,
60
Chapter IV. Rational Series in One Variable
where
λ = (1, 0, . . . , 0),
1 0 ··· 0 1 ··· .. m= . 0 0 0 ··· αh · · · 0 0 .. .
0 0 , 1 α1
γ=
b′0 b′1 .. . b′h−1
Since the determinant of the matrix m is ±αh ∈ U , and since F = A/I is finite, there is an integer q such that mq ≡ 1 mod I (with I the identity matrix). This shows that the sequence (b′n ) is periodic modulo I and in particular for all n ≥ 0, b′j+qn ≡ b′j
mod I .
Thus, v(b′j+qn ) = v(b′j ) = 0, and consequently v(bj+qn ) = s for n ≥ 0 . Finally, observe that v(cn ) ≥ n. Thus if n is large (more precisely if j + qn > s), then v(aj+qn ) = v(bj+qn ) = s . Thus it suffices to set n0 = j + qn′ , where n′ is chosen so that n0 > r. This proves the preliminary claim. (ii) The series S is rational over Q. We may assume that it is regular by Proposition 1.4. By Exercise I.5.1.b, we may assume that it is rational over Z and has a linear representation (λ, µ, γ) with µx over Z and of nonzero determinant. Let P (x) = xr − α1 xr−1 − · · · − αr be its characteristic polynomial. Then (an ) satisfies the linear recurrence relation associated to P . The roots λ1 , . . . , λk of P are algebraic integers. Let K be the number field K = Q[λ1 , . . . , λk ]. By Theorem 2.1, the an admit the expression given by Eq. (3.1). Moreover, for any prime ideal p of K, the αi and an are in the valuation ring of K for the valuation vp and by our preliminary result (i), there exist integers j and ℓ such that n 7→ vp (aj+ℓn ) is an affine function. (iii) Let B be the ring of algebraic integers of K, and let p be a prime number. The ideal pB of B decomposes as ms 1 pB = pm 1 · · · ps ,
where p1 . . . , ps are distinct prime ideals of K. By applying the preceding argument for p = p1 one obtains integers j, ℓ such that the function n 7→ vp1 (aj+ℓn ) is affine. By iteration of this computation for p2 , . . . , ps , one gets successive subsequences and finally one obtains an arithmetic progression n′0 + q ′ N such that for each i = 1, . . . , s, the function n 7→ vpi (an′0 +q′ n )
61
3. A Theorem of P´ olya is affine. Thus there exist integers xi and yi such that vpi (an′0 +q′ n ) = xi + yi n .
Note that xi , yi are integers, since xi + yi n is an integer for n in N. Now observe that for all a ∈ Z, vpi (a) ; i = 1, . . . , s vp (a) = inf mi where ⌊z⌋ denotes the integral part of z. Since the functions n 7→
vpi (an′0 +q′ n ) xi + yi n = mi mi
also are affine, there exists an integer i0 such that for all i = 1, . . . , s and all sufficiently large n, 1 1 (xi + yi n) ≥ (xi + yi0 n) , mi mi0 0 showing that
xi0 + yi0 n vp (an′0 +q′ n ) = mi0
for sufficiently large n. Since the function xi0 + yi0 mi0 n xi0 n 7→ = + yi0 n mi0 mi0 also is affine, the lemma follows. Proof of Theorem 3.1. Let S be a rational series having a finite set of prime factors. Clearly we may assume that S is regular (Proposition 1.4). In view of Proposition 2.4, P we nmay even assume that S is simple. Let S = an x and let p1 , . . . , pℓ be the prime factors of S. Applying Lemma 3.2 successively to p1 , . . . , pℓ , one obtains integers n0 and q such that, for every i = 1, . . . , ℓ, the function n 7→ vpi (an0 +qn ) is affine. Set ǫk = −1, 0, 1 according to an < 0, an = 0, an > 0. Then for n ≥ 0, one has an0 +qn = θn bcn with θn = ǫn0 +qn . Now let λ1 , . . . , λk , with k ≥ 1 be the distinct eigenvalues of S. In view of Theorem 2.1, there are non vanishing polynomials P1 . . . , Pk such that an =
k X i=1
Pi (n)λni .
(3.2)
62
Chapter IV. Rational Series in One Variable
Thus, setting bn = an0 +qn , Qi (t) = Pi (n0 + qt)λni 0 , µi = λqi , one has bn = θn bcn =
k X
Qi (n)µni .
i=1
Since the group generated by the λi ’s is free, P all the µi are distinct. Moreover, the polynomials Qi (t) do not vanish, and thus bn xn is not a polynomial. Thus θn 6= 0 for infinitely many n, and we may suppose that θn = 1 for infinitely many n. The series X bn
cn
xn
has finite image. By Theorem III.2.8 (and Exercise III.1.1), there exists an arithmetic progression n1 + rN such that θn = 1 for n ∈ n1 + rN. Thus bn1 +rn = bcn1 (cr )n =
k X
Qi (n1 + rn)µn1 (µri )n .
i=1
As before, the µri are pairwise distinct. In view of the unicity of the exponential polynomial, one has k = 1 and Q1 (n1 + rt) = C, for some constant. Thus Q1 is a constant and also P1 . By Eq. (3.2), an = P1 λn1 . This completes the proof.
4
A Theorem of Skolem, Mahler, Lech
The following result describes completely the supports of rational series in one variable with coefficients in a field of characteristic zero. They are exactly the rational one-letter languages. This does not hold for more than one variable (see Example III.3.1). Theorem 4.1 (Skolem 1934, 1935, Lech 1953) Let K be a field of charP Mahler acteristic 0, and let S = an xn be a rational series with coefficients on K. The set {n ∈ N | an = 0} is the union of a finite set and of a finite number of arithmetic progressions. In fact, this result has been proved for K = Z by Skolem, it has been extended to algebraic number fields by Mahler and to fields of characteristic 0 by Lech. This author also gives the following example showing that the theorem does not hold in characteristic p 6= 0. P Indeed, let θ be transcendent over the field Fp with p elements. Then the series an xn with an = (θ + 1)n − θn − 1
is rational over Fp (θ) and, however, {n | an = 0} = {pr | r ∈ N} is not a rational subset of tne monoid N.
4. A Theorem of Skolem, Mahler, Lech
63
The proof given here is elementary and does not use p-adic analysis. It requires several definitions and lemmas, and goes through three steps. First, the result is proved for series with integral coefficients. Then it is extended to transcendental extensions and finally to the general case. Definitions A set A of nonnegative integers is called purely periodic if there exist an integer N ≥ 0 and integers k1 , k2 , . . . , kr ∈ {0, 1, . . . , N − 1} such that A = {ki + nN | n ∈ N, 1 ≤ i ≤ r} . The integer N is a period of A. A quasi-periodic set (of period N ) is a subset of N which is the union of a finite set and of a purely periodic set (of period N ). Lemma 4.2 The intersection of a family of quasi-periodic sets of period N is quasi-periodic of period N . Proof. Let (Ai )i∈I be a family of quasi-periodic sets, all having period N . Given a j ∈ {0, 1, . . . , N − 1}, for any i ∈ I, the set (j + N N) ∩ Ai is either finite or equal to j + N N. Thus the same holds for (j + N N) ∩ (∩Ai ). Definition Given a series S = annihilator of S is the set
P
an xn with coefficients in a semiring K, the
ann(S) = {n ∈ N | an = 0} . Thus the annihilator is the complement of the support. With these definitions, the first (and most difficult) step in the proof of Theorem 4.1 can be formulated as follows. P Proposition 4.3 Let S = an xn ∈ Q[[x]] be a regular rational series with rational coefficients. Then the annihilator of S is quasi-periodic. Let p be a fixed prime number. The p-adic valuation vp is defined at the beginning of Section 3. Observe that X vp (q1 · · · qn ) = vp (qi ) 1≤i≤n
vp (q1 + · · · + qn ) ≥ inf{vp (q1 ), . . . , vp (qn )} .
Observe also that for n ∈ N vp (n!) ≤ n/(p − 1) since indeed (Exercise!) vp (n!) = ⌊n/p⌋ + ⌊n/p2 ⌋ + · · · + ⌊n/pk ⌋ + · · · ≤ n/p + n/p2 + · · · + n/pk + · · · X 1 1/p =n = n/(p − 1) . ≤n pk 1 − 1/p k≥1
(4.1)
64
Chapter IV. Rational Series in One Variable
From Eq. (4.1), we deduce n p n vp = vp (pn ) − vp (n!) ≥ n − , n! p−1 thus vp
pn n!
≥n
p−2 . p−1
(4.2)
Next, consider an arbitrary polynomial P (x) = a0 + a1 x + · · · + an xn with integral coefficients. For any integer k ≥ 0, let ωk (P ) = inf{vp (aj ) | j ≥ k} . Clearly ω0 (P ) ≤ ω1 (P ) ≤ · · · ≤ ωk (P ) ≤ · · · and ωk (P ) = ∞ for k > n . Observe also that vp (P (t)) ≥ inf{a0 , a1 t, . . . , an tn } for any integer t ∈ Z, and consequently vp (P (t)) = inf{vp (a0 ), vp (a1 ), . . . , vp (an )} ≥ ω0 (P ) .
(4.3)
Lemma 4.4 Let P and Q be two polynomials with rational coefficients such that P (x) = (x − t)Q(x) for some t ∈ Z. Then for all k ∈ N ωk+1 (P ) ≤ ωk (Q) . Proof. Set Q(x) = a0 + a1 x + · · · + an xn ,
P (x) = b0 + b1 x + · · · + bn+1 xn+1 .
Then bj+1 = aj − taj+1 for 0 ≤ j ≤ n − 1, bn+1 = an , whence for j = 0, . . . , n, aj = bj+1 + tbj+2 + · · · + tn−j bn+1 . This shows that vp (aj ) ≥ ωj+1 (P ) for any j ∈ N. Thus, given any k ∈ N, one has for j ≥ k vp (aj ) ≥ ωj+1 (P ) ≥ ωk+1 (P ) and consequently ωk (Q) ≥ ωk+1 (P ) .
4. A Theorem of Skolem, Mahler, Lech
65
Corollary 4.5 Let Q be a polynomial with rational coefficients, let t1 , t2 , . . . , tk ∈ Z, and let P (x) = (x − t1 )(x − t2 ) · · · (x − tk )Q(x) . Then ωk (P ) ≤ ω0 (Q) . The main argument is the following lemma. Lemma 4.6 Let (dn )n∈N be any sequence of integers and let (bn )n∈N be the sequence defined by bn =
n X n i p di . i i=0
where p is an odd prime number. If bn = 0 for infinitely many indices n, then the sequence (bn )n∈N vanishes. Proof. For n ∈ N, let Rn (x) =
n X
di pi
i=0
x(x − 1) · · · (x − i + 1) . i!
Then for t ∈ N, n X t i p di Rn (t) = i i=0
t and since = 0 for i > t, it follows that i bt = Rt (t) = Rn (t) (n ≥ t) .
(4.4)
Next, we show that for all k, n ≥ 0, ωk (Rn ) ≥ k
p−2 . p−1
For this, let Rn (x) =
n X
(n)
ck xk .
k=0
pi (n) Each ck xk is a linear combination, with integral coefficients, of numbers di , i! for indices i with k ≤ i ≤ n. Consequently, i p (n) . vp (ck ) ≥ inf vp di k≤i≤n i!
66
Chapter IV. Rational Series in One Variable
In view of Eq. (4.2), this implies p−2 p−2 (n) vp (ck ) ≥ inf i ≥k p−1 p−1 which in turn shows that ωk (Rn ) ≥ k
p−2 . p−1
(4.5)
Consider now any coefficient bt of the sequence (bn )n∈N . We shall see that vp (bt ) ≥ k
p−2 p−1
for any integer k, which of course shows that bt = 0. For this, let t1 < t2 < · · · < tk be the first k indices with bt1 = · · · = btk = 0, and let n ≥ sup(t, tk ). By Eq. (4.4), Rn (ti ) = bti = 0 for i = 1, . . . , k. Thus Rn (x) = (x − t1 )(x − t2 ) · · · (x − tk )Q(x)
(4.6)
for some polynomial Q(x) with integral coefficients. By Corollary 4.5, one has ωk (Rn ) ≤ ω0 (Q) .
(4.7)
Next, by Eq. (4.4), vp (bt ) = vp (Rn (t)) and by Eqs. (4.6), (4.3) and (4.7), vp (Rn (t)) ≥ vp (Q(t)) ≥ ω0 (Q) ≥ ωk (Rn ) . Thus, in view of Eq. (4.5), vp (bt ) ≥ k
p−2 p−1
for all k ≥ 0. P Lemma 4.7 Let S = an xn ∈ Z[[x]] be a regular rational series and let (λ, µ, γ) be a linear representation of S of dimension k with integral coefficients. For any odd prime p not dividing det(µ(x)), the annihilator ann(S) is quasi2 periodic of period at most pk . Proof. Let p be an odd prime that does not divide det(µ(x)). Let n 7→ n be the canonical morphism from Z onto Z/pZ. Since det(µ(x)) = det(µ(x)) 6= 0, 2 the matrix µ(x) is invertible in Z/pZ, and there is an integer N ≤ pk with µ(xN ) = I . Reverting to the original matrix, this means that µ(xN ) = I + pM for some matrix M with integral coefficients.
67
4. A Theorem of Skolem, Mahler, Lech Consider now a fixed integer j ∈ {0, . . . , N − 1} and set for n ≥ 0 bn = aj+nN . Then j+nN
bn = λµ(x
n X n i p λµ(xj )M i γ . )γ = λµ(x )(I + pM ) γ = i i=0 j
n
Thus, setting di = λµ(xj )M i γ, one obtains n X n i p di . bn = i i=0
In view of Lemma 4.6, the sequence (bn )n≥0 either vanishes or contains only finitely many vanishing terms. Thus, the annihilator of S is quasi-periodic with 2 period less than pk . Proof of Proposition 4.3. Let (λ, µ, γ) be a regular linear representation of S, and let q be a common multiple of the denominators of the coefficients in λ, µ and γ. Then (qλ, qµ, qγ) is a linear representation of the regular series P n+2 S′ = q an xn . Clearly ann(S) = ann(S ′ ). By Lemma 4.7, the set ann(S ′ ) is quasi-periodic. Thus ann(S) is quasi-periodic. We now turn to the second part of the proof. For this, we consider the ring Z[y1 , . . . , ym ] of polynomials over Z in commutative variables y1 , . . . , ym and the quotient field Q(y1 , . . . , ym ) of rational functions. An element in either one of these sets will be denoted indistinctly without or with an enumeration of the variables. As usual, if P ∈ Q(y1 , . . . , ym ) and a1 , . . . , am ∈ Q, then P (a1 , . . . , am ) is the value of P at that point. The result to be proved is the following. P Proposition 4.8 Let S = an xn be a regular rational series with coefficients in the field Q(y1 , . . . , ym ). Then ann(S) is quasi-periodic. We start with the following well-known property of polynomials. Lemma 4.9 Let K be a (commutative) field, and let P ∈ K[y1 , . . . , ym ]. Let δi be the degree of P in the variable yi . Assume that there exist subsets A1 , . . . , Am of K with Card(Ai ) > δi for i = 1, . . . , m such that P (a1 , . . . , am ) = 0 for all (a1 , . . . , am ) ∈ A1 × · · · × Am . Then P = 0. P Corollary 4.10 Let S = an xn be any series with coefficients in K[y1 , . . . , ym ] and let H1 , . . . , Hm be arbitrary infinite subsets of K. For each (h1 , . . . , hm ) ∈ K m , let X Sh1 ,...,hm = an (h1 , . . . , hm )xn . Then
ann(S) =
\
(h1 ,...,hm )∈H1 ×···×Hm
ann(Sh1 ,...,hm ) .
68
Chapter IV. Rational Series in One Variable
Proof. It follows immediately from Lemma 4.9 that an = 0 iff an (h1 , . . . , hm ) = 0 for all (h1 , . . . , hm ) ∈ H1 × · · · × Hm . Lemma 4.11 Let P ∈ Z[y1 , . . . , ym ], P 6= 0. For all but a finite number of prime numbers p, there exists a subset H ⊂ Zm of the form H = (k1 , . . . , km ) + pZm
(4.8)
such that for all (h1 , . . . , hm ) ∈ H, P (h1 , . . . , hm ) 6≡ 0 mod p . Proof. Let P =
X
im ci1 ,i2 ,...,im y1i1 y2i2 · · · ym .
Let δi be the degree of P in the variable yi , and let p be any prime number strictly greater than the δi ’s and not dividing all the coefficients ci1 ,i2 ,...,im . Again let n 7→ n be the morphism from Z onto Z/pZ. The polynomial X im P = ci1 ,i2 ,...,im y1i1 y2i2 · · · ym
is a non vanishing polynomial with coefficients in Z/pZ. Since p > δi for i = 1, . . . , m, it follows from Lemma 4.9 that there exists (k1 , . . . , km ) ∈ Zm such that P (k 1 , . . . , km ) 6= 0. This proves the lemma. Proof of Proposition 4.8. Let (λ, µ, γ) be a linear representation of S of dimension k. As in the proof of Proposition 4.3, consider a common multiple q ∈ Z[y1 , . . . , ym ] of the denominators of the coefficients λ, µ and γ. Pofn+2 Then (qλ, qµ, qγ) is a linear representation of the series S ′ = q an xn and ′ ann(S ) = ann(S). Thus we may suppose that the coefficients of λ, µ and γ are in Z[y1 , . . . , ym ]. Let P = det(µ(x)) ∈ Z[y1 , . . . , ym ]. Since S is regular, P 6= 0 and by Lemma 4.11, there exists a prime number p and an infinite H ⊂ Zn of the form Eq. (4.8) such that det µ(x)(h1 , . . . , hm ) 6≡ 0 mod p for all (h1 , . . . , hm ) ∈ H. Setting X Sh1 ,...,hm = an (h1 , . . . , hm )xn n
this implies, in view of Lemma 4.7, that for all (h1 , . . . , hm ) ∈ H, the set 2 2 ann(Sh1 ,...,hm ) is quasi-periodic with a period at most pk . Thus r = (pk )! is a period for all these annihilators. In view of Lemma 4.2, the set \ ann(Sh1 ,...,hm ) (h1 ,...,hm )∈H
is quasi-periodic. By Corollary 4.10, this intersection is the set ann(S). Thus the proof is complete. It is convenient to introduce the following
4. A Theorem of Skolem, Mahler, Lech
69
Definition A (commutative) field K is a SML field (Skolem-Mahler-Lech field) if K satisfies Theorem 4.1. We have seen already that the field Q of rational numbers, and the field Q(y1 , . . . , ym ) are SML fields. Proposition 4.12 Let K and L be fields. If L is an SML field and K is a finite algebraic extension of L, then K is an SML field. P Proof. Let S = an xn be a rational series over K. Let k be the dimension of K over L, and let φ1 , . . . , φk be L-linear functions K → L such that, for any h∈K h = 0 ⇐⇒ φi (h) = 0, ∀ i = 1, . . . , k . Define Si =
X n
φi (an )xn ∈ L[[x] .
Then, by the choice of the function φi , one has \ ann(S) = ann(Si ) .
(4.9)
1≤i≤k
Thus, it suffices, by Lemma 4.2 to prove that the series Si are rational over L. By Proposition I.5.1, there exists a finite dimensional subvector space M of K[[x]], containing S and which is stable, that is closed for the operation T 7→ T ◦ x. Since K has finite dimension over L, the space M also has finite dimension over L. The functions φi , extended to series φi : K[[x]] → L[[x]] by φi
X n
X φi (bn )xn bn xn = n
are L-linear. Consequently, φi (M ) is a finite dimensional vector space over L. Since φi (T ◦ x) = φi (T ) ◦ x, the space φi (M ) is stable. Moreover, it contains the series Si = φi (S). Thus, again by Proposition I.5.1, each series Si is rational over L. Proof of Theorem 4.1. Let S be a rational series with coefficients in K. Then by Proposition 1.4, there is a polynomial P such that S − P is regular. Since ann(S − P ) and ann(S) differ only by a finite set, it suffices to prove the result for S − P . Thus we may assume that S is regular. Let (λ, µ, γ) be a linear representation of S, and let K ′ be the subfield of K over Q generated by the set Z of coefficients of λ, µ(x), γ. Then S has coefficients in K ′ and we may assume that K is a finite extension of Q, that is K = Q(Z) for a finite set Z. Let Y be a maximal subset of Z that is algebraically independent over Q. The field Q(Y ) is isomorphic to the field Q(y1 , . . . , ym ) with Y = {y1 , . . . , ym }. In view of Proposition 4.8, the field Q(Y ) is a SML field. Next, K is a finite algebraic extension of Q(Y ). By Proposition 4.12, the field K is a SML field. This concludes the proof.
70
Chapter IV. Rational Series in One Variable
Exercises for Chapter IV P P∞ n 4.1 Set B(x) = ∞ d xn with integers bn , dn related n=0 bn x , D(x) = n=0 P∞n pn xn as in Lemma 4.6. Show that B(x) = n=0 dn (1−x) n+1 .
Notes to Chapter IV The notion of an exponential polynomial is a classical one. The formalism we use here is from Reutenauer (1982). It allows to give an algebraic proof of Benzaghou’s theorem. His proof was based on analytic techniques. The algebraic method makes it possible to prove Benzaghou’s theorem in characteristic p. Some modifications are necessary, since in that case, the exponential polynomial may not exist nor be unique. P´ olya’s theorem is extended to general fields by B´ezivin (1984). There are a great number of arithmetic and combinatorial properties of linear recurrence sequences. The use of symmetric functions to derive divisibility properties is illustrated by Dubou´e (1983). Lascoux (1986) gives numerous applications of expressions of the exponential polynomial by means of symmetric functions. For a rich collection of formulas and results about symmetric functions, see Lascoux and Sch¨ utzenberger (1985). The proof of the Skolem-Mahler-Lech theorem given here is due to Hansel (1986). The original proofs, by Skolem (1934), Mahler (1935), and Lech (1953) depend on p-adic analysis. An openP problem, stated by C. Pisot, is the following. Is it decidable, for a rational series an xn , whether there exists an n such that an = 0? It is decidable whether there exist infinitely many n with an = 0 (Berstel and Mignotte 1976). An extension of P´ olya’s theorem to several noncommutative variables is studied in Reutenauer (1980b). For this, define as follows the unambiguous rational operations on languages (see Eilenberg and Sch¨ utzenberger 1969): The union L1 ∪ L2 is unambiguous if the sets are disjoint. The product L1 L2 is unambiguous if u, u′ ∈ L1 , v, v ′ ∈ L2 , and uv = u′ v ′ imply u = u′ , v = v ′ . The star operation L 7→ L∗ is unambiguous if L is the basis of a free submonoid of A∗ . It can be shown that any rational language is obtained from finite languages by unambiguous operations (Eilenberg 1974). The unambiguous rational operations on series are defined as follows. A rational operation (sum, product, star) on series is unambiguous if the corresponding operation on the support (union, product, star) is unambiguous. A rational series S ∈ QhhAii is unambiguous if it is obtained from polynomials using only unambiguous rational operations. It is not very difficult to see that any unambiguous rational series has only finitely many prime factors. The problem is the converse property, which we state as a conjecture. Conjecture If a rational series has only finitely many prime factors, then it is unambiguous. According to P´ olya’s theorem, this holds for one variable. Partial results are proved in Reutenauer (1980b).
Chapter V
Changing the Semiring If K is a subsemiring of L, each K-rational series is clearly L rational. The main problem considered in this chapter is the converse: how to determine which of the L-rational series are rational over K. This leads to the study of semirings of a special type, and also shows the existence of remarkable families of rational series. In the first section, we examine principal rings from this aspect. Fatou’s Lemma is proved and the rings satisfying this lemma are characterized. Section 2 contains several results on rational series with nonnegative coefficients. The main result (Theorem 2.10) is a characterization of R+ -rational series in one variable. In the last section, Fatou extensions are introduced. We show in particular that Q+ is a Fatou extension of N (Theorem 3.3).
1
Rational Series over a Principal Ring
Let K be a commutative principal ring and let F be its quotient field. Let S ∈ KhhAii be a formal series over A with coefficients in K. If S is a rational series over F , is it also rational over K? This question admits a positive answer, and there is even a stronger result, namely that S has a linear representation of minimal dimension (that is, equal to its rank) with coefficients in K. Theorem 1.1 (Fliess 1974a) Let S ∈ KhhAii be a series which is rational of rank n over F . Then S is rational over K and has a linear representation over K of dimension n. Proof. Let (λ, µ, γ) be a reduced linear representation of S over F . According to Corollary II.2.3, there exist polynomials P1 , . . . , Pn , Q1 , . . . , Qn ∈ F hAi such that for w ∈ A∗ µw = ((S, Pi wQj ))1≤i,j≤n . Let d be an element in K \ 0 such that dPi , dQj ∈ KhAi and dλ ∈ K 1×n . Then for any polynomial P ∈ KhAi d3 λµP = (dλ)((S, dPi P dQj ))i,j ∈ K 1×n , 71
72
Chapter V. Changing the Semiring
since (S, R) ∈ K whenever R ∈ KhAi. Consequently, λµ(KhAi) ⊂
1 1×n K . d3
This shows that λµ(KhAi), considered as a submodule of a free K-module of rank n, is also free and has rank ≤ n. It suffices now to apply Lemma II.1.2. This theorem admits the following corollary, known as Fatou’s Lemma. Corollary 1.2 (Fatou 1904) Let P (x)/Q(x) ∈ Q(x) be an irreducible rational function such that the constant term of Q is 1. If the coefficients of its series expansion are integers, then P and Q have integral coefficients. P Proof. We have Q(0) = 1. Then S = an xn = P (x)/Q(x) is a rational series. Let (λ, µ, γ) be a reduced linear representation of S. Since Z is principal, this representation is similar, by Theorem 1.1 and Theorem II.2.4, to a representation over Z. In particular, the characteristic polynomial of µ(x) has integral coefficients. Now, Q(x) is the reciprocal polynomial of this polynomial (Proposition IV.1.2). Thus Q(x) has integral coefficients, and so does P = SQ. The previous result holds for rings other than the ring Z of integers. We shall characterize these rings completely. Let K be a commutative integral domain and let F be its quotient field. Let M be an F -algebra. An element m ∈ M is quasi-integral over K if there exists an injection of the K-module K[m] into a finitely generated K-module. Proposition 1.3 If m ∈ M is quasi-integral over K, then there exists a finitely generated K-submodule of M containing K[m]. Proof. There exists a finitely generated K-module N and a K-linear injection K[m] → N . Since K[m] is contained in some F -algebra, it is torsion-free over K. Thus the injection extends to an F -linear injection i : F [m] → N ⊗K F . Consequently F [m] has finite dimension over K and m is algebraic over F . Let p : N ⊗ F → i(F [m]) be an F -linear projection. Then p(N ) = p(N ⊗ 1) is a finitely generated K-module containing i(K[m]) and contained in i(F [m]). Consequently, its inverse image by i, say N1 , is a finitely generated K-module and K[m] ⊂ N1 ⊂ F [m] ⊂ M . Corollary 1.4 An element m ∈ F is quasi-integral over K if and only if there exists d ∈ K \ 0 such that dmn ∈ K for all n ∈ N. P Proof. Indeed, K[m] is the set of all expressions ni=0 αi mi , with αi ∈ K.
Corollary 1.5 If M is a commutative algebra, then the set of elements of M which are quasi-integral over K is a subring of M .
1. Rational Series over a Principal Ring
73
Definition The domain K is called completely integrally closed if any m in F which is quasi-integral over K is already in K. Observe that an element in F which is integral over K is also quasi-integral over K. Thus, if K is completely integrally closed, it is integrally closed. Theorem 1.6 (Chabert 1972) The following conditions are equivalent. (i) The domain K is completely integrally closed. (ii) For any irreducible rational function P (x)/Q(x) ∈ F (x) whose series expansion has coefficients in K, and such that the constant term of Q is 1, both P and Q have coefficients in K. We use the following lemma Lemma 1.7 Let m be a matrix in F n×n which is quasi-integral over K. Then the coefficients of the characteristic polynomial of m are quasi-integral over K. Proof. Let P (t) = tn + a1 tn−1 + · · · + an ∈ F [t] be the characteristic polynomial of m. Since m is quasi-integral over K, there exists, by Proposition 1.3, a finitely generated K-submodule of F n×n containing all powers of m. Thus there exists some d ∈ K \ 0 such that dmk ∈ K n×n for all k ∈ N. Consequently, since ±ai is a sum of products of i entries of m, da1 , d2 a2 . . . , dn an ∈ K . Let λ be an eigenvalue of m. Then dλ is integral over K. Indeed, 0 = dn P (λ) = (dλ)n + da1 (dλ)n−1 + · · · + dn an . Consequently, the K-algebra L = K[dλ] is a finitely generated K-module. The element λ is in the quotient field E of L, and there exists q ∈ GLn (E) such that λ ∗ ··· ∗ 0 ∗ · · · ∗ m′ = q −1 mq = . .. .. . 0 ∗ ··· ∗
Let d′ be a common denominator of the coefficients of q and q −1 , that is such that d′ q and d′ q −1 have coefficients in L. Then for all k ∈ N (d′2 d)m′k = (d′ q −1 )dmk (d′ q) ∈ Ln×n .
Thus (d′2 d)λk ∈ L, whence K[λ] ⊂ (d′2 d)−1 L. This shows that λ is quasiintegral over K. Since all eigenvalues of m are quasi-integral, the same holds for the coefficients ai by Corollary 1.5. Proof of Theorem 1.6. Assume that K is completely integrally closed. Let P (x)/Q(x) be a function satisfying the hypotheses of (ii). We have Q(0) = 1. The series X S= an xn = P (x)/Q(x)
74
Chapter V. Changing the Semiring
is F -rational and has coefficients in K. Let (λ, µ, γ) be a reduced linear representation of S. By Corollary II.2.3, the matrix µ(x) is quasi-integral over K. In view of Lemma 1.7, the characteristic polynomial of µ(x) has coefficients in K, and since Q is its reciprocal polynomial (Proposition IV.1.2), the polynomial Q has coefficients in K, and the same holds for P = SQ. Assume conversely that (ii) holds. Let m ∈ F be quasi-integral over K. Then there exists d ∈ K \ 0 such that dmn ∈ K for all n ∈ N. Set P (x) = d, Q(x) = 1 − mx. Then X P (x)/Q(x) = d mn xn ∈ K[[x]] .
Thus by hypothesis Q(x) ∈ K[x], whence m ∈ K. This shows that K is completely integrally closed.
2
Positive Rational Series
In this section, we study series with nonnegative coefficients. We start with the following result. Theorem 2.1 Sch¨ utzenberger (1970) If S ∈ NhhAii is an N-rational series, then S − supp(S) ∈ NhhAii is N-rational. Recall that L is the characteristic series of the language L. Proof (Salomaa and Soittola 1978). In view of Proposition I.5.1, there exist rational series S1 , . . . , Sn such that the N-submodule of NhhAii they generate is stable and contains S. By Lemma III.1.4, the supports supp(S1 ), . . . , supp(Sn ) are rational languages. Let L be the family of languages obtained by taking all intersections of supp(S1 ), . . . , supp(Sn ). Then L is a finite set of rational languages. The set L′ = {u−1 L | u ∈ A∗ , L ∈ L} is also a finite set of rational languages (Corollary III.1.6). Let T be the set of characteristic series of the languages in L′ . Let M be the finitely generated N-submodule of NhhAii generated by T and by the series Si′ = Si − supp(Si ) P for i = 1, . . . , n. We claim that if aj ∈ N and T = aj Sj , then T − supp(T ) is in M . P P Indeed, Sj = Sj′ +supp(Sj ), thus T = aj Sj′ +U , where U = aj supp(Sj ). Note P that supp(Sj′ ) ⊂ supp(Sj ), hence supp(T ) = supp(U ). We may write U = bk Tk where each integer bk is ≥ 1 and the Tk ∈ T have disjoint supports. This is done by keeping only the j’s with aj ≥ 1 and P by making the necessary intersections of supports. Hence U − supp(U ) = (bk − 1)Tk ∈ M and T − P supp(T ) = aj Sj′ + U − supp(U ) ∈ M .
2. Positive Rational Series
75
Since S is an N-linear combination of the Sj , S − supp(S) is in M by the claim. We show that M is stable, which will end the proof by Proposition I.5.1. Indeed, let u ∈ A∗ . Then u−1 T ∈ T by construction, hence in M , for any T in T. Consider u−1 Si′ = u−1 Si − supp(u−1 Si ). Since u−1 Si is an N-linear combination of the Sj , we deduce that u−1 Sj′ is in M . We now consider series of the form X an xn
with all coefficients in R+ . If such a series is the expansion of a rational function, it does not imply in general that it is R+ -rational (see Exercise 2.2). We shall characterize those rational functions over R whose series expansion is R+ rational. We call them R+ -rational functions. Theorem 2.2 (Berstel 1971) Let f (x) be an R+ -rational function which is not a polynomial, and let ρ be the minimum of the moduli of its poles. Then ρ is a pole of f , and any pole of f of modulus ρ has the form ρθ, where θ is a root of unity. Observe that the minimum of the moduli of the poles of a rational function is just the radius of convergence of the associated series. We start with a lemma. Lemma 2.3 Let f (x) bePa rational function which is not a polynomial and with a series expansion an xn having nonnegative coefficients. Let ρ be the minimum of the moduli of the poles of f . Then ρ is a pole of f , and the multiplicity of any pole of f of modulus ρ is at most that of ρ. Proof. Let z ∈ C, |z| < ρ. Then X X |f (z)| = an z n ≤ an |z|n = f (|z|) .
(2.1)
Let z0 be a pole of modulus ρ, and let π be its multiplicity. Assume that the multiplicity of ρ as a pole of f is less than π. Then the function g(z) = (ρ − z)π f (z) is analytic in the neighborhood of ρ, and g(ρ) = 0, whence lim (ρ − ρr)π f (ρr) = 0 .
r→1,r<1
The function h(z) = (z0 − z)π f (z) is analytic at z0 and h(z0 ) 6= 0. Thus lim
z→z0 ,|z|
|(z0 − z)π f (z)| > 0 .
In particular, setting z = rz0 , with 0 ≤ r < 1, this implies lim
r→1,r<1
|z0π (1 − r)π f (rz0 )| > 0 .
(2.2)
76
Chapter V. Changing the Semiring
In view of Eq. (2.1), this shows that lim
r→1,r<1
ρπ (1 − r)π f (rρ) > 0
contradicting (2.2). Proof of Theorem 2.2. Let S be the set of polynomials with nonnegative coefficients and of rational functions with series expansions having nonnegative coefficients and satisfying the conclusions of the statement. It suffices to show that S is closed for sum, product, and star. Recall that the star operation is X f 7→ f ∗ = f n = (1 − f )−1 . n≥0
Let f = an xn and g be in S. P Letn ρf be the radius of convergence of f . Recall that ρf = sup{r ∈ R+ | an r < ∞}. Since the associated series has nonnegative coefficients, P
ρf +g = inf(ρf , ρg ) and, if f, g 6= 0 ρf g = inf(ρf , ρg ) . Thus, according to Lemma 2.3, f + g and f g are in S, since each pole of f + g and of f g is a pole ofP f or of g. Now, let f (x) = n≥1 an xn ∈ S, and assume f 6= 0. The poles of f ∗ = P (1 − f )−1 are the zeros of 1 − f . Observe that an ρnf = ∞ since otherwise limr→ρf f (r) would exist (Abel’s lemma) and this is impossible because a P f has pole in ρf . The coefficients an being nonnegative, the function r 7→ an rn is strictly increasing from 0 to ∞ when r ranges from 0 to ρf , and consequently there is a unique real number r with 0 < r < ρf such that f (r) = 1. Thus r is a pole of f ∗ . Let z be a pole of f ∗ of modulus ≤ r. We prove that z = rθ for some root of unity θ. Indeed, the relations X X X 1= an z n = Re an Re(z n ) an z n = X X ≤ an |z|n ≤ an r n = 1
show that equality holds everywhere, whence an Re(z n ) = an rn for all n ≥ 0. Let n be an integer with an 6= 0 (it exists because f 6= 0). Then Re(z n ) = rn and |z| ≤ r imply z n = rn whence z = rθ for θ some nth root of unity. Thus f ∗ is in S. The previous theorem gives a necessary condition for a rational function to be R+ -rational. We now give a sufficient condition. For this, we go back to the vocabulary of formal series. A rational series with complex coefficients is said to have a dominating eigenvalue if there is, among its eigenvalues (in the sense of Section IV.1) a unique eigenvalue having maximal modulus. It is equivalent to say that the associated rational function is either a polynomial or has a unique pole of minimal modulus.
77
2. Positive Rational Series
Theorem 2.4 (Katayama et al. 1978, Soittola 1976) Let S ∈ R[[x]] be a rational series with nonnegative coefficients and having a dominating eigenvalue. Then S is R+ -rational. We shall use several lemmas. Lemma 2.5 Let K be a semiring and let S =
P
an xn ∈ K[[x]] be a series.
(i) The seriesP S is rational if and only if there exists an integer k ≥ 0 such that T = an xn is rational. Moreover, if K = C, then S and T have n≥k
the same nonzero eigenvalues with the same multiplicities. (ii) The series S is rational if and only if there exist rational series S0 , . . . , Sp−1 ∈ K[[x]] whose merge is S. P an xn . Then S = P + T , where P is a polynomial. Thus if Proof. (i) Let T = n≥k
T is rational, then S is rational. Conversely, T is rational as Hadamard product of S and of xk x∗ . The last assertion follows because S − T is a polynomial and therefore S and TP have the same minimal denominator. (ii) Let Si = ai+np xn . If S is rational and has a linear representation (λ, µ, γ), then Si has a linear representation (λ(µx)i , µ′ , γ), with µ′ (x) = µ(x)p . Thus each Si is rational. To show the converse, observe first that X Si′ = ai+np xnp n
is rational as image under the morphism x 7→ xp (Proposition I.4.2). Thus the sum of products p−1 X
S=
xi Si′
i=0
is rational. Lemma 2.6 Let S be a rational series over R having the eigenvalue λ0 6= 0 with multiplicity d + 1 ≥ 2, let R be a polynomial having the simple root 1/λ0 and let T =P RS. Then T P has the eigenvalue λ0 with multiplicity d. More precisely, if S = an xn , T = bn xn with exponential polynomials X X an = Pi (n)λni , bn = Qi (n)λni (2.3) 0≤i≤r
0≤i≤r
(n large enough), where deg(P0 ) = d, deg(Q0 ) = d − 1 and α is the coefficient of td in P0 (t), then the coefficient β of td−1 in Q0 (t) is dα
α2 αk α1 + 2 2 + · · · + k k 6= 0 . λ0 λ0 λ0
with R = 1 − α1 x − · · · − αk xk .
Proof. Recall that the nonzero eigenvalues of a rational series are the inverses of its poles, with the same multiplicities (see Section IV.1). Thus the first assertion follows.
78
Chapter V. Changing the Semiring Now, for n large enough, one has bn = an − α1 an−1 − · · · − αk an−k =
r X i=0
Pi (n)λni −
k X
αj
j=1
r X i=0
Pi (n − j)λin−j
(2.4)
In order to compute β, note that nq λn − α1 (n − 1)q λn−1 − · · · − αk (n − k)q λn−k α1 α1 αk αk = nq λn 1 − − · · · − k + qnq−1 λn + · · · + k k + Q(n)λn , λ λ λ λ (2.5) where Q(n) is some polynomial of degree ≤ q − 2 in n. Note that if 1/λ is a root of R, then the first term on the right-hand side of (2.5) vanishes, which easily implies the lemma. Definition Let λ0 , . . . , λr be the eigenvalues of a rational series S. The eigenvalue λ0 is strictly dominating if λ0 ∈ R, λ0 > 1 and |λ0 λi | < 1 for i = 1, . . . , r. P Lemma 2.7 Let S = an xn be a rational series over R with nonnegative coefficients which is not a polynomial and which has a dominating eigenvalue. P an n Then for some real number s > 0, the series x has a strictly dominating sn eigenvalue. Proof. Since S has a dominating eigenvalue, one has for large enough n, an = P (n)λn +
r X
Pi (n)λni
i=1
with |λ| > |λi | for i = 1, . . . , r and polynomials P, P1 , . . . , Pr . If α is the leading coefficient of P , then for n → ∞ an ∼ αnd λn with d = deg(P ). Thus α > 0 and λ is real, λ > 0. Set λ0 = λ. Let s′ > 0 be a real number with λ > s′ > |λi | for i = 1, . . . , r, and set λ′i = λi /s′ . Then of course λ′0 > 1 > |λ′i |. Next, let s′′ be such that 1 < s′′ < λ′0 < s′′2 . Setting λ′′i = λ′i /s′′ , it follows that |λ′′0 λ′′i | =
λ′0 |λ′0 λ′i | < <1 s′′2 s′′2
and λ′′0 > 1. Thus the number s = s′ s′′ gives the desired result. P Lemma 2.8 Let S = an xn ∈ R+ [[x]] be a rational series having a strictly dominating eigenvalue. Then for large enough n an+1 > an
(2.6)
79
2. Positive Rational Series Proof. For large enough n, one has an = P (n)λn +
r X
Pi (n)λni
(2.7)
i=1
with λ ∈ R, λ > 1 and |λλi | < 1, hence also |λi | < 1/λ < 1 < λ. This shows that an ∼ αnd λn . n→∞
where d is the degree of P and α is the coefficient of td in P (t). This implies that an+1 ∼ λ. an n→∞ Since λ > 1 and an ∈ R+ , this shows that an+1 > an for large enough n. Lemma 2.9 Let U be a rational series over R having a strictly dominating eigenvalue with multiplicity m. Then there exists an integer p suchP that U is the merge of p rational series having the following property: if S = an xn is any one of them, then (i) S has a dominating eigenvalue λ with multiplicity m. (ii) There exists a polynomial R = 1 − α1 x − · · · − αk xk ∈ R[[x]] which has the simple root 1/λ and such that α1 > 1, α1 + α2 > 1, . . . , α1 + · · · + αk > 1 kαk 2α2 α1 + ···+ k > 0 + λ λ2 λ
(2.8)
Moreover, if m = 1, then R is a denominator for S. Proof. 1. Let λ0 , . . . , λr be the distinct nonzero eigenvalues of U . Then for large enough n X un = λni Pi (n) , 0≤i≤r
where U =
P
un xn , Pi (t) ∈ C[[t]] \ 0 and, by hypothesis
λ0 ∈ R, λ0 > 1,
λ0 |λi | < 1 for i = 1, . . . , r .
2. Define a polynomial R as follows: if λ0 is a simple eigenvalue, then R is the minimal denominator of U , otherwise let R be a divisor of the minimal denominator of U in R[x] such that R has the simple root 1/λ0 . (p) (p) 3. Let R(p) = 1 − α1 x − · · · − αk xk be the polynomial whose roots are the pth powers of those of R, with the same multiplicities. We show that for some (p) p, we have Eq. (2.8) (with αi replaced by αi ). Let λ0 = µ0 , . . . , µk be the inverses of the roots of R. Then, by assumption µ0 > 1 > µ0 |µi | for i = 1, . . . , k .
80
Chapter V. Changing the Semiring (p)
Let σi
be the ith elementary symmetric function of the µpi ’s. Then (p)
σ1
= µp0 + · · · + µpk > µp0 − k
(p)
since σ1
∈ R and |µi | < 1 for i = 1, . . . , k. Moreover, for i = 2, . . . , k X k+1 (p) (p) (p) (p) σi = µj1 µj2 · · · µji < i 0≤j1 <···<ji ≤k
(p)
since each monomial has modulus less than 1. Thus, for large enough p, σ1 > 1 (p) (p) (p) (p) and σ1 ± σi > 1 for i = 2, . . . , k. Now, observe that α1 = σ1 and (p) (p) αi = ±σi for i = 2, . . . , k. Hence (p)
(p)
(p)
(p)
(p)
α1 > 1, α1 + α2 > 1, . . . , α1 + · · · + αk > 1 . (p)
(p)
Moreover, since α1 = σ1 is equivalent to µp0 when p → ∞, µ0 > 1 and the (p) αi (i = 1, . . . , k) are bounded, we obtain for large enough p (p)
(p)
(p)
α α1 α + 2 22p + · · · + k kkp > 0 . µp0 µ0 µ0 4. Let p be as in step 3, and define Sj by X Sj = uj+np xn . n
Then U is the merge of S0 , . . . , Sp−1 . We have for large enough n X p uj+np = (λi )n λji Pi (j + np) .
(2.9)
0≤i≤r
This shows that the eigenvalues of Sj are among λp0 , . . . , λpr (by unicity of the exponential polynomial, see Corollary IV.2.2). Moreover, λp0 6= λpi for i = 1, . . . , k, hence λp0 is a eigenvalue of Sj , and hence Sj has the dominating eigenvalue λp0 . Since the polynomials P0 (t) and P0 (j + tp) have the same degree, the multiplicity of λp0 for Sj is m = deg(P0 ) + 1. Now R(p) has the simple root 1/µp0 = 1/λp0 with multiplicity 1 (because λp0 6= λpi if i 6= 0) and the relations (2.8) hold. If m = 1, then R was chosen to be the minimal denominator of U , hence R(p) is a multiple of the minimal denominator of Sj . Indeed, by Eq. (2.9) and Corollary IV.2.2, each of Sj is one of the λpi with multiplicity Q eigenvalue ≤ deg(Pi ). Now, R(x) = (1−λi x)deg(Pi )+1 (see Section IV.2), hence R(p) (x) = i Q (1 − λpi x)deg(Pi )+1 by definition. i
Proof of Theorem 2.4. 1. We argue by induction on the multiplicity m of the dominating eigenvalue λ of S. By Lemma 2.7 we may assume that S has a strictly dominating eigenvalue. Hence, by Lemma 2.8 and Lemma 2.5(i), we may assume that an+1 > an
2. Positive Rational Series
81
for any n. Now, by Lemma 2.9 and Lemma 2.5(ii), we may further assume that there exists a polynomial R = 1 − α1 x − · · · − αk xk having 1/λ as a simple root, satisfying Eq. (2.8), and which is a denominator of S if m = 1. P P T 2. Let S = = , with P, Q ∈ R[x], T ∈ R[[x]]. The series T = bn xn Q R is of course rational, and with the notations of Lemma 2.6, we obtain (λ0 being dominating) α1 αk bn ∼ dα + · · · + k k λn0 . n→∞ λ0 λ0 Since an ≥ 0, we must have α ≥ 0, hence bn ≥ 0 for large n. P n Let T [h] = n>h bn x . If m = 1, then R is a denominator of S, hence [h] T = 0 for large h. If m ≥ 2, then, as 1/λ0 is a simple root of R, λ0 is a eigenvalue of T and the other eigenvalues are among those of S. Hence T has a dominating eigenvalue of multiplicity m − 1. The same holds for T [h] by Lemma 2.5(i). Thus, in both cases, T [h] is R+ -rational by induction, for some h. 3. Before going into the technicalities of the proof, we consider, from a heuristic point of view, the case where h = 0, that is where bn ≥ 0 for any n. 1 Then T is R+ -rational by induction and S = T . Thus we only have to show R P 1 1 1 1 (where x∗ = xn = that is R+ -rational. Now = x∗ ) and ∗ R R Rx 1−x Rx∗ = (1 − α1 x − · · · − αk xk )(1 + x + x2 · · · + xn + · · · )
= 1 + (1 − α1 )x + (1 − α1 − α2 )x2 + · · · + (1 − α1 − · · · − αk−1 )xk−1 + (1 − α1 − · · · − αk )(xk + xk+1 + · · · + xn + · · · )
= 1 − γ1 x − · · · − γk−1 xk−1 − γk xk x∗ , where γ1 = α1 − 1
γ2 = α2 + α1 − 1 .. . γk = αk + · · · + α1 − 1 . 1 = (γ1 x + · · · + γk−1 xk−1 + By Eq. (2.8), the γ’s are nonnegative, so that Rx∗ 1 γk xk x∗ )∗ is R+ -rational. Hence is also R+ -rational, which concludes the R proof in this special case. 4. There remains the problem of treating the case h > 0. We may suppose h ≥ k. We show that for some polynomial Rh with nonnegative coefficients, one has γk ah xh+k xh+1 1 T [h] + (2.10) S [h] = ah + + Rh , 1−x R 1−x
82
Chapter V. Changing the Semiring
where the γ’s are defined in step 3. Note that T [h] and steps 2 and 3), so S [h] is R+ -rational. Thus
1 are R+ -rational (by R
S = a0 + · · · + ah xh + S [h] is also R+ -rational. P cn xn with 5. Let C = n≥0
c0 = a 0 ,
cn = an − an−1 (n ≥ 1) .
Of course C = (1 − x)S and by 1. C has nonnegative coefficients. We shall see that 1−x xh+k C [h] = T [h] + γk ah + Rh . R 1−x
(2.11)
Since, as is easily verified, S [h] =
1 (C [h] + ah xh+1 ) , 1−x
the formula (2.10) follows from Eq. (2.11). 6. Since T = SR,we have for n > h (hence for n ≥ k, because h ≥ k) bn = an − α1 an−1 − · · · − αk−1 an−k+1 − αk an−k
= an − (1 + γ1 )an−1 − · · · − (γk−1 − γk−2 )an−k+1 − (γk − γk−1 )an−k
= an − an−1 − γ1 (an−1 − an−2 ) − · · · −γk−1 (an−k+1 − an−k ) − γk an−k
= cn − γ1 cn−1 − · · · − γk−1 cn−k+1 − γk an−k .
This relation can be developed further, when n ≥ h + k, since ap = cp + ap−1 . We obtain bn = cn − γ1 cn−1 − · · · − γk−1 cn−k+1 − γk cn−k − γk cn−k−1 − · · · − γk ch+1 − γk ah . Summing up these relations, we get X X n n bn x = cn x (1 − γ1 x − · · · − γk xk − γk xk+1 − · · · ) n>h
n>h
−
γk ah xh+k − Rh , 1−x
with Rh = (γ1 ch + γ2 ch−1 + · · · + γk−1 ch−k+2 + γk ah−k+1 )xh+1 + (γ2 ch + · · · + γk−1 ch−k+3 + γk ah−k+2 )xh+2
+ · · · + (γk−1 ch + γk ah−1 )xh+k−1 .
83
3. Fatou Extensions This polynomial has nonnegative coefficients. We have T [h] = C [h] 1 − γ1 x − · · · − γk−1 xk−1 − Formula (2.11) follows because
γk xk ah xh+k − γk − Rh . 1−x 1−x
1−x xk −1 = 1 − γ1 x − · · · − γk−1 xk−1 − γk R 1−x
(see step 3). The proof of the theorem is complete.
The two preceding theorems give the following characterization of R+ -rational series. Theorem 2.10 A series S ∈ R+ [[x]] is R+ -rational if and only if it is the merge of rational series having a dominating eigenvalue. Proof. Let S be a R+ -rational series and let m be the number of its eigenvalues. We argue by induction on m, the case m = 1 being clear. If m > 1, then S is not a polynomial, since the only eigenvalue of a polynomial is 0. Since the nonvanishing eigenvalues of S are the inverses of the poles of the associated function (see Section IV.2), there exists, by Theorem 2.2, a real number ρ > 0 such that any eigenvalue of S with maximal modulus has the form ρθ, where θ is a root of unity. Let p be a common order of all these roots of unity and let S0 , S1 , . . . , Sp−1 be the series whose merge is S. By Lemma 2.5, each Si is R+ -rational and their eigenvalues are λp , with λ a eigenvalue of S. If p = 1, S has a dominating eigenvalue and we are done. Otherwise, each Si has strictly fewer eigenvalues than S, and is R+ -rational. This concludes the proof of the direct part. For the converse, it suffices to use Theorem 2.4 and Lemma 2.5.
3
Fatou Extensions
According to Fatou’s Lemma (Corollary 1.2) any rational series in Q[[x]] with integral coefficients is rational in Z[[x]]. The same result holds for an arbitrary alphabet A, by Theorem 1.1. This leads to the following definition. Definition Let K ⊂ L be two semirings. Then L is a Fatou extension of K if every L-rational series with coefficients in K is K-rational. Theorem 3.1 (Fliess 1974a) If K ⊂ L are commutative fields, then L is a Fatou extension of K. Proof. This follows immediately from the expression of rationality by means of the rank of the Hankel matrix (Theorem II.1.5). Theorem 3.2 Let K be a commutative Noetherian integral domain, and let F be its quotient field. Then F is a Fatou extension of K.
84
Chapter V. Changing the Semiring
Proof. Let S be an F -rational series with coefficients in K and let (λ, µ, γ) be a reduced linear representation of S. According to Corollary II.2.3, there exists d ∈ K \ 0 such that dµA∗ ⊂ K n×n . Consequently, the algebra µ(KhAi) is contained in d−1 K n×n which is a Noetherian K-module. It follows that µ(KhAi) is a finitely generated K-module and this in turn shows that the syntactic algebra of S over K (which is equal to µ(KhAi)) is a finitely generated K-module. By Theorem II.1.1, the series S is rational. Theorem 3.3 (Fliess 1975) The semiring Q+ is a Fatou extension of N. We need some preliminary lemmas. Lemma 3.4 (Eilenberg and Sch¨ utzenberger 1969) The intersection of two finitely generated submonoids of an Abelian group is still a finitely generated submonoid. Proof. Let M1 and M2 be two finitely generated submonoids of an Abelian group G, with law denoted by +. There exist integers k1 , k2 and surjective monoid morphisms φi : Nki → Mi , i = 1, 2. Let k = k1 + k2 and let S be the submonoid of Nk = Nk1 × Nk2 defined by S = {x = (x1 , x2 ) ∈ Nk | φ1 x1 = φ2 x2 } . Let p1 : Nk → Nk1 be the projection. Then M1 ∩ M2 = φ1 ◦ p1 (S) . Thus it suffices to prove that S is finitely generated. Observe that S satisfies the following condition x, x + y ∈ S =⇒ y ∈ S .
(3.1)
Indeed, since φ1 x1 = φ2 x2 and φ1 x1 + φ1 y1 = φ2 x2 + φ2 y2 and since all these elements are in G, it follows that φ1 y1 = φ2 y2 , whence y ∈ S. Let X be the set of minimal elements of S (for the natural ordering of Nk ). For all z ∈ S, there is x ∈ X such that x ≤ z. Thus z = x + y for some y ∈ Nk and by Eq. (3.1), y ∈ S. This shows by induction that X generates S. In view of the following well-known lemma, the set X is finite. Lemma 3.5 Every infinite sequence in Nk contains an infinite increasing subsequence. Proof. By induction on k. Let (un ) be a sequence of elements of Nk . If k = 1, either the sequence is bounded, and one can extract a constant sequence, or it is unbounded, and one can extract an strictly increasing subsequence. For k > 1, one first extracts a sequence that is increasing in the first coordinate, and then uses induction for this subsequence.
85
3. Fatou Extensions
Lemma 3.6 (Eilenberg and Sch¨ utzenberger 1969) Let I be a set and let M be a finitely generated submonoid of NI . Then the submonoid M ′ of NI given by M ′ = {x ∈ NI | ∃n ≥ 1, nx ∈ M } is finitely generated. Proof. Let x1 , . . . , xp be generators of M . Let C = {x ∈ NI | ∃λ1 , . . . , λp ∈ Q+ ∩ [0, 1] : x =
X
λi xi } .
Then C contains each xi and is a set of generators for M ′ . Indeed, if nx = P λi xi ∈ M for some n ≥ 1 and some λi ∈ N, then x=
Xj λi k n
xi +
X λi n
−
j λ k i
n
xi ,
where ⌊z⌋ is the integral part of z. Thus, it suffices to show that C is finite. Let E be the subvector space of RI generated by M ′ . Since E has finite dimension, there exists a finite subset J of I such that the R-linear function p J : E → RJ (pJ is the projection RI → RJ ) is injective. The image of C by pJ is contained in NJ , and it is also contained in the set X K = {y ∈ RJ | ∃λ1 , . . . , λp ∈ [0, 1] : y = λi yi } , where yi = pJ (xi ). Now K is compact and NJ is discrete and closed. Thus K ∩ NJ is finite. It follows that C is finite.
Proof of Theorem 3.3. Let S be a Q+ -rational series with coefficients in N. We use systematically Proposition I.5.1. There exists a finitely generated stable Q+ -submodule in Q+ hhAii that contains S. Denote it by MQ+ . Similarly, the series S is Q-rational with coefficients in Z, and therefore S is Z-rational. Thus, there is a finitely generated Z-submodule in ZhhAii that contains S, say MZ . Then M = MQ+ ∩ MZ is a stable N-submodule of NhhAii containing S, and it suffices to show that M is finitely generated. Let T1 , . . . , Tr be series in MQ+ generating it as a Q+ -module, and let MQ′ + =
X
NTi .
This is a finitely generated N-module. Since MZ is also a finitely generated N-module, the N-module M ′ = MZ ∩ MQ′ + ⊂ NhhAii is finitely generated (this follows from Lemma 3.4, noting that N-module = commutative monoid). Consequently, M = {T ∈ NhhAii | ∃n ≥ 1, nT ∈ M ′ }
86
Chapter V. Changing the Semiring
is, in view of Lemma 3.6, a finitely generated N-module. Finally, the N-module M ∩ MZ is finitely generated by Lemma 3.4. Since M = M ∩ MZ , this proves the theorem. We now give two examples of extensions which are not Fatou extensions. Example 3.1 The ring Z is not a Fatou extension of N. Consider the series X (|w|a − |w|b )2 w . S= w∈{a,b}∗
This series is Z-rational (it is the Hadamard square of the series considered in Example III.3.1) and has coefficients in N. However, it is not N-rational, since otherwise its support would be a rational language (Section III.1), and also the complement of its support. In Example III.3.1, it was shown that this set is not the support of any rational series. Example 3.2 The √ semiring R+ is not a Fatou extension of Q+ (Reutenauer 1977a). Let α = (1/ 5)/2 be the golden ration and let S be the series X (α2(|w|a −|w|b ) + α−2(|w|a −|w|b ) )w, . S= w∈{a,b}∗
Since S = (α2 a + α−2 b)∗ + (α−2 a + α2 b)∗ , the series S is R+ -rational. Moreover, since α is an algebraic integer over Z and 1/α is its conjugate, one has for all n∈N α2n + α−2n ∈ Z . Consequently, S has coefficients in N. Assume that S is Q+ -rational. Then by Theorem 3.3, it is N-rational. However, the language S −1 (2) = {w | (S, w) = 2} is S −1 (2) = {w ∈ {a, b}∗ | |w|a = |w|b } since x + 1/x > 2 for all x > 0, x 6= 1. Since the language S −1 (2) is not rational, the series S is not N-rational (Corollary III.2.6). Thus S is not Q+ -rational.
Exercises for Chapter V 2.1 a) Let P (x) = xk − α1 xk−1 − · · · − αk with αk 6= 0 be a polynomial with integral coefficients, and suppose that all its roots have modulus P ≤ 1. Show that these roots are roots of unity. (Consider the series S = an xn with an = λn1 + · · · + λnk
where the λi ’s are the roots of P with their multiplicities. The minimal polynomial Q of S has the same roots as P , but with multiplicity 1. The
87
3. Fatou Extensions
sequence (an ) takes only a finite number of distinct values, and thus Q has a multiple ofP the form xp − xq .) b) Let S = an xn be a rational series with coefficients in Z and with polynomial growth (that is there exist a constant C and an integer q such that |an | ≤ Cnq for all n ≥ 1). Show that there exist an integer p, and polynomials Pi ∈ Q[t] for 0 ≤ i ≤ p − 1, such that for all i ∈ {0, . . . , p − 1} ai+np = Pi (n) for all but a finite number of n. c) Show that if a polynomial P ∈ Q[t] satisfies P (n) ∈ Z
2.2
3.1 3.2
3.3 3.4
3.5
for n ∈ N ,
then P is a linear combination with integral coefficients, of polynomials t(t − 1) · · · t − q + 1 t = . Show that if moreover P (n) ∈ N for all q q! P n ∈ N, then the series P (n)xn is N-rational. d) Use the previous questions to show that any Z-rational series in one variable with nonnegative coefficients and with polynomial growth is Nrational. (See P´ olya and Szeg¨ o 1964, Exercise 200 in Chapter 4 and Exercise 85 in Chapter 2.) P a) Let θ be a real number. Show that the series S = n≥0 (cos2 nθ)xn is a C-rational series. (Give an expression for S as a rational function by using the formula cos nθ = 1/2(einθ + e−inθ ).) b) Let 0 < a < c be integers and let θ be a real number with 0 < θ < π/2, such that cos θ = a/c. that the numbers cn cos nθ are integers. Show P Show 2n that the series T = (c cos2 nθ)xn is Z-rational with coefficients in N. c) Show that if c 6= a, then z = eiθ is not a root of unity (use the fact that z is an algebraic number of degree ≤ 2, and that the assumption that z is a root of unity of order p implies that φ(p) ≤ p, where φ is Euler’s function). Show that T is not R+ -rational (use Theorem 2.4) (see Berstel 1971, and also Eilenberg 1974). Show that for any rational series S ∈ KhhAii, where K is a commutative field, the subfield generated by its coefficients is a finitely generated field. Show that if K is a subsemiring of L such that each element in L is a right-linear combination of fixed Pp elements ℓ1 , . . . , ℓp in L, then each Lrational series may be written i=1 ℓi Si for some K-rational series Si (see Lemma II.1.2 and Exercise II.1.3). Show that each Z-rational series is the difference of two N-rational series (use Exercise 3.2). Show that under the hypothesis of Exercise 3.2, if φ is a right K-linear mapping L → K, then fore ach L-rational series S, the series φ(S) = P φ((S, w))w is K-rational. w P Show that for any semiring K, ifS is K n×n -rational, then Si,j = S(w)i,j i,j
is K-rational for fixed i, j in {1, . . . , n} (use Exercise 3.4).
88
Chapter V. Changing the Semiring
Notes to Chapter V Fliess (Fliess 1974a) calls a strong Fatou ring a ring K satisfying Theorem 1.1. Sontag and Rouchaleau (1977) show that for a principal ring K, the ring K[t] is a strong Fatou ring. In the case of one variable, the class of strong Fatou rings is completely characterized by Theorem 1.6. (The formulation is different, but it is equivalent by the results of Section IV.1.) For several variables, a complete characterization of strong Fatou rings is still lacking. A slightly different P proof of Theorem 2.4 (which makes no use of Lemma 2.7) shows that if S = an xn is a rational series with nonnegative coefficients in a subfield K of R, then S is K+ -rational. Thus Theorem 2.10 and Theorem 3.3 give a complete characterization of N-rational series. The proof also shows that any R+ -rational series has star-height at most 2 (over R+ ). The star-height of a rational series S ∈ KhhAii is defined as follows. Consider the sequence R0 ⊂ R1 ⊂ · · · ⊂ Rn ⊂ · · · of sets of series, such that the union of the Rn is the set of all rational series. The set R0 is the set of polynomials, and for S, T ∈ Ri , both S + T and ST are in Ri ; if S ∈ Ri is proper, then S ∗ ∈ Ri+1 . The star-height of a series S is the least integer n with S ∈ Rn . A weak Fatou ring is a ring whose quotient field is a Fatou extensionof it. They are completely characterized in Reutenauer (1980a). These rings have a property which is analogous to the case of one variable, as shown by Cahen and Chabert (1975). Karhum¨ aki (1977) has characterized those polynomials P ∈ Z[x1 , . . . , xn ] such that the rational series over A = {x1 , . . . , xn } X P (|w|x1 , . . . , |w|xn )w w∈A∗
is N-rational.
Chapter VI
Decidability Basically, any operation or property of series can be examined from the aspect of effectivity. In this chapter, we present the positive results, that is those concerning decidable properties, and we limit ourselves to the most striking ones. The first section shows that the most fundamental properties concerning supports (emptiness, finiteness) are decidable. However, most of the other standard questions, such as equality of supports, are undecidable. In the second section, we show first that the size of a finite semigroup of matrices can be bounded (Theorem 2.1). This implies that the finiteness is decidable for a matrix semigroup. As a consequence, one can decide whether the image of a rational series is finite. To complete the chapter, series with polynomial growth are studied. A beautiful characterization (Corollary 2.11) is given, and this property is shown to be decidable.
1
Problems of Supports
We start by showing that several problems concerning the support of rational series are decidable. An instance of these problems is a language L = supp(S), where S is a rational series given by a linear representation (λ, µ, γ) or by a rational expression. The proof of Sch¨ utzenberger’s Theorem I.6.1 shows indeed that a linear representation of a rational series can be effectively computed from one of its rational expressions, and conversely, that a rational expression denoting a recognizable series can be derived from a linear representation. Here we assume that the semiring of coefficients is the field Q (any other “computable” field is also convenient). Proposition 1.1 It is decidable whether the support of a rational series is empty. Proof. This proposition is just a restatement of Corollary II.3.6. Indeed, if L = supp(S), with S a rational series given by a linear representation of dimension n, then the rank of S is less than or equal to n. By the corollary, the series S vanishes if and only if (S, w) = 0 for all words w of length at most n − 1. This condition is easy to test. Finally, it suffices to note the equivalence L = ∅ ⇐⇒ S = 0 . 89
90
Chapter VI. Decidability
Remark An analogous proof shows that the equality of two rational series is decidable. It suffices to test whether their difference vanishes. Proposition 1.2 It is decidable whether the support of a rational series is finite. Proof. Let L = supp(S). Then L is finite if and only if S is a polynomial. We prove that given a reduced linear representation (λ, µ, γ) of S of dimension n = rank(S), the series S is a polynomial if and only if µw = 0 for all words w of length n. This will prove the proposition since a reduced linear representation is effectively computable as shown in Section II.3. Assume first that µw = 0 for all words of length n. Then the same relation holds for any word w of length ≥ n. Consequently, (S, w) = λµwγ = 0 for all these words, showing that S is a polynomial of degree ≤ n − 1. Conversely, assume that S is a polynomial of degree d, and let w ∈ supp(S) be a word of length d. For any prefix u of w, of length i, the polynomial u−1 S (see Section II.1) has degree d − i. Indeed, w = uv for some word v with |v| = d − i, and (u−1 S, v) = (S, uv) = (S, w) 6= 0, showing that the degree of u−1 S is at least d − i. Moreover, if a word t has length > d − i, then (u−1 S, t) = (S, ut) = 0 since |ut| > d. This shows that u−1 S has degree d − i. It follows that the d+ 1 polynomials of the form u−1 S, where u runs through the prefixes of w, are linearly independent. Consequently dim(S ◦QhAi) ≥ d+1. This dimension is precisely the rank n of the series, by Theorem II.1.5 and Corollary II.1.4. Thus n ≥ d + 1. Consider now a word w of length n. Since |w| > d, one has (S, uwv) = 0 for all words u, v. Consequently the ideal of QhAi generated by w is contained in KerS, and therefore is contained in the syntactic ideal I of S. In particular, w ∈ I. Since I = Kerµ by Corollary II.2.2, it follows that µw = 0. We now consider an undecidable problem: It is undecidable whether the support of a rational series is the whole free submonoid A∗ . Indeed, setting L = supp(S), one has the equivalence L 6= A∗ ⇐⇒ (S, w) = 0 for some w ∈ A∗ . The claim then follows from the undecidability of the following problem (Tenth problem of Hilbert; theorem of Davis, Putman, Robinson, Matijacevic, Cudnowskii, see Manin 1977, Theorem VI.1.2 et seq.): given a (commutative) polynomial P ∈ Z[x1 , . . . , xn ], does there exist an n-tuple (α1 , . . . , αn ) ∈ Nn such that P (α1 , . . . , αn ) = 0? To such a polynomial P , we indeed associate the series S ∈ Qhhx1 , . . . , xn ii defined for a word w by (S, w) = P (|w|x1 , . . . , |w|xn ) . This series is rational (Example I.5.2). Clearly, there exists a word w with (S, w) = 0 if and only if P vanishes for some n-tuple of nonnegative integers. The undecidability of this problem also implies the undecidability of the following question: Given two supports of rational series, are they equal? See Exercise 1.2 for a proof of these undecidability results using the Post correspondence problem instead of Hilbert’s tenth problem.
2. Growth
2
91
Growth
We first give a result concerning finite monoids of matrices. Recall that for a given word w, we denote by w∗ the submonoid generated by w. Theorem 2.1 (Jacob 1978, Mandel and Simon 1977) Let µ : A∗ → Qn×n be a monoid morphism such that, for all w ∈ A∗ , the monoid µw∗ is finite. Then there exists an effectively computable integer N depending only on |A| and n such that |µ(A∗ )| ≤ N . As we shall see, the function (|A|, n) 7→ N grows extremely rapidly. There exists however one case where there is a reasonable bound (which moreover does not depend on |A|), namely the case described in the lemma below. A set E of matrices in Qn×n is called irreducible if there is no subspace of 1×n Q other than 0 and Q1×n stable for all matrices in E (the matrices act on the right on Q1×n ). Lemma 2.2 Let M ⊂ Qn×n be an irreducible monoid of matrices such that all nonvanishing eigenvalues of matrices in M are roots of unity. Then |M | ≤ 2 (2n + 1)n . Proof. Let m ∈ M . The eigenvalues 6= 0 of m are roots of unity, whence algebraic integers over Z. The same holds for tr(m). Since tr(m) ∈ Q and Z is integrally closed, this implies that tr(m) ∈ Z. The norm of each eigenvalue is 0 or 1. Thus | tr(m)| ≤ n. This shows that tr(m) takes at most 2n + 1 distinct values for m ∈ M . Let m1 , . . . , mk ∈ M be a basis of the subspace N of Qn×n generated by M . Clearly k ≤ n2 . Define an equivalence relation ∼ on M by m ∼ m′ ⇐⇒ tr(mmi ) = tr(m′ mi ) for i = 1, . . . , k . The number of equivalence classes of this relation is at most (2n + 1)k . In order to prove the lemma, it suffices to show that m ∼ m′ implies m = m′ . Let m, m′ ∈ M be such that m ∼ m′ . Set p = m − m′ , and assume p 6= 0. There exists a vector v ∈ Q1×n such that vp 6= 0. It follows that the subspace vpN of Q1×n is not the null space. Since it is stable under M and M is irreducible, one has vpN = Q1×n . Consequently, there exists some q ∈ N such that vpq = v. This shows that pq has the eigenvalue 1. Now, for all integers j ≥ 1, tr((pq)j ) = tr(pq(pq)j−1 ) = 0 because q(pq)j−1 is a linear combination of the matrices m1 , . . . , mk , and by assumption tr(pr) = 0 for r ∈ M . Newton’s formulas show that all eigenvalues of pq vanish. This yields a contradiction. For the proof of Theorem 2.1, we need another lemma.
92
Chapter VI. Decidability
Lemma 2.3 (Sch¨ utzenberger 1962c) Let µ : A∗ → Qn×n be a morphism into a monoid of matrices which are triangular by blocks ′ µ ν µ= . 0 µ′′ Assume that µ′ A∗ and µ′′ A∗ are finite, and that µw∗ is finite for any word w. Then X |A|i . |νA∗ | ≤ 0≤i<|µ′ A∗ |2 |µ′′ A∗ |2
Proof. Let µ1 = (µ′ , µ′′ ) and set T = {(x′ , z, x′′ ) ∈ A∗ × A+ × A∗ | µ1 (x′ z) = µ1 x′ , µ1 (zx′′ ) = µ1 x′′ } . (i) Let w be a word of length ≥ H 2 , with H = |µ′ A∗ ||µ′′ A∗ |. Then there exists a factorization w = x′ zx′′ such that (x′ , z, x′′ ) ∈ T . Indeed, the set {(x, y) | w = xy} has at least 1 + H 2 elements, and since µ1 A∗ × µ1 A∗ has H 2 elements, there are two distinct factorizations w = xy = x′ y ′ such that µ1 x = µ1 x′ and µ1 y = µ1 y ′ . Assume for instance that |x| < |x′ |. Then there is a word z 6= 1 such that x′ = xz, y = zy ′ . Thus w = xzy ′ with (x, z, y ′ ) ∈ T . (ii) For any t = (x′ , z, x′′ ) ∈ T , one has ν(x′ z n x′′ ) = ν(x′ x′′ ) + nνt . where νt = µ′ x′ νz µ′′ x′′ . Indeed, the identity n ab = 0c
an 0
P
ak bcℓ
k+ℓ=n−1 n
c
!
shows that νz n =
X
µ′ z k νzµ′′ z ℓ .
k+ℓ=n−1 0≤k,ℓ
Since (x′ , z, x′′ ) ∈ T , this equation implies X X µ′ x′ νz n µ′′ x′′ = µ′ x′ µ′ z k νzµ′′ z ℓ µ′′ x′′ = µ′ (x′ z k )νzµ′′ (z ℓ x′′ ) X = µ′ x′ νzµ′′ x′′ = nνt .
93
2. Growth Finally ν(x′ z n x′′ ) = νx′ µ′′ (z n x′′ ) + µ′ x′ ν(z n )µ′′ x′′ + µ′ (x′ z n )νx′′ = νx′ µ′′ x′′ + nνt + µ′ x′ νx′′ = ν(x′ x′′ ) + nνt .
(iii) Consider now a word w ∈ A∗ with |w| ≥ H 2 . In view of (i), the word w admits a factorization w = x′ zx′′ with t = (x′ , z, x′′ ) ∈ T . By (ii), for all n ≥ 0, ν(x′ z n x′′ ) = ν(x′ x′′ ) + nνt . Since µz ∗ is finite, µ(x′ z ∗ x′′ ) is also finite, and consequently also ν(x′ z ∗ x′′ ). This implies that νt = 0 and ν(w) = ν(x′ zx′′ ) = ν(x′ x′′ ). Since |x′ x′′ | < |w|, the lemma follows. Proof of Theorem 2.1. Assume first that the monoid µA∗ is irreducible, and consider any matrix µw ∈ µA∗ . Since µw∗ is finite, there are integers 0 ≤ i < j with µwi = µwj . But this implies that the eigenvalues of w are 0 or roots of unity. The theorem thus follows from Lemma 2.2. If µA∗ is not irreducible, there is some subspace V of Q1×n which is stable under µA∗ . Consider a supplementary space W of V . In a basis which is adapted to the decomposition Q1×n = W ⊕ V , the morphism µ admits the form described in Lemma 2.3. Arguing by induction on the dimension of the representation, the result follows from Lemma 2.3. Corollary 2.4 (McNaughton and Zalcstein 1975) If every matrix of a finitely generated monoid M of matrices over Q generates a finite monoid, then the monoid M is finite. Recall that a ray is a subset of A∗ of the form uv ∗ w, with u, v, w ∈ A∗ . Corollary 2.5 (Reutenauer 1977b) Let S ∈ QhhAii be a rational series such that for any ray R, the set {(S, w) | w ∈ R} is finite. Then the set of coefficients of S is finite. Proof. Let (λ, µ, γ) be a reduced linear representation of S. By Corollary II.2.3, there exist polynomials P1 , . . . , Pn , Q1 , . . . , Qn such that for all words w, µw = ((S, Pi wQj ))1≤i,j≤n . By assumption, the set {(S, uwm v) | m ∈ N} is finite for all words u, v, w. The same holds for the set {(S, P wm Q) | m ∈ N} where P, Q are polynomials. This shows that µw∗ is finite for any word w. By Corollary 2.4, the monoid µA∗ is finite, and in particular {(S, w) | w ∈ A∗ } is finite, since (S, w) = λµwγ. Corollary 2.6 (Jacob 1978) It is decidable whether a finite set of matrices over Q generates a finite monoid.
94
Chapter VI. Decidability
Proof. By Theorem 2.1, there is an upper bound on the size of such a monoid if it is finite. Let E be the finite set of matrices, M the monoid generated by E, and let N be the upper bound given in Theorem 2.1. Then M is finite if and only if every product of N matrices in E equals a product of at most N − 1 matrices in E. This last condition is clearly decidable. Corollary 2.7 (Jacob 1978) It is decidable whether a rational series has a finite image. Recall that the image of a series is the set of its coefficients. We now turn our attention to questions concerning growth of rational series over Z. A series S ∈ ZhhAii has polynomial growth if there exist an integer q ≥ 0 and a real number C such that |(S, w)| ≤ C|w|q for all nonempty words w. The smallest q satisfying the inequality, provided it exists, is called the degree of growth of S. Observe that series with degree of growth 0 are precisely the series with finite image. Theorem 2.8 Let S ∈ ZhhAii be a rational series and let (λ, µ, γ) be a reduced linear representation of S. Then S has polynomial growth if and only if the set {tr(µw) | w ∈ A∗ } is finite. Proof. Suppose first that S has polynomial growth. Then there exist, by Corollary II.2.3, real numbers C, q such that |µw| ≤ C|w|q for all words w. Consequently, for every eigenvalue ρ of µw one has |ρ|r ≤ C ′ rq for some constant C ′ . Thus |ρ| ≤ 1. This implies that −n ≤ tr(µw) ≤ n, where n is the dimension of µ. Since S is Z-rational, there exits a reduced linear representation with coefficients in Z (Theorem V.1.1). This representation is similar to (λ, µ, γ) by Theorem II.2.4 and consequently, the trace of any matrix µw is an integer. Thus each tr(µw) is in {−n, . . . , n}. Conversely, suppose that the set {tr(µw) | w ∈ A∗ } is finite. Let w be a word and let λ1 , . . . , λn be the eigenvalues of µw with their multiplicities. The sequence X p ap = λi = tr(µwp ) 1≤i≤n
takes only a finite number of distinct values. Since it satisfies a linear recurrence relation, it is ultimately periodic, and there is a relation ap+h = ap+k
p≥0
for some h, k ∈PN, h > k. The minimal polynomial (see Section IV.1) of the ap xp divides the polynomial xh − xk . Consequently, the roots rational series p∈N
of this series (in the sense defined in Section IV.1) are roots of unity or 0. In view of the unicity of the exponential polynomial (Section IV.2), the λi are roots of unity or 0.
95
2. Growth
Next, if the monoid µA∗ is not irreducible, then µ can be put, by changing the basis, into the form ′ µ ν µ= 0 µ′′ Arguing by induction, µ is equivalent to a morphism that is triangular by blocks µ1 ∗ ∗ · · · ∗ µ2 ∗ · · · ∗ µ= .. . ∗ 0 0
· · · µn
where each µi A∗ is irreducible. By Lemma 2.2, and according to our computations, all monoids µi A∗ are finite. To complete the proof, it suffices to apply the following lemma iteratively, observing that the product of series with polynomial growth also has polynomial growth. Lemma 2.9 Let ′ µ ν µ= 0 µ′′ be a morphism A∗ → K n×n , where K is a commutative semiring. Every series recognized by µ is a linear combination of series recognized by µ′ or by µ′′ and of series of the form S ′ aS ′′ , where S ′ is recognized by µ′ , a ∈ A and S ′′ is recognized by µ′′ .
Proof. A series recognized by µ is a linear combinations of series of the form X (µw)i,j w (2.1) w
with 0 ≤ i, j ≤ n. It suffices to show that when i, j are coordinates corresponding to ν, the series (2.1) is a linear combination of series of the form S ′ aS ′′ . This is a consequence of the formula X νw = µ′ x· νa· µ′′ y . w=xay
Corollary 2.10 It is decidable whether a rational series S ∈ ZhhAii has polynomial growth. Proof. A reduced linear representation (λ, µ, γ) of S can effectively be computed. Then according to Theorem 2.8, the series S has polynomial growth if and only if the series X tr(µw)w w
has a finite image. This series is rational (Lemma II.1.2) and it is decidable, by Corollary 2.7 whether a rational series has a finite image.
96
Chapter VI. Decidability
Corollary 2.11 (Sch¨ utzenberger 1962c) The set of Z-rational series of polynomial growth is equal to the Z-subalgebra of ZhhAii generated by the characteristic series of rational languages. Proof. Let S ∈ ZhhAii be a rational series having polynomial growth, and let (λ, µ, γ) be a reduced linear representation of S. We may assume, by Theorem V.1.1, that (λ, µ, γ) has integral coefficients. The second part of the proof of Theorem 2.8 shows that, after a change of the basis of Q1×n , µ has a decomposition of the form µ1 ∗ ∗ · · · ∗ µ2 ∗ · · · ∗ µ= .. . ∗ 0 0 · · · µn
In fact, since Z is a principal ring, the change of basis can be done in Z1×n . Each µi A∗ is finite. Observe now that any series recognized by a morphism µ′ : A∗ → Qp×p with ′ ∗ µ A finite, is a linear combination of characteristic series of rational languages. This is a consequence of Theorem III.2.8. To complete the proof, it suffices to apply Lemma 2.9. Conversely, each characteristic series of a rational language is rational (Proposition III.2.1), and the set of rational series with polynomial growth is indeed a subalgebra of ZhhAii.
Exercises for Chapter VI 1.1 Show that the following problem is undecidable. Given a rational series S ∈ QhhAii, are there infinitely many words w such that (S, w) = 0? 1.2 Use the undecidability of the Post Correspondence Problem and Exercise III.3.2 to give another proof of the undecidability property of Section 1 (recall that Post’s problem is whether or not a given equality set is empty.) 2.1 Let S ∈ QhhAii be a rational series such that, for every ray R, almost all coefficients (S, w), w ∈ R, vanish. Show that S is a polynomial. 2.2 Let S ∈ NhhAii be an N-rational series having a polynomial growth. Show that S is in the N-subalgebra of NhhAii generated by the characteristic series of rational languages (use a rational expression for S and the fact that if T ∈ NhhAii is not the characteristic series of the basis of a free submonoid of A∗ , then the growth of T ∗ is not polynomial).
Notes to Chapter VI Most of the results of Section 2 hold in arbitrary fields. Theorem 2.1 can be extended, but the bound N then also depends on the field considered. Lemma 2.3 and Corollaries 2.4, 2.5 hold in arbitrary fields, and Lemma 2.2 holds in fields 2 2 of characteristic 0, provided the bound (2n + 1)n is replaced by rn , where r is the size of the set {tr(m) | m ∈ M }. This set is always finite (under the assumptions of the lemma) for a finite monoid M . Corollaries 2.6, 2.7 extend to
97
2. Growth
“computable” fields. Corollary 2.11 is a special case of a result of Sch¨ utzenberger (1962c): for any integer q ≥ 0, the Z-module of rational series S satisfying a condition of the form |(S, w)| ≤ C|w|q
(w ∈ A∗ )
is equal to the Z-module generated by the products of at most q+1 characteristic series of rational languages. He shows also that the degree of growth of a rational series of ZhhAii is always an integer or ∞.
98
Chapter VI. Decidability
Chapter VII
Noncommutative polynomials This chapter deals with algebraic properties of noncommutative polynomials. They are of independent interest, but most of them will be of use in the next chapter. In contrast to commutative polynomials, the algebra of noncommutative polynomials is not Euclidean, and not even factorial. However, there are many interesting results concerning factorization of noncommutative polynomials: this is one of the major topics of the present chapter. The basic tool is Cohn’s weak algorithm (Theorem 1.1) which is the subject of Section 1. This operation constitutes a natural generalization of the classical Euclidean algorithm. Section 2 deals with continuant polynomials which describe the multiplicative relations between noncommutative polynomials (Theorem 2.2). We introduce in Section 3 cancellative modules over the ring of polynomials. We characterize these modules (Theorem 3.1) and obtain, as consequences, results on full matrices, factorization of polynomials, and inertia. The main result of Section 4 is the (easy) extension of Gauss’s lemma to noncommutative polynomials.
1
The Weak Algorithm
Let K be a commutative field and let A be an alphabet. Recall that the degree of a polynomial P in KhAi was defined in Section I.2: we will denote it by deg(P ). We recall the usual facts about the degree, that is deg(0) = −∞ deg(P + Q) ≤ max(deg(P ), deg(Q))
deg(P + Q) = deg(P ), if deg(Q) < deg(P ) deg(P Q) = deg(P ) + deg(Q) .
(1.1) (1.2)
Note that the last equality shows that KhAi is an integral domain, that is P Q = 0 implies
P = 0 or Q = 0 . 99
100
Chapter VII. Noncommutative polynomials
Definition A finite family P1 , . . . , Pn of polynomials in KhAi is (right) dependent if either some Pi = 0 or if there exist polynomials Q1 , . . . , Qn such that X Pi Qi < max(deg(Pi Qi )) . deg i
i
Definition A polynomial P is (right) dependent family!dependent – on the family P1 , . . . , Pn if either P = 0 or if there exist polynomials Q1 , . . . , Qn such that X Pi Qi < deg(P ) deg P − i
and if furthermore for any i = 1, . . . , n deg(Pi Qi ) ≤ deg(P ) . Note that if P is dependent on P1 , . . . , Pn then the family P, P1 , . . . , Pn is dependent. The converse is given by the following theorem. Theorem 1.1 (Cohn 1961) Let P1 , . . . , Pn be a dependent family of polynomials with deg(P1 ) ≤ · · · ≤ deg(Pn ) . Then some Pi is dependent on P1 , . . . , Pi−1 . Let P be a polynomial and let u be a word in A∗ . We define the polynomial P u as X P u−1 = (P, wu)w . −1
w∈A∗
The operator P 7→ P u−1 is symmetric to the operator P 7→ u−1 P which was introduced in Section I.5. It is easy to verify that this operator is linear, and that the following relations hold: deg(P u−1 ) ≤ deg(P ) − |u| −1
P (uv)
= (P v
−1
)u
−1
(1.3) (1.4)
Moreover, for any letter a, (P Q)a−1 = P (Qa−1 ) + (Q, 1)P a−1
(1.5)
where (Q, 1) denotes as usual the constant term of Q. The last equality is simply the symmetric equivalent of Lemma I.6.2. Lemma 1.2 If P, Q are polynomials and w is a word, then there exists a polynomial P ′ such that (P Q)w−1 = P (Qw−1 ) + P ′ with either P = P ′ = 0 or deg(P ′ ) < deg(P ).
101
1. The Weak Algorithm
Proof. We may assume P 6= 0. If w is the empty word, then (P Q)w−1 = P Q and Qw−1 = Q, so that (P Q)w−1 = P (Qw−1 ) and the proof is complete. Let w = au with a a letter. Then by induction one has (P Q)u−1 = P (Qu−1 ) + P ′ deg(P ′ ) < deg(P ) Now, by Eq. (1.4), one has (P Q)w−1 = ((P Q)u−1 )a−1 = (P (Qu−1 ))a−1 + P ′ a−1 . Thus, by Eqs.(1.5) and (1.4), we have (P Q)w−1 = P ((Qu−1 )a−1 ) + (Qu−1 , 1)P a−1 + P ′ a−1 = P (Qw−1 ) + P ′′ with P ′′ = (Qu−1 , 1)P a−1 + P ′ a−1 . Next, by Eq. (1.3), deg(P a−1 ) < deg(P ) and deg(P ′ a−1 ) ≤ deg(P ′ )−|a| < deg(P ). Hence deg(P ′′ ) < deg(P ), as desired. ProofP of Theorem 1.1. We may suppose that no Pi is equal to 0. Hence deg( Pi Qi ) < maxi (deg(Pi Qi )). Let r P = maxi (deg(Pi Qi )) and let I = {i | Pi Qi has degree deg(R) < r. Let deg(Pi Qi ) = r}. The polynomial R = i∈I
k = sup(I); then i ∈ I =⇒ deg(Pi ) ≤ deg(Pk ). Let w be a word such that |w| = deg(Qk ) and 0 6= (Qk , w) = α−1 ∈ K: such a word exists because Qk 6= 0 (otherwise deg(R) < r = deg(Pk Qk ) = −∞). By Lemma 1.2, we have X X Rw−1 = Pi (Qi w−1 ) + Pi′ i∈I
i∈I
for some polynomials Pi′ with deg(Pi′ ) < deg(Pi ). Since Qk w−1 = α−1 , X X Pi (Qi w−1 ) = αRw−1 − α Pi′ . Pk + α i∈I\k
(1.6)
i∈I
Now, by Eq. (1.3) deg(Rw−1 ) ≤ deg(R) − |w| < r − |w| = deg(Pk Qk ) − deg(Qk ) = deg(Pk ) . Furthermore, deg(Pi′ ) < deg(Pi ) ≤ deg(Pk ). Consequently, by Eq. (1.1), the degree of the right-hand side of Eq. (1.6) is < deg(Pk ). Moreover, deg(Pi (Qi w−1 )) = deg(Pi ) + deg(Qi w−1 ) ≤ deg(Pi ) + deg(Qi ) − deg(Qk ) by Eq. (1.3). So we have deg(Pi (Qi w−1 )) ≤ r − deg(Qk ) = deg(Pk ). This shows that Pk is dependent on Pi , i ∈ I \k; hence Pk also is dependent on P1 , . . . , Pk−1 . For two polynomials X, Y in KhAi, the (left) Euclidean division of X and Y (that is the problem of finding polynomials Q and R such that X = Y Q + R and deg(R) < deg(Y )) is not always possible. However, the next result gives a necessary and sufficient condition for this.
102
Chapter VII. Noncommutative polynomials
Corollary 1.3 Let X, Y, P, Q1 , Q2 , R1 be polynomials such that XP + Q1 = Y Q2 + R1 with P 6= 0, deg(Q1 ) ≤ deg(P ), deg(R1 ) < deg(Y ) . Then there exists polynomials Q and R such that X = Y Q + R with deg(R) < deg(Y ) (that is, Euclidean division of X by Y is possible). Proof. Note that Y 6= 0 (otherwise deg(R1 ) < −∞). If Y ∈ K, the corollary is immediate (take Q = Y −1 X and R = 0). Otherwise, we prove it by induction on deg(X). If deg(X) < deg(Y ), the proof is immediate (take Q = 0 and R = X). Suppose that deg(X) ≥ deg(Y ). Then deg(Q1 ) ≤ deg(P ) < deg(XP ) because 1 ≤ deg(Y ) ≤ deg(X) and deg(R1 ) < deg(Y ) ≤ deg(X) ≤ deg(XP ) because 0 ≤ deg(P ). Thus, deg(Q1 ) and deg(R1 ) are both < max(deg(XP ), deg(Y Q2 )) and by Eq. (1.1), deg(R1 − Q1 ) < max(deg(XP ), deg(Y Q2 )). In view of Theorem 1.1, X is dependent on Y , that is there exist two polynomials Q3 and X1 such that X = Y Q3 + X1 with deg(X1 ) < deg(X). Put this expression for X into the initial equality. This gives X1 P + Q1 = Y (Q2 − Q3 P ) + R1 . Since deg(X1 ) < deg(X), we have by induction X1 = Y Q4 + R with deg(R) < deg(Y ). Thus X = Y Q3 + Y Q4 + R, which proves the corollary. The next result is a particular case of the previous one. Corollary 1.4 If X, Y, X ′ , Y ′ are nonzero polynomials such that XY ′ = Y X ′ , then there exist polynomials Q, R such that X = Y Q + R and deg(R) < deg(Y ).
2
Continuant Polynomials
Definition Let a1 , . . . , an be a finite sequence of polynomials. We define the sequences p0 , . . . , pn of continuant polynomials (with respect to a1 , . . . , an ) in the following way: p0 = 1, p1 = a1 , and for 2 ≤ i ≤ n, pi = pi−1 ai + pi−2 .
2. Continuant Polynomials
103
Example 2.1 The first continuant polynomials are p 2 = a1 a2 + 1 p 3 = a1 a2 a3 + a1 + a3 p 4 = a1 a2 a3 a4 + a1 a2 + a1 a4 + a3 a4 + 1 Notation We shall write p(a1 , . . . , ai ) for pi . It is easy to see that the continuant polynomials may be obtained by the “leap-frog construction”: consider the “word” a1 · · · an and all words obtained by repetitively suppressing some factors of the form ai ai+1 in it. Then p(a1 , . . . , an ) is the sum of all these “words”. Now, we have by definition p(a1 , . . . , an ) = p(a1 , . . . , an−1 )an + p(a1 , . . . , an−2 ) .
(2.1)
The combinatorial construction sketched above shows that symmetrically p(a1 , . . . , an ) = a1 p(a2 , . . . , an ) + p(a3 , . . . , an ) .
(2.2)
An equivalent but useful relation is p(an , . . . , a1 ) = an p(an−1 , . . . , a1 ) + p(an−2 , . . . , a1 ) .
(2.3)
Proposition 2.1 (Wedderburn 1932) The continuant polynomials satisfy the relation p(a1 , . . . , an )p(an−1 , . . . , a1 ) = p(a1 , . . . , an−1 )p(an , . . . , a1 ) .
(2.4)
Proof. This is surely true for n = 1. Suppose n ≥ 2. Then by Eq. (2.1), p(a1 , . . . , an )p(an−1 , . . . , a1 ) = p(a1 , . . . , an−1 ) an p(an−1 , . . . , a1 ) + p(a1 , . . . , an−2 )p(an−1 , . . . , a1 ) which is equal by induction to p(a1 , . . . , an−1 ) an p(an−1 , . . . , a1 ) + p(a1 , . . . , an−1 )p(an−2 , . . . , a1 ) . This is equal, by Eq. (2.3), to p(a1 , . . . , an−1 )p(an , . . . , a1 ) as desired. Theorem 2.2 (Cohn 1969) Let X, Y, X ′ , Y ′ be nonzero polynomials such that XY ′ = Y X ′ . Then there exists polynomials U, V, a1 , . . . , an with n ≥ 1 such that X = U p(a1 , . . . , an ), Y ′ = p(an−1 , . . . , a1 )V Y = U p(a1 , . . . , an−1 ), X ′ = p(an , . . . , a1 )V . Moreover, one has deg(a1 ), . . . , deg(an−1 ) ≥ 1, and if deg(X) > deg(Y ), then deg(an ) ≥ 1.
104
Chapter VII. Noncommutative polynomials
Proof. (i) Suppose first that X is a right multiple of Y , that is X = Y Q. Then the theorem is obvious for U = Y , V = Y ′ , n = 1, a1 = Q; then indeed X = Y Q = U p(a1 ), Y ′ = 1 · V, Y = U · 1 and Y X ′ = XY ′ = Y QY ′ , whence X ′ = QY ′ = p(a1 )V . Furthermore, if deg(X) > deg(Y ), then deg(Q) ≥ 1. (ii) Next, we prove the theorem in the case where deg(X) > deg(Y ), by induction on deg(Y ). If deg(Y ) = 0, then X is a right multiple of Y and we may apply (i). Suppose deg(Y ) ≥ 1. By Corollary 1.4, X = Y Q + R for some polynomials Q and R such that deg(R) < deg(Y ). If R = 0, apply (i). Otherwise, we have Y X ′ = XY ′ = Y QY ′ + RY ′ , hence Y (X ′ − QY ′ ) = RY ′ ; note that Y, R, Y ′ 6= 0, hence X ′ − QY ′ 6= 0. Furthermore, deg(R) < deg(Y ), and we may apply the induction hypothesis: there exist polynomials U, V, a1 , . . . , an such that Y = U p(a1 , . . . , an ), X ′ − QY ′ = p(an−1 , . . . , a1 )V
R = U p(a1 , . . . , an−1 ), Y ′ = p(an , . . . , a1 )V deg(a1 ), . . . , deg(an ) ≥ 1 .
(2.5)
Hence X = Y Q + R = U p(a1 , . . . , an )Q + p(a1 , . . . , an−1 ) = U p(a1 , . . . , an , Q)
by Eq. (2.1). Similarly, X ′ = p(Q, an , . . . , a1 )V . Thus X, Y, X ′ , Y ′ admit the announced expression. Furthermore, deg(Q) ≥ 1; indeed, by Eq. (1.2), deg(X) = deg(Y Q) = deg(Y ) + deg(Q), and hence deg(Q) = deg(X) − deg(Y ) ≥ 1. This prove the theorem in the case where deg(X) > deg(Y ). (iii) In the general case, one has again X = Y Q + R with deg(R) < deg(Y ) (Corollary 1.4). If R = 0, the proof is completed by (i). Otherwise, as above, Y (X ′ − QY ′ ) = RY ′ with deg(Y ) > deg(R). Hence we may apply (ii): there exist U, V, a1 , . . . , an such that Eq. (2.5) holds. Then we obtain, as in (ii): X = U p(a1 , . . . , an , Q), Y ′ = p(an , . . . , a1 )V Y = U p(a1 , . . . , an ), X ′ = p(Q, an , . . . , a1 )V . This proves the theorem. Proposition 2.3 Let a1 , . . . , an be polynomials such that a1 , . . . , an−1 have positive degree, and let Y be a polynomial of degree 1 such that p(an−1 , . . . , a1 ) and p(an . . . , a1 ) are both congruent to a scalar modulo the right ideal Y KhAi. Then for i = 1, . . . , n p(ai , . . . , a1 ) ≡ p(a1 , . . . , ai )
mod Y KhAi .
We prove first a lemma. Lemma 2.4 Let a1 , . . . , an be polynomials such that a1 , . . . , an−1 have positive degree. Then the degrees of 1, p(a1 ), . . . , p(an−1 , . . . , a1 ) are strictly increasing.
105
2. Continuant Polynomials Proof. Obviously deg(1) < deg(a1 ). Suppose deg(p(ai−2 , . . . , a1 )) < deg(p(ai−1 , . . . , a1 )) for 2 ≤ i ≤ n − 1. From the relation p(ai , . . . , a1 ) = ai p(ai−1 , . . . , a1 ) + p(ai−2 , . . . , a1 ) ,
it follows that the degree of p(ai , . . . , a1 ) is equal to deg(ai p(ai−1 , . . . , a1 )), and deg(ai p(ai−1 . . . , a1 )) = deg(ai ) + deg(p(ai−1 , . . . , a1 )) > deg(p(ai−1 , . . . , a1 )) because deg(ai ) ≥ 1. This proves the lemma. Proof of Proposition 2.3 (Induction on n). When n = 1, the result is evident. Suppose n ≥ 2. Note that if the condition on the degrees is fulfilled for a1 , . . . , an , then a fortiori also a1 , . . . , an−2 have positive degree. By assumption, p(an , . . . , a1 ) is congruent to some scalar α and p(an−1 , . . . , a1 ) is congruent to some scalar β mod. Y KhAi. Suppose p(an−1 , . . . , a1 ) = 0. Then by Eq. (2.3), we have p(an−2 , . . . , a1 ) ≡ α = α − βγ for any γ, because β = 0 in this case. Suppose p(an−1 , . . . , a1 ) 6= 0. Then by Eq. (2.3), an p(an−1 , . . . , a1 ) + p(an−2 , . . . , a1 ) = Y Q + α for some polynomial Q. As deg(p(an−2 , . . . , a1 )) < deg(p(an−1 , . . . , a1 )) by Lemma 2.4, we obtain by Corollary 1.3 that an ≡ γ mod Y KhAi for some scalar γ. Using Eq. (2.3) again, and the fact that P ≡ γ, Q ≡ β =⇒ P Q ≡ γβ, we obtain p(an−2 , . . . , a1 ) ≡ α − γβ. In both cases, the induction hypothesis gives p(a1 , . . . , an−2 ) ≡ α − γβ and p(a1 , . . . , an−1 ) ≡ β. Hence, by Eq. (2.1), p(a1 , . . . , an ) ∈ (β + Y KhAi)(γ + Y KhAi) + α − βγ + Y KhAi, and consequently p(a1 , . . . , an ) ≡ βγ + α − γβ ≡ p(an . . . , a1 ), as desired. Lemma 2.5 Let a1 , . . . , an be polynomials. Then p(a1 , . . . , an ) = 0 ⇐⇒ p(an , . . . , a1 ) = 0 . Proof (Induction on n). The lemma is evidently true for n = 0, 1. Suppose n ≥ 2. It is enough to show that p(a1 , . . . , an ) = 0 implies p(an , . . . , a1 ) = 0. Now, by Eq. (2.4), p(a1 , . . . , an )p(an−1 , . . . , a1 ) = p(a1 , . . . , an−1 )p(an , . . . , a1 ) . Suppose p(a1 , . . . , an ) = 0. If p(a1 , . . . , an−1 ) 6= 0, then p(an , . . . , a1 ) = 0 because KhAi is an integral domain. If p(a1 , . . . , an−1 ) = 0, then p(an−1 , . . . , a1 ) = 0 by induction. Hence, by Eqs. (2.1) and (2.3) p(a1 , . . . , an ) = p(a1 , . . . , an−2 ) and p(an , . . . , a1 ) = p(an−2 , . . . , a1 ). By induction, p(a1 , . . . , an−2 ) and p(an−2 , . . . , a1 ) simultaneously vanish, which proves the lemma.
106
3
Chapter VII. Noncommutative polynomials
Inertia
Recall that KhAip×q denotes the set of p by q matrices over KhAi. In particular, KhAin×1 is the set of column vectors of order n over KhAi. This set has a natural structure of right KhAi-module. If V is in KhAin×1 , we denote by (V, 1) its constant term, that is, setting P1 .. V = . Pn
one has
(P1 , 1) (V, 1) = ... ∈ KhAin×1 . (Pn , 1)
Furthermore, if w is a word in A∗ , we denote by V w−1 the vector P1 w−1 V w−1 = ... . Pn w−1
We have the following relation X V = (V, 1) + (V a−1 )a .
(3.1)
a∈A
Definition A (right) submodule E of KhAin×1 is cancellative if, whenever V ∈ E and (V, 1) = 0, then V a−1 ∈ E for any letter a ∈ A. This property of vectors of polynomials is closely related to (but weaker than) the property of stability introduced in Section I.5. The next result characterizes cancellative submodules and will be the key to all the results of this section. Theorem 3.1 A submodule E of KhAin×1 is cancellative if and only if it may be generated, as a right KhAi-module, by p vectors V1 , . . . , Vp such that the matrix ((V1 , 1), . . . , (Vp , 1)) ∈ K n×p is of rank p. In this case, p ≤ n and V1 , . . . , Vp are linearly KhAi-independent. Proof. 1. We begin with the easy part: suppose that E is generated by V1 , . . . , Vp as indicated. Let V ∈ E with (V, 1) = 0. Then V =
X
1≤i≤p
Vi Pi
(Pi ∈ KhAi) .
Taking constant terms, we obtain X 0 = (V, 1) = (Vi , 1)(Pi , 1) .
107
3. Inertia
Because of the rank condition, we have (Pi , 1) = 0 for any i. Hence Pi = P (Pi a−1 )a, which shows that a∈A
V =
X
Vi (Pi a−1 )a .
i, a
By Eq. (3.1) we obtain X Vi (Pi a−1 ) . V a−1 = i
hence V a−1 ∈ E, as desired. 2. Let E be If V ∈ KhAin×1 , V may be P a cancellative submodule of KhAi.n×1 are almost all zero. Let written V = w∈A∗ (V, w)w where (V, w) ∈ KhAi deg(V ) be the maximal length of a word w such that (V, w) 6= 0. Claim. There are vectors V1 , . . . , Vp in E such that
(i) deg(V1 ) ≤ deg(V2 ) ≤ · · · ≤ deg(Vp ) . (ii) The vectors (Vi , 1) form a K-basis of the K-space (E, 1) = {(V, 1) | V ∈ E}. (iii) If V ∈ E and deg(V ) < deg(Vi ) then (V, 1) is a K-linear combination of (V1 , 1), . . . , (Vi−1 , 1). Suppose the claim is true. Then the matrix ((V1 , 1), . . . , (VP p , 1)) has rank p. We show by induction on deg(V ) that each V ∈ E is in E ′ = 1≤i≤p Vi KhAi. If deg(V ) = −∞, that is V = 0, it is obvious. Let deg(V ) ≥ 0 and let i be the smallest integer such that deg(V ) < deg(Vi ) (with i = p + 1 if such an integer does not exist). Then deg(V ) ≥ deg(V1 ), . . . , deg(Vi−1 ). Moreover, if i ≤ p then by (iii) (V, 1) is a linear combination of (V1 , 1), . . . , (Vi−1 , 1), and if i = p + 1 then byP(ii), (V, 1) is also a linear combination of (V1 , 1), . . . , (Vi−1 , 1). Let V ′ = V − 1≤j≤i−1 αj Vj (αj ∈ K) be such that (V ′ , 1) = 0. By the cancellative property of E, V ′ a−1 is in E for any letter a. Now, deg(V ′ ) ≤ max(deg(V ), deg(α1 V1 ), . . . , deg(αi−1 Vi−1 )) = deg(V ) hence deg(V ′ a−1 )P< deg(V ). Hence by induction, V ′ a−1 ∈ E ′ . PNow, by ′ −1 )a, and V ′ is in E ′ . Thus V = V ′ + αj Vj is Eq. (3.1), V ′ = a (V a j
in E ′ as well. 3. Proof of the claim. For d = −1, 0, 1, 2, . . ., let F (d) be the subspace of K n×1 defined by F (d) = {(V, 1) | V ∈ E, deg(V ) ≤ d} . Then 0 = F (−1) ⊂ F (0) ⊂ F (1) ⊂ · · · ⊂ F (d) ⊂ · · · Let 0 ≤ d1 < · · · < dq be such that for any i, F (di − 1) ( F (di ) and such that each F (d) is equal to some F (di ); in other words, one has 0 = F (−1) = · · · = F (d1 − 1) ( F (d1 ) = · · · = F (d2 − 1) ( F (d2 ) ( · · · ( F (dq ) = F (dq + 1) = · · ·
108
Chapter VII. Noncommutative polynomials
In particular, F (dq ) = (E, 1). Now, let B1 be a basis of F (d1 ), B2 be a basis of F (d2 ) mod F (d1 ), . . . , and let Bq be a basis of F (dq ) mod F (dq−1 ). By the definition of the F ’s we may find for each i in {1, . . . , q} vectors Wi,1 , . . . , Wi,ki in E of degree ≤ di such that {(Wi,1 , 1), . . . , (Wi,ki , 1)} = Bi ; in fact, the degree of each Wi,j is exactly di , otherwise (Wi,j , 1) ∈ F (di − 1) = F (di−1 ), which contradicts the fact that Bi is a basis mod F (di−1 ). Define V1 , . . . , Vp by (V1 , . . . , Vp ) = (W1,1 , . . . , W1,k1 , W2,1 , . . . , W2,k2 , . . . , Wq,kq ) . Then the condition (i) of the claim is clearly satisfied. Moreover, as F (dq ) = (E, 1), condition (ii) is also satisfied. Let V ∈ E with deg(V ) < deg(Vk ). Then Vk = Wi,j for some i, j, hence deg(V ) < di = deg(Wi,j ), which implies that (V, 1) ∈ F (di − 1) = F (di−1 ) and (V, 1) is a linear combination of W1,1 , . . . , Wi−1,ki−1 , hence of V1 , . . . , Vk−1 . This proves the claim. P 4. We show the last assertion of the theorem. Clearly, p ≤ n. Suppose Vi Pi = 0 where Pi ∈ KhAiPare not all zero; choose such a relation with sup(deg(Pi )) minimum. Then (Vi , 1)(Pi , 1) = 0 which shows as in (1) that (Pi , 1) = 0 for each i. Now Pj is 6= 0, hence Pj a−1 6= 0 for some letter a. P some −1 By Eq. (3.1) we obtain Vi (Pi a ) = 0, which is a new relation contradicting the above minimality. Thus the V ’s are KhAi-independent. Definition An n by n matrix M over KhAi is full if, whenever M = M1 M2 for some matrices M1 ∈ KhAin×p and M2 ∈ KhAip×n , then p ≥ n. Remark Taking in the above definition a field instead of KhAi, one obtains exactly the definition of an invertible matrix over this field. Corollary 3.2 (Cohn 1961) Let M be an n by n matrix over KhAi. If S1 , . . . , Sn in KhhAii are formal series, not all zero, such that (S1 , . . . , Sn )M = (0, . . . , 0), then M is not full. Proof. Let E be the set of vectors V ∈ KhAin×1 such that (S1 , . . . , Sn )V = 0. Then E is a right submodule of KhAin×1 . Let V = t (P P1 , . . . , Pn ) ∈ E be such that (V, 1) = 0. Then (Pi , 1) = 0Pfor any i. Moreover i Si Pi = 0, so that if a is a letter, by Eq. (3.1), one has i Si (Pi a−1 ) = 0. This means that V a−1 ∈ E; thus E is cancellative. By Theorem 3.1, the right KhAi-module E admits a basis consisting of p vectors V1 , . . . , Vp such that rank((V1 , 1), . . . , (Vp , 1)) = p and p ≤ n. Now suppose that p = n. Then the matrix N = ((V1 , 1), . . . , (Vn , 1)) ∈ K n×n is invertible. But N is the constant matrix of H = (V1 , . . . , Vn ) ∈ KhAin×n , that is N = (H, 1); this implies that H is invertible in KhhAiin×n . Now we have (S1 , . . . , Sn )H = 0 (because (S1 , . . . , Sn )Vi = 0 for all i), hence (S1 , . . . , Sn ) = 0 (multiply by H −1 ), a contradiction. So p < n. Let M = (C1 , . . . , Cn ), where Ck is the k-th column of M . Then, p P Vj Pj,k for some polynomials by hypothesis, Ck belongs to E, hence Ck = j=1
Pj,k . Thus
M = (V1 , . . . , Vp )(Pj,k )1≤j≤p, 1≤k≤n and M is not full.
109
3. Inertia
Corollary 3.3 (Cohn 1982) Let P1 , P2 , P3 , P4 be polynomials such that P2 is invertible as a formal series, that is (P2 , 1) 6= 0, and such that P1 P2−1 P3 = P4 holds in KhhAii. Then there exist polynomials Q1 , Q2 , Q3 , Q4 such that P1 = Q1 Q2 , P2 = Q3 Q2 , P3 = Q3 Q4 , P4 = Q1 Q4 . Proof. Consider the 2 by 2 matrix over KhAi: P1 P4 M= P2 P3 By assumption, we have (1, −P1 P2−1 )M = 0 . Hence M is not full by Corollary 3.2, and M may be written as Q1 (Q2 , Q4 ) M= Q3 for some polynomials Qi . This proves the corollary. The next result is the Inertia Theorem. It will not be used in Chapter VIII. Let S1 , . . . , Sn , T1 , . . . , Tn be formal series. We say that X Sj T j j
is trivially a polynomial if, for each j, either Sj = 0, or Tj = 0, or both Sj and Tj are polynomials. Note that one has T1 X Sj Tj = (S1 , . . . , Sn ) ... . j Tn
Corollary 3.4 (Inertia Theorem, Bergmann 1967, Cohn 1961) Let (Si,h )i∈I, 1≤h≤n and (Th,j )1≤h≤n, P j∈J be two families of formal series such that for each i ∈ I and j ∈ J, h Si,h Th,j is a polynomial. Then there exists an invertible matrix M over KhhAii such that for any i and j T1,j −1 . (Si,1 , . . . , Si,n )M M .. Tn,j
is trivially a polynomial.
Proof. 1. We prove the theorem first in the case where each Th,j is a polynomial. Let E = {V ∈ KhAin×1 | ∀i ∈ I, (Si,1 , . . . , Si,n )V ∈ KhAi}. Then E is a cancellative right submodule of KhAin×1 as may be easily verified (cf. the proof of Corollary 3.2). By Theorem 3.1 there exist p vectors V1 , . . . , Vp in E which form a basis of E (as a right KhAi-module) and such that the constant matrix of (V1 , . . . , Vp ) is of rank p ≤ n. By performing a permutation of coordinates, we may assume that X (V1 , . . . , Vp ) = , Y
110
Chapter VII. Noncommutative polynomials
where (X, 1) ∈ K p×p is invertible. Let X 0 , M= Y In−p where In−p is the identity matrix of order n − p. Then (M, 1) ∈ K n×n is invertible, hence M is invertible in KhhAiin×n . Note that the first p columns of M (that is the Vi ’s) are in E: this implies, by definition of E, that for any i ∈ I the first p components of (Si,1 , . . . , Si,n )M are polynomials. Moreover, let 1 ≤ h ≤ p: then M −1 Vh is equal to the hth column of M −1 M , that is to the hth canonical vector Eh ∈ K n×1 . PNow let j ∈ J. Then by assumption V = t (T1,j , . . . , Tn,j ) is in E. Hence V = 1≤h≤p Vh Ph for P some polynomials Ph . Thus M −1 V = h M −1 Vh Ph is equal, by the previous P remark, to h Eh Ph = t (P1 , . . . , Pp , 0, . . . , 0). This shows that the product T1,j −1 . (Si,1 , . . . , Si,n )M M .. Tn,j
is trivially a polynomial. 2. We come to the general case. Let
H = {h ∈ {1, . . . , n} | ∀j ∈ J, Th,j ∈ KhAi} . If H = {1, . . . , n}, then we are in case 1. Suppose |H| < n: we may suppose that H = {1, . . . , p} with 0 ≤ p < n (including the case H = ∅). Suppose that ∀i ∈ I, ∀h ∈ / H, Si,h = 0. Then n X
Si,h Th,j =
h=1
p X
Si,h Th,j
h=1
is a polynomial, so we are also in case 1 (with p instead of n). Otherwise, there is some i0 ∈ I such that for some h0 ∈ / H, Si0 ,h0 6= 0. Choose h0 ∈ / H such that ω(Si0 ,h0 ) ≤ ω(Si0 ,h ) for any h ∈ / H (for the definition of ω, see Section I.3). Choose polynomials R1 , . . . , Rp such that for 1 ≤ h ≤ p, ω(Si0 ,h + Rh ) ≥ ω(Si0 ,h0 ). Define Sh′ by Sh′ = Si0 ,h + Rh if 1 ≤ h ≤ p and Sh′ = Si0 ,h if p < h ≤ n. Then ω(Sh′ 0 ) ≤ ω(Sh′ ), Sh′ 0 = Si0 ,h0 6= 0 and X
Sh′ Th,j =
1≤h≤n
X
(Si0 ,h + Rh )Th,j +
h≤p
=
X
1≤h≤n
X
Si0 ,h Th,j
h>p
Si0 ,h Th,j +
X
Rh Th,j
h≤p
is a polynomial, by definition of H = {1, . . . , p}. Let w be a word of minimal length in the support of Sh′ 0 ; then w−1 Sh′ 0 is an invertible formal series, and for any h, since ω(Sh′ ) ≥ |w|, one has w−1 (Sh′ Th,j ) = (w−1 Sh′ )Th,j . Hence P −1 ′ Sh )Th,j is a polynomial. Define the matrix N ∈ KhhAiin×n which coinh (w cides with the n × n identity matrix except in the h0 th row, where it is equal to (w−1 S1′ , . . . , w−1 Sn′ ); in particular the entry of the coordinate (h0 , h0 ) of N is the invertible series w−1 Sh′ 0 , so N is invertible in KhhAiin×n . Let M = N −1 . Then
4. Gauss’s Lemma
111
for any j, M −1 t (T1,j , . . . , Tn,j ) = N t (T1,j , . . . , Tn,j ) is to t (T1,j , . . . , Tn,j ) Pequal −1 ′ except in the h0 th component, where it is equal to (w Sh )Th,j : hence the first p and the h0 th components of M −1 t (T1,j , . . . , Tn,j ) are polynomials and we may conclude the proof by induction on n − p because we have increased |H|.
4
Gauss’s Lemma
We consider in this section polynomials with integer or rational coefficients. Everything would work, however, with any factorial ring instead of Z. Definition A polynomial P ∈ QhAi is primitive if P 6= 0, P ∈ ZhAi and if its coefficients have no nontrivial common divisors in Z. Definition The content of a nonzero polynomial P ∈ QhAi is the unique positive rational number c(P ) such that P/c(P ) is primitive. Notation P/c(P ) will be denoted by P . Example 4.1 c(4/3+6a−2ab) = 2/3 because 3/2(4/3+6a−2ab) = 2+9a−3ab is primitive. Note that for P 6= 0 P primitive ⇐⇒ c(P ) = 1 P ∈ ZhAi ⇐⇒ c(P ) ∈ N .
(4.1) (4.2)
Theorem 4.1 (Gauss’s Lemma) (i) If P, Q are primitive, then so is P Q. (ii) If P, Q are nonzero polynomials, then c(P Q) = c(P )c(Q) and P Q = P Q. Proof (i) Suppose P Q is not primitive. Then there is some prime number n which divides each coefficient of P Q. This means that the canonical image φ(P Q) of P Q in (Z/nZ)hAi vanishes. But Z/nZ is a field, so (Z/nZ)hAi is an integral domain (Section I.1); moreover 0 = φ(P Q) = φ(P )φ(Q), so φ(P ) = 0 or φ(Q) = 0. This means that n divides all coefficients of P or of Q, and contradicts the fact that P and Q are primitive. (ii) By (i), P Q/c(P )c(Q) = (P/c(P ))(Q/c(Q)) is primitive. So, by definition of the content of P Q, c(P Q) = c(P )c(Q). Now, P Q = P Q/c(P Q) so that P Q = P Q/c(P )c(Q) = P Q. Corollary 4.2 Let a1 , . . . , an be polynomials. Then the continuant polynomials p(a1 , . . . , an ) and p(an , . . . , a1 ) are both zero or have the same content. Proof (Induction on n). The result is obvious for n = 0, 1. Let n ≥ 2. By Lemma 2.5, we may suppose that both polynomials are 6= 0. Now we have, by Proposition 2.1 p(a1 , . . . , an )p(an−1 , . . . , a1 ) = p(a1 , . . . , an−1 )p(an , . . . , a1 ) .
112
Chapter VII. Noncommutative polynomials
By induction, either p(a1 , . . . , an−1 ) = p(an−1 , . . . , a1 ) = 0, in which case p(a1 , . . . , an ) = p(a1 , . . . , an−2 ) by Eq. (2.1) and p(an , . . . , a1 ) = p(an−2 , . . . , a1 ) and we conclude by induction; or c(p(an−1 , . . . , a1 )) = c(p(a1 , . . . , an−1 )), which implies by Eq. (2.4) and Theorem 4.1 that c(p(a1 , . . . , an )) = c(p(an , . . . , a1 )). Corollary 4.3 Let P1 , P2 , P3 , P4 be nonzero polynomials in ZhAi such that P2 is invertible in QhhAii and such that P1 P2−1 P3 = P4 . Then there exist polynomials R1 , R2 , R3 , R4 ∈ ZhAi such that P1 = R1 R2 , P2 = R3 R2 , P3 = R3 R4 , P4 = R1 R4 . Proof. By Corollary 3.3 we have P1 = Q1 Q2 , P2 = Q3 Q2 , P3 = Q3 Q4 , P4 = Q1 Q4 for some polynomials Q1 , Q2 , Q3 , Q4 ∈ QhAi. Let ci = c(Qi ), i = 1, 2, 3, 4. By Theorem 4.1 we have c(P1 ) = c1 c2 , c(P2 ) = c3 c2 , c(P3 ) = c3 c4 , c(P4 ) = c1 c4 . Thus c(P4 ) = c(P1 )c(P3 )/c(P2 ). As by hypothesis and Eq. (4.2) c(Pi ) ∈ N, there exist positive integers d1 , d2 , d3 , d4 such that c(P1 ) = d1 d2 , c(P2 ) = d3 d2 , c(P3 ) = d3 d4 , c(P4 ) = d1 d4 . Moreover, by Theorem 4.1, P 1 = Q1 Q2 , P 2 = Q3 Q2 , P 3 = Q3 Q4 , P 4 = Q1 Q4 . Put Ri = di Qi , i = 1, 2, 3, 4. Then Ri ∈ ZhAi. Moreover P1 = c(P1 )P 1 = d1 d2 Q1 Q2 = R1 R2 . Similarly P2 = R3 R2 , P3 = R3 R4 and P4 = R1 R4 . Proposition 4.4 Let Y be a primitive polynomial of degree 1 which vanishes for some integer values of the variables. Let P, Q ∈ ZhAi and let α ∈ Z, α 6= 0 be such that P Q ≡ α mod Y ZhAi. Then P ≡ β, Q ≡ γ mod Y ZhAi for some β, γ ∈ Z such that α = βγ. Proof. We have P Q = Y Q2 + α for some polynomial Q2 . As α 6= 0, we have Q 6= 0 and we may apply Corollary 1.3. This shows that P = β + Y T for some β ∈ Q and T ∈ QhAi. Hence Y Q2 + α = βQ + Y T Q. Since α 6= 0 and deg(Y ) > 0, we obtain β 6= 0: indeed, otherwise P = Y T and Y T Q = Y Q2 + α, implying that Y divides α. This shows that Q = γ + Y S for some γ ∈ Q such that α = βγ. Now the assumption on Y and the fact that P, Q have integer coefficients imply that β, γ ∈ Z. Since Y T = P − β ∈ ZhAi, we obtain that c(Y )c(T ) ∈ N by Eq. (4.2) and Theorem 4.1 (ii). But Y is primitive, so c(Y ) = 1, which shows that c(T ) ∈ N and T ∈ ZhAi by (4.2). Similarly, S ∈ ZhAi.
113
4. Gauss’s Lemma
Exercises for Chapter VII n P
Pi Qi = 0 is P called trivial if for each i, either Pi = 0 or Qi = 0. Note that Pi Qi may be written Q1 .. (P1 , . . . , Pn ) . .
1.1 Let P1 , . . . , Pn , Q1 , . . . , Qn be polynomials. A relation
i=1
Qn
Show that if
n P
Pi Qi = 0, then there exists an invertible n by n matrix
i=1
M with coefficients in KhAi such that the relation Q1 −1 . (P1 , . . . , Pn )M M .. = 0 Qn
is trivial (cf. Cohn 1961). 1.2 a) Let X, Y X ′ , Y ′ be nonzero formal series such that XY ′ = Y X ′ , with ω(X) ≥ ω(Y ) (cf Chapter I). Show that there exists a formal series U such that X = Y U , X ′ = U Y ′ . b) Let S be a formal series and let C be its centralizer, that is C = {T ∈ KhhAii | ST = T S}. Show that if T1 , T2 ∈ C and ω(T2 ) ≥ ω(T1 ), then there exists T ∈ C such that T2 = T1 T . (Hint : one may suppose ω(S) ≥ 1; let n be such that ω(S n ) ≥ ω(T1 ), ω(T2 ): use a) three times.) Let T ∈ C such that ω(T ) ≥ 1 is minimum. Show that C = K[[T ]], that is o nX an T n | an ∈ K C= n∈N
( (see Cohn 1961). 2.1 Show that for n ≥ k ≥ 1 the continuant polynomials satisfy the identities p(a1 , . . . , an )p(an−1 , . . . , ak ) − p(a1 , . . . , an−1 )p(an , . . . , ak ) = (−1)n+k p(a1 , . . . , ak−2 )
with the conventions: p(a1 , . . . , ak−2 ) = 0 if k = 1, = 1 if k = 2, and p(an−1 , . . . , ak ) = 1 if k = n. Show that the number of words in the support of p(a1 , . . . , an ) is the nth Fibonacci number Fn (F0 = F1 = 1, Fn+2 = Fn+1 + Fn ). 2.2 Show that if a1 , . . . , an are commutative polynomials, then 1
a1 +
=
1
a2 + a3 +
1 ···+
1 an
p(a1 , . . . , an ) . p(a2 , . . . , an )
114
Chapter VII. Noncommutative polynomials
2.3 Show that the entries of the matrix a1 1 a2 1 a 1 ··· n 1 0 1 0 1 0 may be expressed by means of continuant polynomials. 3.1 Let M be an n by n polynomial matrix such that M = M1 M2 with M1 ∈ KhhAiin×p and M2 ∈ KhhAiip×n . Show that then one may choose M1 , M2 to be polynomial matrices (use the inertia theorem; see Cohn 1985).
Notes to Chapter VII Most of the results of this chapter are due to P. M. Cohn. We have already seen a result concerning noncommutative polynomials in Chapter II (Corollary II.3.3): in P. M. Cohn’s terminology, it means that KhAi is a fir (“free ideal ring”). The terminology “continuant” stems from its relation to continuous fractions (see Exercises 2.2 and 2.3). Corollary 3.2 is a special case of a more general result, stating that every polynomial matrix which is singular over the free field is not full (see Cohn 1961).
Chapter VIII
Codes and Formal Series The aim of this chapter is to present an application of formal series to the theory of (variable-length) codes. The main result (Theorem 4.1) states that every finite complete code admits a factorization into three polynomials which reflect its combinatorial structure. The first section contains some basic facts on codes and prefix codes. These are easily expressed by means of formal power series. Section 2 is devoted to complete codes and their relations to Bernoulli morphisms (Theorem 2.4). Concerning the degree of a code, we give in Section 3 only the very basic results needed in Section 4. This last section is devoted to the proof of the main result. It uses the material of the previous section and from Chapter VII.
1
Codes
Definition A code is a subset C of A∗ such that whenever u1 , . . . , un , v1 , . . . , vp in C satisfy u1 · · · un = v1 · · · vp ,
(1.1)
then n = p and ui = vi for i = 1, . . . , n. In this case, any word in C ∗ (= the submonoid generated by C) is called a message. Note that if C is a code, then C ⊂ X + (= X ∗ \ 1). Example 1.1 The set {a, ab, ba} is not a code, because the word aba has two factorizations in it: aba = a(ba) = (ab)a . Example 1.2 The set {a, ab, bb} is a code; indeed, no word in it is a prefix of another, so in each relation of the form (1.1), either u1 is a prefix of v1 or vice versa, so one has u1 = v1 and one concludes by induction on n. Example 1.3 The set {b, ab, a2 b, a3 b, . . . , an b, . . .} = a∗ b is a code, for the same reason as in Example 1.2. 115
116
Chapter VIII. Codes and Formal Series
Example 1.4 The set {a3 , a2 ba, a2 b2 , ab, ba2 , baba, bab2, b2 a, b3 } is a code, for the same reason; note that in this case, moreover no word is a suffix of another. Example 1.5 The set C = {a2 , ab, a2 b, ab2 , b2 } is a code. Indeed, let C denote its characteristic polynomial; then we have 1 − C = 1 − a2 − ab − a2 b − ab2 − b2
= (1 − b − a2 − ab) + (b − b2 − a2 b − ab2 )
= (1 − b − a2 − ab)(1 + b)
= ((1 − a − b) + (a − a2 − ab))(1 + b) = (1 + a)(1 − a − b)(1 + b) . Thus, in ZhhAii, we have (1 − C)−1 = (1 + b)−1 (1 − a − b)−1 (1 + a)−1 . By the results of Section I.4, for any proper formal series S, (1 − S)−1 = P n ∗ −1 = A∗ = A∗ is the sum of all words on A n≥0 S = S and (1 − a − b) (and hence, its nonzero coefficients are all equal to 1). Hence X C n (1 + a) . A∗ = (1 + b) n≥0
P This shows that the series n≥0 C n has no coefficient ≥ 2, since otherwise A∗ would have such a coefficient. From X X X u1 · · · un Cn = n≥0
n≥0 u1 ,...,un ∈C
we obtain that no word has two distinct factorizations of the form u1 · · · un (ui ∈ C), so C is a code. Recall that for any language X, X denotes its characteristic series (considered as an element of QhhAii in the present chapter). One of the arguments of the last example may be generalized as follows. Proposition 1.1 Let C be a subset of A+ and let C be its characteristic series. Then C is a code if and only if one has in ZhhAii (1 − C)−1 = C ∗ = C ∗ .
(1.2)
Proof. The first equality is always true, as shown in Section I.4. We have X X X u1 · · · un = Cn = C∗ . n≥0 u1 ,...,un ∈C
n≥0
If C is a code, then the words u1 · · · un
(n ≥ 0, ui ∈ C)
are all distinct, so the left-hand side is equal to C ∗ . If C is not a code, then two of these words are equal, so the left-hand side is a series with at least one
117
1. Codes
coefficient ≥ 2: it cannot be equal to C ∗ , because the latter has only 0, 1 as coefficients. The previous result provides an effective algorithm for testing whether a given rational subset of C of A+ is a code. Indeed, one has merely to check if the rational power series C ∗ − C ∗ is equal to 0; for this, apply Corollary II.3.4. However, there is a more direct algorithm. We give below, without proof, the algorithm of Sardinas and Patterson (see Lallement 1979, Berstel and Perrin 1985). Recall that for any language X and any word w, we denote by w−1 X the language w−1 X = {u ∈ A∗ | wu ∈ X} . More generally, if Y is a language, we denote by Y −1 X the language [ Y −1 X = w−1 X . w∈Y
Now let C be a subset of A+ . Define a sequence of languages Cn by C0 = C −1 C \ 1
Cn+1 = Cn−1 C ∪ C −1 Cn
(n ≥ 0) .
Then C is a code if and only if no Cn contains the empty word. If C is finite, the sequence (Cn ) is periodic (because each word in Cn is a factor of some word in C). The same is true if C is rational (see Berstel and Perrin 1985, Prop. I.3.3). Hence in these cases, we obtain an effective algorithm. Another way to express the fact that a set of words is a code is by means of the so-called unambiguous operations. Let X, Y be languages. We say that their union is unambiguous if they are disjoint languages. We say that their product is unambiguous if x, x′ ∈ X, y, y ′ ∈ Y , and xy = x′ y ′ implies x = x′ , y = y ′ . We say that the star X ∗ is unambiguous if X is a code. Proposition 1.2 Let X, Y be languages. (i) The union of X and Y is unambiguous if and only if X ∪ Y = X + Y . (ii) The product XY is unambiguous if and only if XY = X Y . (ii) If 1 ∈ / X, then the star X ∗ is unambiguous if and only if X ∗ = X ∗ . Proof. The first two assertions are a direct consequence of their definition. The last one is merely a reformulation of Proposition 1.1. We have already met a family of codes in Section II.3: the prefix codes. A code is a prefix code if no word in it is a prefix of another word in it (this condition is sufficient to ensure codicity). Symmetrically, one defines suffix codes. A code is called bifix if it is both prefix and suffix. Proposition 1.3 Let C be a code such that for any word v in C ∗ , one has v −1 C ∗ ⊂ C ∗ . Then C is a prefix code. v
Note the converse: for any set C and for any word v in C ∗ , one has C ∗ ⊂ C ∗.
−1
118
Chapter VIII. Codes and Formal Series
Proof. Suppose u = vw, with u, v in C and w ∈ A∗ . We have to show that w = 1. Now w = v −1 u ∈ v −1 C ∗ ⊂ C ∗ , hence w ∈ C ∗ . Therefore w = c1 · · · cn (ci ∈ C) and u = vc1 · · · cn ∈ C. The only possibility for C to be a code is n = 0, that is w = 1, and C is a prefix code. Proposition 1.4 Let C be a prefix code such that CA∗ ∩ wA∗ is nonempty for any word w. Let P be the set of proper prefixes of the words in C. Then one has in ZhhAii C − 1 = P (A − 1) . Proof. Let P ′ = A∗ \ CA∗ . Then, by Proposition II.3.1, we have A∗ = C ∗ P ′ . But, because C is a prefix code, the conditions u1 · · · un q = v1 · · · vp r, ui , vj ∈ C, q, r ∈ P ′ imply n = p, ui = vi for i = 1, . . . , n, hence also q = r. This shows that the product C ∗ P ′ is unambiguous, hence by Proposition 1.2, we have A∗ = C ∗ P ′ . Now, by Proposition 1.1, A∗ = (1 − A)−1 and C ∗ = (1 − C)−1 . Moreover, the empty word is in P ′ , so P ′ is invertible in ZhhAii. Hence 1−A = P ′−1 (1−C), which implies C − 1 = P ′ (A − 1). It remains to show that P = P ′ . Let w be in P ; then w is a proper prefix of some word in C and so has no prefix in C, C being a prefix code; hence w∈ / CA∗ =⇒ w ∈ P ′ . Let w be in P ′ . By assumption, there are words c ∈ C, u, v ∈ A∗ such that cu = wv; as w ∈ / CA∗ , w must be a proper prefix of c, so w ∈ P . Let C be a code. Define, for any word u, the series Su inductively by S1 = 1 Su = a−1 Sv + (Sv , 1)a−1 C ,
for u = va (a ∈ A)
Note that, obviously, Su has nonnegative coefficients. The reader may verify that the support of Su consists of proper suffixes of C (cf. Exercise 1.3). Lemma 1.5 Let C be a code. Then for any word u, u−1 (C ∗ ) = Su C ∗ . In particular, Su is a characteristic series. If C is finite, then Su is a polynomial. Proof. We shall use the formulas of Lemma I.6.2. We prove u−1 (C ∗ ) = Su C ∗ by induction on |u|. If u = 1, it is clearly true. Let u = va, (a ∈ A). Then by induction v −1 (C ∗ ) = Sv C ∗ . Thus, by Lemma I.6.2, u−1 (C ∗ ) = a−1 v −1 (C ∗ ) = (a−1 Sv )C ∗ + (Sv , 1)(a−1 C ∗ ) = (a−1 Sv )C ∗ + (Sv , 1)(a−1 C)C ∗ = Su C ∗ . Now, since u−1 (C ∗ ) is obviously a characteristic series, the same holds for Su . It is easily verified by induction that Su is a polynomial if C is finite. One defines symmetrically the series Pu ∈ ZhhAii by P1 = 1 Pav = Pv a−1 + (Pv , 1)Ca−1 ,
for a ∈ A and v ∈ A∗
119
2. Completeness
Now we define, for a couple (u, v) of words another series in the following way: Fu,1 = 0 Fu,av = (Pv , 1)Su a−1 + Fu,v a−1 . As above, the series Fu,v clearly has nonnegative coefficients. Proposition 1.6 Let C be a code. Then for any words u and v, u−1 (C ∗ )v −1 = Su C ∗ Pv + Fu,v . In particular, Fu,v is a characteristic series. If C is finite, then Fu,v is a polynomial. Proof (Induction on |v|). The result is obvious if v = 1 by Lemma 1.5. Let a ∈ A. Then u−1 (C ∗ )(av)−1 = [u−1 (C ∗ )v −1 ]a−1 is equal, by induction and Lemma I.6.2, to (Su C ∗ Pv )a−1 + Fu,v a−1 = Su C ∗ (Pv a−1 ) + (Pv , 1)Su (C ∗ a−1 ) + (Pv , 1)Su a−1 + Fu,v a−1 = Su C ∗ (Pv a−1 ) + (Pv , 1)Su C ∗ (Ca−1 ) + Fu,av = Su C ∗ Pav + Fu,av . This proves the formula. Now, since Su C ∗ Pv has nonnegative coefficients and since u−1 (C ∗ )v −1 is a characteristic series, the same holds for Fu,v . If C is finite, it is easily seen by induction on the definition that Fu,v is a polynomial.
2
Completeness
Definition A language C ⊂ A∗ is complete if, for any word w, the set C ∗ ∩ A∗ wA∗ is nonempty. Lemma 2.1 If C is complete, then any word w is either a factor of a word in C or may be written as w = smp , with m ∈ C ∗ and where s (p) is a suffix (prefix) of a word of C. Proof. We have xwy ∈ C ∗ for some words x, y. Let us represent a word in C ∗ schematically by
Then we have two cases: 1)
w
120
Chapter VIII. Codes and Formal Series
2)
w In the first case, w is a factor of a word in C. In the second case, w = smp as in the lemma. Definition A Bernoulli morphism is a mapping π : A∗ → R such that (i) (ii) (iii) (iv)
π(w) > 0 for any word w, π(1) = 1, π(uv) P = π(u)π(v) for any words u, v, π(a) = 1. a∈A
It is called uniform if π(a) = 1/|A| for any letter a. We define for any language X the measure of X by X π(X) = π(w) w∈X
(it may be infinite). We shall frequently use the following inequalities: X π(∪Xi ) ≤ π(Xi ) π(XY ) ≤ π(X)π(Y ) .
Note that, for any n, one has π(An ) = 1. Lemma 2.2 Let C be a code. Then π(C) ≤ 1. Proof. Since C is the limit of its finite subsets, it is enough to show the lemma in the case where C is finite. Let p be the maximal length of words in C. Then C n ⊂ A ∪ A2 ∪ · · · ∪ Apn . Thus π(C n ) ≤ pn. Now, as C is a code, each word in C n has only one factorization of the form u1 · · · un (ui ∈ C). As π is multiplicative, we obtain π(C n ) = π(C)n . Hence π(C)n ≤ pn . This shows that π(C) ≤ 1. Lemma 2.3 Let C be a finite complete language. Then π(C) ≥ 1. Proof. By Lemma 2.1, we may write A∗ = SC ∗ P ∪ F , where S, P, F are finite languages. Thus ∞ = π(A∗ ) ≤ π(S)π(C ∗ )π(P ) + π(F ) .
121
2. Completeness This shows that π(C ∗ ) = ∞. Now [ C∗ = Cn n≥0
P so that π(C ∗ )P ≤ n≥0 π(C n ). Moreover, π(C n ) ≤ π(C)n , π being multiplicative. So ∞ ≤ n≥0 π(C)n , which shows that π(C) ≥ 1.
Theorem 2.4 (Sch¨ utzenberger and Marcus 1959, Bo¨e et al. 1980) Let C be a finite subset of A∗ and let π be a Bernoulli morphism. Then any two of the following assertions imply the third one: (i) C is a code, (ii) C is complete, (iii) π(C) = 1 . Note that this gives an algorithm for testing whether a given finite code is complete (see Exercise 2.3). We need another lemma. Lemma 2.5 Let X be a language and let w be a word such that X ∩ A∗ wA∗ is empty. Then π(X) < ∞. Proof. Let ℓ = |w| and for i = 0, . . . , ℓ − 1 Xi = {v ∈ X | |v| ≡ i mod ℓ} . Then Xi ⊂ Ai (Aℓ \ w)∗ . Indeed v ∈ Xi implies v = uv1 · · · vn with |u| = i and for any j, |vj | = ℓ; by assumption, w is not factor of v, hence w is none of the vj ’s: thus vj ∈ Aℓ \ w, which proves the claim. Now π(Aℓ \ w) = π(Aℓ ) − π(w) = 1 − π(w) < 1 and π[(Aℓ \ w)∗ ] = π ≤
X
ℓ
[
X (Aℓ \ w)n ≤ π[(Aℓ \ w)n ]
n≥0
n
n≥0
[π(A \ w)] < ∞ .
n≥0
Thus π(Xi ) = π[Ai (Aℓ \ w)∗ ] ≤ π(Ai )π[(Aℓ \ w)∗ ] < ∞ and since X = ∪0≤i≤ℓ−1 Xi , we obtain π(X) < ∞. Proof of Theorem 2.4. Lemma 2.2 and 2.3 show that (i) and (ii) imply (iii). Let C be a code with π(C) = 1. Suppose C is not complete. Then for some word w, C ∗ ∩ A∗ wA∗ is empty. P Hence, by Lemma 2.5, π(C ∗ ) < ∞. As C is a ∗ code, π(C ) is equal to the sum n≥0 π(C)n . The latter being finite, we deduce that π(C) < 1, a contradiction. Let C be complete and π(C) = 1. Then C n is complete for any n; indeed, for any word w, there are words u, v, c1 , . . . , cp (ci ∈ C) such that uwv = c1 · · · cp (C being complete). Let r be such that p + r is a multiple of n; then uwvcr1 = c1 · · · cp cr1 ∈ (C n )∗ , which shows that (C n )∗ ∩ A∗ wA∗ is not empty. Hence
122
Chapter VIII. Codes and Formal Series
C n is complete. Thus, by Lemma 2.3, π(C n ) ≥ 1 for any n. But as usually π(C n ) ≤ π(C)n = 1, thus π(C n ) = π(C)n for any n. Suppose C is not a code. Then for some words u1 , . . . , un , v1 . . . , vp in C we have u1 · · · un = v1 · · · vp and u1 6= v1 . Hence u1 · · · un v1 · · · vp = v1 · · · vp u1 · · · un , and we have obtained a word in C n+p which has two distinct factorizations. Hence π(C n+p ) = π {w1 · · · wn+p | wi ∈ C} X < π(w1 · · · wn+p ) = π(C n+p ) w1 ,...,wn+p ∈C
which is a contradiction. Let π be a Bernoulli morphism. Since π is multiplicative, it may be extended to an algebra morphism, still denoted by π, π : ZhAi → R by the formula X X π (P, w)w = (P, w)π(w) . w
w
Note that, because the measure of A is 1, one has π(A − 1) = 0 . Theorem 2.6 (Sch¨ utzenberger 1965) Let C be a finite code such that for any word w, the set C ∗ ∩ wA∗ is nonempty. Then C is a prefix code. Proof. Let C ′ be the set of words in C having no proper prefix in C, that is C ′ = C \ CA+ . Clearly C ′ is a prefix code. Moreover, if w is a word, then for some words c1 , . . . , cn ∈ C, u ∈ A∗ , one has by assumption c1 · · · cn = wu . Then either c1 ∈ C ′ , or c1 has a prefix in C ′ . Thus C ′ A∗ ∩ wA∗ is nonempty. Let P be the set of proper prefixes of the words in C ′ . Then by Proposition 1.4, C ′ − 1 = P (A − 1). Apply the morphism π : ZhAi → R, obtaining π(C ′ − 1) = 0 because π(A − 1) = 0. Thus π(C ′ ) = 1. As C is a code, we have by Lemma 2.2, π(C) ≤ 1. But C ′ ⊂ C and π is positive. Hence C = C ′ is prefix. Theorem 2.7 (Reutenauer 1985) Let P in NhAi be without constant term such that P − 1 = X(A − 1)Y for some polynomials X, Y in RhhAii. Then P = C for some finite complete code C. Furthermore, if Y ∈ R (X ∈ R), then C is a prefix (suffix) code. Proof. 1. Note that if S, T are formal series, then supp(ST ) ⊂ supp(S) supp(T ) .
3. The Degree of a Code
123
Moreover, if S is proper, then supp(S ∗ ) ⊂ supp(S)∗ . 2. We have 1 − P = X(1 − A)Y . By assumption, 1 − P is invertible in RhhAii. The same holds for 1 − A since its inverse is A∗ = A∗ . This shows that X and Y are also invertible. So we obtain (1 − P )−1 = Y −1 (1 − A)−1 X −1 which implies (1 − A)−1 = Y (1 − P )−1 X . Thus A∗ = Y P ∗ X .
(2.1)
By 1, this implies that each word w may be written as w = ymx, with y ∈ supp(Y ), m ∈ supp(P )∗ and x ∈ supp(X). Let C = supp(P ) and let u be a word such that |u| > deg(X), deg(Y ). Let v be any word. Then w = uvu may be written uvu = ymx as above, which shows, by the choice of u, that m = v1 vv2 . Hence C ∗ ∩ A∗ vA∗ is nonempty: we have shown that C is complete. Thus, by Lemma 2.3, π(C) ≥ 1 (where π is some Bernoulli morphism). Now, as P − 1 = X(A − 1)Y , we obtain π(P ) = 1. Hence 1 ≤ π(C) ≤ π(P ) = 1 because P has nonnegative integer coefficients. This shows, π being positive, that P = C and that π(C) = 1. Hence, by Theorem 2.4, C is a code, and thus a finite complete code. Suppose now that Y ∈ R. Then, as above, Eq. (2.1) shows that for any word v, one has vu = mx for some words m ∈ C ∗ , x ∈ supp(X) (u being chosen as before). Then, as |u| > |x|, we obtain m = vv1 which shows that C ∗ ∩ vA∗ is nonempty. We conclude by Theorem 2.6.
3
The Degree of a Code
Given a monoid M , recall that an ideal in M is a nonempty subset J which is closed for left and right multiplication by elements of M . Moreover, an idempotent is an element e which is equal to its square, that is e2 = e. Theorem 3.1 (Suschkewitsch 1928) Let M be a finite monoid. There exists in M an ideal J which is contained in any ideal of M . Let e be an idempotent in J. Then eM e is a finite group whose neutral element is e. This ideal will be called the minimal ideal of M Proof. 1. Let J be the intersection of all ideals in M . Clearly J is closed for multiplication by elements of M . We have only to verify that it is not empty. But let m be the product of all elements of M , in some order. Then m is in each ideal of M , and hence in J.
124
Chapter VIII. Codes and Formal Series
2. We use the following classical fact: if a ∈ M , then some positive power of a is an idempotent. Indeed, chose i, j ≥ 1 such that j ≥ i and that ai = ai+j (this is possible because the set {a, a2 , . . . , an , . . .} is finite). Let k = j − i. Then ai+k is idempotent because ai+k ai+k = ak ai+i+k = ak ai+j = ak ai = ak+i . 3. Clearly, eeme = eme = emee and emeem′ e = e(mem′ )e, hence eM e is a (finite) monoid whose neutral element is M . 4. Let a = eme be in eM e. We show the existence of b ∈ eM e such that ab = e. We have a = et for some t ∈ M . Now M aM is an ideal of M contained in J (because M aM = M etM , e ∈ J and J is an ideal), hence M aM = J (J being minimal). Thus e = uav for some elements u, v of M . Next, e = uetv = uuetvtv = un e(tv)n for any n ≥ 1. Choose n such that (tv)n is idempotent. Then e = un e(tv)n = un e(tv)n (tv)n = e(tv)n = etv(tv)n−1 = aw (recall that et = a). But a = eme implies ae = eme2 = eme = a, whence e = aw = aew and e = e2 = aewe. Let b = ewe ∈ eM e. Then e = ab. 5. Symmetrically, we have e = ca for some c in eM e. Then, classically c = ce = cab = eb = b. This shows that each element of eM e has an inverse in eM e, that is, eM e is a group. Theorem 3.2 Let C be a finite complete code. There exist a finite monoid M and a surjective morphism φ : A∗ → M such that C ∗ = φ−1 φ(C ∗ ). Let J be the minimal ideal of M . There exists an idempotent e in J ∩ φ(C ∗ ); further φ(C ∗ ) ∩ eM e is a subgroup of the group eM e. It will not be shown here that the index of φ(C ∗ ) ∩ eM e in eM e depends only on C; for this, we refer the reader to the book by Berstel and Perrin (1985). This being admitted, we introduce the following definition. Definition With the notation of Theorem 3.2, the index of eM e ∩ φ(C ∗ ) in eM e is called the degree of C. Proof of Theorem 3.2. Clearly, C ∗ is a rational subset of A∗ (cf. Section III.1). Hence, by Kleene’s theorem (Theorem III.1.1), it is recognizable. This shows that there exist a finite monoid M , a monoid morphism φ : A∗ → M , and a subset N of M such that C ∗ = φ−1 (N ). Clearly, we may assume that φ is surjective; then N = φ(C ∗ ) and C ∗ = φ−1 φ(C ∗ ). Let J be the minimal ideal of M and w a word in φ−1 (J). Then C ∗ ∩A∗ wA∗ is nonempty (because C is complete), hence there exist words u, v such that uwv is in C ∗ . Now m = φ(uwv) is in φ(C ∗ ) and also in J (because m = φ(u)φ(w)φ(v), φ(w) ∈ J, and J is an ideal). Some power e = mn with n ≥ 1 of m is idempotent and still lies in φ(C ∗ ) ∩ J. Now, φ(C ∗ ) is clearly a submonoid of M . Hence, the product of any two elements of eM e ∩ φ(C ∗ ) lies in eM e ∩ φ(C ∗ ). Take a ∈ eM e ∩ φ(C ∗ ). Then for some n ≥ 2, an = e (eM e being a finite group). Then an−1 is the inverse of a in eM e, and belongs to eM e ∩ φ(C ∗ ). Thus, the latter is a subgroup of eM e.
4
Factorization
Theorem 4.1 (Reutenauer 1985) Let C be a finite complete code. Then there exist polynomials X, Y, Z in ZhAi such that C − 1 = X(d(A − 1) + (A − 1)Z(A − 1))Y
(4.1)
4. Factorization
125
and (i) d is the degree of C, (ii) C is prefix (suffix) if and only if Y = 1 (X = 1). Example 4.1 We have a2 + a2 b + ab + ab2 + b2 − 1 = (1 + a)(a + b − 1)(1 + b) . The corresponding code is neither prefix nor suffix, but synchronizing (that is of degree 1). Example 4.2 Let C be the square of the code of Example 4.1. Then C is of degree 2 and C − 1 = (1 + a)(2(a + b − 1) + (a + b − 1)(1 + b)(1 + a)(a + b − 1))(1 + b) . Example 4.3 We have a3 + a2 ba + a2 b2 + ab + ba2 + baba + bab2 + b2 a + b3 − 1
= 3(a + b − 1) + (a + b − 1)(2 + a + b + ab)(a + b − 1) . The corresponding code is a bifix code and has degree 3.
The following corollary (which also uses Theorem 2.7) characterizes completely finite complete codes. Corollary 4.2 (Reutenauer 1985) Let C be a language not containing the empty word. Then the following conditions are equivalent: (i) C is a complete finite code. (ii) There exist polynomials P, S in ZhAi such that C − 1 = P (A − 1)S .
In order to prove Theorem 4.1, we need the following lemma. Lemma 4.3 Let C be a finite complete code of degree d. Then there exist words u1 , . . . ud , v1 , . . . , vd , with u1 , v1 ∈ C ∗ , such that for any i, 1 ≤ i ≤ d: X ∗ −1 A∗ = u−1 i (C )vj 1≤j≤d
and for any j, 1 ≤ j ≤ d: X ∗ −1 u−1 A∗ = i (C )vj . 1≤i≤d
Proof. By Theorem 3.2 there exist a finite monoid M and a surjective morphism φ : A∗ → M such that C ∗ = φ−1 φ(C ∗ ); moreover, there exists an idempotent e in J ∩ φ(C ∗ ), where J is the minimal ideal of M , G = eM e is a finite group and H = eM e ∩ φ(C ∗ ) is a subgroup of G of index d.
126
Chapter VIII. Codes and Formal Series
Let u1 , . . . ud , v1 , . . . , vd be words in φ−1 (G) such that [ G= φ(vi )H
(4.2)
1≤i≤d
and G=
[
Hφ(uj )
1≤j≤d
(disjoint unions). By elementary group theory, we may assume that φ(u1 ) = φ(v1 ) = e (hence u1 , v1 ∈ φ−1 (e) ⊂ φ−1 φ(C ∗ ) = C ∗ ) and that φ(ui ) is the inverse of φ(vi ) in G. Let 1 ≤ j ≤ d and w be a word. Then there exists one and only one i, ∗ −1 ∗ 1 ≤ i ≤ d, such that w ∈ u−1 i (C )vj , that is ui wvj ∈ C . Indeed, the element eφ(wvj ) of G is in some φ(vi )H by Eq. (4.2). Hence, φ(ui wvj ) = φ(ui )eφ(wvj ) ∈ φ(ui )φ(vi )H = eH = H, which implies that ui wvj ∈ φ−1 (H) ⊂ φ−1 φ(C ∗ ) = C ∗ . Conversely, ui wvj ∈ C ∗ implies φ(ui wvj ) ∈ eM e ∩ φ(C ∗ ) = H, because φ(ui wvj ) = eφ(ui wvj )e is already in eM e. Hence eφ(wvj ) = φ(vi )φ(ui wvj ) ∈ φ(vi )H, and i is completely determined by j and w. We have shown that one has the disjoint union, for any j, 1 ≤ j ≤ d: [ ∗ −1 A∗ = u−1 i (C )vj . 1≤i≤d
But this is equivalent to the last relation of the lemma. By symmetry, we have also the first. We easily derive the following lemma Lemma 4.4 Let C be a finite complete code of degree d. Then there exist polynomials P, P1 , S, S1 , Q, G1 , D1 with coefficients 0, 1 such that (i) (ii) (iii) (iv) (v) (vi)
dA∗ − Q = SC ∗ P . A∗ − G1 = SC ∗ P1 . A ∗ − D 1 = S1 C ∗ P . P1 , S1 have constant term 1. G1 , D1 have constant term 0. If C is a prefix (suffix) code, then S1 = 1 (P1 = 1).
Proof. We use Lemma 4.3 and the notation of Section 1. We have, by Proposi∗ −1 tion 1.6, u−1 = Sui C ∗ Pvj + Fui ,vj ; moreover, by Lemma 1.5 and Propoi (C )vj sition 1.6, Sui , Pvj and Fui ,vj are polynomials with nonnegative coefficients. Now, by Lemma 4.3, for any i X X Sui C ∗ Pvj + Fui ,vj A∗ = 1≤j≤d
1≤j≤d
and for any j A∗ =
X
1≤i≤d
Sui C ∗ Pvj +
X
1≤i≤d
Fui ,vj
127
4. Factorization Let P =
X
Pvj ,
S=
1≤j≤d
G1 =
X
X
Sui ,
P1 = Pv1 ,
S1 = Su1
1≤i≤d
Fui ,v1 ,
D1 =
X
Fu1 ,vj
Q=
j
i
X
Fui ,vj .
i,j
Then we obtain dA∗ = SC ∗ P + Q,
A∗ = SC ∗ P1 + G1 ,
A∗ = S1 C ∗ P + D1 ,
(4.3)
which proves (i), (ii) and (iii). ∗ −1 ∗ As u1 ∈ C ∗ by Lemma 4.3, u−1 1 (C ) contains 1, hence u1 (C ) has constant ∗ ∗ −1 term 1. As u1 (C ) = Su1 C by Lemma 1.5, S1 = Su1 must have constant term 1. TheP same holds for P1 by symmetry, and proves (iv). As S = Sui , the Sui ’s are nonnegative and as Su1 has constant term 1, i
S has nonnegative constant term. Moreover, P1 has constant term 1. Hence, because A∗ has constant term 1 and by Eq. (4.3), G1 has constant term 0. Similarly, D1 has constant term 0. This proves (v). ∗ ∗ Suppose now that C is prefix. Then, by Proposition 1.3, u−1 1 (C ) = C ∗ ∗ ∗ ∗ −1 −1 ∗ (because u1 ∈ C ). Hence u1 (C ) = C . As by Lemma 1.5, u1 (C ) = Su1 C , we obtain S1 = Su1 = 1. Similarly, if C is suffix, then P1 = 1. This proves (vi). Given a Bernoulli morphism π, define a mapping λ for each word w by λ(w) = π(w) |w| . For each language X, define λ(X) by X λ(X) = λ(w) ∈ R+ ∪ ∞ . w∈X
This is called the average length of X. On the other hand λ extends to a linear mapping ZhAi → R by X λ(P ) = (P, w)λ(w) . w
Lemma 4.5 Let P1 , . . . , Pn be polynomials. Then X λ(P1 · · · Pn ) = π(P1 ) · · · π(Pi−1 )λ(Pi )π(Pi+1 ) · · · π(Pn ) . 1≤i≤n
Proof. For n = 2, it is enough, by linearity, to prove the lemma when P1 = u, P2 = v are words. But in this case λ(uv) = π(uv) |uv| = π(u)π(v)(|u| + |v|) = π(u)|u|π(v) + π(u)π(v)|v| = λ(u)π(v) + π(u)λ(v) . The general case is easily proved by induction.
128
Chapter VIII. Codes and Formal Series
Proof of Theorem 4.1. 1. First, note that the “if” part of (ii) is a consequence of Theorem 2.7. We use the notation of Lemma 4.4. We have A∗ − G1 = (1 − A)−1 − G1 = (1 − A)−1 (1 − (1 − A)G1 ). As A∗ − G1 = SC ∗ P1 and P1 has constant term 1 (Lemma 4.4), P1 is invertible in ZhAi and we obtain from SC ∗ P1 = (1 − A)−1 (1 − (1 − A)G1 ) , by multiplying by 1 − A on the left and by P1−1 on the right, (1 − A)SC ∗ = (1 − (1 − A)G1 )P1−1 .
(4.4)
Multiply the relation (i) of Lemma 4.4 by 1 − A on the left. This yields d − (1 − A)Q = (1 − A)SC ∗ P . Hence, by Eq. (4.4), d − (1 − A)Q = (1 − (1 − A)G1 )P1−1 P . Note that, because G1 has no constant term, 1−(1−A)G1 is invertible in ZhhAii, so that we obtain, by multiplying the previous relation by P1 (1 − (1 − A)G1 )−1 on the left P = P1 (1 − (1 − A)G1 )−1 (d − (1 − A)Q) . 2. We apply Corollary VII.4.3 to the last equality: there exist E, F, G, H in ZhAi such that P1 = EF,
1 − (1 − A)G1 = GF
d − (1 − A)Q = GH,
P = EH
(4.5)
By Proposition VII.4.4 applied to the second equality (with 1 − A instead of Y ), we obtain G ≡ ±1 mod (1 − A)ZhAi . Replacing if necessary E, F, G, H by their opposites, we may suppose that G ≡ +1, and hence we obtain, again by Proposition VII.4.4, and by the third equality in Eq. (4.5), that H ≡ d mod (1 − A)ZhAi, which implies P = E(d + (A − 1)R) ,
R ∈ ZhAi .
(4.6)
3. We have A∗ − D1 = (1 − A)−1 (1 − (1 − A)D1 ) so that by Lemma 4.4 (iii), S1 C ∗ P = (1 − A)−1 (1 − (1 − A)D1 ) . As D1 has constant term 0, (1 − (1 − A)D1 ) is invertible in ZhhAii; moreover S1 is also invertible because it has constant term 1. So we obtain, by multiplying by (1 − C)S1−1 on the left and by (1 − (1 − A)D1 )−1 (1 − A) on the right, (1 − C)S1−1 = P (1 − (1 − A)D1 )−1 (1 − A) . Now we use Eq. (4.6) and multiply by −S1 on the right, thus obtaining C − 1 = E(d + (A − 1)R)(1 − (1 − A)D1 )−1 (A − 1)S1 .
129
4. Factorization 4. By Corollary VII.4.3, there exist E ′ , F ′ , G′ , H ′ ∈ ZhAi such that E(d + (A − 1)R) = E ′ F ′ , 1 − (1 − A)D1 = G′ F ′ (A − 1)S1 = G′ H ′ , C − 1 = E ′ H ′ .
(4.7)
Let π be any Bernoulli morphism. Replacing if necessary E ′ , F ′ , G′ , H ′ by their opposites, we may assume that π(F ′ ) ≥ 0 . So, by Eq. (4.7) and Proposition VII.4.4, we obtain (since π(A − 1) = 0) G′ = 1 + (A − 1)G′′ ,
F ′ = 1 + (A − 1)F ′′
(4.8)
for some G′′ , F ′′ ∈ ZhAi. This and Eq. (4.7) imply that (A − 1)S1 = (1 + (A − 1)G′′ )H ′ = H ′ + (A − 1)G′′ H ′ . Thus, we have H ′ = (A − 1)H ′′ ,
H ′′ ∈ ZhAi .
(4.9)
Now, Eqs. (4.7) and (4.8) imply also E(d + (A − 1)R) = E ′ (1 + (A − 1)F ′′ ) . 5. We now apply Theorem VII.2.2 to this equality and denote by pi the continuant polynomial p(a1 , . . . , ai ) and p˜i = p(ai , . . . , a1 ). Thus there exist polynomials U, V ∈ QhAi such that E = U pn , ′
d + (A − 1)R = p˜n−1 V,
E = U pn−1 ,
1 + (A − 1)F ′′ = p˜n V .
(4.10)
Applying Corollary VII.1.3 to the second and the last equalities (with X → p˜n−1 or p˜n , Y → A − 1, Q1 → 0, P → V , R → d or 1), we obtain that the left Euclidean division of p˜n−1 and p˜n by A− 1 is possible, that is p˜n−1 and p˜n are both congruent to a scalar mod(A − 1)QhAi. This implies, by Proposition VII.2.3, that pn−1 and p˜n−1 (pn and p˜n ) are congruent to the same scalar mod (A − 1)QhAi. lary VII.4.2, they have the same content c(pn−1 ) = c(˜ pn−1 ),
c(pn ) = c(˜ pn ) .
(4.11) Moreover, by Corol(4.12)
6. As D1 has coefficients 0, 1, the polynomial 1 − (A − 1)D1 is primitive. Hence, by Eq. (4.7) and by Gauss’s Lemma, G′ and F ′ are primitive. As by Eqs. (4.10) and (4.8) p˜n V = 1 + (A − 1)F ′′ = F ′ , we obtain by Gauss’s Lemma c(˜ pn )c(V ) = 1
130
Chapter VIII. Codes and Formal Series
and p¯ ˜n V = F ′ . Hence, by Proposition VII.4.4 and Eq. (4.8), V = ε + (A − 1)V ′ ,
ε = ±1, V ′ ∈ ZhAi .
(4.13)
Furthermore, C − 1 is primitive, hence so is E ′ by Eq. (4.7). As E ′ F ′ = E(d + (A − 1)R) by Eq. (4.7) and E ′ , F ′ are primitive, we obtain by Gauss’s Lemma that d + (A − 1)R is primitive. Thus by Eq. (4.10) and Gauss’s Lemma again d + (A − 1)R = p¯˜n−1 V . This implies, by Proposition VII.4.4 and Eq. (4.13), p¯ ˜n−1 = εd + (A − 1)L,
L ∈ ZhAi .
By Eqs. (4.11) and (4.12), we obtain that p¯n−1 and p¯˜n−1 are congruent to the same scalar mod(A − 1)QhAi. Hence p¯n−1 = εd + (A − 1)M with M ∈ QhAi. But p¯n−1 − εd = (A − 1)M and A − 1 is primitive, so that c(M ) = c(¯ pn−1 − εd) ∈ N and M ∈ ZhAi, by Eq. (4.2) in Chapter VII. We have seen that E ′ is primitive, so that by Gauss’s Lemma and Eq. (4.10), we have E ′ = U p¯n−1 which implies E ′ = U (εd + (A − 1)M ) . Hence, by Eqs. (4.7) and (4.9), C − 1 = U (εd + (A − 1)M )(A − 1)H ′′ , where all polynomials are in ZhAi and where ε = ±1. This shows that we have a relation of the form C − 1 = X(ε′ d + (A − 1)D)(A − 1)Y , where X = ±U , Y = ±H ′′ , ε′ d + (A − 1)D = ±(εd + (A − 1)M ) are chosen in such a way that, for some Bernoulli morphism π, one has π(X) ≥ 0, π(Y ) ≥ 0 . 7. Apply Lemma 4.5 to this relation, using the fact that π(A −1) = 0; we obtain λ(C − 1) = π(X)ε′ dλ(A − 1)π(Y ) .
4. Factorization
131
Now λ(1) = 0, λ(C) > 0, λ(A) > 0, and we obtain ε′ dπ(X)π(Y ) > 0 . This shows that ε′ = 1 and proves Eq. (4.1) and (i). 8. Now, if C is a prefix code, we have by Lemma 4.4 (vi) that S1 = 1. Hence, by Eq. (4.7), A − 1 = G′ H ′ , which implies by Eq. (4.9) A − 1 = G′ (A − 1)H ′′ . Hence H ′′ = ∓1, and we obtain Y = ±1. But π(Y ) ≥ 0, so Y = 1. On the other hand, if C is suffix, then P1 = 1 by Lemma 4.4 (vi). Then, by Eq. (4.5), E = ±1 which implies by Eq. (4.10) and Gauss’s Lemma that U = ±1. Thus X = ±1. As π(X) ≥ 0, we obtain X = 1. This proves the theorem.
Exercises for Chapter VIII 1.1 Show that a submonoid of A∗ is of the form C ∗ , C a code, if and only if it is free (that is isomorphic to some free monoid). Show that a submonoid M of A∗ is free if and only if for any words u, v, w u, uv, vw, w ∈ M =⇒ v ∈ M . 1.2 Show that, given rational languages K, L, it is decidable whether their union (their product, the star of K) is unambiguous. 1.3 Show that Su (Pu , Fu,v ) as defined in Section 1 is a sum of proper suffixes (prefixes, factors) of words of C. 2.1 Show that for a finite code C the three following conditions are equivalent: (i) C is a complete and prefix code. (ii) For any word w, wA∗ ∩ CA∗ is not empty. (iii) For any word w, wA∗ ∩ C ∗ is not empty. 2.2 Let C be a finite complete language. Show that for any word w, there exists some power of a conjugate of w which is in C ∗ (two words w, w′ are conjugate if w = uv, w′ = vu for some words u, v). 2.3 Deduce from Theorem 2.4 an algorithm to show that a finite set C is a code (hint: it is decidable whether C is complete, since the set of factors of a rational language is rational). 3.1 Show that if e, e′ are idempotents in the minimal ideal J of a finite monoid M , then there exists an idempotent e1 in J which is a right multiple of E and a left multiple of e′ . Show that the mapping a 7→ ae1 defines a group isomorphism eM e → e1 M e1 . Deduce that all the maximal groups in J are isomorphic. 3.2 Let C be a finite complete code. Show that C is synchronizing (that is of degree 1) if and only if for some word w, one has wA∗ w ⊂ C ∗ . 4.1 Let C be a finite complete code which is bifix. Let n be such that an ∈ C for some letter a. a) Show that for any i, 1 ≤ i ≤ n, Ci = a−i C is a prefix set such that Ci A∗ ∩ wA∗ is not empty for any word w.
132
Chapter VIII. Codes and Formal Series b) Show that the set of proper suffixes of C is the disjoint union of the Ci ’s. c) Deduce that C i − 1 = Pi (A − 1) and that n X Pi (A − 1) . C − 1 = n(A − 1) + (A − 1) i=1
Show that n is the degree of C. Show that it is also equal to the average length of C (cf. Perrin 1977).
Notes to Chapter VIII Theorem 4.1 is a non commutative generalization of a theorem due Sch¨ utzenberger (1965). Corollary 4.2 is a partial answer to the main conjecture in the theory of finite codes, the factorization conjecture which states that P and S may be chosen to have nonnegative coefficients (or equivalently coefficients 0 and 1). Finite complete codes are maximal codes, and conversely, every maximal code is complete. Most of the general results on codes are stated here in the finite case. However, they hold for rational and even for thin codes. For a general exposition of the theory of codes, see the book by Berstel and Perrin (1985). Another illustration of the close relation between codes and formal series is the following result (roughly): a thin code is bifix if and only if its syntactic algebra is semisimple (Reutenauer 1981, Berstel and Perrin 1985).
References Amice, Y. Les nombres p-adiques. Presses Universitaires de France, Paris, 1975. Pr´eface de Ch. Pisot, Collection SUP: Le Math´ematicien, No. 14. Benzaghou, B. Alg`ebres de Hadamard. Bull. Soc. Math. France, 98:209–252, 1970. Bergmann, G. M. Commuting elements in free algebras and related topics in ring theory. Thesis, Harvard University, 1967. Berstel, J. Sur les pˆ oles et le quotient de Hadamard de s´eries N-rationnelles. C. R. Acad. Sci. Paris S´er. A-B, 272:A1079–A1081, 1971. Berstel, J., Mignotte, M. Deux propri´et´es d´ecidables des suites r´ecurrentes lin´eaires. Bull. Soc. Math. France, 104(2):175–184, 1976. Berstel, J., Perrin, D. Theory of codes, volume 117 of Pure and Applied Mathematics. Academic Press Inc., Orlando, FL, 1985. Berstel, J., Reutenauer, C. Recognizable formal power series on trees. Theoret. Comput. Sci., 18:115–148, 1982. B´ezivin, J.-P. Factorisation de suites r´ecurrentes lin´eaires et applications. Bull. Soc. Math. France, 112(3):365–376, 1984. Bo¨e, J. M., de Luca, A.,, Restivo, A. Minimal complete sets of words. Theoret. Comput. Sci., 12(3):325–332, 1980. ´ ements quasi-entiers et extensions de Fatou. J. Cahen, P.-J., Chabert, J.-L. El´ Algebra, 36(2):185–192, 1975. Carlyle, J. W., Paz, A. Realizations by stochastic finite automata. J. Comput. System Sci., 5:26–40, 1971. Chabert, J. L. Anneaux de Fatou. Enseignement Math., 18:141–144, 1972. Cohn, P. M. On a generalization of the Euclidean algorithm. Proc. Cambridge Philos. Soc., 57:18–30, 1961. Cohn, P. M. Free associative algebras. Bull. London Math. Soc., 1:1–39, 1969. Cohn, P. M. The universal field of fractions of a semifir. I. Numerators and denominators. Proc. London Math. Soc. (3), 44(1):1–32, 1982. 133
134
References
Cohn, P. M. Free rings and their relations, volume 19 of London Mathematical Society Monographs. Academic Press Inc. [Harcourt Brace Jovanovich Publishers], London, 1985. Conway, J. H. Regular algebra and finite machines. Chapman and Hall, 1971. Cori, R. Un code pour les graphes planaires et ses applications. Soci´et´e Math´ematique de France, Paris, 1975. With an English abstract, Ast´erisque, No. 27. Dubou´e, M. Une suite r´ecurrente remarquable. European J. Combin., 4(3): 205–214, 1983. Ehrenfeucht, A., Parikh, R.,, Rozenberg, G. Pumping lemmas for regular sets. SIAM J. Comput., 10(3):536–541, 1981. Eilenberg, S. Automata, languages, and machines. Vol. A. Academic Press [A subsidiary of Harcourt Brace Jovanovich, Publishers], New York, 1974. Eilenberg, S., Sch¨ utzenberger, M.-P. Rational sets in commutative monoids. J. Algebra, 13:173–191, 1969. Fatou, P. Sur les s´eries enti`eres `a coefficients entiers. Comptes Rendus Acad. Sci. Paris, 138:342–344, 1904. Fliess, M. Formal languages and formal power series. In IRIA., editor, S´eminaire Logique et Automates, pages 77–85, Le Chesnay, 1971. Fliess, M. Matrices de Hankel. J. Math. Pures Appl. (9), 53:197–222, 1974a. Fliess, M. Sur divers produits de s´eries formelles. Bull. Soc. Math. France, 102: 181–191, 1974b. Fliess, M. S´eries rationnelles positives et processus stochastiques. Ann. Inst. H. Poincar´e Sect. B (N.S.), 11:1–21, 1975. Fliess, M. Fonctionnelles causales non lin´eaires et ind´etermin´ees non commutatives. Bull. Soc. Math. France, 109(1):3–40, 1981. Hansel, G. Une d´emonstration simple du th´eor`eme de Skolem-Mahler-Lech. Theoret. Comput. Sci., 43(1):91–98, 1986. Harrison, M. A. Introduction to formal language theory. Addison-Wesley Publishing Co., Reading, Mass., 1978. Herstein, I. N. Noncommutative rings. The Carus Mathematical Monographs, No. 15. Published by The Mathematical Association of America, 1968. Isidori, A. Nonlinear control systems: an introduction, volume 72 of Lecture Notes in Control and Information Sciences. Springer-Verlag, Berlin, 1985. Jacob, G. Repr´esentations et substitutions matricielles dans la th´eorie alg´ebrique des transductions. Thesis, University of Paris, 1975. Jacob, G. La finitude des repr´esentations lin´eaires des semi-groupes est d´ecidable. J. Algebra, 52(2):437–459, 1978.
References
135
Jacob, G. Un th´eor`eme de factorisation des produits d’endomorphismes de k n . J. Algebra, 63:389–412, 1980. Karhum¨ aki, J. Remarks on commutative N -rational series. Theoret. Comput. Sci., 5(2):211–217, 1977. ISSN 0304-3975. Katayama, T., Okamoto, M.,, Enomoto, H. Characterization of the structuregenerating functions of regular sets and the DOL growth functions. Information and Control, 36(1):85–101, 1978. ISSN 0890-5401. Kleene, S. C. Representation of events in nerve nets and finite automata. In Shannon, C. E., McCarthy, J., editors, Automata Studies, Annals of mathematics studies, no. 34, pages 3–41. Princeton University Press, Princeton, N. J., 1956. Koblitz, N. p-adic numbers, p-adic analysis, and zeta-functions, volume 58 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1984. Kuich, W., Salomaa, A. Semirings, automata, languages, volume 5 of EATCS Monographs on Theoretical Computer Science. Springer-Verlag, Berlin, 1986. Lallement, G. Semigroups and combinatorial applications. John Wiley & Sons, New York-Chichester-Brisbane, 1979. Lang, S. Algebra. Addison-Wesley Publishing Company Advanced Book Program, Reading, MA, second edition, 1984. first edition in 1965. Lascoux, A. Suites r´ecurrentes lin´eaires. Adv. in Appl. Math., 7(2):228–235, 1986. ISSN 0196-8858. Lascoux, A., Sch¨ utzenberger, M.-P. Formulaire raisonn´e de fonctions sym´etriques. Technical report, LITP, Universit´e Paris VII, 1985. Lech, C. A note on recurring series. Ark. Mat., 2:417–421, 1953. ISSN 0004-2080. Lewin, J. Free modules over free algebras and free group algebras: The Schreier technique. Trans. Amer. Math. Soc., 145:455–465, 1969. Lyndon, R. C., Schupp, P. E. Combinatorial group theory. Springer-Verlag, Berlin, 1977. Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 89. Mahler, K. Eine arithmetische Eigenschaft der Taylor-Koeffizienten rationaler Funktionen. Akad. Wetensch. Amsterdam Proc., 38:50–60, 1935. Mandel, A., Simon, J. On finite semi-groups of matrices. Theoret. Comput. Sci., 5:101–111, 1977. Manin, Y. I. A course in mathematical logic. Springer-Verlag, New York, 1977. McNaughton, R., Zalcstein, Y. The Burnside problem for semigroups. J. Algebra, 34:292–299, 1975. Perrin, D. Codes asynchrones. Bull. Soc. Math. France, 105(4):385–404, 1977. ISSN 0037-9484.
136
References
P´ olya, G. Arithmetische Eigenschaften der Reihenentwicklungen rationaler Funktionen. J. reine angew. Math., 151:1–31, 1921. P´ olya, G., Szeg¨ o, G. Aufgaben und Lehrs¨ atze aus der Analysis. Band II: Funktionentheorie. Nullstellen. Polynome. Determinanten. Zahlentheorie. Dritte berichtigte Auflage. Die Grundlehren der Mathematischen Wissenschaften, Band 20. Springer-Verlag, Berlin, 1964. Restivo, A., Reutenauer, C. On cancellation properties of languages which are supports of rational power series. J. Comput. System Sci., 29(2):153–159, 1984. Reutenauer, C. Une caract´erisation de la finitude de l’ensemble des coefficients d’une s´erie rationnelle en plusieurs variables non commutatives. C. R. Acad. Sci. Paris S´er. A-B, 284(18):A1159–A1162, 1977a. Reutenauer, C. On a question of S. Eilenberg (automata, languages, and machines, vol. a, Academic Press, New York, 1974). Theoret. Comput. Sci., 5 (2):219, 1977b. Reutenauer, C. Vari´et´es d’alg`ebres et de s´eries rationnelles. In 1er Congr`es Math. Appl. AFCET-SMF, volume 2, pages 93–102. AFCET, 1978. Reutenauer, C. S´eries formelles et alg`ebres syntactiques. J. Algebra, 66(2): 448–483, 1980a. Reutenauer, C. S´eries rationnelles et alg`ebres syntactiques. Thesis, University of Paris, 1980b. Reutenauer, C. An Ogden-like iteration lemma for rational power series. Acta Inform., 13(2):189–197, 1980c. Reutenauer, C. Semisimplicity of the algebra associated to a biprefix code. Semigroup Forum, 23(4):327–342, 1981. Reutenauer, C. Sur les ´el´ements inversibles de l’alg`ebre de Hadamard des s´eries rationnelles. Bull. Soc. Math. France, 110(3):225–232, 1982. Reutenauer, C. Noncommutative factorization of variable-length codes. J. Pure Appl. Algebra, 36(2):167–186, 1985. Ryser, H. J. Combinatorial mathematics. The Carus Mathematical Monographs, No. 14. Published by The Mathematical Association of America, 1963. Salomaa, A., Soittola, M. Automata-theoretic aspects of formal power series. Springer-Verlag, New York, 1978. Sch¨ utzenberger, M.-P. Sur certains sous-mono¨ıdes libres. Bull. Soc. Math. France, 93:209–223, 1965. Sch¨ utzenberger, M. P. On the definition of a family of automata. Information and Control, 4:245–270, 1961a. Sch¨ utzenberger, M. P. On a special class of recurrent events. Ann. Math. Statist., 32:1201–1213, 1961b.
References
137
Sch¨ utzenberger, M.-P. On a theorem of R. Jungen. Proc. Amer. Math. Soc., 13:885–890, 1962a. ISSN 0002-9939. Sch¨ utzenberger, M.-P. Finite counting automata. Information and Control, 5: 91–107, 1962b. ISSN 0890-5401. Sch¨ utzenberger, M.-P. On a theorem of R. Jungen. Proc. Amer. Math. Soc., 13:885–890, 1962c. ISSN 0002-9939. Sch¨ utzenberger, M.-P. Parties rationnelles d’un mono¨ıde libre. In Proc. Intern. Math. Conf., volume 3, pages 281–282, 1970. Sch¨ utzenberger, M.-P., Marcus, R. S. Full decodable code-word sets. IRE Trans., IT-5:12–15, 1959. Skolem, T. Ein Verfahren zur Behandlung gewisser exponentialer Gleichungen und diophantischer Gleichungen. C. R. 8e Congr. Scand. Stockholm, pages 163–188, 1934. Soittola, M. Positive rational sequences. Theoret. Comput. Sci., 2(3):317–322, 1976. ISSN 0304-3975. Sontag, E. D. On some questions of rationality and decidability. J. Comput. System Sci., 11(3):375–381, 1975. ISSN 0022-0000. Sontag, E. D., Rouchaleau, Y. Sur les anneaux de Fatou forts. C. R. Acad. Sci. Paris S´er. A-B, 284(5):A331–A333, 1977. Steyaert, J.-M., Flajolet, P. Patterns and pattern-matching in trees: an analysis. Inform. and Control, 58(1-3):19–58, 1983. ISSN 0019-9958. ¨ Suschkewitsch, A. K. Uber die endlichen Gruppen ohne das Gesetz der eindeutigen Umkehrbarkeit. Math. Ann., 99:30–50, 1928. Turakainen, P. A note on test sets for ⋉-rational languages. Bull Europ. Assoc. Theor. Comput. Sci., 25:40–42, 1985. Wedderburn, H. M. Noncommutative domains of integrity. J. reine angew. Math., 167:129–141, 1932.
138
Index
Index adjoint morphism, 24 algebra group –, 54 Hadamard –, 47, 53 monoid –, 47 syntactic –, 22, 34 algorithm – of Sardinas and Patterson, 117 reduction –, 33 alphabet, 2 annihilator, 63 automaton, 24 average length, 127 Bernoulli morphism, 120 bifix code, 117 Boolean semiring, 2 cancellative – right module, 106 characteristic – series, 5 – zero, 54 code, 115 codimension, 25 coefficient, 2 commutative semiring, 1 complete – language, 119 – topological space, 4 completely integrally closed, 73 congruence monoid –, 35 semiring, 15 syntactic –, 47 conjecture, 46, 70, 132 conjugate, 131 constant term, 4, 106 content of a polynomial, 111 continuant polynomial, 102 continuous fraction, 114 degree – of a code, 124
– of a polynomial, 3, 99 – of growth, 94 denominator minimal –, 50 dense, 4 dependent, 100 dimension of a linear representation, 8 discrete topology, 3 distance ultrametric –, 3 dominating eigenvalue, 76 strictly – , 78 eigenvalue dominating –, 76 strictly dominating –, 78 eigenvalues of a rational series, 50 Eisenstein’s criterion, 18 equality set, 96 equality set, 47 Euclidean, 99 – algorithm, 99 – division, 101 exponential polynomial, 54 extension Fatou –, 83 factorization conjecture, 132 family locally finite –, 4 summable –, 4 Fatou – extension, 83 – lemma, 72 strong – ring, 88 weak – ring, 88 finitely generated – Abelian group, 57 – module, 8 fir, 114 formal series, 2 free – ideal ring, 114 – monoid, 2
139
Index full matrix, 108 Gauss’s lemma, 111 generating function, 37 geometric series, 26, 56 group algebra, 54 growth degree of –, 94 polynomial –, 87, 94 Hadamard – algebra, 53 – product, 11 Hankel – -like property, 19 – matrix, 25, 52 Hilbert’s tenth problem, 90 ideal – in a monoid, 123 minimal –, 123 syntactic –, 22 syntactic right –, 23 idempotent, 47, 123 image of a series, 39 inertia theorem, 109 integral domain, 99 integral part of a rational fraction, 53 invertible series, 5 irreducible set of matrices, 91 language, 2, 35 proper –, 37 rational –, 35 recognizable –, 35 leap-frog construction, 103 length – of a word, 2 average –, 127 letter, 2 linear recurrence relation, 32, 50 linear representation, 8 locally finite family, 4 matrix proper –, 13 Hankel –, 25 star of a –, 13 measure, 120 merge, 56 message, 115 minimal – automaton, 24 – denominator, 50
– polynomial, 50 module, 8 finitely generated –, 8 monoid – algebra, 47 free –, 2 morphism – of formal series, 16 – of semiring, 2 multiplicity of an eigenvalue, 50 normalized, 50 open peoblem, 70 open problem, 48, 88 p-adic valuation, 58 palindrome, 34, 40 periodic purely –, 63 quasi –, 63 poles, 50 polynomial, 3 – growth, 87, 94 exponential –, 54 minimal –, 50 support of an exponential –, 55 Post correspondence problem, 90, 96 prefix – -closed, 29 – code, 117 – set, 29 prime factors of a series, 58 prime subsemiring, 15 primitive polynomial, 111 product – of languages, 35 – of series, 3 Hadamard –, 11 proper – language, 37 – linear recurrence relation, 52 – matrix, 13 – series, 4 purely periodic, 63 quasi-integral, 72 quasi-periodic, 63 quasi-power, 43 quasi-regular, 17 quotient of a semiring, 15 rank
140 – of a series, 24 of a matrix, 25 rational – closure, 5 – language, 35 – operations, 5 – series, 5 R+ - – function, 75 unambiguous – operations, 70 rationally – closed, 5 – separated, 46 ray, 93 reciprocal polynomial, 51 recognizable – language, 35 – series, 7 reduced linear representation, 26 reduction algorithm, 33 regular – linear representation, 52 – rational series, 52 – semiring, 16 representation dimension of a linear –, 8 linear –, 8 reduced linear –, 26 regular – of a monoid, 36 tree –, 30 reversal, 34 right complete, 29 Schreier’s formula, 21 semiring, 1 – morphism, 2 Boolean –, 2 prime –, 15 regular –, 16 simplifiable –, 16 topological –, 4 semisimple, 132 separated rationally –, 46 series characteristic – of a language, 5 formal –, 2 morphism of formal –, 16 proper –, 4 rational –, 5 recognizable –, 7 shuffle product, 18 similar linear representations, 27 simple
Index – elements, 55 – set of recognizable series, 57 simplifiable semiring, 16 stable, 9 star – -height, 88 – of a matrix, 13 – of a series, 4 submodule, 8 subsemiring, 1 suffix – -closed, 32 – set, 33 suffix code, 117 summable family, 4 support – of a series, 2 – of an exponential polynomial, 55 synchronizing, 125 syntactic – algebra, 22, 34 – congruence, 47 – ideal, 22 – monoid, 47 – right ideal, 23 thin, 132 topological semiring, 4 torsion-free, 72 tree representation, 30 trivial relation, 113 trivially a polynomial, 109 ultrametric distance, 3 unambiguous rational operations, 70, 117 undecidable problem, 90 uniform Bernoulli morphism, 120 weak algorithm, 99 word, 2 empty –, 2