SEMIGROUPS AND AUTOMATA SELECTA UNO KALJULAID (1941–1999)
Semigroups and Automata SELECTA Uno Kaljulaid (1941–1999)
Edited by
Jaak Peetre Lund, Sweden
and
Jaan Penjam Tallinn, Estonia
Amsterdam • Berlin • Oxford • Tokyo • Washington, DC
© 2006 The authors. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior written permission from the publisher. ISBN 1-58603-582-7 Library of Congress Control Number: 2005938840 Publisher IOS Press Nieuwe Hemweg 6B 1013 BG Amsterdam Netherlands fax: +31 20 687 0019 e-mail:
[email protected] Distributor in the UK and Ireland Gazelle Books Falcon House Queen Square Lancaster LA1 1RN United Kingdom fax: +44 1524 63232
Distributor in the USA and Canada IOS Press, Inc. 4502 Rachael Manor Drive Fairfax, VA 22032 USA fax: +1 703 323 3668 e-mail:
[email protected]
LEGAL NOTICE The publisher is not responsible for the use which might be made of the following information. PRINTED IN THE NETHERLANDS
CONTENTS
v
Contents Preface. Biography of Uno Kaljulaid. J. Peetre Bibliography of Uno Kaljulaid.
vii xi xxi
Chapter I. Representations of semigroups and algebras 1 1. [K69a] On the cohomological dimension of some quasiprojective varieties. 3 2. [K77a] Triangular products of representations of semigroups and associative algebras. 15 3. [K79a] Triangular products and stability of representations. Candidate dissertation. 19 4. [K79b] Triangular products and stability of representations. (Author review of Candidate thesis in Physico-Mathematical Sciences). 101 5. [K87a] Some remarks on Shevrin’s problem. 111 6. [K90] Transferable elements in group rings. 117 7. [K00] Ω-rings and their flat representations. Coauthor O. Sokratova 127 Chapter II. Automata theory 1. Preamble. Editors 2. Automata and their decomposition. 3. [K97] On two algebraic constructions for automata. Coauthor J. Penjam 4. [K98c] Revisiting wreath products, with applications to representations and invariants.
141 143 145 183 203
Chapter III. Majorization 1. Generalized majorization. Coauthor J. Peetre 2. Van der Waerden’s conjecture and hyperbolicity. J. Peetre 3. On generalized majorization. J. Peetre
205 207 225 233
Chapter IV. Combinatorics 1. [K88a] On Stirling and Lah numbers. 2. Letter (or draft of letter) c. 1991 from Uno Kaljulaid to Torbjörn Tambour. 3. On Fibonacci numbers of graphs.
237 239 243 245
Chapter V. History of Mathematics 251 1. Th. Molien, an innovator of algebra. 253 2. [K87e] On the results of Molien about invariants of finite groups and their renaissance in contemporary mathematics. 257 3. Theodor Molien, about his life and mathematical work as seen a century later. (A biographical sketch and a glimpse of his work). 265 4. Notes on five 19th century Tartu mathematicians (Backlund, Kneser, Lindstedt, Molien, Weihrauch). 291 Chapter VI. Popularization of Mathematics 1. [K68a] and [K69b] On the geometric methods of Diophantine Analysis.
325 327
vi
CONTENTS
2. [K68b] Lenin prize for work in Diophantine geometry. 3. [K69c] The history of solving equations. 4. [K70] Additional remarks on groups. 5. [K73a] Polynomials and formal series. 6. [K75a] On Galois theory. 7. [K75b] Theory of automata. Coauthor E. Tamme 8. [K93c] Mordell’s problem. 9. [K96] On two discrete models in connection with structures of mathematics and language.
351 355 373 389 399 413 427 447
Index of Names
459
Subject Index
467
PREFACE
vii
Preface
We have the pleasure to offer to the Mathematical Public the Selecta of the eminent, late Estonian algebraist Uno Kaljulaid. It contains mainly papers published in Kaljulaid’s lifetime. Many of them were originally written in Russian, a few also in Estonian, and have now been translated into English, mainly, by one of us, J. Peetre1.
Heritage. In addition to this published material, Kaljulaid left a large number of manuscripts in various states of completion. They are currently in the custody of the Senior Editor. For instance, there is an almost complete paper on right order groups, surveying the subject in its historical development, starting with D. Hilbert; some material on Petri nets, etc., things that, apparently, occupied Kaljulaid in his last years. Hopefully, part of it can also be made public, at a later stage, perhaps in the form of Selecta II. Let us now highlight some of the main items of the present Volume. Contents. We offer here the English translation of Kaljulaid’s 1979 Tartu/Minsk Candidate thesis [K79a], which originally was typewritten in Russian and manufactured in not so many copies. The thesis was devoted to representation theory in the spirit of his thesis advisor B. I. Plotkin: representations of semigroups and algebras, especially extension to this situation, and application of the notion of triangular product of representations for groups introduced by Plotkin. We include also two summaries of the thesis [K77a] and [K79b]. Through representation theory, Kaljulaid became also interested in automata theory, which at a later phase became his main area of interest. Another field of research concerns combinatorics. Besides being an outstanding and most dedicated mathematician Uno Kaljulaid was also very much interested in the history of mathematics. In particular, he took a vivid interest in the life and work of the great 19th century Dorpat-Tartu algebraist Th. Molien (see Chapter V). Perhaps he saw in Molien a kindred soul, as neither of the two got quite the recognition from their Alma Mater, which they for sure deserved; in Molien’s case, he had to go into voluntary exile in Tomsk, Siberia. Kaljulaid was also very interested in the teaching and exposition, or popularization of mathematics; he had several outstanding research students. Some of his more popular-scientific papers were published in an Estonian language journal Matemaatika ja Kaasaeg (Mathematics and Our Age). Amongst there is a whole series of papers about algebraic matters, culminating in a brilliant, elementary – although partly rather philosophical – essay devoted to Galois theory [K75a]. Another such series is his excellent essay of Diophantine Geometry [K68a,69b], in various installments, followed by his éloge [K68b] to another of his teachers Yu. I. Manin. We believe that the inclusion of these papers here will make the Volume more interesting for beginners, and perhaps even contribute to attracting young people to mathematics, in Estonia and elsewhere. 1Later on referred to as Senior Editor.
viii
PREFACE
Presentation. The papers in the Volume are assembled in chapters according to the theme. Important matters or notions have often, with some consequence, been set in italics, sometimes upon their first appearance, or else where they are defined. Rather rare quotes in other languages than English are usually followed by a translation within parentheses. References to items of Uno Kaljulaid come in the form [Kx], where x (a year) is taken modulo 1900, and refer to the bibliography. References to other mathematicians come in the form [y], where y runs through 1, 2, 3 . . . , independently in each separate paper. In case of books translated into Russian, the Russian translation is often indicated, along with the original for the benefit of the Readers reading Russian or having access to the Russian book. In transliterating the Cyrillic into English we use, with some consequence, the system in Mathematical Reviews, as set forth on p. 1–2 of the book [1]. Some facts about Estonia and Estonian mathematics. It should perhaps also be recalled here that Estonia is the northern most of the three Baltic Republics, facing the Finnish Gulf in the north, bordering to Latvia in the south and to Russia in the East. Its population is about 1.3 million, most of them Estonians, many living in the capital Tallinn; there is also a large Russian speaking minority. The Estonians speak a language somewhat affined to Finnish and not at all related to the language of their southern neighbors the Latvians and the Lithuanians. Estonians were mentioned already by the Roman writer Tacitus (c. 55–117) who spoke of them as the Aestorum gentes. However, around the beginning of the 13th century the Estonians were still among those few people in Europe who had not accepted Christianity. In a devastating war (1208–1227), against German, Swedish and Danish Crusaders, the new religion was forced upon them. The last stronghold of the Estonians, the Castle of Valjala on the island of Saaremaa, was conquered by a Crusader’s army, coming from Pärnu and marching over the frozen archipelago, in February, 1227. Then the Estonians became united, together with the Latvians, in a state ruled by the Order of the Brethren of the Sword, later known as the Teutonic Order, while the native population came to live, for centuries, in serfdom. The rule of the Order lasted until mid 16th century. At later times, Estonia was governed, alternatingly, by Swedes, Poles, and Russians. The situation of the indigenous deteriorated ever more and was particularly low towards the end of the 18th century, farmers were freely sold to the highest bidding landowner; one could even draw a parallel to the Belgian Congo at a much later epoch. However, in the mid of the 19th century a national awakening took place. After hard struggles, the Estonians managed to form an independent country of their own in 1918–20, in the aftermath of World War, when all empires collapsed, the Russian one included. In the advent of the Molotov-Ribbentrop treaty in August, 1939 it was annexed by the Soviet Union in June, 1940, and regained its independence in 1991, during the fall of the Soviet empire. For more details about the above, and also some information about mathematics in Estonia until 1940, with a tradition going back to the Academia Gustaviana in Tartu, founded by the Swedish King Gustavus Adolphus in 1632, closed down in 1656, when the city was captured by the Russians, and then followed by the Academia Carolina
PREFACE
ix
(1690–1710)2, we refer to an article by Ülo Lumiste, nestor of Estonian mathematicians, in the book [2]. After a long interregnum the university was reopened in 1802, under the auspices of czar Alexander I; Estonia was now a part of the Russian Empire, the university’s official name being Kaiserliche Universität zu Dorpat, as the language of teaching was German.
Acknowledgements. The appearance of the present compilation would not have been possible without the generous assistance of a large number of friends and colleagues, students, secretaries, librarians, family members, etc. – from Novosibirsk in the East to Iowa in the West. To all of them we express here our sincere thanks. The following list of names (in alphabetic order) comprises probably only a fraction of all: Gert Almkvist, Marianne Blauert, Leonid Bokut, Kerstin Brandt, Michael Cwikel, Martina Eicheldinger, Miroslav Engliš, Jan Gustavsson, László Filep, Eila Ritva Jansson, Margreth Johnsson, Kalle Kaarli, Dan and Christer Kiselman, Andi Kivinukk, Richard Koch, Petr Krylov, Ruvim Lipyanskiˇı, Indrek Martinson, Caroline Myrberg, Aleksandr Nikolskii, Inga-Britt Peetre, Jakob-Sebastian Peetre, Monika Perkmann, Ann-Christin Persson, Ulf Persson, Professor Pater Anders Piltz O.P., Boris Plotkin, Olga Sokratova, Sven Spanne, Gunnar Sparr, Michael David Spivak, Annika Tallinn, Hellis Tamm, Marje Tamm, Enn Tamme, Erki Tammiksaar, Gunnar Traustason, Michael Tsfasman, Victor Ufanrovski, Aleksandr Zubkov. Amongst institutions, we mention in particular the following: Eesti Loodusuurijate Selts (Estonian Naturalists’ Society, Tartu, Estonia); Verlag Heyn (Klagenfurt, Austria). We have had an invaluable aid from many libraries, amongst others: Mathematical libraries of Lund, and the one of Uppsala (named the Beurling library); Lund University, Giesen, and Heidelberg; the library of the Mittag-Leffler Institute; the library of the Institute of Cybernetics at Tallinn University of Technology. Finally, we express our great esteem for the generosity of our sponsors, the Royal Physiographic Society of Lund, taking over all costs of publication and the European Union’s Fifth Framework Programme project IST-2001-37592 (eVikings II) that partially supported the editing of this book and the related visits of Jaan Penjam to Lund. The Editors References [1] A., J. Lohwater. Russian-English Dictionary of the mathematical sciences. American Mathematical Society, Paris, 1961. [2] Ü. Lumiste and J. Peetre. Edgar Krahn, A centenary volume 1894–1961. IOS Press, Providence, Rhode Island, 1994. 2Probably, few mathematicians are aware of that the first ever to teach about Newton’s cosmology was the Swede Sven Dimberg in Tartu [3].
x
PREFACE
[3] Ü. Lumiste and H. Piirimäe. Newton’s Principia in the curricula of the University of Tartu (Dorpat) in the early 1690’s. In: R. Vihalemm (ed.), Estonian studies in the history and philosophy of science. Kluwer Academic Publishers, Dordrecht, Boston, New York and London, 2001, 1–18. Swedish translation, based on enlaged 1981 Estonian version: J. Peetre – S. Rodhe, Normat (to appear).
BIOGRAPHY OF UNO KALJULAID
xi
Biography of Uno Kaljulaid by J. Peetre
The following is mainly drawn from Uno Kaljulaid’s own curriculum vitae along with my personal recollections, as well as information obtained from his daughter Mrs. Annika Tallinn, and some other persons. Uno Kaljulaid was born on October 21, 1941 in Kõpu3 in the district of Viljandi in south-western Estonia.
Primary education. In primary school in Kõpu Uno was supposedly a naughty boy, but he had never any problems in learning. Once even a question was raised of sending him to a special school. After finishing primary school his father, Elmar Kaljulaid wanted him to become a tractor driver, but a relative (the husband of Uno’s sister) took care of him and so Uno moved to Pärnu, a nearby famous seaside resort on the Eastern side of the Riga bay.
Secondary education. So his secondary education young Uno got in Pärnu. He graduated the Pärnu First High School in 1959. But even after Uno still did return to Kõpu. In summer time he used to help his mother with haymaking. But his great hobby was to go and pick cranberries in the swamps and morasses – a great part of Estonia consists of morasses. Early in the morning off he went on his moped and returned only by midnight, when everybody at home already was worried about him. But each time his rucksack was crammed with berries. In Kõpu he also wrote many of his mathematical papers, a special room having been prepared as an office for him. After the death of his parents, however, the farm was sold. Then Uno began to spend his summers in Pärnu, where he rented a room in a house in Toominga (Wild Cherry) Street at the beach area. He liked the arrangement very much and spent at least five years there. Academic career. Uno Kaljulaid studied at Tartu University 1959–1963. But already in 1959, prior to his entering the university, he attracted general attention by participating in the All-Estonian Mathematical Olympiad, arriving as an honorable number four. This was a turbulent time in Estonian mathematics, as the old professors (Jaakson, Rägo, Sarv) were all about to retire. The leading mathematician at the mathematics department of Tartu was then Gunnar Kangro (1913–1975), who opened up a new direction, summation theFig. 1: Uno Kaljualid – a student in Tartu ory and attracted many good students4 there. After four years of study Uno was transferred to the Mechanical and Mathematical Faculty 3Kõpu, small village (population: 372 in 2000) situated on the highway connecting Viljandi and Kilingi-
Nõmme, first mentioned in 1481. [1], 12, p. 264 4In 1940/41, Kangro wrote a long paper on summation theory (100 p.). It appeared in the Acta in 1942, the author had, in 1941, been drafted by the Red Army and then deported to Russia. [2], p. 16.
xii
BIOGRAPHY OF UNO KALJULAID
of Moscow University. He got his diploma in algebraic geometry, under the auspices of Yuri Manin 1966, but he was never formally Manin’s “aspirant”, several applications by him being turned down (cf. below). Post-graduate studies again were done at Tartu University in 1968–1972. As follows to the comments by J.-E. Roos to his diploma work [K69a], some problems, then open, have been settled now. The advisor of his Candidate thesis was Professor Boris Plotkin (at Riga, now in Jerusalem). The defence took place, on March 11, 1979, at the Mathematical Institute of the Belorussian Academy of Sciences in that country’s capital Minsk, with Zenon I. Borevich and Alex E. Zaleskiˇı as official opponents. Uno Kaljulaid taught at Tartu University from 1972 on, first 1972–1974 as an Assistant Professor and then 1974–1983 as an Associate Professor. He was made a Docent in 1983. From 1993 on he did scientific work and provided consultative service at the Computer Science Institute of the Department of Mathematics of Tartu University. Simultaneously, Kaljulaid was a part time senior research fellow at the Institute of Cybernetics in Tallinn, where he carried out studies on compositional theory of abstract state machines with memory.
Fig. 2: Boris Isakovich Plotkin, supervisor of Uno Kaljulaid
Scientific work. Teaching. Students. Uno Kaljulaid’s scientific output is, nominally, not large. Much is in the form of short papers, often merely research announcements. The bibliography below sets the number of items printed under Kaljulaid’s life time to some 40. According to MathSciNet he has 27 reviewed papers in Mathematical Reviews. Searching there for Anywhere: Kalju∗ gave, somewhat surprisingly, 124 hits, indicating that Uno Kaljulaid, after all, was quite influential. To some extent this high figure can be accounted for by the fact that it comprises also reviews written by Kaljulaid. On the other hand, MATH Database lists 14 items covered by Zentralblatt. The first printed paper by Uno Kaljulaid seems to be [K69a] and visibly represents, although we are not explicitly told this, his diploma work at Moscow. It is about algebraic geometry in a rather abstract style (Serre, Grothendieck), to wit about the cohomological dimension of algebraic varieties. This is what Professor Manin wrote to me when he learnt about the untimely death of Uno: He was a student at the Algebra Chair of Moscow University. For some time, I was nominally his advisor, however, he always had his own scientific interests. I remember his mild smile and gentle speech. He was deeply interested in mathematics and enthusiastic about it. During the last decade or so I received a couple of letters and postcards from him. He was explaining what he was doing mathematically and usually added just a few words about life, which so drastically changed for many of us. I will miss him. Although Uno Kaljulaid was a dedicated mathematician and all absorbed by this subject, he had also wide interests outside mathematics. We have already recorded his
BIOGRAPHY OF UNO KALJULAID
xiii
passion for the lovely Estonian cranberries. During his Moscow days he also fell in love with ballet. After his sojourn at Moscow University, Uno Kaljulaid did one year of military service in the same city. Having returned to Estonia, in 1967, he worked some time with Professor Jaak Hion5 as supervisor. However, he soon was attracted to the theory of representations, especially of semigroups and algebras, and so his thesis advisor became, at least unofficially, Professor Plotkin, at this time one of the leaders in this area. His Candidate thesis [K79a] (in type script and written in Russian), is translated into English, and printed here for the first time. An “author’s review” of the thesis [K79b] is likewise included here. For a very brief overview we also refer to a note in Uspekhi Matematicheskikh Nauk [K77a]. Furthermore, some preliminary results later covered in [K79a] were presented in Fig. 3: Military service 1967 separate publications prior et posterior, see e.g. [K71b,71c,73c,76,77b,78a,78b,81,82a,82b, 83a,83e,85a], not reproduced here). The following lines were written, on my request, by Professor Plotkin about his contacts with Uno: Uno Kaljulaid was not only my pupil but also a very close friend. Our contacts started in the end of 60-ties, when I used to come to Tartu from Riga with talks and lectures. That time the mathematical life in Tartu was rather active. One of the most popular activities was Summer Mathematical Schools in Kääriku. In Kääriku there was a base of Tartu University and mathematicians enjoyed this place where mathematical discussions could be combined with rest, beautiful nature and conversations. I remember that these conferences were made possible due to [Jaak] Hion, Mati Kilp, [Ülo] Lumiste and other mathematicians from Tartu. At the beginning of 70-ties my interest was focused on the varieties of group representations. This topic attracted attention of Uno. Soon after he asked me to give him a problem for his [Candidate] thesis. I recommended him to build a similar theory for representations of semigroups. In this case I took into account that Uno [had] already 5Born in 1929, Hion got his Candidate degree in Moscow under A. G. Kurosh, an outstanding algebraist mainly known for his work in group theory, in 1955.
xiv
BIOGRAPHY OF UNO KALJULAID
studied semigroups for a long time. Simultaneously I proposed him some problems about group representations. Uno managed to prove a series of significant results and in the end of the 70-ties he brilliantly defended his [Candidate] thesis at the Institute of Mathematics in Minsk. His work was highly appreciated by the reviewers and the Council members. Along with great results achieved by Uno, I should mention that he had deep and wide mathematical background. Uno has graduated from Moscow State University, where he got his education from outstanding teachers. For example, I know that during his university years he collaborated with Yu. I. Manin. I think Uno took great advantages from his education in Moscow University and the wide style of mathematical thinking can be traced in all his works during his mathematical career. During the period of preparation of the thesis Uno frequently visited me in Riga. Also later he used to come to discuss various problems. Methods, elaborated in the thesis, were extended and used in the automata theory. We considered automaton as a three-sorted mathematical system which possesses algebraic operations converting states to states and states to output signals. The system of input signals naturally constitutes a semigroup with the representation on the set (space) of states. This algebraic point of view on automata turns out to be very fruitful. Last years he collaborated with his pupil Olga Sokratova and other pupils in automata theory. I think that they could give useful information about his last works. I am sure that your activity in commemorating the memory of Uno Kaljulaid will be appreciated by mathematicians.
Fig. 4: Participants of Summer School in Kääriku 1966: V. Vagner, J. Hion, E. Lyapin, L. Shevrin, L. Gluskin and B. Plotkin
BIOGRAPHY OF UNO KALJULAID
xv
Fig. 5: Mati Kilp and Uno Kaljulaid on their way to Moscow 1964
I find it curious that thus two men, independently, first declare that Uno Kaljulaid was not their pupil, but otherwise give him all the praise that they can! This shows that Uno already early on was an independent mind. There is however one person in Tartu that influenced him quite a lot. This is Hion, who also should be considered as the founder of the Estonian school of algebra. So, maybe he should after all be viewed as the true teacher of Uno Kaljulaid! Later he became, undoubtedly inspired by this, interested in automata theory. Already in [K69a] there is a brief treatment of at least linear automata. Indeed, automata theory became his main occupation in the last decade of his life. With his unusually broad mathematical education, Kaljulaid took also a vivid interest in the history of mathematics. In particular, he wrote several papers (see this Volume, Chapter V, in particular the last one) about Theodor Molien (1861–1941), born in Riga of Swedish decent, studied in Dorpat/Tartu and a docent there, later in Siberia), who was a pioneer in the field of algebras, but is relatively little known to the general mathematical public, despite the fact that he influenced, for instance, Emmy Noether, who also duly quoted him. Kaljulaid was also early interested in combinatorics, which is treated here in Chapter III. It is my guess that it was through teaching that he was led to this subject. Among research students of Uno Kaljulaid I mention Annela Kelly (née Rämmer), Peeter Laud, Riina Miljan, Jaan Penjam, Tiit Pikkmaa, Varmo Vene, Tiina Zingel (née Nirk).
My recollections of Uno Kaljulaid. I first met Uno Kaljulaid during a trip to then still Soviet occupied Estonia in the spring of 1989. Namely at a meeting of the Estonian Mathematical Society, which took place at Klooga-Ranna, a seaside resort a few miles West of Tallinn and not far from Paldiski, which at the time was a base for
xvi
BIOGRAPHY OF UNO KALJULAID
Soviet submarines. (The conflict about submarines with Sweden was going on. “There they are, the submarines, which you cannot catch”, I was told, and people pointed to across the bay.) On that meeting, Kaljulaid gave a talk on combinatorics. After the talk I had a discussion with him and I told him about my own experience of this subject. It ended by me inviting him to Sweden. Kaljulaid came to Stockholm in the spring of 1990. I had rented a room for him in the apartment of Bertil Eneroth, Civil Engineer, in Sibyllegatan 38 in the district Östermalm, where I housed many of my guests during my Stockholm years6. He gave a talk at the algebra seminar run by Jan-Erik Roos at Stockholm University. This was the year when I directed, jointly with Svante Janson, a program at the Mittag-Leffler Institute, which was devoted to Hankel theory. So I invited him also there one day. He had supper with me and my betrothed Eila in the company of, also, Marcel Grossman from Marseille. At the same time Uno went also to Lund, where he met Lars Hörmander and his team of bright young Russian students. Our contacts continued later by correspondence. Uno Kaljulaid wrote to me numerous letters to which I responded less frequently. Much of this correspondence is preserved, but some has, regretfully, been lost, especially most electronic messages. Corresponding with him was not easy. He told me about his ideas, gave bibliographical information7, often quite useful, wrote about his travel, the meetings he had been to, and people that he had met . . . Often he wrote several letters, one on the top of the other. Despite my reprimands, they were often undated, so it was not always clear in which order they ought to be read; now afterwards this makes identification quite complicated. Sometimes he, apparently short of paper, wrote numerous post scripts and supplements on odd post cards. He admired me very much and never stopped to thank me for having invited me and for other support. Nevertheless, I think that this collection – I have all stored in a special, rather thick binder –, gives a vivid picture of his thoughts and scientific activities. However, often Kaljulaid sent me odd items such as excerpts from local newspapers which I found of no interest. He sent me also a number of gifts at various times. Among these I value especially highly a copy of the Estonian translation of Johann Renner’s chronicle [6], which covers the highly dramatic period 1555–1561 in the country’s history, when the Swedes under Erik XIV established themselves in the turbulent country8. As a person Uno Kaljulaid was rather complex. He was always very friendly, and utterly polite, at least to me. Many mathematicians, at least among my Swedish colleagues, took a liking of him, and so the news of his untimely death came as a shock to everybody. In a way he was a maniac. He belonged to the category of mathematicians for whom there was no life outside mathematics. I am not a psychiatrist, but my diagnosis is 6Others who stayed at the same apartment, at various times, include Fernando and Luz Cobos, Genkai Zhang; the last was probably Gennadi Vainikko. 7For instance, he gave me precious references of vital importance for my work on trilinear forms, by pointing to work by V. V. Dolotin, I. M. Gel’fand et al etc. 8Johannes Renner, German man of law (c. 1552–1583), lived 1556–1561 in Estonia and was in the service of the Teutonic Order. He witnessed from a close corner the early phases of the devastating Livonian War (1558–1583). The chronicle was completed in 1582, a year before its author’s death, but the ms. of Renner’s book was lost for about two centuries and so the book appeared in print only in 1876. Nowadays it is regarded as a classic in Low German, which was the official language of Livonia (Estonia + Latvia) for centuries, until the beginning of the 18th century).
BIOGRAPHY OF UNO KALJULAID
xvii
that he suffered from a kind of persecution mania. I once called, in desperation, Vainikko (then at Helsinki) about this, but he showed little understanding; some of the things that Kaljulaid had told him also turned out not be true (that obstructions were made to him when he left Tartu etc.). Already in the very beginning of our acquaintance Kaljulaid began to worry about that some of his letters could have been intercepted. This was still in Soviet times, but such allegations continued throughout the period of our relation. Let me relate only one such episode, which is supposed to have taken place during one of his stays at Lund (cf. infra). Namely, Kaljulaid claimed that, in our Department’s coffee room, some Swedes, speaking in Swedish, had slandered him in his presence. With my knowledge of Swedes and Swedish mentality, I find this highly improbable, especially as I have doubts of Kaljulaid’s ability to understand spoken Swedish. Also many people here liked him; among them was Anders Melin – I am not sure if he was supposed to have been present on the occasion referred to above; it was also Melin who first suggested to us to make an application to the Crafoord Foundation (see again infra). Kaljulaid told me also of several other incidents, about various acts of persecution against him, which I found more or less credible. On these occasions his whole attitude suddenly changed, the voice altered almost to whisper, although there could be nobody nearby who could overhear our conversation in Estonian; to me he then looked more like an old woman telling a gossip. Once I wrote to him and advised to go to the Rector of the University and complain; afterwards I realized that, although this could have been a logical step in Sweden, it could hardly have been a good idea in post-communist Estonia. I doubt that Kaljulaid followed my advice. After my return to Lund in 1992, I arranged Uno Kaljulaid a second visit to Sweden with money coming from the Swedish National Council for Scientific Research (NFR); again, he visited both Stockholm and Lund. To Lund he came in September 1994. It was on this occasion that we set up a plan to study majorization from a very general point of view. However, only a tiny portion of our rather ambitious plan was ever materialized (see Chapter III); it is clear that I wrote the first version of that paper already then. We made also plans for future cooperation; to this end we applied, in 1995, for a grant from the Crafoord Foundation, and, indeed, we were given a rather handsome sum of money, which allowed Kaljulaid to come to Lund several times. So, Kaljulaid arrived again in Lund at the end of September 1994. By the irony of fate, he came the week before the Estonia catastrophe9, so, had he come only slightly later, he could well have been one of the victims. I recall that Eila and I heard about it by 6 o’clock in the morning by early, Finnish language broadcast on the Swedish radio. I immediately phoned Uno, who was staying in one of our Department’s guestrooms. We were both, of course, utterly shocked, and I reminded him about all the Estonian refugees, often in tiny vessels, who had drowned in similar weather conditions in the same month in September, 1944. Anyhow, soon we went on with mathematics. Kaljulaid gave several colloquium talks on automata theory; they were based on material which he had prepared on previous occasions. So I volunteered to write them up for him (see Chapter II, Section 2). Having learnt about his inability in practical matters, I saw it
9The passenger vessel Estonia, owned by a joint-venture Swedish Estonian company on the line TallinnStockholm, perished near the Finnish coast, on September 28, 1994, in one of the fierce autumn storms in the Baltic. On this occasion, 869 persons were killed.
xviii
BIOGRAPHY OF UNO KALJULAID
as my duty to try to help him publish at least part of his ideas. Probably, I prepared a TeX-version of Lecture 1 already while he was in Lund. Next time that Uno Kaljulaid came to Sweden was the year after, in October, 1996. We then made plans for another visit. This time we made an application to the Swedish Institute (SI), which included also a visit for Kaljulaid’s bright student Peeter Laud; I was supposed to have become his advisor. Unfortunately, the application was turned down. Later Laud showed interest in more applied things and defended his PhD thesis [3] on information security matters in 2002. An even shorter, last trip was in May 1997. After that time (during the last two years of the life of Uno Kaljulaid), my contacts with him were even more sporadic. I wrote Lecture 2. Uno sent me corrections and additions, and also some material for Lecture 3. Rereading our correspondence from this period, I find it striking that he showed relatively little interest in the whole project. On my side, I also took almost none initiative, as I was busy with teaching and other activities . . .
Marriage. Uno Kaljulaid married in 1973. His future wife Helle was a technical assistant at the mathematics department. From this marriage two daughters were born, Annika in 1974 and Kristina in 1979. In the mid 1980’s the parents divorced, but they never separated. Illness and death. Uno Kaljulaid became ill already at the end of the 1987 and had a surgery for a stomach cancer. At the time doctors gave him only at most five years to live. However, he was practically rather healthy until the middle of July, 1999. He worked and went jogging every morning. Until the mid of July he rested in his beloved Pärnu but then he began to cough and gradually felt less and less at ease. Nevertheless, at the end of July he participated in a conference in Poland, and, probably, gave also a talk there. Upon his return he, finally, went to see a doctor, because his health had seriously began to deteriorate. Mid September he underwent another surgery, but its purpose was only to set a diagnosis: a cancer in the stomach with remote metastases in the lungs and in the liver. After the operation Uno told that he would not surrender so easily and that he hoped to be able to finish at least the ongoing work. A few days before his death, however, asked that all should be finished. Luckily he did not suffer of heavy pain, but still it was very hard. Uno Kaljulaid passed away at the age of 57 on September 26, 1999 in the pulmonary clinic at Tartu. Annika wrote me that it was a sunny autumn day. He died in the arms of his half-sister Laine. He was buried on October 1st at his native village Kõpu in the district of Viljandi. His death was that of a true hero . . . In the meantime, I was quite unaware of everything. Early in June I received two postcards from Uno, dated in Tartu on June 4, 1999, and, apparently, sent in the same enclosure, the text of which I hereby offer a translation:
BIOGRAPHY OF UNO KALJULAID
xix
Dear Mr. Peetre, Thank you for sending me the thesis of Mr. Rosengren10, and likewise for your lines. This time everything arrived in unhurt shape, although with some delay. I have now finished my courses, and very soon I shall also finish the exams. But this occupation gives me steadily less and less satisfaction. Probably I’ll have a chance to participate in the CSconference in Uppsala. But I have not yet made up my mind whether to go there or not, because its scope covers a few of my interests. But it would be an opportunity to see Stockholm once more. Spring here was chilly, frost took the flowering of the currant bushes. Probably things were not so bad where you are – for Lund is on the latitude of Latvia or even further south. I presume that you are already by the sea, I wish a pleasant summer. Uno Kaljulaid
I was notified about Kaljulaid’s death, three days after, in an email message from his daughter Annika. She gave me also the above details of his illness and death. Furthermore, she told me that at his sickbed her father told that he wanted me to take care of his “Nachlass”, which I also eventually did . . . So all this is just my tribute to him . . . References (including two articles [2] and [5], in Estonian, commemorating Uno Kaljulaid) [1] Eesti Entsüklopeedia 1–14 + Supplementary Volume. (Estonian Encyclopedia.), Tallinn, 1985. [2] Mati Kilp. Uno Kaljulaid 21.10.1941–26.09.1999. In: Annual, 1999. Eesti Matemaatka Selts (Estonian Mathematical Society), Tartu, 2001, 111–114. [3] Peeter Laud, Computationally Secure Information Flow. Ph.D. Thesis. Universität des Saarlandes, Saarbrücken, April 2002. [4] Ü. Lumiste and J. Peetre. Edgar Krahn , A centenary volume 1894–1961. IOS Press, Providence, Rhode Island, 1994. [5] Rein Prank. Remebering Uno Kaljulaid. In: Annual, 1999. Eesti Matemaatka Selts (Estonian Mathematical Society), Tartu, 2001, 119–123. [6] J. Renner. Liivimaa ajalugu 1556–1561 (History of Livonia). Translated by Ivar Leimus and Enn Tarvel. Olion, Tallinn, 1995.
10Hjalmar Rosengren defended his PhD thesis Multivariable orthogonal polynomials as coupling coefficients for Lie and quantum representations on May 6, 1999.
This page intentionally left blank
BIBLIOGRAPHY OF UNO KALJULAID
xxi
Bibliography of Uno Kaljulaid
Many works of Uno Kaljulaid have been published in the Estonian journals: 1. Matemaatika ja Kaasaeg is a now extinct, popular-scientific Estonian language journal, whose name is here translated as Math. and Our Age. 2. Eesti Teaduste Akadeemia Toimetised, Füüsika-Matatemaatika = Proceedings of the Estonian Academy of Sciences, Physics–Mathematics (Proc. Estonian Acad. Sci. Phys. Math), founded in 1951/52 by Jüri Nuut (1894–1952). 3. Tartu Ülikooli Toimetised = Acta et commentationes Universitatis Tartuensis (Acta Comm. Univ. Tartuensis) As a rule, papers in the last two journals were published in Russian and supplied with a short abstract in English and in Estonian. Below, rare exceptions when the article done in English and abstracts in other languages (or missing) are pointed out. N.B. – A star * in front of a paper means that the item in question has not been reprinted in this Volume. A double star ** indicates that it will be available on the Senior Editor’s web page: http://www.maths.lth.se/matematiklu/personal/jaak/engJP.html [K68a] On the geometric methods of Diophantine Analysis, I and II. Math. and Our Age, 14; 15 (1968), 22–30; 3–13. [K68b]Lenin prize for work in Diophantine geometry. Math. and Our Age, 14 (1968), 108–110. [K69a] On the cohomological dimension of some quasiprojective varieties. Proc. Estonian Acad. Sci. Phys. Math., 18 (1969), 261–272 incl. loose errata). [K69b]On the geometric method of Diophantine Analysis, III. Mathematics and Our Age, 16 (1969), 20–26. [K69c] The history of solving equations. Mathematics and Our Age, 16 (1969), 122– 140. [K70]Additional remarks on groups. Mathematics and Our Age, 17 (1970), 7–22. *[K71a] On the absence of zero divisors in certain semigroup rings. Acta Comm. Univ. Tartuensis, 281 (1971), 49–57. *[K71b] On the powers of the augmentation ring of the integral group ring for finite groups. Acta Comm. Univ. Tartuensis, 281 (1971), 58–62. *[K71c] On the absence of zero divisors in some semigroup rings. In: Abstracts of the All Union Colloquium of Algebra, Kishinev, 1971, pp. 138–139 (Russian). [K73a] Polynomials and formal series. Mathematics and Our Age, 19 (1973), 39–47. *[K73b] 80 years from the birth of Villem Nano. Math. and Our Age, 19 (1973), 118– 122 (coauthors: E. Tamme, R. Kruus).
xxii
BIBLIOGRAPHY OF UNO KALJULAID
*[K73c] On the powers of the augmentation ideal. Proc. Estonian Acad. Sci. Phys. Math., 22 (1973), 3–21. [K75a] On Galois theory. Mathematics and Our Age, 20 (1975), 17–31. [K75b]Theory of automata. Mathematics and Our Age, 20 (1975), 32–47. (coauthor: E. Tamme). *[K76] On wreath type constructions for algebras. In: Abstracts of the Third All Union Symposium of Rings, Algebras and Modules, Tartu, 1976, pp. 49–50 (Russian). [K77a] Triangular products of representations of semigroups and associative algebras. Uspehi Mat. Nauk 32 (1977), no 4/196, 253-254 (Russian). *[K77b] Remarks on the varieties of semigroup representations and automata. Acta Comm. Univ. Tartuensis, 431 (1977), 47–67. *[K77c] Remarks on the course on discrete mathematics. Proc. of the III Regional Conference-Seminar of Leading Departments and Leading Lecturers of Mathematics, Minsk, 1977, p. 50 (Russian). *[K78a] A remark the basis of identities of an algebra of upper triangular matrices. In: Materials of Conf. “Methods of Algebra and Functional Analysis”, Tartu, 1978, pp. 105–107 (Russian). *[K78b] Triangular products and group rings. Vestn. Moskov. Univ. Mat., no. 6, 1978, p. 81 (Russian). [K79a] Triangular products and stability of representations. Candidate dissertation. Tartu University, 1979, 150 pp. (Russian, typescript). [K79b]Triangular products and stability of representations. Author review of Candidate thesis in Physico-Mathematical Sciences [K79a]. Minsk, 1979, 13 pp. (Russian). *[K79c] The arithmetics of varieties of representations of semigroups and algebras. Manuscript, deposited at VINITI, no. 344–78; “Matematika” 2AI36 DEP, 1979, 42 pp. (Russian). *[K81] About semigroup actions. Acta Comm. Univ. Tartuensis, 556 (1981), 27–32. *[K82a] Terminals of groups and stability of representations. Acta Comm. Univ. Tartuensis, 610 (1982), 15–25. **[K82b] A lower bound for the terminal of certain groups. Acta Comm. Univ. Tartuensis, 610 (1982), 26–37. [K83a] Triangular products representations of linear semigroups actions. Acta Comm. Univ. Tartuensis, 640 (1983), 13–28. *[K83b] A remark on Stirling numbers. In: Sb. “Komb. Analiz”, 6 (1983), p. 98 (Russian).
BIBLIOGRAPHY OF UNO KALJULAID
xxiii
*[K83c] Elements of discrete mathematics. Tartu University Press, Tartu, 1983, 100 pp. (Estonian). *[K83d] Lattices and combinatorics – a problem book. Tartu University Press, Tartu, 1983, 27 pp., (Estonian). *[K83e] On the freedom of the semigroup of special ideals. In: Abstracts of the conference “Methods of algebra and analysis”, Tartu, 1983, pp. 10–12. **[K85a] Unique factorization of varieties of semigroup representations. Acta Comm. Univ. Tartuensis, 700 (1985), 17–31. [K85b]Remarks on subcommutant rings. In: XVIII All Union Algebraic Conference, Abstracts of talks. Kishinev, 1985, p. 227. [K85c] On two results on strongly regular rings. In: Proc. of the Conference “Theoretical and applied questions of mathematics”, Abstracts of talks, Tartu, 1985, pp. 67–69. [K87a] Some remarks on Shevrin’s problem. Acta Comm. Univ. Tartuensis, 764 (1987), 30–38 (English). *[K87b] On the theory of vacuum deposition of layer on the rotating cylindrical substrate. Acta Comm. Univ. Tartuensis, 779 (1987), 127–136 (coauthor: J. Lembra). *[K87d] Theodor Molien and group algebras. In: Development of schools, ideas and theories in natural sciences at Tartu University, Tartu, 1987, pp. 16–24 (Estonian). [K87e] On the results of Molien about invariants of finite groups and their renaissance in contemporary mathematics. In: Development of schools, ideas and theories in Natural Sciences at Tartu University, Tartu, 1987, pp. 111–119 (Russian). [K88a] On Stirling and Lah numbers. In: Methods of algebra and analysis. Tartu, 1988, pp. 11–14 (Russian). [K88b]Fibonacci numbers of outer planar graphs. In: Methods of algebra and analysis, Tartu, 1988, pp. 15–17 (Russian, coauthor: T. Pikkmaa). *[K88c] On the theory of vacuum deposition of layer on a rotating cylindrical substrate from an asymmetrically located source. Acta Comm. Univ. Tartuensis, 830 (1988), 127–136 (coauthor: J. Lembra). [K90]Transferable elements in group rings. Acta Comm. Univ. Tartuensis, 878 (1990), 39–52. *[K93a] M. Meriste, J. Penjam, Algebraic theory of tape-controlled attributed automata. Research Report CS59/93, Institute of Cybernetics, Tallinn, 1993, 28 pp. (coauthors: M. Meriste, J. Penjam). *[K93b] Analytical and algebraic methods in combinatorics. Tartu University Press, Tartu, 1993, 159 pp. (Estonian, coauthor: Ü. Kaasik).
xxiv
BIBLIOGRAPHY OF UNO KALJULAID
*[K93c] Mordell’s problem. Estonian Mathematical Society. Annual 1988, Tartu University Press, Tartu, 1993, pp. 128–151, 178, 182 (Estonian, summary in English and Russian). *[K93d] Languages, tools and methods of conceptual modelling. Research Report CS61/93, Institute of Cybernetics, Tallinn, 1993, 49 pp. (coauthors: M. Meriste, J. Penjam et al.) [K96]On two discrete models in connection with structures of mathematics and language (the languages of life). Schola Biotheoretica XII, Tartu, 1996, pp. 84–95 (Estonian). [K97a] On two algebraic constructions for automata. Research Report CS92/97, Institute of Cybernetics, Tallinn, 1997, 27 pp. (coauthor:J. Penjam). *[K97b] Categories, automata and splicing systems. In: Proc. of 9th Nordic Workshop on Programming Theory, Tallinn, 1997, p. 47 (coauthor: J. Penjam). *[K98a] Flatness and localizations of Ω-semigroups. Research Report CS96/98, Institute of Cybernetics, Tallinn, 1998, 49 pp. (coauthor: O. Sokratova.) *[K98b] Does there exist a (non-abelian simple) linearly right-orderable group all of whose proper subgroups are cyclic?. In: Kourovka Notebook, 14th augmented edition, problem 14.45, Novosibirsk, 1999, p. 110. [K98c] Revisiting wreath poducts, with applications to representations and invariants. In: Yu. A. Bahturin, A. I. Kostrikin, A. Yu. Ol’shanskiˇı (eds.), Kurosh Algebraic Conference, Abstract of talks, Moscow University Press, Moscow, 1998, pp. 64–65. [K98d]Right order groups; their representations, structure and combinatorics. Manuscript, 37 pp.; 2nd version (1998) (to be submitted). [K00]Ω-rings and their flat representations. In: Contributions to General Algebra 12, Verlag Joh. Heyn, Klagenfurt 2000, 377–390 (coauthor: Olga Sokratova).
CHAPTER I Representations of semigroups and algebras
This page intentionally left blank
3
1.
[K69a] On the cohomological dimension of some quasiprojective varieties Comments by J.-E. Roos
Abstract. We prove that the cohomological dimension of the complement an arbitrary finite set of points in an r-dimensional Cohen-Macauly projective variety equals r −1.
The problem of the computation of the cohomology of quasiprojective varieties with coefficients in coherent sheaves leads, in particular, to the interesting question of the cohomological dimension of such varieties. This characteristic of a variety interests us, in first place, in connection with a result by Nagata [7] to the effect that every algebraic variety can be embedded in a complete algebraic variety. As simple examples show, far from always this embedding V → V ∗ satisfies the requirement of the minimality of the number dim(V ∗ \V ). It is an interesting problem to exhibit all the cases when this number can be described in terms of the cohomological dimension of the complement V ∗ \V . In this paper one such case is described in Theorem 1.2. Section 1.1 contains a brief survey of some known, but not readily available results of the theory of local cohomology of A. Grothendieck in a form suitable to us. In Section 1.2 we state some general properties of cohomological dimension. In Section 1.3 it is shown that the cohomological dimension of the complement of a finite non-empty of points in an n-dimensional projective space equals n − 1, and in Section 1.4 we give some auxiliary computations.
1.1. The local cohomology of Grothendieck 1. We give some basic definitions. The space X has cohomological dimension n if, for an arbitrary algebraic sheaf F on X, for i > n the group H i (X, F ) is zero, but there exists a sheaf F such that H n (X, F ) = 0. According to Grothendieck ([3], Theorem 4.15.2) a space of combinatorial Zariski dimension ≤ n has cohomological dimension ≤ n. On the other hand, there exists a space of infinite combinatorial dimension but having zero cohomological dimension [3]. For algebraic varieties X we change the definition of cohomological dimension, considering instead of Abelian sheaves on X the category of coherent sheaves. Then the affine varieties gives us an example of Zariski spaces of arbitrary large combinatorial dimension, in addition having zero cohomological dimension. 2. If Z ⊂ X is locally closed, then by definition one can find an open set V ⊂ X such that Z is closed in V . In the group F (V ) of sections of F on V we distinguish the semigroup ΓZ (X, F ) of all such sections whose supports are contained in Z. The group ΓZ (X, F ) is independent of the choice of V , and the functor F =⇒ ΓZ (X, F )
4
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
maps an exact sequence of sheaves 0 → F → G → H into an exact sequence of Abelian groups 0 → ΓZ (F ) → ΓZ (G) → ΓZ (H). This means that the functor F =⇒ ΓZ (X, F ) is exact from the left from the category of Abelian sheaves on X into the category of Abelian groups. If U ⊂ X is open, then the natural homomorphism of restriction F (V ) → F (V ∩ U ) induces a homomorphism ΓZ (X, F ) → ΓZ∩U (U, F |U ), which is indeed a sheaf. The functor F =⇒ ΓZ (F ) is exact from the left in the category of Abelian functions onto itself; we define the right derivative HZi (X, F ) of this functor which is called the sheaf of local cohomology of X. Let X be an r-dimensional Zariski space F , F an Abelian sheaf on it and Z ⊂ X locally closed. Grothendieck’s theorem ([5], Proposition 1.12) says that for i > r the groups HZi (X, F ) and that the sheaves HZi (X, F ) are zero. 3. Let X = SpecA be an additive scheme, Y one of its subschemes, given by an ideal I ⊂ A; the sheaf of coefficients F associated with the A-module N . Then one has for all i > 0 the isomorphism HYi (X, F ) ≈ lim ExtiA (A/I n , N )
([5], Theorem 2.8).
n
For each open Y ⊂ X and a coherent sheaf F on X one has the exact sequence 0 → ΓY (X, F ) → Γ(X, F ) → Γ(X\Y, F ) → HYi (X, F ) → . . . HYi (X, F ) → H i (X, F ) → H i (X\Y, F ) → HYi+1 (X, F ) → . . . . As in the case at hand H i (X, F ) = 0 for all i > 0, we have the isomorphism H i (X\Y, F ) ≈ HYi+1 (X, F ). Next, let X be an r-dimensional projective space and S = k[t0 , . . . , tr ] the algebra of polynomials over the field k. We take for F the sheaf O(n). Then, by Serre [11], for r O(n)) 0 < i < r the groups H i (X, are zero, while the group H (X, O(n)) is a vector −n−1 and has a basis of skew symmetric cocycles of cover space over k of dimension r U = (ti = 0) of the form 1 , f01...r = α0 t . . . tαr where αi > 0 and αi = −n. Therefore we have for 0 < r < r − 1 the isomorphism H i (X\Y, O(n)) ≈ HYi (X, O(n)), while, by definition the groups H r (X\Y, O(n)) are given by the exact sequence HYr (X, O(n)) → H r (X, O(n)) → H r (X\Y, O(n)) → 0.
1. On the cohomological dimension ...
5
4. Let M and N be graded S-modules. Then the derived functor Ext of the functor Homs (M, N ) = ⊕ HomnS (M, N ), defined, on the one hand, by Serre in [11] and, on the n
other hand, Cartan and Eilenberg in [1] need not coincide. However, it is easy to see they do coincide in the case needed by us of ExtiA (A/I n , A), where A = k[t1 , . . . ,r ] and I is the ideal in A given by Y ⊂ X. Indeed, the ring A/I n as a module over itself, is also an A-module. As a ring A/I n is Noetherian. The submodules of A/I n are ideals in it; therefore, it follows from Hilbert’s theorem ([1], p. 32) that this module is Noetherian. But a Noetherian module over a Noetherian ring is of finite type. In this case ([11], p. 434) both definitions coincide. Let there be given R-modules A, B, A , B and R-homomorphisms α : A → A and β : B → B . Introduce an R-homomorphism Hom(α, β) : Hom(A, B) → Hom(A , B ) which to each ϕ ∈ Hom(A, B) is defined by the Hom(α, β) ◦ ϕ = β ◦ ϕ ◦ α. The objects Hom(A, β) and Hom(α, B) are obtained from Hom(α, β) for A = A and B = B respectively. The following theorem from homological algebra may be useful in the calculation of local groups of cohomology. Let us consider the exact sequences of modules α
0 → I n → A → A/I n → 0 and
β
0 → A → K → A → K/A → 0, where A is a projective and K an injective module. The following isomorphisms hold true (cf. [1], p.141): n ExtiA (A/I n , A) ≈ Exti−2 A (I , K/A);
Ext2A (A/I n , A) ≈ Coker(HomA (α, β)); Ext1A (A/I n , A) ≈ Ker (Hom(α, β))/[Ker (HomA (α, K/A)) + Ker (HomA (A, β))]. As by the first main theorem of Grothendieck one has the isomorphism HYi (X, O) ≈ lim ExtiA (A/I n , A), n
the three isomorphisms just given suffice for the calculation in a 3-dimensional space.
1.2. Some general properties of cohomological dimension 1. Let X and Y be algebraic varities; ϕ : Y → X a morphism and F an algebraic sheaf on X. Then there is defined on Y an algebraic sheaf F ϕ , called the inverse image of the sheaf F under the isomorphism1 ϕ. If F is a coherent sheaf on X, then F ϕ too is coherent on Y . Indeed, in view of the coherence of F there exists U ⊂ X for which one has an exact sequence Op → Oq → F → 0. The homomorphism Ox → Oy induces the identity map on the base field k; therefore we have the canonical isomorphism Oy ⊗Ox Ox ≈ Oy . 1Regarding the construction of the sheaf F ϕ , cf. [9].
6
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
This gives us Oyn ≈ (Oxn )ϕ , n = 1, 2, . . . and so in ϕ−1 (U ) we have for F ϕ the exact sequence Op → Oq → F ϕ → 0, proving the coherence of F ϕ . 2. T HEOREM 1.1. For arbitrary algebraic varieties X and Y we have the inequality dimh(X × Y ) ≥ dimh X + dimh Y.
(1)
If dim X = dimh X, dim Y = dimh Y , then both sides of (1) coincide. P ROOF. Let p1 : X × Y → X and p2 : X × Y → Y be the natural projections. Furthermore, set dimh X = r, dimh Y = s. Then there exists coherent sheaves F and G, on X and Y respectively, such that the k-vector spaces H r (X, F ) and H r (Y, G) are not zero; therefore H r (X, F ) ⊗k H s (Y, G) = 0. Let us use the Künneth formula for sheaves [10]: H i (X, F ) ⊗k H j (Y, G), H n (X × Y, F p1 ⊗OX×Y Gp2 ) ≈ i+j=n
It follows from it that H r+s (X ×Y, F p1 ⊗OX×Y Gp2 ) = 0, whence dimh X ×Y ≥ r+s. Let us remark that for t > r + s the relation n ) = 0 H t (OX×Y
cannot hold true. This follows from Künneth’s formulae in view of n p1 OTn = OTn ⊗OT OT = (OX ) ⊗ OT (OYn )p2 ,
where T = X × Y . In the case dim X = dimh X, dim Y = dimh Y , we obtain in view of Grothendieck’s theorem ([3], Theorem 4.15.2) that dimh X + dimh Y ≥ dim X × Y ≥ dimh X × Y ≥ dimh X + dimh Y, which again gives dim X × Y = dimh X + dimh Y. 3. Let i : V → W be a closed embedding of algebraic varieties. Then holds the relation dimh V ≤ dimh W. Indeed, if we set r = dimh V , then the group H r (V, F ) is non-zeo for some coherent sheaf F over V . On the variety W we consider the sheaf F W , defined by the process of extending F off the variety V . The required relation follows from the isomorphism H r (W, F W ) ≈ H r (V, F ). We remark that for an open mapping this relation is not true. Indeed, let V = A2 \(0), W = A2 , where A2 denotes the affine plane. Then dimh W = 0, but dimh V = 1 (cf. Paragraph 1, Section 1.4).
1. On the cohomological dimension ...
7
4. We make the following conjecture: for an arbitrary fiber bundle (E, π, B) whose fiber is the projective space P r , it holds the formula dimh E = dimh B + r. If this is true it follows from it in a trivial way that the cohomological dimension for the σ-process for a point only can increase. Let X ∗ be a variety obtained by monoidal transformations from a non-special, irreducible algebraic variety X of dimension r. Let the center of this σ-process be a nonspecial d-dimensional variety i : V → X. Furthermore, let f : X ∗ → X be the projection. The inverse image of X under this projection of X ∗ is a projective fibering of rank r − d − 1 with basis V . In view of the fact that the embedding i∗ : V ∗ → X ∗ is closed, the hypothesis made and the monotonicity properties we obtain dimh X ∗ ≥ dimh V + r + d − 1. In particular, for the σ-process for a point we obtain dimh X ∗ ≥ r − 1. As dim X ∗ = dimh X, we have in view of known theorems (cf. Paragraph 1, Section 1.1) we obtain either dimh X ∗ = r or dimh X ∗ = r − 1. Let us now take for X an affine variety of dimension r, and for V a point. It is clear that dimh X = 0. Clearly, as V ∗ is a projective space, then dimh X ∗ = r − 1. Thus for r > 1 we have dimh X ∗ > dimh X.
1.3. The cohomological dimension of a certain variety 1. Let us consider the projective space P r and an arbitrary subvariety of codimension ≥ 2 in it. In Section 1.1 we saw that the group H r (P r \Y, O(n)) can be found from the exact sequence HYr (P r , O(n)) → H r (P r , O(n)) → H r (P r \Y, O(n)) → 0. Is the group H r (P r \Y, O(n)) always different from zero? The answer to this question is negative and follows at once from the following theorem ([5], Theorem 6.8): For any quasiprojective variety of dimension r the following three conditions are equivalent: (1) all irreducible components of X of dimension r are non-proper; (2) H r (X, F ) = 0 for any quasi-coherent sheaf F on X; (3) H r (X, OX (−n)) = 0 for all n ≥ 0, where OX (1) is the “very abundant sheaf” of Serre, induced by some projective embedding of X. As X = P r \Y is a quasiprojective variety of dimension r (open in P r ), apparently irreducible and non-complete, then condition (1) is fulfilled. Therefore condition (2) gives H r (P r \Y, F ) = 0 for every coherent sheaf F on X. 2. T HEOREM 1.2. The cohomological dimension of a quasi-projective variety P r \Y obtained by throwing away a non-empty finite set of points Y = {Q1 , . . . , Qs } in the projective space P r equals r − 1.
8
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
P ROOF. In view of the result of the previous subsection it is sufficient to find a coherent sheaf F on P r \Y such that the group H r (P r \Y, F ) is non-zero. It turns out that one can take F = O(n). We prove the relation H r (P r \Y, O(n)) = 0 by contradiction. Assume that for each coherent sheaf F the group H r (P r \Y, F ) is zero. Then in the exact sequence . . . → H r (P r−1 \Y, F ) → HYr (P r , F ) → H r (P r , F ) → H r (P r \Y, F ) → . . . the boundary groups are zero, and we obtain, in particular, the isomorphism HYr (P r , O(n)) ≈ H r (P r , F ). We use Proposition 1.9 in [5], which we reformulate in a form suitable for us. Let Y ⊂ Y ⊂ P r be closed subspaces and Y = Y \Y . Then for any coherent sheaf F on P r we have the exact sequence HYr (P r , O(n)) → H r (P r , O(n)) → HYr (P r , O(n)) → 0. By the excision formula ([5], Proposition 1.3) for a topological space X, a locally closed Y ⊂ X and an open V ⊂ X such that Y ⊂ V ⊂ X, one has, for each Abelian sheaf F and for all i, the isomorphism HYi (X, F ) ≈ HYi (V, F |V ). Take any point Q in the set Y = {Q1 , . . . , Qs } and consider for Y with respect to the set {Q}. The point Q lies in some component A of the standard affine covering of the space P r . We apply now the excision formula to the penultimate term of our exact sequence for the sheaf O(n). Taking account of that A is affine and the isomorphism O(n)|A = O|A, we get the isomorphisms HYr (P r , O(n)) ≈ HYr (A, O(n)) ≈ H r−1 (A\{Q}, O). Therefore we have the following exact sequence: α
HYr (P r , O(n)) → H r (P r , O(n)) → H r−1 (Ar \{Q}, O) → 0, where α is an epimorphism of k-vector spaces. Thanks to a result of Serre [11] one knows that H r (P r , O(n)) is a finite dimensional k-vector space. On the other hand, the computations in Paragraph 2 of Section 1.3 show that the k-space H r−1 (Ar \Q, O) is infinite dimensional. Therefore the exact sequence of vector spaces obtained concludes the contradiction. The Theorem is proved. In the question of the dimension of the k-space HYr (A, O), where Y = {Q1 , . . . , Qs }, one can limit oneself to the case of a one-dimensional space Y . In fact, the following corollary holds true. P ROPOSITION 1.3. Let A be an r-dimensional variety and F a coherent sheaf on r A. If the space HQ (A, O) is infinite dimensional over k for an arbitrary point Q ∈ A, then the relation r (A, F ) = ∞ dimk H{Q 1 ,...,Qs }
holds for any arbitrary finite family of points {Q1 , . . . , Qs } ⊂ A.
1. On the cohomological dimension ...
9
P ROOF. By Grothendieck [3] for Q1 ⊂ {Q1 , . . . , Qs } ⊂ A one has the exact sequence β
α
r r r HQ (A, F ) → H{Q (A, F ) → H{Q (A, F ) → 0, 1 1 ,...,Qs } 2 ,...,Qs }
which we, for the sake of simplicity, rewrite in the form α
β
A(1) → B(s) → C(s − 1) → 0. Our Proposition gives the possibility to carry induction over the number of points s. Let us assume that the statement is proved for s < n. Then in the exact sequence A(1) → B(n) → C(n − 1) → 0, the term C(n−1) has infinite dimension, which in view of the fact that B(n) is a k-space gives a contradiction. As the computation in 1.4.2 shows that r dimk HQ (k r , O) = dimk H r−1 (k r \Q, O) = ∞,
it follows from what has been proved that for each finite collection of points S in k r the k-space H r−1 (k r \S, O) is infinite dimensional. 3. The character of the facts, from [5] and [11], used in the proof of Theorem 1.2 is such that the statement of the theorem, apparently, can be carried over to the case of an arbitrary variety V of dimension ≥ 2, if it were possible for each affine variety r (X, OX ) is infinite dimensional. X = Spec A, dim A = r, to prove that the k-space HQ r (A) Clearly, A may be taken as a local ring; then everything reduced to the proof that HM is infinite dimensional, where M ⊂ A is a maximal ideal. As S. I. Dolgaev has observed that, when all local rings of a variety V are CohenMacauly rings (for example, when V is nonsingular or is locally a complete intersection), this easily follows from the following criterion of Grothendieck for the coherence of sheaves of local cohomology: Let X be a locally Noetherian pre-scheme locally embedded in a regular pre-scheme, and Y a closed subvariety of X, F a coherent OX -module, c(x) = dim{¯ x}, n ∈ Z. The following two conditions are equivalent [4]: (i) for all x ∈ X\Y such that c(x) = 1, depth Fx ≥ n; (ii) for i ≤ n the sheaves H iY (F ) are coherent. Indeed, take X = Spec OV,Q = Spec A. As by assumption A is a Cohen-Macauly r (A) ring and c(x) = dim{¯ x} = 1, then depth Ax = dim Ax = r − 1. If the space HM were finite dimensional, then it would follow from condition (ii) that n = r, from which by (i) depth Ax ≥ r, which is contradictory.
1.4. Some computations and remarks 1. Let us consider the algebraic variety X, obtained from the affine plane [with coordinates (x, y)] by exclusion of the origin; it is not affine but admits an affine cover U = (U1 , U2 ), where U1 = X\(x = 0) and U2 = X\(y = 0). If X is an arbitrary variety in which the subvariety Y has codimension ≥ 2, we obtain, in view of the fact that the singularity of every rational function on X has codimension 1, that H 0 (X \Y, O) ≈ H 0 (X , O). Therefore, in this case H 0 (X, O) consists of all polynomials P (x, y).
10
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Let us compute the group H 1 (U, O). It is clear that all cochains f12 ∈ C 1 (U) P (x, y) , where k, l are integers. In view of C 2 (U) = 0, all one dimenhave the form xk y l sional cochains are cocycles. The clarification of the question which of the cochains are P (x, y) coboundaries is equivalent to when all k can be written in the form x y xk P1 (x, y) − x P2 (x, y) , xk y Thus, we have
H (X, O) ≈ 1
P (x, y) xk y l
k, ≥ 0
xk P1 (x, y) − x P2 (x, y) xk y
,
where P, P1 , P2 are arbitrary polynomials, while k , , k, l ≥ 0. It is easy to see that this 1 factor space is infinite dimensional. To this end we remark that all expressions k give x y different cosets: 1 xk y − xm y n 1 − = , xk y xm y n xp y q where p = max(k, m), q = max(l, n), k = p − n, m = p − m, = q − , n = q − n. It is sufficient to show that there exist numbers P and Q such that xk y − xm y n = xp P − y q Q. To this end we have to consider two cases 1) p = k, q = and 2) p = k, q = n. Assuming that such P and Q exist in the first case, we obtain xp P − y q Q = 1 − xp−m y q−n , which is a contradiction, as the left hand side of the equation has unity among its terms. Analogously, in the second case the equation y q− −xp−m = xp P −y q Q, where p−m < p, q − < q, leads us to a contradiction. Thus we have proved that dimk H 1 (X, Q) = ∞. 2. T HEOREM 1.4. Let X be an r-dimensional affine space with a distinguished point, defined over an algebraically closed field k. Then the cohomology group H r−1 (X, O) is an infinite dimensional vector space over k. P ROOF. Consider the affine covering U = (Ui ) of X, where Ui = (xi = 0), i = 1, . . . , r. As dim U = r − 1 all (r − 1)-dimensional cochains are cocycles. The elements f1,...,r ∈ C r−1 (U) have the form P (x1 , . . . , xr ) . xi11 , . . . , xirr Let ρi be the restriction homomorphisms, i.e. ρi : Γ( ∩ Uj , O) → Γ(∩Uj , O). j=i
j
1. On the cohomological dimension ...
As by definition of the differential d df =
r
j+1
(−1)
ρj
Pj (x1 , . . . , xn ) i (j)
x11
j=1
11
i (j)
. . . x j . . . xrr
,
then for the computation of the group H r−1 (X, O) we must clarify which expressions P (x1 , . . . , xr ) are expressible in the form xi11 , . . . , xirr ⎞ ⎛ r 1 α α −i (j) α −i (j) j ⎝ (−1)j+1 x1 1 1 . . . xj . . . x1 r r · Pj (x1 , . . . , xr )⎠ = αr 1 xα 1 . . . xr j=1 ⎞ ⎛ r 1 α j ⎝ (−1)j+1 xj Pj (x1 , . . . , xr )⎠ , = α1 r x1 . . . xα r j=1 where αk = max ik (j), 1≤j≤r
k = 1, . . . , r.
Let us denote this equivalence by E. We show that the factor space P (x1 , . . . , xr ) E xi11 , . . . , xirr is infinite dimensional over the field k. To this end it is sufficient to remark that in the case that there exists an index j such that the expressions I1 =
xi11
1 1 and I2 = k1 , ∀ij > 0, kj > 0, ir . . . xr x1 . . . xkr r
j = 1, . . . , r, must lie in different cosets. Let us set αj = max(ij , kj ),
j = 1, . . . , r.
Then
1 r −ir r −kr (xa1 1 −i1 . . . xα − xa1 1 −k1 . . . xα ). r r r . . . xα r We must show that the expression within parentheses can be written in the form I1 − I2 =
1 xα 1
r
α
(−1)j+1 xj j Pj (x1 , . . . , xr ).
j=1
Without loss of generality we can assume that there exists an index s such that α1 = i1 , . . . , αs = is , αs+1 = ks+1 , . . . , αr = kr . The expression within parentheses takes the form (2)
a
x1 s+1
−is+1
r −ir s −kr . . . xα − xa1 1 −k1 . . . xα . r r
where αs+1 − is+1 < αs+1 . . . , αr−ir ; α1 − k1 < α1 , . . . , αs − ks < αs . But this equation shows that (2) cannot be expressed in the form
12
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
r
α
(−1)j+1 xj j Pj (x1 , . . . , xr ).
j=1
Our statement is proven.
3. As has been proved by M. Kneser, in a 3-dimensional space X an irreducible curve E can be expressed by three algebraic surfaces, which we denote by V0 , V1 , V2 . In view of E = ∩Vi we have for X = X \E the open affine covering U = (Ui = X \Vi ) and i
we can apply the following theorem of Serre [11]. Let X be an algebraic variety, F a coherent sheaf on X and U = (Ui ) a finite affine covering of X. Then for each i ≥ 0 the homomorphism σ(U) : H i (UU, F ) → H i (X, F ) is an isomorphism. As dim U = 2, we have by this theorem H 3 (X, F ) = 0 for all coherent sheafs on X. There arises an interesting question: For an algebraic curve E and a coherent sheaf F on X, can the group H 2 (X, F ) be different from zero? This is connected with the conjecture on the impossibility to express an arbitrary curve in a 3-dimensional space by two surfaces. Indeed, we would have a proof of this negative statement if for some curve E the answer to the question posed would be positive. The question of the non-triviality of H 2 (X, F ) arises also in connection with the conjecture that each vector bundle of rank 2 on a 3-dimensional affine space is trivial. Indeed, Serre proved in [12] that if this problem has a positive solution then each non-special rational or elliptic curve in a 3-dimensional affine space would be a complete intersection. Therefore this conjecture would be refuted if in a 3-dimensional affine space one could find a rational or elliptic curve E such that H 2 (CE, F ) = 0 for some coherent sheaf F . This shows that the question perhaps could be solved in terms of cohomological of algebra. In [6] Hartshorne introduced the notion of local connectivity of a variety of codimension 1, which refers to the situation when spreading out of a subvariety of codimension greater than unity does not disturb the connectivity structure of the variety. He obtains a necessary condition for a manifold to be a complete intersection, which amounts to local connectivity of this variety of codimension 1. It turns out that the non-triviality of the groups H i (X \V, O), i ≥ 2, is not a necessary for the representability of that variety as a complete intersection. This is shown by the following example. Consider in the complex affine space X = C4 with the Zariski topology the variety V which is the union of two planes: x1 = x2 = 0 and x3 = x4 = 0. It is clear that at the origin this variety is not connected with codimension 1 and so it cannot be a complete intersection. However, a computation reveals that H 2 (CV, O) = H 3 (CV, O) = 0, where [as before] CM denotes the complement on C4 of a set M . We have X = CV = C[(x1 = x2 = 0) ∪ (x3 = x4 = 0)] = 3
=C(x1 = x2 = 0) ∩ C(x3 = x4 = 0) = ∪ Ui , i=0
1. On the cohomological dimension ...
13
where U0 = (x1 = 0)∩(x3 = 0), U1 = (x1 = 0)∩(x4 = 0), U3 = (x2 = 0)∩(x4 = 0). Take in X the covering U = (Ui ). By Serre’s theorem H i (X, O) ≈ H i (U, O), i = 2, 3. Let us complete H 3 (U, O). To this end we remark that all 3-dimensional cochains are of the form P (x, y, z, t) . xk y z m tn It is clear that all 3-dimensional cochains are cocycles. For all j = 0, 1, 2, 3 we have ∩ Ui = ∩Ui . Therefore, all restriction homomorphisms i=j
i
ρi : Γ( ∩ Uj ) → Γ(∩ Uj ), j=i
j
i = 0, 1, 2, 3,
are exact. It is now easy to see that all 3-dimensional cochains are exact, hence H 3 (U, O) = 0. An analogous reasoning reveals that also the groups H 2 (U, O) are trivial. Note added in proof. R. Hartshorne, Cohomological dimension of algebraic varieties (Ann. Math. 3, 444–450 (1968)), has shown that H 2 (P 3 \E, F ) = 0 for all F .
Comments. The reference [5] has now appeared in Springer Lecture Notes in Mathematics 41 (1967). The reference [4] has been published by North-Holland/Masson 1968 as Volume 2 in the series Advanced Studies in Pure Mathematics. The problem of [12], mentioned at the end of Kaljulaid’s paper has now been solved: The conjecture of Serre that all projective modules over a polynomial ring are free (i.e. that algebraic vector bundles over k n are trivial) has been proved independently by Quillen [8] and Suslin [13] (Cf. also: [2]). Jan-Erik Roos
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
H. Cartan and S. Eilenberg. Homological Algebra. Princeton Landmarks in Mathematics. Princeton University Press, Princeton, 1999. Reprint of the 1956 original. D. Ferrand. Les modules projectifs de type fini sur un anneau de oltnômes sur un corps sont libres. In: Séminaire Bourbaki, Vol. 1975/76. Springer-Verlag, Berlin, 1977, 202–221. R. Godement. Topologie algébrique et théorie des faisceaux. Technical Report 13. Actualit’es Sci. Ind., no. 1252., Publ. Math. Univ. Strasbourg., Hermann, Paris, 1958. Russian translation: Moscow, 1961. A. Grothendieck. Cohomologie locale des faisceaux cohérents et théorèmes de Lefschetz locaux et globaux. Technical Report exposé 8, 8-2-4, I.H.E. Seminaire de Géométrie Algébrique, 1962. A. Grothendieck. Local Cohomology. Technical Report Lecture notes by R. Hartshorne. Harvard University, 1961. R. Hartshorne. Complete intersections and connectedness. Amer. J. Math. 84, 1962, 497–508. M. Nagata. Imbedding of an abstract variety in a complete variety. J. Math. Kyoto Univ. 2, 1962, 1–10. D. Quillen. Projective modules over polynomial rings. Invent. Math. 36, 1976, 167–171. J. Sampson and G. Washnitzer. A Vietoris mapping theorem for algebraic projective fibre bundles. Ann. Math. 68, 1958, 348–371. J. Sampson and G. Washnitzer. A Künneth formula for coherent algebraic sheaves. Illinois J. Math. 3, 1959, 389–402. J. P. Serre. Faisceaux algébriques cohérents. Ann. Math. 61, 1955, 191–278. J. P. Serre. Sur les modules projectifs. Technical Report 14-e année, no. 2. Seminaire Dubreil Pisot, Algèbre et Théorie des nombres, 1960. A. A. Suslin. Projective modules over polynomial rings are free. Dokl. Akad. Nauk. SSSR 229, 1976, 1063–1066.
This page intentionally left blank
15
2.
[K77a] Triangular products of representations of semigroups and associative algebras Revised by J. Peetre, comments by R. Lipyanskiˇı
The triangular product in the theory of varieties of representations of groups plays a role analogous to the role of the wreath product for group varieties. In this note we study the triangular product of representations of semigroups and associative algebras. We assume that K is a field. This is required in the main results of the paper, although the principal constructions and notions can be introduced for any associative and commutative ring K with unit. For pairs (G, Γ) such that the semigroup (algebra) Γ acts by semigroup (algebra) endomorphisms on the K-module G, one can introduce, exactly as in the case of groups, a net of notions and constructions. A variety of representations of semigroups and algebras is a saturated Birkhoff class of the corresponding pairs. By definition, a class K of pairs is termed saturated if for all right epimorphisms of pairs (G, Γ) → (G , Γ ) with (G , Γ ) ∈ K it holds that (G, Γ) ∈ K. The variety generated by the class K will be denoted Var K. Multiplication of two varieties Θ1 and Θ2 is defined by the rule: a pair (G, Γ) is contained in Θ1 · Θ2 if G has a Γ-invariant submodule H such that (H, Γ) ∈ Θ1 and (G/H, Γ) ∈ Θ2 . There arises the semigroup M(K) (the semigroup L(K)) of varieties of representations of semigroups (algebras). The semigroup M(K) is anti-isomorphic to the semigroup of ideals of the semigroup ring F = KΨ of the free monoid Ψ with a countable set of free generators, invariant with respect to all endomorphisms F induced by endomorphisms of the monoid Ψ. The semigroup L(K) is anti-isomorphic to the semigroup T (K) of non-zero ideals of the free associative Kalgebra F of countable rank (with respect to the usual multiplication of ideals of F ). 1. For pairs (A, Σ1 ) and (B, Σ2 ) we set Φ = Hom+ K (B, A) ⊂ EndK (A, B). The natural action of the semigroups Σ1 and Σ2 on the (additive) semigroup Φ allows us to introduce a multiplication in Φ × Σ1 × Σ2 , (ϕ, σ1 , σ2 ) · (ϕ , σ1 , σ2 ) = (σ2 · ϕ , ϕ · σ1 , σ1 σ1 , σ2 σ2 ). There arises the semigroup Γ = Φ Σ1 × Σ2 ; its action on G = A ⊕ B goes according to the formula (a + b) ◦ (ϕ, σ1 , σ2 ) = a ◦ σ1 + bϕ ◦ σ1 + b ◦ σ2 , extends to the pair (G, Γ), which will be denoted (A, Σ1 ) (B, Σ2 ) and called the triangular product of the given pairs. The properties of this construction are in many respects parallel to the properties of the triangular product of group pairs (B.I. Plotkin, 1971, [3]). Let us remark that Γ is a group if and only if Σ1 and Σ2 are groups and Φ is treated as the additive closure to a group of the semigroup Hom+ K (B, A). T HEOREM 2.1. The following formula holds true: Var(K1 K2 ) = Var K1 · Var K2 .
16
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
From this one deduces that the variety of linear representations (over a field) is a semigroup with a unique decomposition as a product of a finite number of indecomposable varieties. 2. The questions under study are also connected with automata theory. A linear semigroup automaton A = (A, Γ, B) is a system, where A (the states) and B (the outputs) are K-modules, while Γ (the input signals) is a semigroup and there are given K-linear operations A◦Γ → A and A Γ → B such that (A, Γ) is a liner map with respect to the action ◦, and a ∗ γ1 γ2 = (a ◦ γ1 ) ∗ γ2 for all a ∈ A, γ1 γ2 ∈ Γ. The automaton A = (A , Γ , B ) is called an invariant subautomaton of A if A and B are K-submodules in A and B respectively and A ◦ Γ ⊂ A , A Γ ⊂ B . By definition an automaton A belongs to the product of two varieties of linear automata Θ1 and Θ2 if there exists an invariant subautomaton such that A ⊂ A, A ∈ Θ1 with A/A ∈ Θ2 . T HEOREM 2.2. A variety of linear automata with the multiplication indicated is a semigroup which is not free but contains a free subsemigroup isomorphic to M(K). ¯ where Σ ¯ acts from the left and the right on 3. Let there be given K-algebras Φ and Σ, ¯=Φ ¯ ⊕Σ ¯ we retain Φ and that this is a bimultiplication in the sense of Hochschild. On Γ the definition of addition and multiplication by scalars, while multiplication is defined anew putting (ϕ, σ) · (ϕ , σ ) = (ϕ · σ + σ · ϕ + ϕϕ , σσ ). ¯ which is the semidirect product of the algebras Φ and There arises the K-algebra Φ Σ, ¯ Σ. ¯ 1) For given pairs (G1 , Σ1 ) and (G2 , Σ2 ), where Σi are K-algebras, we let (G1 , Σ ¯ 2 ) be the corresponding faithful pairs and set G = G1 ⊕ G2 . We treat Σ ¯ = and (G2 , Σ ¯ 2 and Φ = HomK (G2 , G1 ) as subalgebras of EndK G. Multiplication in EndK G ¯ 1 ⊕Σ Σ ¯ on Φ. Setting Σ = Σ1 ⊕ Σ2 we obtain a natural defines a left and a right action of Σ ¯ then we can extend the action of Σ ¯ on Φ to an action of Σ on epimorphism f : Σ → Σ, ¯ = Γ whose action on G = G1 ⊕ G2 is given by the Φ. We arrive at the algebra Φ Σ formula (g1 + g2 ) ◦ (ϕ, σ) = g2ϕ + (g1 + g2 ) ◦ σ. This action agrees with the operations on Γ. There arises the pair (G, Γ), which is the triangular product of the representations (G1 , Σ1 ) and (G2 , Σ2 ), which we denote by (G1 , Σ1 ) (G2 , Σ2 ). T HEOREM 2.3 (Main theorem). For any two classes K1 and K2 of representations there holds the formula Var(K1 K2 ) = Var K1 · Var K2 . It follows from this that each non-trivial representation of algebras decomposes uniquely as a finite product of indecomposbale representations. Thus, the semigroup L(K) is free. This opens a new door to the result of Bergman and Lewin [1] on the freeness of the semigroup T (K). Here we have supplementary possibilities. It is known that the set A(K) of proper varieties of (associative) K-algebras is in a bijective correspondence with the set T (K). Multiplication in T (K) induces now on A(K) an associative multiplication , which we denote by ∗. We are led to the following results. For a K-algebra A let A∗ be the result of an outer adjunction of a unit to it , and let Var A be the variety of K-algebras generated by A. Let us introduce for any K-algebras
2. Triangular products of representations
17
A and B the operation of wreath product by the formula AwrB = HomK (B ∗ , A∗ ) (A ⊕ B), where A∗ and B ∗ are regarded as K-modules. The justification of this name is given by the functional role of this operation, which is disclosed by the formula (Var A∗ ) ∗ (Var B ∗ ) = Var(BwrA). By definition a T -ideal is finitary if the variety of K-algebras defined by it generates a finite dimensional K-algebra. It turns out that a finite product of T -ideals is finitary when all factors are finitary. If a variety of K-algebras is given with the aid of identities in n variables then it can not be decomposed in more than n factors. In particular, the semisimplicity (in the sense of Jacobson)of a K-algebra forces Var A to be indecomposable. The author is obliged to Professor B.I. Plotkin for supervising this work, and for his valuable advice and interesting discussion, and, furthermore, G. Bergman for sending him his pre-print. [3], [2]
Comments. It is known that if K1 and K2 are two classes of group representations over a field and K1 K2 their triangle product, then Var(K1 K2 ) = Var K1 · Var K2 [4]. Uno Kaljulad extends this result to representations of semigroups, associative algebras and linear automata. In this way he obtains another proof of the Bergman-Lewin theorem that the semigroup of T -ideals (verbal ideals of absolutely free associative algebra over a field) is free Bergman-Lewin [1]. He introduces a new operation over associative algebras, the wreath product of algebras, and proves some interesting properties of this operation: (Var A∗ ) ∗ (Var B ∗ ) = Var(BwrA), where A∗ (and B ∗ ) is obtained from algebra A (and B) by adjunction to it a unit. Author investigates also decomposition of finitary T -ideal in indecomposable factors. He brings a sufficient condition when factors of this decomposition to be indecomposable: if K-algebra is semisimple (in the sense of Jacobson), then Var A to be indecomposable. There are also other sufficient conditions for the above-mentioned properties. I think that this paper of Uno Kaljulad was a pioneer work in the theory of the variety of semigroup representations and the variety of linear automata. His results extend also above mentioned Bergman-Lewin theorem. Ruvim Lipyanskiˇı
References [1] G. Bergman and J. Lewin. The semigroup of ideals of a fir is (usually) free. J. London Math. Soc. 11 (2), 1975, 21–31. [2] G. Birkhoff. The role of algebra in computing. Computers in algebra and number theory, SIAM-AMS Proc., Amer. Math. Soc IV, 1971, 1 – 47. [3] B.I. Plotkin and A.S. Grinberg. On semigroups of varieties, connected with group representations. Siberian Math. Journal 13, 1972, 841–858. [4] B.I. Plotkin. Multiplicative systems of varieties of pairs – group representations. Latvian Mathematical Yearbook 18, 1976, 143–169, 223.
This page intentionally left blank
19
3.
[K79a] Triangular products and stability of representations. Candidate dissertation Translation by J. Peetre, revised by K. Kaarli
Contents of the dissertation Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1. The triangular product. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .23 1.1. 1.2. 1.3. 1.4. 1.5.
Triangular products of group representations . . . . . . . . . . . . . . . . . . . . . 23 Triangular products of semigroup representations . . . . . . . . . . . . . . . . . 25 Triangular products of representations of algebras . . . . . . . . . . . . . . . . 35 Connections between -constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2. The arithmetics of varieties of representations of semigroups and algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2.1. 2.2. 2.3. 2.4. 2.5. 2.6. 2.7.
Varieties of linear pairs and automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 Technical results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 The fundamental lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 The theorem on generating representations of semigroups . . . . . . . . . . 55 Consequences. Connections with linear automata . . . . . . . . . . . . . . . . . 57 The theorem on generating representations of algebras . . . . . . . . . . . . . 62 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3. Powers of the fundamental ideal and stability of representations of groups and semigroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 3.1. Preliminary topics; on the terminal of nilpotent groups . . . . . . . . . . . . . 70 3.2. Construction of stable representations of groups with the aid of the triangular product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 3.3. Generalized measure subgroups of finite groups . . . . . . . . . . . . . . . . . . . 85 3.4. Mal’cev nilpotency and stability of semigroups . . . . . . . . . . . . . . . . . . . 90 3.5. Comments and remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
Introduction In various branches of mathematics and its applications there arises a need to use representations, and so problems of their classification become urgent; cf. [18, 20, 32, 46, 47, 50, 54, 65, 66, 86]. If one takes into account that a representation is a two-sorted algebraic system (a pair) then the systematics of representations is facilitated. The book [35] is
20
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
written from this point of view, and, furthermore, there is visible evidence of this in the note [57] and in the survey [41]. The naturality and usefulness of the study of classes of algebraic systems has often been emphasized by A. I. Mal’cev; for example, in [30]. The reduction of classes to “simpler” ones is one of the fundamental problems of this direction; as an example, let us mention the result of A. L. Shmel’kin and Neumanns 2 on the freedom of the semigroup of varieties of groups ([34, Theorem 23.4]). The problem of decomposition has always been an essential ingredient of every theory of representations: the classical theory of reduction to irreducible linear representations of a fixed group (§14 in the book [48] by D. A. Suprunenko) or the reduction to indecomposable varieties of representations with a variable group (the paper [43] by B. I. Plotkin and A. S. Grinberg, as an example). Indecomposable classes as “simplest blocks” in a given theory can not be reduced to simpler classes and have to be studied separately. On the other hand, for the reduction to indecomposable classes one needs tools for doing the decomposition. In the theory of varieties of groups this role is played by the wreath product of groups, and in the case of representations with a variable group the construction of the triangular -product of group representations. According to B. I. Plotkin [36], the pair (G, Γ) is the triangular product of its subpairs (A, Σ1 ) and (B, Σ2 ) if the following conditions are fulfilled: (1) for the subgroup Σ = {Σ1 , Σ2 } ≤ Γ, generated by two subgroups Σ1 and Σ2 , the subpair (G, Σ) decomposes into the direct product of its subpairs (A, Σ1 ) and (B, Σ2 ); (2) in the group Γ there exists a normal subgroup Φ such that the subrepresentation (G, Φ) is faithful, and the image of Φ in Aut G coincides with the centralizer of the series 0 ⊂ A ⊂ G, that is, it acts as identity on each factor of this series; (3) the group Γ coincides with the semi-direct product Φ Σ. The object of this thesis is the study of the triangular product and its applications. It consists of two parts. The goal of the first part (Sections 3.1 and 3.2) is to find the -construction for representations of semigroups and algebras, to study the properties of these tools and their application to the decomposition of the varieties of the corresponding representations. In the second part (Section 3.3) the triangular product is applied to the study of the powers of the fundamental ideal of group rings. Representations by endomorphisms of modules, semigroups and algebras have been the subject of many studies by A. V. Mihalev [33] and L. M. Gluskin [10]. On the other hand, the representations of rings by endomorphisms of Z-modules is a classical research topic. The tendency towards a category-theoretic formulation of the classification of representations makes urgent the problem of the decomposition of classes of representations of various algebraic objects (groups, semigroups, algebras etc.), as the elaboration of a general theory requires the understanding of the possible deviations and its leading to a coherent series of notions, constructions, and results. This is one of the reasons why the introduction and the study of tools of decomposition of representations of semigroups and algebras deserve attention. The known difficulties for carrying over results for groups and their representations to semigroups increases the interest for cases and ways where it is possible; concerning that see [24, Chapter 7]. 2 Editors’ note. Three well-known group-theorists: Bernhard and Hanna Neumann and their son Peter M. Neumann
3. Triangular products and stability of representations
21
The essential results of the first part of this thesis concern the search for suitable constructions for representations of semigroups and algebras, the study of several their properties, and further the establishing of connections between these new constructions with the triangular products of representations of groups. There exists a cryptomorphism (in the sense of G. Birkhoff [58]) of the three constructions mentioned, although even their definitions are quite different, and, as a result, sometimes there are considerable differences in the proofs of properties. The varieties of representations of the semigroups admit an associative multiplication and the corresponding semigroup is factorial. This follows from the main result of Section 3.2, Theorem 3.33 about the generating representations. Such results are also obtained for algebras (Theorem 3.43 and Theorem 3.44). The role of the -constructions, introduced in Section 3.1, in the proof of these facts is analogous to the role of the wreath product in the proof of the above mentioned grouptheoretic theorem of Shmel’kin and Neumanns. For the theorem on freedom of the semigroup of varieties of linear representations of semigroups there is also a proof in terms of the semigroup ring of the free countably generated monoid; the extract of the reasoning needed is well-known from [56]. Among the consequences of the theorem on generating representations of algebras (Theorem 3.43) let us mention the theorem of Bergman and Lewin on the freedom of the semigroup of T -ideals, which in [56] is proved by means of the theory of FI-rings of Cohn [5]. Here the corresponding fact is interpreted as a statement about the freedom of the semigroup of varieties of representations of algebras, and in this form it readily follows from Theorem 3.43. The given approach, however, allows to penetrate more deeply into the essence of the matter. For example, for given finite dimensional (over a fixed field K) pairs (A, Σ1 ) and (B, Σ2 ) the identities for the variety Var(A, Σ1 )·Var(B, Σ2 ) are readily found, these are exactly the identities for Var(G, Γ), where (G, Γ) = (A, Σ1 ) (B, Σ2 ). This might be rather difficult to obtain such a result by means of multiplication of T -ideals. There are also other applications of the material in Section 3.2 concerning representations of algebras in the theory of associative algebras themselves. Let us mention a necessary condition for the indecomposability of a variety of algebras (Theorem 3.49). The technique developed in the first part of this thesis is tightly connected with automata theory [31]. After the interaction of this discipline with the theory of algebras (V. M. Glushkov [9]) an essential result was established, that is the theory of decomposition of finite automata. Nowadays algebraic methods in automata theory are developing rather intensively; cf. [18], [42], [65], [45] etc., and furthermore in Eilenbergs’ book [66] there is given a detailed analysis of the corresponding methods in a modern presentation. The present author has introduced the semigroup of varieties of linear automata and has given a description of it in the language of pairs of “consistent” ideals in the free countably generated associative algebra (over the given field), which gives the possibility to establish interesting properties of this semigroup (Theorem 3.37). Since the very beginning of the theory of group representations a major role have been played by the group and semigroup algebras. In this connection it was observed that the application of the ideas and methods of the (general) theory of algebras and their representations to group algebras was fertile and even the group algebras themselves turned out to be a subtle tool of calculation in the study of the structure of groups. The papers [81,98–100] convince of the great heuristic value of group and semigroup algebras
22
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
in Combinatorics. The situation here reminds the one in Number Theory, where for the achievement of many deep facts on integers one applies algebraic and analytic methods. The main goal of Section 3.3 of this dissertation is the study of an issue of the of the powers of the fundamental ideal of an integral groups ring stabilization, which is already being deeply investigated for more than a decade, cf., for example the survey A. V. Mihalev and A. E. Zaleskiˇı [51], the lectures by A. A. Bovdi [3] or the book by D. Passman [96]. Our choice of subject was stimulated by the deep and beautiful work of A. I. Mal’cev [27], K. Gruenberg [69] and B. Hartley [77], where the special role of nilpotence in this circle of ideas is likewise clearly set forth. In Sections 3.3.1–3.3.3 the possible values of terminal are found for Artinian groups and the limit of finite groups is calculated. These results of the papers [13, 14] have been obtained independently and by other methods, and were in part generalized by Gruenberg and Roseblade [71], Sandling [102] and Hartley [80]. In Section 3.3 the methods of [13, 14] are developed, using moreover systematically the language and technique of the general theory of group representations, and, furthermore, a circle of ideas connected with the well-known theorem of L. A. Kaluzhnin [84] on nilpotence of a group, acting faithfully and stably on a finite invariant series of another group, and some applications of it. The elements of such an approach were set forth by Hartley [76], but he uses it only for the interpretation of some results. Due to mentioned approach, a self-developed presentation, and in several cases a generalization and a considerable simplification of the proofs in [14, 71, 80] are achieved. The paper [26] of A. I. Mal’cev on the possibility of embedding semigroups into a group gave rise to a well-known cycle of developments, in particular, there appeared results, that are at the first glance not connected with stability. Given the goal for finding “good” classes of semigroups with cancellation embeddable in a group, A. I. Mal’cev has found in [28] a notion of nilpotence for semigroups such that each such semigroup with cancellation is embeddable in a nilpotent group. Up to now the interest for this notion has not been considerable. The present author has made an attempt to unify the results of [28] with the above mentioned theorem of Kaluzhnin. This leads to the necessity of reconsidering the notion of stability for semigroups of endomorphisms. This question, and further some properties of semigroup rings of locally nilpotent (in the sense of Mal’cev) semigroups are treated in Section 3.3.4. The papers [11–17] have been published on the theme of this dissertation. The main results have been communicated at the XI All Union Algebraic Colloquium (Kishinev, 1971); the All Union Symposium on Ring Theory, Algebras and Modules (Kääriku, 1976); at the Algebra Seminar of Tartu State University; at the Riga Algebra Seminar; at the the Seminars of Higher Algebra and Rings and Modules at Moscow State University; and at the Minsk Algebra Seminar. Twice the material of the two first sections was used in a special lecture course in automata theory presented by the author himself at Tartu State University; the main contents of this course were set forth at the III Regional Conference-Seminar of leading lecturers of mathematics of the Belorussian, Latvian, Lithuanian, Estonian Soviet Republics and the Kaliningrad Oblast of the Soviet Union (Minsk, 1977). Acknowledgement. The author is thankful to Prof. B. I. Plotkin for supervising this work, for his valuable advice, and generous support.
3. Triangular products and stability of representations
23
3.1. The triangular products This section has a preparatory character. First of all, here we treat representations as two-sorted entities (pairs) and carry over the corresponding definitions to the case when the acting object of a pair is a semigroup or an associative algebra. The main object of the section is the introduction of the triangular product of representations of semigroups and algebras and the study of their properties and connections. Applications of the frame of notions considered are given in the Sections 3.2–3.3. We underline that although the main constructions and notions of this section can be introduced for an arbitrary associative and commutative unitary ring K, we prefer to restrict ourselves in the first two sections because of reasons of organization, to the case when K is a field. 3.1.1. Triangular products of group representations 1. The object of this first section is preparatory, to acquaint the Reader with the notion of triangular product for group pairs. This construction turns out to be useful for us also in our study of the fundamental ideal of group rings in Section 3.3, but, in the first place, it serves as a model for analogous constructions of the triangular product of representations of semigroups and algebras. 2. Let A and B be any two groups. The set AB of all functions B → A forms a group on which B acts according to the formula ∀ x,
b ∈ B,
f ∈ AB ,
(f ◦ b)(x) = f (xb−1 ).
There arises the pair (AB , B). We accompany this pair with the semi-direct product AB B which will be called the (complete) wreath product of A and B, and denoted AwrB. Let us fix an associative-commutative ring K, for example K = Z and let Γ be an arbitrary group. If there is given a representation of Γ by automorphisms of a certain K-module G, then one speaks on the (group) pair (G, Γ). Let (A, Σ1 ) and (B, Σ2 ) be any two group pairs, and let Φ = HomK (A, B) be the module of all K-homomorphisms of B into A. Defining an action of the groups Σ1 and Σ2 on Φ respectively by the formulae ∀ x ∈ B,
σ1 ∈ Σ1 ,
ϕ ∈ Φ,
(ϕ ◦ σ1 )(x) = ϕ(x) ◦ σ1
and
∀ x ∈ B, σ2 ∈ Σ2 , ϕ ∈ Φ, (ϕ ◦ σ2 )(x) = ϕ(x ◦ σ2−1 ) we arrive at the pairs (Φ, Σ1 ) and (Φ, Σ2 ). Moreover, as the actions of Σ1 and Σ2 are permutable on Φ, we can now define the pair (Φ, Σ1 × Σ2 ) to which corresponds the group Γ = Φ Σ1 × Σ2 , where the initial groups Φ and Σ1 × Σ2 are embedded, ¯ = {φ¯ = (φ, 1)|ϕ ∈ Φ} ⊂ Γ, Φ→Φ while the group Σ1 × Σ2 can be identified with its image in Γ via the map (σ1 , σ2 ) → ¯ Σ1 × Σ2 there corresponds the pair (Φ, ¯ Σ1 × (ε, σ1 σ2 ). To the semi-direct product Φ ¯ It Σ2 ), the representation of the group Σ1 × Σ2 by inner automorphisms of the group Φ. ∼ ¯ is easy to convince oneself that (Φ, Σ1 × Σ2 ) = (Φ, Σ1 × Σ2 ). Let G = A ⊕ B and let us define the pair (G, Φ). To this end we consider in G the series of submodules 0 ⊂ A ⊂ G. In the group Aut G we introduce the centralizer Z
24
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
of this series, that is, all automorphisms that act as identities of A and G/A. The map σ : Z → Φ, for each b ∈ B, σ ∈ Z given by the formula bσ = b ◦ σ − b, is, as is readily seen, an isomorphism between the groups Z and Φ. Hence, we have a right isomorphism of the pairs (G, Φ) and (G, Z). Next, let us turn to the following question. Let there be given the pairs (G, Φ) and (G, Σ), and set Γ = Φ Σ. What are the necessary and sufficient conditions for the existence of the pair (G, Γ)? It turns out that if the condition ∀ g ∈ G,
ϕ ∈ Φ,
σ ∈ Σ,
(g ◦ σ) ◦ ϕ = (g ◦ ϕσ
−1
)◦σ
is fulfilled, then the action of the groups Φ and Σ can be extended to an action of the group Γ on G. Applying this to the reviewing situation, we arrive to the pair (A ⊕ B, HomK (B, A) Σ1 × Σ2 ) = (G, Γ), in which the action is given by the formula ∀ a ∈ A,
b ∈ B,
ϕ ∈ Φ,
σ1 ∈ Σ1 ,
σ2 ∈ Σ2 ,
(a + b) ◦ ϕσ1 σ2 = a ◦ σ1 + bϕ ◦ σ1 + b ◦ σ2 = (a + bϕ ) ◦ σ1 + b ◦ σ2 . This pair (G, Γ) is called the triangular product of the pairs (A, Σ1 ) and (B, Σ2 ) and will be denoted by (A, Σ1 ) (B, Σ2 ). Let us add that the pairs (A, Σ1 ) and (B, Σ2 ) need not necessarily be faithful and therefore the following formula is of interest Ker [(A, Σ1 ) (B, Σ2 )] = [Ker (A, Σ1 )] × Ker [(B, Σ2 )]. The operation of triangular product is a covariant functor in the first argument in the category of linear group actions, which preserves exactness from the left and from the right. But if we consider as morphisms only right homomorphisms of pairs, the triangular product becomes a covariant functor in both arguments preserving exactness from the left and from the right. These and many other properties of the -operation on group pairs are proved in [36], to which we refer the interested Reader. 3. Let A be any Abelian group, and B1 and B2 arbitrary groups. Let us consider the group AB1 B1 corresponding to the pair (AB1 , B1 ), where the action of B1 in AB1 is defined by the formula ∀ x ∈ B1 ,
f ∈ AB1 ,
(f ◦ b)(x) = f (xb−1 ).
For the regular pair (ZB2 , B2 ) and the triangular product (G, Γ) = (AB1 , B1 )(ZB2 , B2 ), B. I. Plotkin established the formula (3)
Γ = HomZ (ZB2 , AB1 ) B1 × B2 ∼ = A wr (B1 × B2 ).
One consequence of this fact deserves special attention because of an application in the last section. T HEOREM 3.1. Let A be an arbitrary (additively written) Abelian group, B an arbitrary group, and E the unit group. The acting group of the pair (A, E) (ZB, B) is isomorphic to AwrB. P ROOF. The proof amounts to applying the preceding formula to the pairs (A, E) and (ZB, B). Let us also give a sketch of the proof of formula (3), because of the lack of a suitable reference.
3. Triangular products and stability of representations
25
B2 2 Let there be given an arbitrary pair (A1 , B1 ). It induces pairs (AB 1 , B2 ) and (A1 , B1 ), 2 where the actions are given as follows: in the pair (AB 1 , B2 ) by the formula 2 ∀ f ∈ AB 1 ,
b2 , x ∈ B2 ,
(f ◦ b2 )(x) = f (xb−1 2 ),
2 and in the pair (AB 1 , B1 ) by the formula 2 ∀ f ∈ AB 1 ,
b1 ∈ B1 ,
x ∈ B2 ,
(f ◦ b1 )(x) = f (x) ◦ b1 .
B2 2 These actions on AB 1 commute and so there arises the pair (A1 , B1 B2 ), and we may add that the action in it is the following: 2 ∀ f ∈ AB 1 , b1 ∈ B1 , x, b2 ∈ B2 , (f ◦ b1 b2 )(x) = ((f ◦ b1 ) ◦ b2 )(x) = ((f ◦ b2 ) ◦ b1 )(x).
Setting now in the constructed pair A1 = AB1 we arrive at the pair ((AB1 )B2 , B1 B2 ). Let us first show that Γ ∼ = (AB1 )B2 B1 B2 or, what amounts to the same, let us show that there exists an isomorphism of pairs ((AB1 )B2 , B1 B2 ) ∼ = (Φ, B1 ×B2 ), where by Φ we denote the Abelian group HomZ (ZB2 , AB1 ). To this end we associate to each function f : B2 → AB1 its Z-linear extension f : ZB2 → AB1 , which reduces to an isomorphism of Abelian groups, ∗ : (AB1 )B2 → Φ. Moreover, the isomorphism ∗ agrees with the actions of the two pairs under view, and, thus, guarantees the requirement. As, by definition, Awr(B1 × B2 ) ∼ = AB1 ×B2 (B1 × B2 ), it suffices to establish B1 B2 the following isomorphism: (A ) B1 B2 ∼ = AB1 ×B2 (B1 × B2 ). To this end, we define, with the help of the formula ∀ f ∈ (AB1 )B2 ,
x ∈ B1 ,
y ∈ B2 ,
f μ (x, y) = (f (y))(x),
the map μ : (AB1 )B2 → AB1 ×B2 , which, as is readily seen, is an isomorphism of Abelian groups. The map μ can, however, be extended to an isomorphisms of the semidirect products under consideration, as for each b1 ∈ B1 and b2 ∈ B2 we have (f ◦ b1 b2 )μ = f μ ◦ b1 b2 . We omit the last details. 3.1.2. Triangular products of semigroup representations 1. Let K be an arbitrary associative and commutative ring with unit. We say that we have a pair (G, Γ), if the semigroup Γ acts as a semigroup by K-endomorphisms on the K-module G. In other words, there is defined an algebraic operation G × Γ → G, which we denote by g ◦ γ, possessing the following properties. (1) For γ ∈ Γ fixed the map g → g ◦ γ is a K-endomorphism of the module G; (2) for any g ∈ G, γ1 , γ2 ∈ Γ, it holds g ◦ (γ1 γ2 ) = (g ◦ γ1 ) ◦ γ2 . In the special case, when Γ is a monoid with unit ε, one requires the supplementary condition: (3) for any g ∈ G, g ◦ ε = g. Let us give a list of definitions connected with the notion of pair. By a homomorphism of pairs μ : (G, Γ) → (G , Γ ) we mean a pair of homomorphisms: a K-homomorphism μ : G → G and a homomorphism μ : Γ → Γ connected with the condition ∀ g ∈ G, γ ∈ Γ, (g ◦ γ)μ = g μ ◦ γ μ . In this way we get the category of pairs in which one can define all usual algebraic notions.
26
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
The pair (H, Σ) is a subpair of (G, Γ) if H is a submodule of G, Σ a subsemigroup of Γ, the submodule H is invariant with respect to the action of Σ and the representation of Σ with respect to H is induced by the given representation of the semigroup Γ. The kernel of a pair (G, Γ) is, by definition, the congruence Ker (G, Γ) of the semigroup Γ, whose classes are the classes of Γ which are equi-acting on G. If Ker (G, Γ) is the equality relation on G, then we say that (G, Γ) is a faithful pair. A congruence of the pair (G, Γ) is a pair H, σ, where H is a Γ-invariant submodule of G, and σ a congruence on the semigroup Γ such that σ ≤ Ker (G/H, Γ). In a natural way one defines likewise the notion of factor pairs, formulates and proves the homomorphism theorem, and, furthermore, Remak’s theorem.3 Besides usual homomorphisms of pairs one distinguishes also their onesided homomorphisms. A left homomorphism is a homomorphism of the Γ-modules corresponding to these pairs. In the case of right homomorphisms of pairs the latter have one and the same domain of action, on which the homomorphism acts identically. A variety of representations of semigroups is a saturated Birkhoff class of corresponding pairs. By definition, the class K is saturated if for any right epimorphism of pairs (G, Γ) (G, Γ ) it follows from (G, Γ ) ∈ K that (G, Γ) ∈ K. To each variety Θ there corresponds a verbal function ∗ Θ, which to each pair (G, Γ) associates the intersection ∗ Θ(G, Γ) of all Γ-submodules H ⊂ G such that (G/H, Γ) ∈ Θ. It is clear that (G/∗ Θ(G, Γ), Γ) ∈ Θ. This verbal function has the following property. Let Θ1 and Θ2 be varieties of pairs. The relation (G, Γ) ∈ Θ1 · Θ2 is fulfilled if and only if (∗ Θ2 (G, Γ), Γ) ∈ Θ1 . On the other hand, to each variety of pairs Θ there corresponds a radical function Θ which associates to each pair (G, Γ) the sum of all Γ-submodules H in G for which (H, Γ) ∈ Θ. Moreover, let Θ1 and Θ2 are the varieties of pairs. The relation (G, Γ) ∈ Θ1 · Θ2 is fulfilled if and only if (G/ Θ1 (G, Γ), Γ) ∈ Θ2 . We limit ourselves to these remarks in order not to overburden the picture with details somewhat modifying the notions and reasonings in the group case [37, 40]. 2. Let there be given the semigroups Φ, Σ1 and Σ2 ; we agree to write additively the operation on Φ. We assume that Σ1 acts from the right on Φ and that Σ2 acts from the left on Φ; moreover, we require that these two actions intertwine element-wise 4. On the set of triples5 Γ = {(ϕ, σ1 , σ2 )|ϕ ∈ Φ, σ1 ∈ Σ1 , σ2 ∈ Σ2 )} we define an operation setting (ϕ, σ1 , σ2 ) · (ϕ , σ1 , σ2 ) = (σ2 · ϕ + ϕ · σ1 , σ1 σ1 , σ2 σ2 ). 3Translators’ note. The theorem of Krull-Remak-Schmidt was proved around 1925. This important result
states that any finite dimensional A-module M , where A is an associative F -algebra over a field F can be written in an essentially unique way as a direct sum of submodules, which submodules cannot be written as direct sums of proper submodules. This reduces the problem of classification of A-modules to the determination of these so-called indecomposable modules. 4We denote these actions by σ · ϕ and ϕ · σ using the sign ‘·’ distinctly from the sign ‘◦’, which denotes 2 1 the action of Γ on G 5 The triple (ϕ, σ1 , σ2 ) will in the sequel also be denoted ϕσ1 σ2 .
3. Triangular products and stability of representations
27
One can check that this operation is associative, so that the set of pairs Γ has a semigroup structure, which we call the triple product of semigroups 6 and denote by Γ = Φ Σ1 × Σ2 . For given pairs (A, Σ1 ) and (B, Σ2 ), where Σ1 and Σ2 are semigroups, acting on 7 the K-modules A and B respectively, we set Φ = Hom+ K (B, A) ⊂ EndK (A ⊕ B). The natural action of the semigroups Σ1 and Σ2 on Φ defines semigroup structure on the semigroup Γ = Φ Σ1 × Σ2 . The action of Γ on G = A ⊕ B, defined by the rule (a + b) ◦ (ϕ, σ1 , σ2 ) = bϕ + a ◦ σ1 + b ◦ σ2 , agrees with the multiplication of the semigroup Γ and we arrive at the pair (G, Γ), which we denote by (A, Σ1 ) (B, Σ2 ) and call the triangular product of the two given pairs.
3. The following remark hints to the usefulness of this construction in the study of varieties of representations of semigroups. If the pair (A, Σ1 ) is contained in the variety Θ1 and (B, Σ2 ) in the variety Θ2 , then the triangular product (G, Γ) = (A, Σ1 ) (B, Σ2 ) is contained in the variety Θ1 · Θ2 . For the proof we remark that A is a Γ-submodule of G and so we have the pairs (A, Γ) and (G/A, Γ). Let us consider the diagram 2 fffff8 Σ1 fffff qqqq f f f f μ ff qq ff1fff qqqpr1 fffff q f f f q f ffff μ qq / Σ 1 × Σ2 Γ = Φ Σ1 × Σ2X MMM XXXXX XXXXX μ MMMpr2 XXXX2X M XXXXX XXXXX MMMM XXXXX MM XXX,& Σ2 where the “erasing” homomorphism μ is given by the formula (ϕσ1 σ2 )μ = σ1 σ2 , the map pri : Σ1 × Σ2 → Σi is the natural projection, and μi = μ · pri , i = 1, 2. It is easy to see that Ker μ1 and Ker μ2 act trivially on A and G/A ∼ = B, respectively. For all a ∈ A, γ ∈ Γ we have a ◦ γ = a ◦ γ μ1 , from which follows the existence of a right epimorphism (A, Γ) → (A, Σ1 ), which implies that (A, Γ) ∈ Θ1 .
6Cf. also [66, p. 142] 7Translators’ note. In the notation Hom+ the symbol + refers to a “forgetful” functor: While
HomK (B, A) denotes the Abelian group of homomorphisms from B to A, as K-modules, writing Hom+ K (B, A) we regard them as Abelian groups thus “forgetting” the K-module structure.
28
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Furthermore, defining μ2 as the natural projection we obtain an epimorphism of pairs μ2 : (G, Γ) → (B, Σ2 ). Moreover, the kernel of μ2 is A. Consequently, there arises the commutative diagram (G, Γ) (G/A, Γ)
μ2
≈
/ / (B, Σ2 ) O / (B, Γ)
the existence of which gives (G/A, Γ) ∈ Θ2 . Hence we have (A, Γ) ∈ Θ1 and (G/A, Γ) ∈ Θ2 . Hence, by definition, it follows that (G, Γ) ∈ Θ1 · Θ2 . 4. The final part of this section will be devoted to a deduction of the properties of the triangular product of pairs of representations of semigroups. P ROPOSITION 3.2. If the pairs (A, Σ1 ) and (B, Σ2 ) are faithful, then the pair (G, Γ) = (A, Σ1 ) (B, Σ2 ) is faithful too. P ROOF. Let us assume the contrary. Then there exist distinct elements γ = (ϕ, σ1 , σ2 ) and γ = (ϕ , σ1 , σ2 ) in Γ which act identically in G = A ⊕ B: we have g ◦ γ = g ◦ γ for all g ∈ G. In view of the faithfulness of (A, Σ1 ) and (B, Σ2 ) it follow readily that γ = γ , which contradicts our assumption. P ROPOSITION 3.3. Let (A, Σ1 ) and (B, Σ2 ) be two pairs8 and (G, Γ) = (A, Σ1 ) (B, Σ2 ) their triangular product. For each Γ-submodule H in G one has either H ⊂ A or A ⊂ H. P ROOF. If H ⊂ A everything is proved. So assume that A ⊃ H. Then there exists an element h ∈ H such that h ∈ A. This implies the existence of a1 ∈ A and b ∈ B, b = 0, such that h = b + a1 . Let us pick a basis in B containing the element b and let us consider an arbitrary map ϕ of this basis into A with bϕ = a. We continue ϕ to an element in Φ = Hom(B, A), which we likewise denote by ϕ . Moreover, we require the following remark. The pair (A, Σ1 ) can be “completed” to a pair (A, Σ∗1 ), where the semigroup Σ∗1 is obtained by adjoining to Σ1 a unity element , whose action on A is defined by the formula a ◦ ε = a for all a ∈ A. In an analogous manner we obtain (B, Σ∗2 ), and we end up with the pair (G, Γ∗ ) = (A, Σ∗1 ) (B, Σ∗2 ). It is easy to see that from the fact that the submodule H ⊂ G is Γ-invariant it follows H is Γ∗ -invariant, and vice versa.
8As earlier in this section here Σ and Σ , are semigroups acting on the K-modules A and B respectively. 1 2 Let us also emphasize that Σ1 and Σ2 need not be monoids. Starting with this moment and everywhere in this paper, we assume that K is a field.
3. Triangular products and stability of representations
∗
Let us now take γ ∈ Γ , where γ = ment h. We have
h ◦ γ = (b, a1 )
ε 0
ϕ ε
ε ϕ 0 ε
29
∈ End G and apply it to the ele-
= (b ◦ ε + a1 ◦ 0, b ◦ ϕ + a1 ◦ ε) =
= (b, bϕ + a1 ) = (b, a1 ) + (0, a) = h + a. We have showed that for any a ∈ A one can find γ ∈ Γ∗ such that h ◦ γ = h + a. Hence, it follows that a = h ◦ γ − h ∈ H in view the remark made above concerning the module H. Therefore, we deduce that A ⊂ H. 5. The triangular product of semigroup pairs enjoys good functional properties which are collected in the following two propositions. P ROPOSITION 3.4. Let there be given a homomorphism ν : (A, Σ1 ) → (A , Σ1 ) and let (B, Σ2 ) be an arbitrary pair. Then there exists a homomorphism of pairs μ : (A, Σ1 ) (B, Σ2 ) → (A , Σ1 ) (B, Σ2 ) coinciding with ν on (A, Σ1 ) and with identity on (B, Σ2 ). Moreover, if ν is a monomorphism (epimorphism), then μ is likewise a monomorphism (epimorphism). P ROOF. Let us introduce the notation: (G, Γ) = (A, Σ1 ) (B, Σ2 ), (G , Γ ) = + (A , Σ1 ) (B, Σ2 ), Φ = Hom+ K (B, A) and Φ = HomK (B, A ). We define a mor phism of semigroups μ : Φ → Φ by the formula
∀ϕ ∈ Φ,
b ∈ B,
μ
bϕ = (bϕ )ν ,
and, furthermore, “lift” it to a morphism of semigroups μ : Γ → Γ by setting (ϕ, σ1 , σ2 )μ = (ϕμ , σ1ν , σ2 ). Moreover, we define a morphism of K-modules μ : G → G by the formula ∀a ∈ A,
b ∈ B,
(a + b)μ = aν + b.
For any a + b ∈ A ⊕ B = G, σ1 ∈ Σ1 and σ2 ∈ Σ2 we then have ((a + b) ◦ (ϕ, σ1 , σ2 ))μ = (a ◦ σ1 + bϕ + b ◦ σ2 )μ = = (a ◦ σ1 + bϕ )ν + b ◦ σ2 = = (a ◦ σ1 )ν + (bϕ )ν + b ◦ σ2 = μ
= aν ◦ σ1ν + bϕ + b ◦ σ2 = (aν + b) ◦ (ϕμ , σ1ν , σ2 ) = = (a + b)μ ◦ (ϕ, σ1 , σ2 )μ . We see that the given morphism ν can be extended to a morphism of pairs μ : (G, Γ) → (G, Γ ). It is clear that μ is an identity on (B, Σ2 ). One verifies immediately that if ν is a monomorphism (epimorphism), then μ is defined by a pair of monomorphisms (epimorphisms) μ : G → G and μ : Γ → Γ and so is also a monomorphism (epimorphism).
30
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Let us again consider the triangular product (A, Σ1 ) (B, Σ2 ). For a fixed left pair the product still can be viewed as a functor, but now in the category of change of semigroups 9. Before formulating the result, we recall the definition of this category. The objects in the considered category are still pairs, but the morphism μ : (G, Γ) → (G , Γ ) in the category of changes is just two morphisms μ : G → G and μ : Γ → Γ connected with the following “compatibility condition”, γ ∈ Γ ,
∀g ∈ G,
g μ ◦ γ = (g ◦ γ μ )μ .
In order to distinguish the morphisms in the category of changes of semigroups, we denote them by μ : (G, Γ) (G , Γ ). P ROPOSITION 3.5. An arbitrary object (A, Σ1 ) and a morphism ν : (B, Σ2 ) (B , Σ2 ) in the category of changes of semigroups induce a morphism μ : (A, Σ1 ) (B, Σ2 ) (A, Σ1 ) (B , Σ2 ). in this category. P ROOF. 1) Let (G, Γ), (G , Γ ), Φ and Φ have the same meaning as in the proof of Proposition 3.4. Let us define the map μ : Φ → Φ in the following way: ∀b ∈ B,
bϕ
μ
= (bν )ϕ .
Moreover, we extend the homomorphism ν : Σ2 → Σ2 to a morphism of direct products μ : Σ1 × Σ2 → Σ1 × Σ2 , defining ν as identity on Σ1 . By the definition of the triangular product we have the pairs (Φ , Σ1 × Σ2 ) and (Φ, Σ1 × Σ2 ). Let us show that μ : Φ → Φ and μ : Σ1 × Σ2 → Σ1 × Σ2 induce a morphism of the pairs indicated. Indeed, for each b ∈ B we have
b(σ2 ·ϕ ·σ1 ) = (bν )σ2 ·ϕ ·σ1 = (bν ◦ σ2 )ϕ ◦ σ1 = [(b ◦ σ2 )ν ]ϕ ◦ σ1 = μ
= (b ◦ σ2 ν )ϕ
μ
ϕ
μ
◦ σ1 = (b ◦ σ2 μ )ϕ
ν
μ
μ
◦ σ1 = bσ2
·ϕμ ·σ1
.
In an analogous manner one can show that (σ2 · ϕ )μ = σ2 μ · ϕμ and (ϕ · σ1 )μ = · σ1 . 2) Let us give the map μ : Γ → Γ by the formula (ϕ , σ1 , σ2 )μ = (ϕμ , σ1 , σ2 μ ).
It turns out that μ is a morphism of triple products, μ : Γ → Γ. This follows from the computation [(ϕ , σ1 , σ2 )(ψ , τ1 , τ2 )]μ = ((σ2 · ψ + ϕ · τ1 )μ , σ1 τ1 , (σ2 τ2 )μ ) = = (σ2 μ · ψ μ + ϕμ · τ1μ , σ1 τ1 , σ2 μ · τ2 μ ) = = (ϕμ , σ1 , σ2 μ ) · (ψ μ , τ1 , τ2 μ ) = = (ϕ , σ1 , σ2 )μ · (ψ , τ1 , τ2 )μ . 3) Moreover, from the formula (a + b)μ = a + bν , bν ∈ B we obtain the morphism μ : A⊕B → A⊕B . Next, let us show that the mapping μ just defined gives a morphism μ : (A, Σ1 ) (B, Σ2 ) (A, Σ1 ) (B , Σ2 ) 9Translators’ note. This translation of the Russian “kategoriya zamen”, used by the author was kindly suggested to us by B. I. Plotkin,
3. Triangular products and stability of representations
31
in the category of changes of semigroups. Indeed, for any a ∈ A, b ∈ B, ϕ ∈ Φ , σ1 ∈ Σ1 and σ2 ∈ Σ2 we have, on the one hand, (a + b)μ ◦ (ϕ , σ1 , σ2 ) = (a + bν ) ◦ (ϕ , σ1 , σ2 ) =
= a ◦ σ1 + (bν )ϕ + bν ◦ σ2 = a ◦ σ1 + bϕ
μ
+ (b ◦ σ2 ν )ν ;
On the other hand, we have [(a + b) ◦ (ϕ , σ1 , σ2 )μ ]μ = [(a + b) ◦ (ϕμ , σ1 , σ2 μ )]μ = = [a ◦ σ1 + bϕ
μ
μ
+ b ◦ σ2 μ ]μ = (a ◦ σ1 + bϕ ) + (b ◦ σ2 ν )ν ;
we use here the fact that the map μ coincides with ν on Σ2 . As a result we have (a + b)μ ◦ (ϕ , σ1 , σ2 ) = ((a + b) ◦ (ϕ , σ1 , σ2 )μ )μ . This proves the statement, and at the same time Proposition 3.5
P ROPOSITION 3.6. Let there be given two pairs (A, Σ1 ) and (B, Σ2 ), and let (G, Γ) be their triangular product. For each radical F , satisfying the condition F(A, Σ1 ) < A, we have the identity F(G, Γ) = F(A, Σ1 ). If F˙ is a verbal, for which F(B, Σ2 ) > 0, then F(G, Γ) = A + F(G, Σ2 ). The proof, which is a repetition of the arguments with the help of which the corresponding fact was established in the case of groups (cf. [36, Lemma 2]), will be omitted. P ROPOSITION 3.7. Let there be given two pairs (A, Σ1 ) and (B, Σ2 ), and let (G, Γ) be their triangular product. For each Γ-submodule H in G containing A, there exists a right epimorphism (H, Γ) → (A, Σ1 ) (B ∩ H, Σ2 ). P ROOF. Let us denote B = B ∩ H, Φ = Hom+ (B, A) and Φ = Hom+ (B , A). We have G = A+ B, and from A ⊂ H it obviously follows that H = A+ B . Moreover, we have Γ = ΦΣ1 ×Σ2 . Let Γ = Φ Σ1 ×Σ2 . Hence, (A, Σ1 )(B , Σ2 ) = (H, Γ ). Each element ϕ in the semigroup Φ acts also from B into A; the corresponding element in Φ will also be denoted ϕπ . Thus there arises a map π : Φ → Φ . We remark also that each f ∈ F may be considered as the restriction to B of a homomorphism in ϕ ∈ Φ; this follows from well-known facts about vector spaces. Therefore π : Φ → Φ is an epimorphism. We see that it induces a right epimorphism of pairs (H, Γ) → (H, Γ ). With the aim to prove this we define a map π : Γ → Γ by the following formula: ∀ϕ ∈ Φ,
σ1 ∈ Σ1 ,
σ2 ∈ Σ2 ,
(ϕ, σ1 , σ2 )π = (ϕπ , σ1 , σ2 ).
π is surjective. Moreover, π is a homomorphism of semigroups. Indeed, let γ = (ϕ, σ1 , σ2 ) and γ = (ϕ , σ1 , σ2 ) be arbitrary elements of Γ. It is easy to see that for the identity (γγ )π = γπ · γ π it is sufficient that the elements δ = ϕ · σ1 + σ2 · ϕ and λ = ϕπ · σ1 + σ2 · ϕπ in Hom(B , A) are equal. However, this follows from the obvious fact that for all b ∈ B holds the equality bδ = bλ . Setting hπ = h for all h ∈ H, we obtain a pair of homomorphisms π : H → H, Γ → Γ . That the map π commutes with the actions on the corresponding pairs is readily verified expanding the definitions, and is therefore omitted.
32
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
6. Further information about a pair is obtained by its input to the triangular products of pairs that are more or less simpler arranged than the given single pair. The first result of this kind, which is obtained by passing to a simpler domain of action, is an analogue of the well-known theorem of Kaluzhnin and Krasner in group theory; in semigroup theory the corresponding fact is not known. P ROPOSITION 3.8. Let there be given an arbitrary faithful pair (G, Γ), and a Γsubmodule A of G, while Σ1 and Σ2 are the semigroups of endomorphisms induced by the semigroup Γ in A and in G/A. Then the pair (G, Γ) can be embedded as a subpair in the triangular product (A, Σ1 ) (G/A, Σ2 ). P ROOF. Exploiting the faithfulness of (G, Γ) and replacing Γ by the subsemigroup in End G, we arrive at a pair isomorphic (from the right) to the given pair (G, Γ). Therefore we can in what follows assume that Γ is contained in End G. For any element γ ∈ Γ we denote by γ μ and γ ν respectively the endomorphism of the spaces A and G/A induced by γ. Moreover, we set Γμ = Σ1 and Γν = Σ2 . Also, in G we can find a K-subspace B complementary to A. This yields for G a direct decomposition G = A + B, which provokes a natural epimorphism α : G → G/A and a projection β : G → B. The map α can also be viewed as an isomorphism α : B → G/A and this gives a unique sense for the notation α−1 , in particular, for each g ∈ G we have −1 (g α )α = g β . The pair (G/A, Σ2 ) and the map α induce the pair (B, Σ2 ): for b ∈ B and γ ∈ Γ, γ ν = σ2 ∈ Σ2 , we have −1
b ◦ σ2 = (bα ◦ σ2 )α
−1
= ((b ◦ γ)α )α
= (b ◦ γ)β .
We find from the decomposition G = A + B and the elements σ1 ∈ Σ1 and σ2 ∈ Σ2 respectively, the elements ε 0 σ2 0 and σ = σ1 = 2 0 ε 0 σ1 in End G, in this way establishing the embeddings Σi → Σi ⊂ End G, i = 1, 2. In addition, for each element g ∈ G, g = a + b, we have g σ1 = (b + a)σ1 = b + a ◦ σ1
and
g σ2 = (b + a)σ2 = b ◦ σ2 + a.
Moreover, remarking that it follows from bσ2 = b ◦ σ2 = (b ◦ γ)β that b ◦ γ − b ◦ σ2 ∈ A, we define a map ϕ : B → A according to the formula bϕ = b ◦ γ − b ◦ σ2 , it is easy to check that ϕ ∈ Hom(B, A). The semigroup Hom(B, A) can also be viewed as a subsemigroup Φin EndG, by associating to each element ϕ ∈ Hom(B, A) the ε ϕ of the space G. In addition, we have endomorphism ϕ = 0 ε
g ϕ = (a + b)ϕ = a + bϕ + b. Let Γ = Φ Σ1 × Σ2 . By our construction, (G, Γ ) = (A, Σ1 ) (B, Σ2 ). The map σ1 ϕσ2 → (ϕ, σ1 , σ2 ) induces an isomorphism of the subsemigroup Σ1 · Φ · Σ2 of End G onto Φ Σ1 × Σ2 , which isomorphism will be denoted π. Indeed, as in End G holds the equation σ2 ϕ σ2 0 ε ϕ ε 0 = , σ1 ϕσ2 = 0 ε 0 σ1 0 ε 0 σ1
3. Triangular products and stability of representations
we have
π σ ¯2 ϕ¯ σ2 ϕ = 0 σ1 0 σ ¯1 π ¯2 σ2 · ϕ¯ + ϕ · σ ¯1 ¯2 ) (σ2 σ σ2 σ = = 0 0 σ1 σ ¯1
33
[(σ1 ϕσ2 )(¯ σ1 ϕ¯ ¯σ2 ]π =
(σ2 · ϕ¯ + ϕ · σ ¯1 ) (σ1 σ ¯1 )
π =
¯1 )(σ2 · ϕ¯ + ϕ · σ ¯2 )(σ2 σ ¯2 ))π = = ((σ1 σ ¯1 ), σ1 σ ¯1 , σ2 σ ¯2 ) = = ((σ2 · ϕ¯ + ϕ · σ ¯ σ ¯1 , σ ¯2 ) = (σ1 ϕσ2 )π (¯ σ1 ϕ¯ ¯σ2 )π . = (ϕ, σ1 , σ2 )(ϕ, It is evident that π is bijective. It turns out that the semigroup Γ considered as a subsemigroup of End G can be embedded in Σ1 ΦΣ2 . For the proof pick an arbitrary element γ ∈ Γ and set γ μ = σ1 and γ ν = σ2 . Furthermore, let ϕ, ϕ , σ1 and σ2 be obtained by the procedure above. Let us show that γ = σ1 ϕσ2 . The left hand side and the right hand side of this equation are elements of End G and therefore for its verification it suffices to show that g γ = g σ1 ϕσ2 holds for all g = a + b ∈ G. We have, on the one hand, g γ = (a + b)γ = aγ + bγ = a ◦ σ1 + bϕ + b ◦ σ2 . On the other hand, we obtain ϕσ2 ε 0 g σ1 ϕσ2 = (b + a)σ1 ϕσ2 = (b, a) = 0 σ1 σ2 ε ϕ = = (b, a ◦ σ1 ) 0 ε σ 0 = (b ◦ σ2 , bϕ + a ◦ σ1 ), = (b, bϕ + a ◦ σ1 ) 2 0 ε
that is g σ1 ϕσ2 equals a ◦ σ1 + bϕ + b ◦ σ2 , an expression coinciding with the expression previously obtained for g γ . The required equation is thus established. As a consequence we have constructed an embedding (G, Γ) → (G, Γ ), which ∼ together with the isomorphism (G, Γ ) = (AΣ1 ) (B, Σ2 ) → (AΣ1 ) (G/A, Σ2 ) yields the embedding required. This proves Proposition 3.8 The final results of this section as given below somewhat unwind the connection between the -operation and the Cartesian multiplication of pairs, in particular, the operation of raising pairs to Cartesian power. pairs (Ai , Σi ), i ∈ I and a P ROPOSITION 3.9. Let there be given a family of ). Then the pair ( (A , Σ )) (B, Σ ) can be embedded in the pair pair (B, Σ i i i∈I (A , Σ ) (B, Σ ) . i i i∈I If in this result one takes all pairs (Ai , Σi ) equal to (A, Σ), we obtain from it the following. C OROLLARY 3.10. Let there be given arbitrary pairs (A, Σ) and (B, Σ ). Then for an arbitrary set (of indices) I one can embed the pair (A, Σ)I (B, Σ ) into the pair ((A, Σ) (B, Σ ))I .
34
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
We limit ourselves here to the following Proposition 3.11; let us just add that Proposition 3.9 is proved by similar arguments. P ROPOSITION 3.11. Let there be given arbitrary pairs (A, Σ1 ) and (B, Σ2 ). Then for an arbitrary set (of indices) I one can embed the pair (A, Σ1 )I (B, Σ2 ) into the pair ((A, Σ1 ) (B, Σ2 ))I . P ROOF. We introduce the three maps (1) ω : A + B I → (A + B)I ; (2) τ : Σ1 × ΣI2 → (Σ1 × Σ2 )I ; (3) ν : Hom(B I , A) → (Hom(B, A))I . They are defined as follows. First, for all a ∈ A and ¯b ∈ B I we define (a + ¯b)ω = a + ¯b, where a(i) = a for all i ∈ I. Clearly, a is the constant function sending the entire domain I onto one and the same value a ∈ A; in addition, a + ¯b is the function I → A + B required. ¯2 ∈ ΣI2 . Let us set (σ1 σ ¯2 )τ = σ1 σ2 , where Second, pick arbitrary σ1 ∈ Σ1 and σ σ1 (i) = σ1 for all i ∈ I. Third, for each ϕ ∈ Hom(B I , A) we define ϕν ∈ (Hom(B, A))I by the following condition ν ∀i ∈ I, ¯b ∈ B, [¯b(i)]ϕ (i) = ¯bϕ . It is easy to see that ν is an isomorphism, and that ω and τ are monomorphisms. The pair of maps τ and ν can be joined to a homomorphism of semigroups μ : Hom(B I , A) Σ1 × ΣI2 → (Hom(B, A) Σ1 × Σ2 )I , if we give the map μ by the formula (ϕ, σ1 , σ ¯2 )μ = (ϕν , σ1 , σ ¯2 ). It suffices to verify that the map μ is compatible with multiplication on the triangular product. As σ1 σ1 = σ1 ·σ1 , it is clear that the comparison of the elements [(ϕ, σ1 , σ ¯2 )(ϕ , σ1 , σ ¯2 )]μ and (ϕ, σ1 , σ ¯2 )μ · (ϕ , σ1 , σ ¯2 )μ reduces to the verification that the expressions (¯ σ2 · ϕ + ν ν ν ¯2 · ϕ + ϕ · σ1 represent one and the same element in (Hom(B, A))I . To ϕ · σ1 ) and σ this end, take any ¯b ∈ B I , i ∈ I and compute ν [¯b(i)](¯σ2 ·ϕ +ϕ·σ1 ) (i) = ¯b(¯σ2 ·ϕ +ϕ·σ1 ) = ν ¯2 )(i)]ϕ (i) + = (¯b ◦ σ ¯2 )ϕ + (¯bϕ ) ◦ σ1 = [(¯b ◦ σ ν + [¯b(i)]ϕ (i) ◦ σ =
1
ν ν = [¯b(i) ◦ σ ¯2 (i)]ϕ (i) + [¯b(i)]ϕ (i) ◦ σ1 =
(¯ σ2 ·ϕ = [¯b(i)]
ν
+ϕν ·σ1 )(i)
.
Thus we are led to the condition ∀i ∈ I, (¯ σ2 · ϕ + ϕ · σ1 )ν (i) = (¯ σ2 · ϕν + ϕν · σ1 )(i), which, apparently, is equivalent to the required equation. It remains to check that the morphisms μ and ω define monomorphisms of pairs μ∗ : (A + B I , Hom(B I , A) Σ1 × ΣI2 ) → ((A + B I ), (Hom(B, A) Σ1 × Σ2 )I ).
3. Triangular products and stability of representations
35
The only not completely immediate part of the proof is the verification of the compatibility of the map μ∗ , defined with the help of ω and μ, with the action of the pairs. ¯2 ∈ Σ2 we have Indeed, for arbitrary a ∈ A, ¯b ∈ B I , ϕ ∈ Hom(B I , A), σ ∈ Σ1 , σ ¯2 )μ = (a + ¯b) ◦ (ϕν , σ1 , σ ¯2 ) = (a + ¯b)ω ◦ (ϕ, σ1 , σ ν = ¯bϕ + a ◦ σ1 + ¯b ◦ σ ¯2 = ν
¯2 = = (¯bϕ + a ◦ σ1 ) + ¯b ◦ σ ¯2 )]ω . = [(¯bϕ + a ◦ σ1 ) + ¯b ◦ σ2 ]ω = [(a + ¯b) ◦ (ϕ, σ1 , σ ν It remains only to check that here we used the equality of the elements ¯bϕ + a ◦ σ1 and ¯bϕ + a ◦ σ1 , which follows from the relation
∀i ∈ I, (¯bϕ + a ◦ σ1 )(i) = ¯bϕ + a ◦ σ1 = = [¯b(i)]ϕ
ν
(i)
ν
+ a ◦ σ1 (i) = (¯bϕ + a ◦ σ1 )(i).
By this the proof of the proposition is complete.
3.1.3. Triangular products of representations of algebras 1. The construction indicated in the heading of this section will be achieved in two ¯ We may assume that Σ ¯ acts steps. First, let there be given two K-algebras Φ and Σ. from the right and the left on Φ and will denote this circumstance by ϕ · σ and σ · ϕ, respectively. We require that these two actions satisfy the following conditions 10: a) σ · (ϕ + ϕ ) = σ · ϕ + σ · ϕ ; (ϕ + ϕ ) · σ = ϕ · σ + ϕ · σ; b) σ · (ϕϕ ) = (σ · ϕ)ϕ ; (ϕϕ ) · σ = ϕ(ϕ · σ); c) (σ · ϕ) · σ = σ · (ϕ · σ ); (ϕ · σ)ϕ = ϕ(σ · ϕ ); d) (σ + σ ) · ϕ = σ · ϕ + σ · ϕ; ϕ · (σ + σ ) = ϕ · σ + ϕ · σ ; e) (σσ ) · ϕ = σ(σ · ϕ); ϕ · (σσ ) = (ϕ · σ) · σ ; f) σ · (κϕ) = κ(σ · ϕ); (kϕ) · σ = κ(ϕ · σ); g) (κσ) · ϕ = κ(σ · ϕ); ϕ · (κσ) = κ(ϕ · σ) ¯ = Φ+Σ ¯ we define addition and multiplication by In the direct sum of algebras Γ scalars component wise, but multiplication will be defined a new setting (ϕ, σ)(ϕ , σ ) = (ϕ · σ + σ · ϕ + ϕϕ , σσ ). ¯ there raises the structure of a new K-algebra, which we denote by Γ ¯ = Φ Σ ¯ On the set Γ ¯ and call the semidirect product of the algebras Φ and Σ. Remarks. 1) Let Σ be any K-algebra. Considering the field K as a K-algebra, we form the semidirect sum Σ∗ = Σ K. One gets a K-algebra, having the element (0, 1) as unit; moreover, the pairs of the form (σ, 0), σ ∈ Σ, give in Σ∗ a subalgebra isomorhic to Σ. This contains the essential part of a result mentioned in [21, p. 54-55]. 10Following MacLane [88] we speak of commuting bimultiplications (of Hochschild) on the algebra Φ.
36
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
2) Here we consider pairs in which the acting elements form a K-algebra. Let A be a K-module and Σ a K-algebra. We shall speak of a pair (A, Σ) if there is given an operation A × Σ → A, an action of Σ on A denoted by ◦ and satisfying for each a ∈ A, σ, σ ∈ Σ and κ ∈ K the conditions: (a) a ◦ (σσ ) = (a ◦ σ) ◦ σ ; (b) a ◦ (σ + σ ) = a ◦ σ + a ◦ σ ; (c) a ◦ (κσ) = κ(a ◦ σ); (d) for each fixed σ ∈ Σ, the map a → a ◦ σ is a K-endomorphism of the module A. Every pair (A, Σ) can be “lifted” to the pair (A, Σ∗ ), if the action of the algebra Σ∗ , constructed in the previous remark, in A is defined in the following way, ∀a ∈ A, σ ∈ Σ, κ ∈ K, a ◦ (σ, κ) = a ◦ σ + κa. The proof of the fact that there arises a pair (A, Σ∗ ) containing (A, Σ) as a subpair, is left to the Reader. Second step. Let there be given pairs 11 (A, Σ1 ) and (B, Σ2 ), and ¯ 1 ) and (B, Σ ¯ 2 ) be the corresponding faithful pairs. Then Σ ¯ =Σ ¯1 ⊕ Σ ¯ 2 can let (A, Σ in a natural way be interpreted as a subalgebra of EndK G, where G = A + B and the same is true for Φ = HomK (B, A). Therefore there is defined a left and right action of ¯ in Φ; this is just multiplication in EndK : Σ def
¯ϕ σ ¯·ϕ = σ
and
def
ϕ·σ ¯ = ϕ¯ σ.
It is clear that these actions are bimultiplications on Φ. Setting Σ = Σ1 ⊕ Σ2 we have a ¯ which allows us to lift the action of Σ ¯ on Φ to an action natural epimorphism f : Σ → Σ of Σ on Φ: def
¯f · ϕ σ·ϕ = σ
and
def
ϕ · σ = ϕ · σf .
We arrive at the algebra Φ Σ = Γ with an action on G = A ⊕ B defined by the rule (a + b) ◦ (ϕ, σ) = bϕ + (a + b) ◦ σ. Let us remark that the elements (ϕ, σ) of the algebra Φ Σ act on G as endomorphisms ϕ + σf . More precisely, the multiplication in Γ and its action on G are given by the formulae (ϕ, σ1 , σ2 )(ϕ , σ1 , σ2 ) = ((ϕ · σ1 + σ2 ϕ + ϕϕ ), σ1 σ1 , σ2 σ2 ) and (a + b) ◦ (ϕ, σ1 , σ2 ) = bϕ + a ◦ σ1 + b ◦ σ2 . As this action of Γ on G agrees with the operations in the algebra Γ, there arises a pair (G, Γ), which we call the triangular product of (A, Σ1 ) and (B, Σ2 ); we denote it by (A, Σ1 ) (B, Σ2 ). 11We emphasize, in particular, that the acting objects in the pairs given in this section are K-algebras.
3. Triangular products and stability of representations
37
2. Let us pass to the study of the triangular product of representations of algebras. P ROPOSITION 3.12. If the pairs (A, Σ1 ) and (B, Σ2 ) are faithful then so is the pair (A, Σ1 ) (B, Σ2 ). P ROOF. We repeat word by word the reasoning in the proof of Proposition 3.2 and arrive at the required result. P ROPOSITION 3.13. Let (A, Σ1 ) and (B, Σ2 ) be two pairs and let (G, Γ) = (A, Σ1 ) (B, Σ2 ) be their triangular product. Then for each Γ-submodule H in G we have either H ⊂ A or A ⊂ H. P ROOF. 1) Essentially the same reasoning as in the proof of Proposition 3.3 gives the result required. We can carry over completely the notations and reasonings in the first part of that proof with the single exception: while considering the action of the element (ϕ, σ1 , σ2 ) of the algebra Φ Σ on A ⊕ B we think of it here as the endomorphism ϕ + σ1f + σ2f . 2) Using the remark in the preceding subsection, we embed the pairs (A, Σ1 ) and (B, Σ2 ) into (A, Σ∗1 ) and (B, Σ∗2 ) respectively. In a natural way we extend the action on G of the algebra Γ = Φ Σ∗1 ⊕ Σ∗2 to the action of the algebra Γ∗ = Φ Σ∗1 ⊕ Σ∗2 ; this is done using the already known scheme ∗ Σ2 Φ . (B, A) 0 Σ∗1 Thus we get the pair (G, Γ∗ ). Next, we remark that from the Γ-invariance of the submodule H ⊂ Γ it follows that it is invariant with respect to the action of all elements of Γ∗ , and vice versa. In fact, assume that H ◦ Γ ⊂ H; then for any g = a + b ∈ H and γ ∗ = (ϕ, (σ1 , κ), (σ2 , κ)) ∈ Γ∗ , where ϕ ∈ Φ, σ1 ∈ Σ1 , σ2 ∈ Σ2 and κ ∈ K we have g ◦ γ ∗ = (a + b) ◦ (ϕ, (σ1 , κ), (σ2 , κ)) = bϕ + a ◦ (σ1 , κ) + b ◦ (σ2 , κ) = = bϕ + a ◦ σ1 + κa + b ◦ σ2 + κb = (a + b) ◦ (ϕ, σ1 , σ2 ) + κ(a + b) ∈ H. Conversely, if we start with an arbitrary g = a + b ∈ H and chose for γ ∗ the element (ϕ, (σ1 , 0), (σ2 , 0)), then it follows immediately from the relation g ◦ γ ∗ = (a + b) ◦ (ϕ, σ1 , σ2 ) that H is a Γ-invariant submodule. 3) Let us pass to the main reasoning in the proof. To this end, using the previous construction of the elements ϕ ∈ Hom(B, A) we find in Γ∗ the element γ ∗ = (ϕ , (0, 1), (0, 1)) and apply it to h = a1 + b. By Γ-invariance of H and the remark just made, it follows that h ◦ γ ∗ ∈ H, from which in view of the equalities
h ◦ γ ∗ = (a1 + b) ◦ (ϕ , (0, 1), (0, 1)) = bϕ + a1 ◦ (0, 1) + b ◦ (0, 1) = a + h we have a = h◦γ ∗ −h ∈ H. The relation A ⊂ H is proved. This achieves the proof. 3. An easy modification of the proof apparatus in Paragraph 5 of Section 3.1.2 above allows to derive here some features of functional behavior of the -product for representations of algebras. P ROPOSITION 3.14. Let there be given a morphism ν : (A, Σ1 ) → (A , Σ1 ) and an arbitrary pair (B, Σ2 ). There exists a morphism μ : (A, Σ1 ) (B, Σ2 ) → (A , Σ1 )
38
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
(B, Σ2 ), which coincides with the map ν on (A, Σ1 ) and is an identity on (B, Σ2 ). If ν is injective (surjective), then μ is also so. P ROOF. The proof is obtained by repeating almost word by word the proof of Proposition 3.4, the notations of which are preserved here with an immediate modification of their interpretation. We stop only at a fragment of the reasoning. Let us assume that ∀ϕ ∈ Φ,
σ1 ∈ Σ1 ,
σ2 ∈ Σ2 ,
(ϕ, σ1 , σ2 )μ = (ϕμ , σ1ν , σ2 ).
We have to verify that μ : Γ → Γ is a morphism of algebras, which agrees with the action of these algebras on G and G respectively. We restrict the verification to the fact that μ intertwine with multiplication on the algebras Γ and Γ ; the rest follows even simpler by checking the definitions. We have (ϕ, σ1 , σ2 )μ · (ϕ , σ1 , σ2 )μ = (ϕμ , σ1ν , σ2 ) · (ϕμ , σ1 ν , σ2 ) = = ((ϕ · σ1 + σ2 · ϕ + ϕϕ )μ , (σ1 σ1 )ν , σ2 σ2 ) = = [(ϕ, σ1 , σ2 )(ϕ , σ1 , σ2 )]μ . In these computations we used the relation ϕμ · σ1 ν + σ2 · ϕμ + ϕμ ϕμ = (ϕ · σ1 + σ2 · ϕ + ϕϕ )μ ; it follows from the identities ϕμ · σ1 ν = (ϕ · σ1 )μ ;
σ2 · ϕμ = (σ2 · ϕ )μ
and
ϕμ ϕμ = (ϕϕ )μ ,
of which only two first ones require the proof. The first of them follows from the series of equations, valid for any b ∈ B, bϕ
μ
·σ1 ν
μ
= bϕ ◦ σ1 ν = (bϕ )ν ◦ σ1 ν = (bϕ ◦ σ1 )ν = b(ϕ·σ1 ) . μ
Moreover, for all b ∈ B we have bσ2 ·ϕ
μ
= (b ◦ σ2 )ϕ
μ
ν μ μ = (b ◦ σ2 )ϕ = (bσ2 ·ϕ )ν = b(σ2 ·ϕ ) ,
from which the second equation in view follows. The statement is proved. We allow ourself not to produce the remaining details. The category of changes of substitutions of pairs of representations of algebras is defined completely in analogy with the semigroup case (cf. Subsection 3.2.5). P ROPOSITION 3.15. An arbitrary object (A, Σ1 ) and a morphism ν : (B, Σ2 ) (B , Σ2 ) in the category of substitutions of pairs of representations of algebras induces a morphism μ : (A, Σ1 ) (B, Σ2 ) (A, Σ1 ) (B , Σ2 ) of the same category. P ROOF. The proof is obtained carrying over verbatim to the present situation the notations and reasonings of Section 3.1.4, with the difference that here Φ and Φ are thought of as subalgebras in EndK (A ⊕ B) and EndK (A ⊕ B) respectively; all the rest of the proof mentioned is preserved in the present interpretation.
3. Triangular products and stability of representations
39
4. The behavior of radicals and verbals with respect to the triangular product of representations of algebras is the same as in the semigroup case described in Proposition 3.6; we omit its formulation as well as the proof, because in an obvious way it repeats the semigroup case. The same remarks refer to P ROPOSITION 3.16. Let there be given two pairs (A, Σ1 ) and (B, Σ2 ) and a set (G, Γ) = (A, Σ1 ) (B, Σ2 ). For any Γ-submodule H in G contained in A, there exists a right epimorphism (H, Γ) → (A, Σ1 ) (B ∩ H, Σ2 ). 5. Embedding theorems referred in Paragraph 6 of Section 3.1.2 hold true also for representations of algebras. P ROPOSITION 3.17. Let there be given an arbitrary faithful pair (G, Γ), a Γ-submodule A of G, while Σ1 and Σ2 are the subalgebras of endomorphisms, induced by the K-algebra Γ in A and in G/A respectively. Then the pair (G, Γ) can be embedded as a subalgebra in the triangular product (A, Σ1 ) (G/A, Σ2 ). P ROOF. We take advantage of the fact that the pair (G, Γ) is faithful and replace the algebra Γ by the corresponding subalgebra in EndK G; in this way we obtain a pair which is right isomorphic to the original pair (G, Γ). Therefore we may assume, in what follows, that Γ is already contained in the algebra EndK G. The action of the elements of the algebra Γ on the module G induces their actions on A and on G/A; the morphisms arising in this way will be denoted μ and ν respectively. We set Im μ = Σ1 and Im ν = Σ2 . We select in G a subspace B complementary to A, which gives a direct decomposition G = A + B, with the accompanying natural epimorphism α : G G/A and projection β : G → G/A. The map α may be viewed as an isomorphism B → G/A, giving a unique meaning to the notation α−1 ; in particular, for −1 each g ∈ G we have (g α )α = g β . The pair (G/A, Σ2 ) and the map α induce the pair (B, Σ2 ); for b ∈ B and γ ∈ Γ, γ ν = σ2 ∈ Σ2 we have α−1 −1 b ◦ σ2 = (bα ◦ σ2 )α = (b ◦ γ)α = (b ◦ γ)β . Take an arbitrary element γ ∈ Γ and let γ μ = σ1 , γ ν = σ2 . Let us associate to the elements σ1 ∈ Σ1 and σ2 ∈ Σ2 respectively the elements 0 0 σ2 0 and 0 σ1 0 0 in End G, which determine the embeddings Σi → End G, i = 1, 2. Further, if Φ = Hom(B, A) is embedded, in the manner indicated in Section 3.1.1, in End G, then the subalgebra Φ ⊂ End G may be treated as the annihilator of the series 0 ⊂ A ⊂ G. Next, we remark that by the construction of the embedding of Σ1 ⊕ Σ2 in End G, for each g = a + b ∈ G we have g ◦ σ1 = a ◦ σ1 and g ◦ σ2 = b ◦ σ2 . Moreover, from b ◦ σ2 = (b ◦ γ)β it follows the existence of an a ∈ A such that b ◦ γ = a + b ◦ σ2 . From these remarks we obtain the relations b ◦ (γ − σ2 − σ1 ) = a
and
a ◦ (γ − σ2 − σ1 ) = 0;
they show that γ − σ2 − σ1 ∈ Φ, because Φ is the annihilator of the series 0 ⊂ A ⊂ G. Therefore there exists a ϕ ∈ Φ such that γ = ϕ + σ1 + σ2 . There arises an embedding of
40
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
algebras Γ → Φ Σ1 ⊕ Σ2 , which we denote by π; for each γ ∈ Γ, γ = ϕ + σ1 + σ2 it is given by the formula γ π = (ϕ, σ1 , σ2 ). This morphism π together with the isomorphism (A, Σ1 ) (B, Σ2 ) → (A, Σ1 ) (G/A, Σ2 ) induces also the useful embedding of pairs (G, Γ) → (A, Σ1 ) (G/A, Σ2 ). The proof of each of the following three propositions is essentially a transfer of the corresponding proof in the semigroup case, sometimes with light modifications; all difficulties are overcome without any pain, so we leave them to the Reader. We limit ourselves to the formulations. family of pairs (Ai , Σi ), i ∈ I, P ROPOSITION 3.18. Let there begiven an arbitrary ). Then the pair (A , Σ ) (B, Σ ) can be embedded into the and the pair (B, Σ i i i∈I pair i∈I (Ai , Σi ) (B, Σ ) . C OROLLARY 3.19. Let there be given arbitrary pairs (A, Σ1 ) and (B, Σ2 ). Then for each family of indices I one has the embedding A, Σ1 )I (B, Σ2 ) → ((A, Σ1 ) I (B, Σ2 ) . P ROPOSITION 3.20. Let there be given arbitrary pairs (A, Σ1 ) and (B, Σ2 ). Then for each family of indices I the pair (A, Σ1 ) (B, Σ2 )I can be embedded in the pair ((A, Σ1 ) (B, Σ2 ))I . 3.1.4. Connections between -constructions 1. The constructions of the triangular product of pairs of representations of groups, semigroups and algebras, as considered in the previous sections of this paper, are, as it seems to us, not only isolated technical tools, but partial appearances of a whole, more general concept. Here we shall indicate some correlations between these three constructions. 2. P ROPOSITION 3.21. Let there be given pairs representations of semigroups (A, Σ1 ) and (B, Σ2 ), while (G, Γ) is their triangular product. The acting semigroup Γ = Φ Σ1 × Σ2 is a group if and only if Σ1 and Σ2 are groups and the semigroup Φ = Hom+ K (B, A) is treated as a group. If these conditions are fulfilled then (G, Γ) is isomorphic to the triangular product of (A, Σ1 ) and (B, Σ2 ) as group pairs. P ROOF. Let us first make an observation. Let Σ1 and Σ2 be groups and let us treat Φ = HomK (B, A) as an additive Abelian group. Let us show that then Γ = Φ Σ1 × Σ2 is also a group. To this end, we remark that the element (ϕ , σ1 , σ2 ) ∈ Γ is a unity of Γ exactly when for each element (ϕ, σ1 , σ2 ) ∈ Γ we have (4)
(ϕ, σ1 , σ2 ) = (ϕ · σ1 + σ2 · ϕ , σ1 σ1 , σ2 σ2 ) and (ϕ, σ1 , σ2 ) = (ϕ · σ1 + σ2 · ϕ, σ1 σ1 , σ2 σ2 ).
From these relations it follows that, in particular, σi = εi , where εi are the units of Σi , i = 1, 2. Taking account of this, the equality of the first components in the triples in (4) takes the form ϕ · ε1 + σ2 · ϕ = ϕ · σ1 + ε2 · ϕ = ϕ.
3. Triangular products and stability of representations
41
The equation ϕ = ϕ · ε1 and the arbitrariness of the the choice of the element σ2 ∈ Σ2 imply now that ϕ + ε2 · ϕ = ϕ, i.e. ϕ = 0. Consequently, the unity of Γ must be the triple (0, ε1 , ε2 ), where 0 is the zero homomorphism in HomK (B, A), which is verified by an immediate check. In an analogous way one solves the question of inverse elements. Indeed, for the triple (ϕ , σ1 , σ2 ) to be the inverse of (ϕ, σ1 , σ2 ) it is necessary and sufficient that the following equations be fulfilled (5)
(ϕ · σ1 + σ2 · ϕ , σ1 σ1 , σ2 σ2 ) = (0, ε1 , ε2 ) and (ϕ · σ1 + σ2 · ϕ, σ1 σ1 , σ2 σ2 ) = (0, ε1 , ε2 ).
It follows from () that σ1 = σ1−1 and σ2 = σ2−1 . It follows then from the equalities for the first components in (5) that ϕ · σ1−1 = −σ2 · ϕ and ϕ · σ1 = −σ2−1 · ϕ, which equalities are equivalent to ϕ = −σ2−1 ·ϕ·σ1−1 . We conclude that the inverse to the triple (ϕ, σ1 , σ2 ) is given by (−σ2−1 · ϕ · σ1−1 , σ1−1 , σ2−1 ). The first statement of the proposition is now proved in a standard way in both directions of the implication. Let us pass to the proof of the second statement of the proposition. First, it is clear that for the subgroup Σ generated in Γ by Σ1 and Σ2 the subrepresentation (G, Σ) splits, (G, Σ) = (A ⊕ B, Σ1 × Σ2 ). Second, for each (ϕ, ε1 , ε2 ) ∈ Φ and (ϕ1 , σ1 , σ2 ) ∈ Γ one has (ϕ, σ1 , σ2 )−1 · (ϕ, ε1 , ε2 ) · (ϕ1 , σ1 , σ2 ) = (−σ2−1 · ϕ · σ1−1 , σ1−1 , σ2−1 ) × × (ϕ, ε1 , ε2 ) · (ϕ1 , σ1 , σ2 ) = (σ2−1 · ϕ · σ1 , ε1 , ε2 ), which shows the invariance of the subgroup Φ in Γ. Moreover, one checks immediately that the pair (G, Φ) is faithful, along with the fact that the image of Φ in Aut G coincides with the centralizer of the series 0 ⊂ A ⊂ G. Third, let us introduce the map f of the pair (G, Γ) into the pair (G, Γ∗ ), being the triangular product of (A, Σ1 ) and (B, Σ2 ) as group pairs, defining it as the identity map on G and on Γ by the formula (ϕ, σ1 , σ2 )f = (ϕ · σ1−1 , σ1 , σ2 ). def
A check shows that the map f is a morphism of the group pairs (G, Γ) and (G, Γ∗ ), and, furthermore, bijective. With these reasonings our statement is proved, and at the same time the proof of Proposition 3.21 is finished as well. 3. In the study of the interrelations between the triangular product of semigroup pairs (A, Σ1 ) (B, Σ2 ) and of pairs (or representations) of algebras (A, S1 ) (B, S2 ) an essential role is played by the following remark. In the semigroup Φ Σ1 × Σ2 its elements (ϕ, σ1 , σ2 ) and their components ϕ, σ1 , σ2 are thought of as endomorphisms in End G: σ2 ϕ ε2 ϕ ε2 0 σ2 0 and , , 0 σ1 0 ε1 0 σ1 0 ε1 respectively. In the algebra Φ S1 ⊕ S2 one has a different interpretation for its elements (ϕ, σ1 , σ2 ) and their components : 0 0 σ2 ϕ 0 ϕ σ2 0 , . and , 0 σ1 0 0 0 σ1 0 0
42
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Next, let there be given two semigroup pairs (A, Σ1 ) and (B, Σ2 ), which we, in a well-known manner, lift to the corresponding monoid pairs (A, Σ∗1 ) and (B, Σ∗2 ). The linear extension of the actions of Σ∗1 in A and Σ∗2 in B gives pairs (A, KΣ∗1 ) and (B, KΣ∗2 ), where the acting object are the corresponding semigroup algebras. Let us consider the triangular products (A, Σ1 ) (B, Σ2 ) = (A ⊕ B, Φ Σ1 × Σ2 ), (where the semigroup Φ = HomK (B, A) is treated as the centralizer of the series 0 ⊂ A ⊂ G in End G) and (A, KΣ∗1 ) (B, KΣ∗2 ) = (A ⊕ B, Φ (KΣ∗1 ⊕ KΣ∗2 )), (where Φ is treated as the annihilator of the series 0 ⊂ A ⊂ G in End G.) It is easy to verify that one has the following fact. P ROPOSITION 3.22. The map π∗ : (ϕ, σ1 , σ2 ) → (ϕ − ε, σ1 − ε2 , σ2 − ε1 ) gives an embedding of Φ Σ1 × Σ2 into the multiplicative semigroup of algebra Φ (KΣ∗1 ⊕ KΣ∗2 ), which agrees with the actions in the pairs (A⊕ B, Φ Σ1 × Σ2 ) and (A⊕ B, Φ (KΣ∗1 ⊕ KΣ∗2 )). 4. Let there be given any two pairs (A, S1 ) and (B, S2 ), whose acting objects S1 and S2 are unitary K-algebras. Using the natural “cutting” functor, we obtain the semigroup pairs (A, Σ1 ) and (B, Σ2 ), where Σi is the multiplicative semigroup of the algebra Si , i = 1, 2. We form anew the corresponding triangular products (A, S1 ) (B, S2 ) = (A ⊕ B, Φ (S1 ⊕ S2 )) and
(A, Σ1 ) (B, Σ2 ) = A ⊕ B, Φ Σ1 × Σ2 .
Then we obtain P ROPOSITION 3.23. The map π ∗ : (ϕ, σ1 , σ2 ) → (ϕ + ε, σ1 + ε2 , σ2 + ε1 ) gives an isomorphism of the multiplicative semigroup of the algebra Φ (S1 ⊕ S2 ) with the semigroup Φ Σ1 × Σ2 , which agrees with the action in the pairs (A ⊕ B, Φ (S1 ⊕ S2 )) and (A ⊕ B, Φ Σ1 × Σ2 ). The proof is easily obtained by an immediate checking of the definition, and will be omitted. 3.1.5. Comments 1. Under the influence of the view at representations of algebraic structures as twosorted systems (or pairs) there arose the language of pairs which as a working tool in the systematic study of representations by B. I. Plotkin in his book [35] balances the role of the inner structure of groups and their outer properties of actions on the representation modules.
3. Triangular products and stability of representations
43
2. The extension of the theme of varieties of groups (cf., e.g., [34]) to varieties of linear pairs-representations of groups required the statement of the questions there, which were suggested by group theory. Their solution, however, leads rather far from the original and requires new tools. So, in [36] there arose the construction of the triangular product of group representations. This carries in itself the analogue of the properties and the role of the wreath product of groups: cf. [38, 43, 49]. This construction is the natural model for the -construction for representations of semigroups and algebras considered in the present Section. The connections found in Section 3.1.4 between the three constructions may be interpreted as an argument for the advantage of these constructions. The term, but not the notion of triangular product is borrowed from Eilenberg [66], but its appearance is (according to [66]) connected with Schützenberger (1965). The books I. B. Menskiˇı [32] and S. Eilenberg [66] point to further paths for developing this theme, important for the applications.
3.2. The arithmetics of varieties of representations of semigroups and algebras The topic of this Section concerns the arithmetic properties of families of varieties of linear representations (over a field K) of semigroups and algebras, and also the same properties of varieties. In the study of varieties the machinery of the triangular products, as developed in the previous Sections, is applied. The main goal is to prove the “theorem of generators” for varieties of representations of semigroups as well as of algebras. In order to make its formulation more precise we remark that the set of varieties of pairs admits an associative multiplication: the pair (G, Γ) is contained in Θ1 · Θ2 if G admits a Γ-submodule H such that (H, Γ) ∈ Θ1 and (G/H, Γ) ∈ Θ2 . Furthermore, let us agree to denote by Var K the variety generated by the class of pairs K. The formula we are interested in is given by the formula Var(K1 K2 ) = Var K1 · Var K2 , which holds for arbitrary classes of pairs K1 and K2 . From this one can derive, in particular, that the semigroup of non-trivial varieties of linear representations of semigroups is a semigroup with unique decomposition into factors. As an application of this fact we prove Theorem 3.37 on the structure of the semigroup of varieties of linear automata. In the case of algebras this leads to a new proof of the theorem of Bergman and Lewin on the freedom of the semigroup of T -ideals in a free associative K-algebra of countable rank. Our approach puts this theorem and its proof into one row with the corresponding results for varieties of representations of groups and semigroups, and gives supplementary information on varieties of algebras, which is hard to obtain in the language of T -ideals (cf. Theorems 3.49, 3.50 and 3.51 below). Everywhere in this Section, K is a field. We speak here of a pair (G, Γ) if the semigroup (algebra) Γ acts as a semigroup (algebra) of K-endomorphisms on the K-module G; also, in Sections 3.2.5 the acting object is a semigroup, while in the Section 3.2.6 it is a K-algebra. Unless the contrary is told, the word “variety” means a“ variety of pairs which is distinct from the unit variety (the class of all pairs with zero domain of action), and from the “variety of all pairs”. 3.2.1. Varieties of linear pairs and automata 1. Here we introduce a connection (of Galois type) whose closed objects are varieties of representations of semigroups and special ideals in suitable algebras.
44
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
2. Let X = {x1 , x2 , . . . } be a countable set. Let Ψ and Ψ∗ be the free semigroup and the free monoid, respectively, with the elements of X as free generators. Let u = u(xi1 , . . . , xik ) be an arbitrary element of the semigroup ring KΨ∗ . By definition, in the pair (G, Γ) holds the (special) bi-identity y ◦ u ≡ 0 if for each specialization y → g ∈ G, xij → γij ∈ Γ the equality g ◦ u(γi1 , . . . γiiik ) = 0 holds in (G, KΓ∗ ). Here we can consider also bi-identities of a more general type, parallel to what was done in [35, p. 566–572]. However, in the case when K is a field each such system of bi-identities can easily be replaced by a system of special bi-identities equivalent to it. To each class of pairs Θ we associate in KΨ∗ the set UΘ of all u ∈ KΨ∗ such that in each pair in Θ the bi-identity y ◦ u ≡ 0 is fulfilled; we call UΘ the indicator of the class Θ. The subset UΘ is a two-sided ideal in KΨ∗ , invariant with respect to all endomorphisms of the ring KΨ∗ which are induced by an endomorphism of the monoid Ψ∗ ; the endomorphisms of the ring KΨ∗ with this property will be called special. Furthermore, we call special likewise those ideals of KΨ∗ which are invariant with respect to all special endomorphisms. Thus to each class of pairs Θ there corresponds a special ideal ΘU in the ring KΨ∗ . On the other hand, let U be an arbitrary subset in KΨ∗ . We associate with it a class ΘU according to the following rule: the pair (G, Γ) belongs to the class ΘU if and only if all bi-identities y ◦ u ≡ 0, u ∈ U, are fulfilled in this pair. We remark that if U is a twosided ideal in KΨ∗ , then the class ΘU is closed with respect to subpairs, homomorphic images and Cartesian products of pairs, an furthermore saturated. The last thing means, by definition that the class is also closed with respect to complete pre-images of this pair under right homomorphisms. In other words, what was said above means that the class ΘU is a variety of pairs. Next, let Θ be a variety of pairs, and U ⊂ KΨ∗ a two-sided special ideal. We have the relations Θ → UΘ → Θ(UΘ ) = Θ and U → ΘU → U(ΘU ) = U . It turns out that one has the equalities U = U and Θ = Θ . Hence, varieties of pairsrepresentations of semigroups are in a bijective correspondence with special ideals in KΨ∗ . On the set of linear representations of semigroups one can define a multiplication as follows. By definition a pair (G, Γ) is contained in the class Θ1 · Θ2 if in G there exist a Γ-submodule H such that (H, Γ) ∈ Θ1 and (G/H, Γ) ∈ Θ2 . Varieties of pairs form a semigroup with respect to this multiplication, which we denote by M = M(K). We remark that the indicator of the variety Θ1 · Θ2 is the ideal U2 · U1 , where U1 and U2 are the indicators of the varieties Θ1 and Θ2 , respectively. We have the following result. P ROPOSITION 3.24 ([17]). The semigroup M(K) of varieties of representations of semigroups is anti-isomorphic to the semigroup of special ideals of the ring KΨ∗ . In the case of a fixed acting semigroup Γ the requirement of saturation in the definition of variety becomes trivial, and in this case the variety of pairs is the Birkhoff class of the corresponding Γ-modules. Here we have the following. P ROPOSITION 3.25 ([17]). The varieties of Γ-modules are in one-to-one correspondence with the two-sided ideals of the semigroup ring KΓ∗ .
3. Triangular products and stability of representations
45
3. Let us pass to linear automata, which constitute a partial generalization of the linear systems in [18]. A linear automaton (semigroup automaton (Mealey)) is a three-sorted algebraic system A = (A, Γ, B), where A (the states) and B (the outputs) are K-modules, Γ is the semigroup of input signals, and there are given a K-linear map of transition A ◦ Γ → A and an operation of output A ∗ Γ → B with the properties a ◦ (γ1 γ2 ) = (a ◦ γ1 ) ◦ γ2 , ∀a ∈ A; γ1 , γ2 ∈ Γ, a ∗ (γ1 γ2 ) = (a ◦ γ1 ) ∗ γ2 . A linear automaton A = (A , Γ , B ) is a subautomaton of A = (A, Γ, B) if A ⊂ A, Γ ⊂ Γ, B ⊂ B are subobjects of the corresponding algebraic structures, and A ◦ Γ ⊂ A and A ∗ Γ ⊂ B . Let there be given two linear automata A = (A, Γ, B) and A = (A , Γ , B ) and a triple of morphisms σ = (σ1 , σ2 , σ3 ), σ1 : A → A , σ2 : Γ → Γ , σ3 : B → B . By definition, σ : A → A is a morphism of automata if the following conditions are fulfilled ∀a ∈ A,
γ∈Γ
(a ◦ γ)σ1 = aσ1 ◦ γ σ2
and (a ∗ γ)σ3 = aσ1 ∗ γ σ2 .
It is clear that the submodules Ker σ1 = Aσ ⊂ A, Ker σ3 = Bσ ⊂ B, and the kernel congruence Ker σ2 = κ on Γ satisfy the requirement ∀a, a ∈ A,
γ, γ ∈ Γ
((a − a ∈ Aσ )&(γκγ )) =⇒
=⇒ ((a ◦ γ − a ◦ γ ∈ Aσ )&(a ∗ γ − a ∗ γ ∈ Bσ )). Conversely, if in the components of the linear automaton A = (A, Γ, B) is chosen a family of congruences Λ = (Aσ , κ, Bσ ) satisfying the requirements mentioned then Λ is called a congruence of the automaton A. In this case the system A/Λ = (A/Aσ , Γ/κ, B/Bσ ), in which all operations on the equivalence classes are induced by the corresponding ones in A, is a linear automaton. This is, by definition, the factor automaton of A by Λ. It is clear that for linear automata one can formulate and prove the homomorphism theorems and Remak’s theorem. The Cartesianproduct of the family of linear automata Ai = (Ai , Γ i , Bi ), i ∈ I, is called the system i∈I Ai = (A, Γ, B), where A = ci∈I Ai and B = ci∈I B i are the complete direct sums of the modules Ai and Bi , i ∈ I, respectively, while Γ = i∈I Γi is the Cartesian product of the semigroups Γi , i ∈ I, the operations A ◦ Γ → A and B ∗ Γ → B being defined component wise. By definition, a class Θ of linear automata is called a Birkhoff class if it is closed with respect to epimorphic images, subautomata and Cartesian products. Furthermore, we say that a class Θ of linear automata is saturated if together with the automaton A = (A, Γ, B) it contains all automata of the form (A, Γ, B ), where B ⊃ B and for each epimorphism (A, Γ, B) → (A, Σ, B) which is identity on A and B, it follows from (A, Σ, B) ∈ Θ that (A, Γ, B) ∈ Θ. Saturated Birkhoff classes of linear automata will be called varieties of linear automata. 4. Let A = (A, Γ, B) be a linear automaton, accompanied by the maps μ0 : Γ → EndK A and μ∗ : Γ → Hom(A, B), given by the formulae ∀a ∈ A,
γ μ0 (a) = a ◦ γ
and γ μ∗ (a) = a ∗ γ.
46
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
We extend them by linearity to maps μ0 : KΓ → EndK A and μ∗ : KΓ → HomK (A, B). Then there arises the automaton AL = (A, KΓ, B). We say that in the automaton A the bi-identity y ◦ u ≡ 0 (the bi-identity z ∗ u ≡ 0) is fulfilled if for the linear extension σ : KΨ∗ → KΓ of an arbitrary homomorphism σ : Ψ → Γ induced by a specialization σ : X → Γ, the following condition is satisfied: for all a ∈ A we have in the automaton AL the relation a ◦ uσ = 0 (a ∗ uσ = 0). To a class of automata Θ we associate in F = KΨ∗ the pair of subsets (UΘ , VΘ ), the indicator of the class Θ. Here UΘ (the indicator of states of the class Θ) is the subset of all u ∈ KΨ such that for every automaton in Θ there is fulfilled the bi-identity y ◦ u ≡ 0. Similarly, VΘ (the indicator of outputs for Θ) is the subset of all v ∈ KΨ such that for every automaton in Θ there is fulfilled the bi-identity z ∗ v ≡ 0. One sees readily that UΘ is a two-sided ideal in F, while VΘ is a left special ideal in F . For the pair (UΘ , VΘ ) we have further UΘ · F ⊂ VΘ (compatibility condition). Indeed, for all a ∈ A, u ∈ UΘ , f ∈ F we have: a ∗ (uf )σ = a ∗ (uσ f σ ) = (a ◦ uσ ) ∗ f σ = 0 ◦ f σ = 0, proving the required statement. A pair (U, V), where U is a two-sided special ideal in F and V is a left special ideal in F, will be called an ideal pair. On the other hand, let (U, V) be any compatible pair of subsets of KΨ, the compatibility means that UF ⊂ V. We associate to such a pair a class of automata Θ by the following rule: the automaton A = (A, Γ, B) belongs to the class Θ if in A hold all bi-identities y ◦ u ≡ 0, u ∈ U, and likewise all bi-identities z ∗ v ≡ 0, v ∈ V. It is clear that the class of automata Θ = Θ(U ,V) obtained in this way is a variety. Let (UΘ , VΘ ) be the indicator of Θ, U being the minimal special ideal in F, containing the set U, and V being the minimal special left ideal in F , containing V. Then U = UΘ and V = VΘ . The equation U = UΘ is proved by the following reasoning. As the indicator UΘ is special it suffices, in view of the inclusion U ⊂ UΘ , to show that UΘ is contained in each special ideal I containing U. To this end, we consider the pair (KΨ∗ /I, Ψ), induced by the regular action of Ψ on KΨ∗ ; from U ⊂ I it follows that this pair is contained in Θ. Let J be the subset of all u ∈ KΨ∗ such that for the given pair hold the bi-identities y ◦ u ≡ 0, u ∈ J . We have UΘ ⊂ J . It is not hard to convince oneself that I = J . Indeed, clearly I ⊂ J . Furthermore, for arbitrary g ∈ KΨ∗ , v ∈ J we have I = (g + I) ◦ v = gv + I. Taking g = ε ∈ KΨ∗ , we deduce that v ∈ I that implies J ⊂ I. The statement is proved. Next, we show that V = VΘ . We note that the elements of V are sums of the form i fi vi , fi ∈ F, vi ∈ V. Therefore in each automaton A = (A, Γ, B) ∈ Θ we have ∀a ∈ A, a ∗ ( fi vi )σ = a ∗ ( fiσ viσ ) = (a ◦ fiσ ) ∗ viσ = ai ∗ vi = 0. i
i
i
i
This proves that V ⊂ VΘ . We begin the verification of the converse inclusion with the following observation. From the condition UF ⊂ V it is easy to see that U F ⊂ V . Therefore, if V is any special left ideal containing the set V, we have the linear automaton A = (F/U , Ψ, F/V ) with the regular action in the role of ◦ and ∗. From V ⊂ V it follows that A ∈ Θ. Regarding the automaton A we prove further that its
3. Triangular products and stability of representations
47
indicator W coincides with V . By definition, W = {w ∈ KΨ | (g + U ) ∗ w = 0 for all g ∈ F }. Next, for each v ∈ V we have (g + U ) ∗ v = (g + U )v = gv + U v ⊂ V , from which it follows that V ⊂ W. Conversely, for each w ∈ W we have V = (ε + U ) ∗ w = (ε + U )w = w + U w, but U w ⊂ U ⊂ V ⊂ V . Hence, we find w ∈ V , and thus W ⊂ V . Consequently, we have proved the required equality W = V . Furthermore, we have the following obvious fact: if the ideal pair (UΘ , VΘ ) is the indicator of some class of linear automata Θ, while the ideal pair (U , V ) is the indicator of some concrete automaton in the class Θ, then UΘ ⊂ U and VΘ ⊂ V . From this it follows that VΘ ⊂ ∩V⊂V V = V . The equality V = VΘ is proved. 5. This Subsection is devoted to the proof of the following proposition. P ROPOSITION 3.26. The nontrivial varieties of linear semigroup automata are in bijective correspondence with the ideal pairs of the ring F . We require an auxiliary result. L EMMA 3.27. If for a linear automaton A = (A, Γ, B) all its subautomata of the form Aa = (a ◦ KΓA , Γ, a ∗ KΓ) are contained in the variety Θ, then also A ∈ Θ. (Aa , Γ, Ba ) isomorphic to Aa , P ROOF. We select for each a ∈ A an cautomaton c and form the automaton (A , Γ, B ) = ( a∈A Aa , Γ, a∈A Ba ). Each element γ ∈ Γ can be viewed as a constant function: γ(a) = γ for all a ∈ A. In this way the an embedding of semigroup Γ is embedded in the Cartesian power ΓA , which induces automata (A , Γ, B ) → (A , ΓA , B ). But (A , ΓA , B ) = a∈A (Aa , Γ, Ba ) ∈ Θ, hence also (A , ΓA , B ) ∈ Θ. The automaton (A , Γ, B ) contains the subautomaton AA = ( da∈A Aa , Γ, da∈A Ba ), where we denote by da∈A the discrete direct sum of the corresponding modules. However, the class Θ is closed with respect to subautomata, from which it follows that AA ∈ Θ. The isomorphisms isomorphisms Aa → a ◦ KΓ∗ and Ba → a ∗ KΓ * induce epimorphisms d a∈A
Aa
a∈A
a ◦ KΓ∗ = A and
d
Ba B ⊂ B,
a∈A
so that we obtain an epimorphism of automata AA (A, Γ, B ). In view of AA ∈ Θ it follows now that (A, Γ, B ) ∈ Θ, hence, also (A, Γ, B) = A ∈ Θ. Proof of Proposition 3.26. Let there be given an arbitrary variety of linear automata Θ and an ideal pair (U, V). We have the juxtapositions (U, V) → Θ(U ,V) → (UΘ(U,V) , VΘ(U,V) ) and Θ → (UΘ , VΘ ) → Θ(UΘ ,VΘ ) . In the previous subsection, we have, actually, shown that U = UΘ(U,V) and V = VΘ(U,V) . We show now the equality Θ = Θ(UΘ ,VΘ ) ; for simplicity of notation, we shall denote the right hand side by the symbol Θ .
48
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
It is clear that it suffices to show that Θ ⊂ Θ. To this end it is in turn sufficient to prove that, for each automaton A = (A, Γ, B) ∈ Θ , all its subautomata of the form Aa = (a ◦ KΓ∗ , Γ, a ∗ KΓ), a ∈ A, lie in Θ; this follows from Lemma 3.27. Next, we prove this statement itself. Let a map τ = (τ1 , τ2 , τ3 ) of the automaton (KΓ∗ , Γ, KΓ) to the automaton Aa be given by the formula ∀u ∈ KΓ∗ ,
γ ∈ Γ,
v ∈ KΓ,
uτ1 = a ◦ u, γ τ2 = γ v τ3 = a ∗ v.
It is clear that τ is an epimorphism of automata. Set Ker τ1 = U and Ker τ3 = V. As Aa ∈ Θ and one has the isomorphism B = (KΓ∗ /U, Γ, KΓ/V) ∼ = Aa , we have B ∈ Θ . In the following writing of the remaining deductions we shall use the following notation. Let W be an arbitrary special ideal in KΨ∗ and Γ a semigroup. We denote by WΓ the set of images (values) of all elements of W under the homomorphisms KΨ∗ → KΓ∗ , induced by all possible specializations X → Γ. Note that WΓ is a special ideal in KΓ∗ . By definition of the class Θ all bi-identities y ◦ u ≡ 0, u ∈ UΘ , as well as all bi-identities z ∗ v ≡ 0, v ∈ VΘ , are fulfilled in the automaton B. Hence, the ideal (UΘ )Γ in the regular action ◦ annihilates the module KΓ∗ /U and we have for each u ∈ (UΘ )Γ U = (ε + U) ∗ u = ε ◦ u + U = u + U, which implies u ∈ U. Thus we have showed that (UΘ )Γ ⊂ U. In analogous manner one proves (VΘ )Γ ⊂ V. The relations proved guarantee the existence of an epimorphism of the automaton (KΓ∗ /(UΘ )Γ , (KΓ∗ /(VΘ )Γ ), contained in Θ, onto the automaton B. Therefore B ∈ Θ, and hence, in view of Aa ∼ = B, it follows that Aa ∈ Θ. The proof of the proposition formulated in the proposition beginning of this subsection, is complete. 3.2.2. Technical results 1. First of all, we mention the following result on the triangular product of pairs, which is going to be used. P ROPOSITION 3.28. For arbitrary subpairs (A , Σ1 ) and (B , Σ2 ) of (A, Σ1 ) and (B, Σ2 ) respectively, the pair (A , Σ1 ) (B , Σ2 ) belongs to the variety Var((A, Σ1 ) (B, Σ2 )) P ROOF. Let us introduce the notation (G, Γ) = (A, Σ1 ) (B, Σ2 ). The statement will be established in several steps. First, we note that the embeddings Σi → Σi , i = 1, 2, induce in obvious way the embedding of pairs (A, Σ1 ) (B, Σ2 ) → (G, Γ). Let Γ be the acting semigroup of the pair (A, Σ1 ) (B, Σ2 ). Set H = A + B . Clearly, H ∩ B = B , while an immediate verification shows that H ◦ Γ ⊂ H. Therefore we have the epimorphism of pairs (A ⊕ B , Γ ) (A, Σ1 ) (B , Σ2 ); cf. Proposition 3.7. The acting semigroup of the pair to the right of the arrow will be denoted Γ . We remark further that A is a Γ -submodule of A ⊕ B . Let us consider the pair (A , Σ1 ) (B , Σ2 ) and distinguish in the semigroup Φ = Hom+ (B , A) the subsemigroup Φ of all elements ϕ such that Im ϕ ⊂ A . Clearly, we have the natural isomorphism Hom+ (B , A ) → Φ , which again induces an isomorphism of pairs (A , Σ1 ) (B , Σ2 ) → (A ⊕ B , Φ Σ1 × Σ2 ), from which, in view of the fact that (A ⊕ B , Φ Σ1 × Σ2 ) is a subpair of (A, Σ1 ) (B , Σ2 ), it follows that
3. Triangular products and stability of representations
49
there exists an embedding (A , Σ1 ) (B , Σ2 ) → (A, Σ1 ) (B , Σ2 ). In view of the properties of Var(G, Γ), the constructed morphism of pairs gives the inclusion required in the proposition. 2. Let X = {x1 , x2 , . . . } be a countable set, while Ψ and Ψ∗ are the free semigroup and the free monoid respectively with the elements of X as free generators. Furthermore, let Θ be a variety of pairs and U the corresponding special ideal in KΨ∗ . The pair (KΨ∗ /U, Ψ), apparently, is a cyclic pair, and, as is readily seen, free in the variety Θ. It is easy to see that Θ = Var(KΨ∗ /U, Ψ). P ROPOSITION 3.29. Let (A, Σ) be an arbitrary pair and (R, Ψ) a free pair in the variety Θ2 . Then Var((A, Σ) (R, Ψ)) = Var(A, Σ) · Θ2 . P ROOF. Let us denote Θ1 = Var(A, Σ) and Θ3 = Var((A, Σ) (R, Ψ)). Using Proposition 3.4 and the Corollary to Proposition 3.9 together with Proposition 3.28 just proved, we deduce that Θ1 · Θ2 ⊂ Θ3 . On the other hand, we have Θ3 = Var((A, Σ) (R, Ψ)) ⊂ Θ1 · Θ2 . Hence Θ3 = Θ1 · Θ2 . 3. The results of the preceding subsection widen our understanding of the structure of semigroups of varieties of representations of semigroups. We can at once establish a useful property of this semigroup – it is a semigroup with twosided cancellation. We formulate this as the following theorem. T HEOREM 3.30. Let Θ, Θ1 , Θ2 be arbitrary varieties. The following implications are true: (a) Θ1 · Θ = Θ2 · Θ =⇒ Θ1 = Θ2 ; (b) Θ · Θ1 = Θ · Θ2 =⇒ Θ1 = Θ2 . Let the proof be preceded by two remarks on special ideals in the ring KΨ∗ . First, an immediate check of the definitions shows that each special ideal U in KΨ∗ is contained in the fundamental ideal Δ of the semigroup ring KΨ∗ . ∗ Second, for the semigroup ring KΨ ∞ asn a ring of polynomials in noncommuting variables from X we have the relation n Δ = 0. This allows us to introduce a notion of weight of an ideal U, v(U), defining it as the first index κ such that U ⊂ Δκ , U ⊂ Δκ+1 . It is easy to see that if a special ideal U is split into the product of two other proper special ideals, then the weight of the factors is less than the weight of U itself. Proof of Theorem 3.30. (a) We must show that Θ1 ⊂ Θ2 and Θ2 ⊂ Θ1 . Let assume that, for instance, Θ1 ⊂ Θ2 . Choose an arbitrary pair (A, Σ), generating the variety Θ1 and let (R, Ψ) be a free pair in Θ. Then, in view of Proposition 3.29, the pair (G, Γ) = (A, Σ) (R, Ψ) generates the variety Θ1 · Θ = Θ2 · Θ. Let Θ2 be the radical of the variety Θ2 . Let us consider the submodule H = Θ2 (A, Σ) ⊂ A. If we have H = A, then (A, Σ) ∈ Θ2 , hence Θ1 = Var(A, Σ) ⊂ Θ2 , which contradicts the assumption. Consequently, we must have H < A and we can apply Proposition 3.6; as a result we obtain the relation H = Θ2 (G, Γ), which together with (G, Γ) ∈ Θ2 Θ gives (G/H, Γ) ∈ Θ. The natural epimorphism (A, Σ) (A/H, Σ)
50
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
induces an epimorphism (G, Γ) (A/H, Σ) (R, Ψ); cf. Proposition 3.4. The submodule H lies, in view of the construction, in the kernel of this epimorphism. But then we have the following commutative diagram of epimorphisms. (G, Γ)
L
L
L
l L% ll (G/H, Γ)
/ (A/H, Σ) (R, Ψ) l6 ll
Therefore it follows from (G, Γ) ∈ Θ that (A/H, Γ) (R, Ψ) ∈ Θ, and from this again we find (in view of Proposition 3.29) that Θ ⊂ Var(A/H, Σ) · Θ = Var(A/H, Σ) · Var(R, Ψ) = Var((A/H, Σ) (R, Ψ)) ⊂ Θ, i.e. Θ = Var(A/H, Σ) · Θ. Next, let us show that the last equality leads to a contradiction. To this note we notice that in view of H < A the variety Var(A/H, Σ) is not identity, and that it follows from Var(A/H, Σ) ⊂ Θ that it cannot be the variety of all pairs. Consequently, to the variety Var(A/H, Σ) there corresponds in KΨ∗ a proper special ideal U2 . The special ideal corresponding to Θ shall be denoted U1 . In view of Proposition 3.24 we have U1 = U1 U2 . Comparison of the weights in the left hand side and the right hand side of this equality gives v(U1 ) = v(U1 U2 ) ≥ v(U1 ) + v(U2 ) > v(U1 ). This is a contradiction. Hence, it is true that Θ1 ⊂ Θ2 . As the varieties Θ1 and Θ2 , in this argument, enter in a symmetric fashion, we obtain analogously Θ2 ⊂ Θ1 . (b) Let us assume that Θ1 ⊂ Θ2 . Take any pair (A, Σ) generating the variety Θ, def and let (R, Ψ) be a free pair in Θ1 . According to Proposition 3.29, the pair (G, Γ) = ∗ (A, Σ) (R, Ψ) then generates the variety ΘΘ1 = ΘΘ2 . Let Θ2 be the verbal of Θ2 . Consider the submodule R0 = ∗ Θ2 (R, Ψ). If R0 = (0), then (R, Ψ) ∈ Θ2 . Then Θ1 = Var(R, Ψ) ⊂ Θ2 , contradicting the assumption. Hence R0 > (0). Using def ∗
Proposition 3.6 we obtain H = Θ2 (G, Γ) = A + R0 . From (G, Γ) ∈ ΘΘ2 it follows now that (H, Γ) ∈ Θ. We have, however, the natural right epimorphism (H, Γ) → (A, Σ) (R0 , Ψ); so the pair to the right of the arrow belongs also to the variety Θ. Furthermore, note that the free cyclic pair (B, Ψ) in the variety Var(R0 , Ψ) is contained in V SC(R0 , Ψ); the proof of this fact is done carrying over Lemma 1.3 in [49] word by word, to the semigroup case. Next according to Proposition 3.11 the pair (A, Σ) (R0 , Ψ)I can be embedded into ((A, Σ) (R0 , Ψ))I . It follows from the above mentioned relation (B, Ψ) ∈ V SC(R0 , Ψ) the existence of a subpair (B, Σ2 ) in (R0 , Ψ)I such that there exists a right epimorphism μ : (B, Ψ) (B, Σ2 ). The map μ, apparently, induces an epimorphism of pairs (A, Σ) (B, Ψ) (A, Σ) (B, Σ2 )
3. Triangular products and stability of representations
51
while it follows from the relations (B, Σ2 ) ⊂ (R0 , Ψ)I and (A, Σ) (R0 , Ψ)I ∈ Θ that (A, Σ) (B, Σ2 ) ∈ Θ; cf. Proposition 3.28. Hence (A, Σ) (B, Ψ) ∈ Θ. Let us use Proposition 3.29; as in (a) we deduce that Θ = Θ · Var(B, Ψ). Assume that U1 and U2 are the special ideals corresponding to the varieties Θ and Var(B, Ψ), respectively. We obtain the equality U1 = U2 · U1 , which, however, is a contradiction, as a comparison of the weights to the left and to the right shows. Consequently, Θ1 ⊂ Θ2 . The roles of Θ1 and Θ2 , being symmetric, we derive in an analogous fashion Θ2 ⊂ Θ1 . This completes the proof of the theorem. 4. Let us now pass to the presentation of a technical result, which will be necessary in the proof of the Theorem of generators. Namely, we study in detail the form of the bi-identities satisfied by the triangular products of pairs. Let there be given two arbitrary pairs (A, Σ1 ) and (B, Σ2 ) and let (G, Γ) be their triangular product. Furthermore, select arbitrary elements γi ∈ Γ, γi = (ϕi , σi , σi ), where ϕi ∈ Φ = Hom(B, A), σi ∈ Σ1 , σi ∈ Σ2 , i = 1, . . . , n; and let u = u(x1 , . . . , xn ) be some fixed element in the semigroup algebra KΨ∗ . As a first step in this direction let us compute the element u(γ1 , . . . , γn ) ∈ KΓ∗ . It is easy to understand that in the basic case when u = f (x1 , . . . , xn ) ∈ KΨ∗ , the element f (γ1 , . . . , γn ) has the form m n rij (σ1 , . . . , σn ) · ϕi · sij (σ1 , . . . , σn ), f (γ1 , . . . , γn ) = i=1 j=1
(6)
f (σ1 , . . . , σn ), f (σ1 , . . . , σn )
,
here m1 + · · · + mn is the length of the word f ∈ Ψ∗ , while each of the elements rij (x1 , . . . , xn ) and sij (x1 , . . . , xn ) are defined by the word f and the pair of indices i, j only. The details of the necessary verification here are left to the Reader. The formula (6) may be written more compactly by taking account of the following. Let us set out with the fact that Φ is an additive Abelian group: therefore, together with elements in Σk on Φ there act also elements in Z0 Σk ⊂ ZΣ∗k , k = 1, 2, where we denote by Z0 the set of non-negative integers. Setting r¯i =
mi
rij (σ1 , . . . , σn )
j=1
and s¯i =
mi
sij (σ1 , . . . , σn ),
j=1
we get the following formula, n r¯i · ϕi · s¯i , f (σ1 , . . . , σn ), f (σ1 , . . . , σn ) . f (γ1 , . . . , γn ) = i=1
An anewed attempt allows us now also to settle the general case. Indeed, let there be given a fixed element u = u(x1 , . . . , xn ) = λk fk (x1 , . . . , xn ), λk ∈ K. k ∗
in the semigroup algebra KΨ , and elements γi ∈ Γ as before. It is not hard to see that there exist in Z0 Ψ∗ elements r¯ik (x1 , . . . , xn ) and s¯ik (x1 , . . . , xn ) such that their
52
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
values r¯ik = r¯ik (σ1 , . . . , σn ) and s¯ik = s¯ik (σ1 , . . . , σn ) allow us to write the element u(γ1 , . . . , γn ) ∈ KΓ∗ in the form λk fk (γ1 , . . . , γn ) = u(γ1 , . . . , γn ) = =
k
k
n λk ( r¯ik · ϕi · s¯ik ), λk fk (σ1 , . . . , σn ), λk fk (σ1 , . . . , σn ) . i=1
k
k
n Below we denote the element i=1 r¯ik · ϕi · s¯ik by ψk ; then the expression then for u(γ1 , . . . , γn ) can be written more concisely as λk ψk , u(σ1 , . . . , σn ), u(σ1 , . . . , σn ) . (7) u(γ1 , . . . , γn ) = k
Let us make explicit how the element u(γ1 , . . . , γn ) acts on G. To this end we apply it to the element g = a + b, a ∈ A, b ∈ B. The action of elements in the ring KΓ∗ on G is the linear extension of the action of the elements of Γ∗ ; therefore, using (7) we see that g ◦ u(γ1 , . . . , γn ) = (a + b) ◦ ( λk ψk , u(σ1 , . . . , σn ), u(σ1 , . . . , σn )) = a◦
u(σ1 , . . . , σn )
+
k
λk b
ψk
+ b ◦ u(σ1 , . . . , σn ).
k
After these preparatory calculations let us pass to the main issue of this subsection – the form of the bi-identities in (G, Γ) = (A, Σ1 ) (B, Σ2 ). More exactly, we seek the form of the element g ◦ u(γ1 , . . . , γn ) in the assumption that in both factors of the triangular product the bi-identity y ◦ u ≡ 0 is satisfied. From this assumption it follows, in particular, that a ◦ u(σ1 , . . . , σn ) = 0
and b ◦ u(σ1 , . . . , σn ) = 0.
Thus, we are here led to the formula g ◦ u(γ1 , . . . , γn ) =
λk bψk .
k
The terms of the right and the left side of this equation can be processed further. Assume that we have r¯ik (x1 , . . . , xn ) = nikp vikp (x1 , . . . , xn ) p
and s¯ik (x1 , . . . , xn ) =
mikq wikq (x1 , . . . , xn ),
q
where all nikp , mikq ∈ Z0 and all vikp (x1 , . . . , xn ) and all wikq (x1 , . . . , xn ) belong to the monoid Ψ∗ . For simplicity we write v¯ikp = vikp (σ1 , . . . , σn ) in this notation we have r¯ik =
and w ¯ikq = wikq (σ1 , . . . , σn ); p
nikp v¯ikρ
3. Triangular products and stability of representations
and s¯ik =
53
mikq w ¯ikq .
q
In this way we obtain the element in a form of interest to us: λk bψk = λk b( i r¯ik ·ϕi ·¯sik ) = g ◦ u(γ1 , . . . , γn ) = k
(8)
=
k
(nikp · mikq · λk )bv¯ikp ·ϕi ·w¯ikp .
k,i,p,q
3.2.3. The fundamental lemma 1. Let K be an arbitrary class of pairs, and DK the class of all direct products of pairs in K, Θ = Var K and (A, Σ) a free pair in Θ. In these assumptions we have the following. L EMMA 3.31. If there is given in A a finite linearly independent system of elements a1 , . . . , an , then there exists a pair (B, Σ ) ∈ DK and a homomorphism of pairs μ : (A, Σ) → (B, Σ ) such that the elements aμ1 , . . . , aμn are linearly independent in B. P ROOF. The varieties of semigroup pairs are in bijective correspondence with special ideals in the ring KΨ∗ ; cf. Section 3.1.2 Thus, if the variety Θ corresponds to the special ideal U, then the pair (KΨ∗ /U, Ψ) is a free cyclic pair in Θ; therefore the given pair (A, Σ) is a subpair of the Cartesian power of the pair (KΨ∗ /U, Ψ). However, using Remak’s theorem, we readily see that in K there are pairs (Ai , Σi ),i ∈ I, such that there exists a right homomorphism ν of the pair (A, Σ) into the pair i∈I (Ai , Σi ); ¯ Σ). ¯ As ν is the identity map on A, the elements a this pair will be denoted (A, ¯i = aνi , ¯ i = 1, . . . , n, must be linearly independent in A. Furthermore, for any set of indices F ⊂ I, let πF be the natural projection of A¯ into the Cartesian sum of the subspaces Ai , the index i of which lies in F . Moreover, let A¯(F ) be the kernel of πF ; apparently, for any subsets F , F ⊂ I, we have the relation A¯(F ) A¯(F ) = A¯(F ∪F ) . ¯n . Let us show the existence of a Finally, let V be the linear hull of the vectors a ¯1 , . . . , a finite subset of I such that the projection corresponding to it induces a monomorphism on V . Indeed, we observe that one has the equalities A¯(F ) = A¯(∪F ⊂I F ) = A¯(I) = (0). F ⊂I
From this it follows that 0=V
F ⊂I
(V A¯(F ) ). A¯(F ) = F ⊂I
As V is a finite dimensional space, it follows from this that there exists a finite subset ∗ F ∗ ⊂ I such that V ∩ A¯(F ) = 0. It is not hard to see that the map πF ∗ is a monomorphism on V . Similarly to as was done above for the domain of action, we define a projection ¯ → πF : Σ i∈F Σi for the acting semigroups; in this way we get a projection of pairs ¯ Σ) ¯ → (Ai , Σi ). Let us set (B, Σ ) = ∗ (Ai , Σi ). It is clear that πF : (A, i∈F i∈F
54
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
(B, Σ ) ∈ DK and that the homomorphism μ = νπF ∗ : (A, Σ) → (B, Σ ) satisfies the desired requirement. 2. We are in a position to formulate and prove a fundamental lemma en route to the Theorem on a generating pair. L EMMA 3.32. Let the variety Θ1 be generated by the single pair (A, Σ1 ) and assume that the variety Θ2 is generated by an arbitrary class of pairs K2 , subject to the condition DK2 = K2 . Then Θ1 Θ2 = Var((A, Σ1 ) K2 )). P ROOF. Clearly, we have the inclusion Var((A, Σ1 ) K2 ) ⊂ Θ1 Θ2 . However, if (R, Ψ) is a free pair in Θ2 , then we have by virtue of Proposition 3.29 Θ1 Θ2 = Var(A, Σ1 ) Var(R, Ψ). Consequently, every bi-identity of the pair (A, Σ1 ) (R, Ψ) is also true in the pairs (A, Σ1 ) (B, Σ2 ), where (B, Σ2 ) ∈ K2 . All this leads thus to the verification of the following statement: if a certain bi-identity y ◦ u ≡ 0 is not fulfilled in ˜ Γ) ˜ = (A, Σ1 ) (R, Ψ), then there exists a pair (B, Σ2 ) ∈ K2 such that this the pair (G, bi-identity is not fulfilled in the pair (A, Σ1 ) (B, Σ2 ) either. First, we may assume that in both varieties Θ1 and Θ2 the bi-identity y ◦ u ≡ 0 is fulfilled. Indeed, if the bi-identity y ◦ u ≡ 0 is not fulfilled in Θ2 , then there exists a pair (B, Σ2 ) ∈ K2 , in which the said bi-identity is not fulfilled. But then this bi-identity cannot be fulfilled in (A, Σ1 ) (B, Σ2 ) either, and our assertion is proved. If, however, the bi-identity y ◦ u ≡ 0 is not fulfilled in Θ1 , then it cannot hold neither in (A, Σ1 ) nor in (A, Σ1 ) (B, Σ2 ), for any choice of (B, Σ2 ) ∈ K2 , and in this case all is proved anew. Using this observation we assume that the bi-identity y ◦ u ≡ 0 is not fulfilled in ˜ and γ ∗ , . . . , γ ∗ ∈ ˜ Γ), ˜ but holds true in Θ1 and Θ2 . This means that there exist g ∗ ∈ G (G, 1 n ˜ Γ such that g ∗ ◦ u(γ1∗ , . . . , γn∗ ) = 0. In view of this condition, if g ∗ = a + h, where a ∈ A, h ∈ R and γi∗ = (ϕ∗i , σi , σi ), where ϕ∗i ∈ Hom+ (R, A), σi ∈ Σ1 , σi ∈ Ψ, i = 1, . . . , n, then a ◦ u(σ1 , . . . , σn ) = 0
and h ◦ u(σ1 , . . . , σn ) = 0.
With the aid of formula (8) of the previous Subsection we have (nikp · mikpq · λk )hv¯ikp ·ϕ1 ·w¯ikq . g ∗ ◦ u(γ1∗ , . . . , γn∗ ) = i,k,p,q
Let V be the linear hull in R of the finite subset i,k,p hv¯ikp . This is a finite dimensional subspace in R and so we may apply Lemma3.31. In view of this result there exists a pair (B, Σ2 ) ∈ K2 and a homomorphism μ : (R, ψ) → (B, Σ2 ) which is a monomorphism on V . It turns out that in the pair (G, Γ) = (A, Σ1 ) (B, Σ2 ) the bi-identity y ◦ u ≡ 0 is not fulfilled. In order to prove this let us consider a K-morphism ν : B → R which is inverse to μ on V μ and defined in an arbitrary, but fixed manner outside V μ ; such a morphism can be defined in a corresponding way on a basis of B obtained by complementing a basis of V μ ⊂ B. Moreover, we put
3. Triangular products and stability of representations
55
def
(1) fi = νϕ∗i , i = 1, . . . , n; it is clear that ϕi ∈ Hom+ (B, A). We further remark that ∗
∗
[(h ◦ v¯ikp )μ ]ϕi = [(h ◦ v¯ikp )μν ]ϕi = (h ◦ v¯ikp )ϕi , def
def
(2) b = hμ , and g = a + b ∈ A ⊕ B; def (3) τi = (σi )μ ∈ Σ2 , i = 1, , . . . , n; (4) γi = (ϕi , σi , τi ) ∈ Hom+ (B, A) Σ1 × Σ2 ; (5) v˜ikp = vikp (τ1 , . . . , τn ) = [vikp (σ1 , . . . , σn )]μ ∈ Σ2 . According to formula (8) we have in this notation g ◦ u(γ1 , . . . , γn ) = (nikp mikq λk )bv˜ikp ·ϕi ·w¯ikq . i,k,p,q
The sum to the right in this equation admits a not very difficult transformation12 showing that it equals g ∗ ◦ u(γ1∗ , . . . , γn∗ ). However, g ∗ ◦ u(γ1∗ , . . . , γn∗ ) = 0, so we conclude that g ◦ u(γ1 , . . . , γn ) = 0. Hence, y ◦ u ≡ 0 cannot hold in (G, Γ). Thereby, our statement is proved and so also Lemma 3.32. 3.2.4. The theorem on generating representations of semigroups This section will be devoted to the proof of one of the fundamental results of this section, Theorem 3.33 below. It is a key result and admits a series of consequences for the structure of classes of linear representations of semigroups, and gives also means for the study of interesting individual representations. T HEOREM 3.33. Let K1 and K2 be any two classes of linear representations (over a field K) of semigroups. The following formula holds true Var K1 · Var K2 = Var(K1 K2 ). P ROOF. Let us introduce the notations Θ = Var(K1 K2 ) and Θi = Var Ki , i = 1, 2. As was shown in Paragraph 3 of Section 3.1.2, for arbitrary pairs (A, Σ1 ) ∈ Θ1 and (B, Σ2 ) ∈ Θ2 it holds (A, Σ1 ) (B, Σ2 ) ∈ Θ1 Θ2 . Therefore we have the inclusion K1 K2 ⊂ Θ1 Θ2 , which also implies that Θ ⊂ Θ1 Θ2 . It remains to prove the converse inclusion Θ1 Θ2 ⊂ Θ. The corresponding reasoning will be given in two steps. In the first of them we assume temporarily that one can remove the restriction DK2 = K2 in Lemma 3.32, and show that this can be used in the proof at hand. In the second step we show that this refined version of Lemma 3.32, indeed, holds true. The first step is a reduction. Let (A, Σ1 ) be a faithful pair generating the variety Θ1 , as (A, Σ1 ) we may take, for instance, the “faithfulling” of a free cyclic pair in Θ1 . One sees readily that in these assumptions there exists a family of pairs (Ai , Σi ) ∈ K1 , i ∈ I, and a subpair (A , Σ ) in the Cartesian product (A, Σ) = i∈I (Ai , Σi ) such that there is an epimorphism of pairs (A , Σ ) (A, Σ1 ). Next, let us fix an arbitrary pair (B, Σ2 ) in the class K2 . According to Proposition 3.9 there exists an embedding ((Ai , Σi ) (B, Σ2 )), (A, Σ) (B, Σ2 ) → i∈I 12But is hard to write down and so will be omitted,
56
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
where the pair to the right of the arrow, clearly, lies in Θ. But then the same is also true for the pair (A, Σ) (B, Σ2 ) and also for the pair (A , Σ ) (B, Σ2 ); cf. Proposition 3.28. Finally, using Proposition 3.4: the epimorphism (A , Σ ) (A, Σ1 ) guarantees the relation (A, Σ1 ) (B, Σ2 ) ∈ Θ. To sum up, we see that (A, Σ1 ) K2 ⊂ Θ, from which, on the basis of our above assumption and Lemma 3.32, the relation Θ1 Θ2 ⊂ Θ follows at once. The second step is the refinement of Lemma 3.32. Assume that the class K1 consists ¯2 = of the single pair (A, Σ), the class K2 being arbitrary. Furthermore, let us denote K DK2 , Θ = Var(K1 K2 ) and Θi = Var Ki , i = 1, 2. It follows immediately from ¯ 2 ) = Θ1 Θ2 . Therefore we have Θ ⊂ Θ1 Θ2 . Lemma 3.32 that Var(K1 K Let us show that we have also the converse embedding Θ1 Θ2 ⊂ Θ. ¯ Σ) ¯ = Let (Bi , Σi ), i = 1, . . . , n, be an arbitrary finite family in the class K2 ; (B, n ¯ ¯ ¯ i=1 (Bi , Σi ); G = A + B; Γ = Hom(B, A) Σ × Σ. An easy verification shows that the subspaces A + Bi ⊂ G, i = 1, . . . , n, are Γ-invariant; thus we have the pairs (A + Bi , Γ). We show that all these pairs lie in the variety Θ. Indeed, the pairs (A, Σ) (Bi , Σi ) lie, apparently, in Θ. Let us prove the existence of epimorphisms μi : (A + Bi , Γ) → (A, Σ) (Bi , Σi ), from which it will follow that (A + Bi , Γ) ∈ Θ, i = 1, . . . , n. ¯ → On the domains of action we define μi as the identity. The natural projections Σ ¯ → Σ×Σi . The association to each ϕ ∈ Hom(B, ¯ A) ¯ i give homomorphisms μi : Σ× Σ Σ ¯ A) Hom(Bi , A); it suffices its restriction to Bi defines an epimorphism μi : Hom(B, ¯ obtained by comto recall that it is sufficient to give a K-morphism on the bases of B pleting the bases of Bi ⊂ B. Let us check that the triple of maps μi thus defined gives a morphism of the triangle products μi : Γ → Hom+ (Bi , A) (Σ × Σi ). Take any two elements (ϕ, σ1 , σ2 ) and (ϕ , σ1 , σ2 ) in Γ. We compute [(ϕ, σ1 , σ2 )(ϕ , σ1 , σ2 )]μi = ((σ2 · ϕ + ϕ · σ1 )μi , σ1 σ1 , (σ2 σ2 )μi ) = = (σ2μi · ϕμi + ϕμi · σ1 , σ1 σ1 , σ2μi · σ2 μi ) = = (ϕ, σ1 , σ2 )μi · (ϕ , σ1 , σ2 )μi . For this it suffices to invoke the relation (σ2 · ϕ + ϕ · σ1 )μi = σ2μi · ϕμi + ϕμi · σ1 , which we used in these computations. Indeed, for each b ∈ Bi we have b · σ2 = b · σ2μi ∈ Bi , from which it follows that
μi
b(σ2 ·ϕ +ϕ·σ1 )
= bσ2 ·ϕ +ϕ·σ1 = (b ◦ σ2 )ϕ + (bϕ ) · σ1 = = (b ◦ σ2μi )ϕ
μi
+ (bϕ ) ◦ σ1 =
μi ·ϕμi +ϕμi ·σ1
= bσ2
μi
.
Our statement is completely proved. It remains to establish that the map μi agrees with the action in the pairs considered. Take arbitrary elements a + b ∈ A + Bi and (ϕ, σ1 , σ2 ) ∈ Γ, and let us provide the
3. Triangular products and stability of representations
57
necessary verification taking in account that the map is identity on the domains of action. We have (a + b)μi ◦ (ϕ, σ1 , σ2 )μi = (a + b) ◦ (ϕμi , σ1 , σ2μi ) = = bϕ
μi
+ a ◦ σ1 + b ◦ σ2μi = bϕ + a ◦ σ1 + b ◦ σ2 =
= (a + b) ◦ (ϕ, σ1 , σ2 ). To sum up, we have proved the relation (A + Bi , Γ) ∈ Θ, i = 1, . . . , n. But the Γmodules A + Bi , i = 1, . . . , n, generate the module G. Hence, repeating the train of thoughts in the proof of Lemma 3.27 we deduce that (G, Γ) ∈ Θ. Consequently, we have ¯ 2 ⊂ Θ, which at once implies the desired relation Θ1 Θ2 = Var(K1 K ¯2) ⊂ K1 K Θ.
3.2.5. Consequences. Connections with linear automata 1. The rising interest in the arithmetic and the geometry of non-commutative rings gives the stimulus for the study of the question on the unique factorization of elements in semigroups. Here we shall prove that, in particular, the unique factorization holds in the semigroup of varieties of linear representations of semigroups. We are going to use the following lemma. L EMMA 3.34. Assume that the relations Θ1 Θ2 = Θ1 Θ2 and Θ2 ⊂ Θ2 hold for the varieties Θ1 , Θ2 , Θ1 , Θ2 . Then there exists a variety Θ3 such that Θ1 Θ2 = Θ1 Θ3 Θ2 . P ROOF. Let (Ri , Ψ) be a free pair in Θi , i = 1, 2, and set (G, Γ) = (R1 , Ψ) (R2 , Ψ). In view of the relations Θ2 = Var(R2 , Ψ) and Θ2 ⊂ Θ2 we have (R2 , Ψ) ∈ Θ2 . We denote by ∗ Θ2 the verbal of the variety Θ2 , and set ∗ B = Θ2 (R2 , Ψ). One can check that B = 0; otherwise we would have (R2 , Ψ) = (R2 /B, Ψ) ∈ Θ2 , which is a contradiction. We take Θ3 = Var(B, Ψ) and prove first that Θ1 Θ3 Θ2 ⊂ Θ1 Θ2 . Indeed, from B > 0 it follows that ∗ Θ2 (G, Γ) = R1 + B, and there exists a right epimorphism (R1 + B, Γ) → (R1 , Ψ)(B, Ψ), as follows from Propositions 3.6 and 3.7. However, by virtue of Theorem 3.33 the pair (R1 , Ψ)(B2 , Ψ) generates the variety Θ1 Θ3 from where, due to the epimorphism indicated above, it follows Var(R1 + B, Γ) = Θ1 Θ3 . Note also that (G, Γ) ∈ Θ1 Θ2 = Θ1 Θ2 , which is equivalent to the inclusion (∗ Θ2 (G, Γ), Γ) ∈ Θ1 . To sum up, we have shown that Θ1 Θ2 ⊂ Θ1 , from which the relation required here follows. On the other hand, by virtue of Theorem 3.33 we have Var(G, Γ) = Θ1 Θ2 , so the relation (∗ Θ2 (G, Γ), Γ) ∈ Θ1 Θ3 obtained above is equivalent to (G, Γ) ∈ Θ1 Θ3 Θ2 . We obtain Θ1 Θ2 = Θ1 Θ2 ⊂ Θ1 Θ3 Θ2 , which together with the inclusion proved above gives Θ1 Θ2 = Θ1 Θ3 Θ2 . A variety is called indecomposable, if it cannot be presented as the product of two non-trivial factors. The main consequence of the Theorem of Generating Representations is the following.
58
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
T HEOREM 3.35. Each variety of linear representations (over a field K) of semigroups can uniquely be decomposed as a product of finitely many indecomposable varieties.
P ROOF. Let us first show the possibility to decompose every variety as a product of finitely many indecomposable varieties. The anti-isomorphism between the semigroup of varieties of pairs and the semigroup of proper special ideals of KΨ∗ permits us to translate this statement to the language of ideals: we have to replace the word “variety” by “proper special ideal”. In this new formulation the statement is readily proved by induction over the weight of the special ideal considered. It remains to prove the uniqueness of the decomposition. It is not hard to see that this follows from the following fact: if the varieties Θ1 and Θ1 are indecomposable, then, for any varieties Θ2 and Θ2 , the equality Θ1 Θ2 = Θ1 Θ2 implies Θ1 = Θ1 and Θ2 = Θ2 . In order to prove this statement we replace Θ2 = Θ2 by the equivalent pair of inclusions Θ2 ⊂ Θ2 and Θ2 ⊂ Θ2 , assuming that Θ2 ⊂ Θ2 . In view of Lemma 3.34 there exists then a variety Θ3 such that Θ1 Θ3 Θ2 = Θ1 Θ2 . Next, using Theorem 3.30 and cancelling this identity to the right by Θ2 , we obtain Θ1 = Θ1 Θ3 , which contradicts the condition. The relation Θ2 ⊂ Θ2 is proved analogously. Thus, the equation Θ2 = Θ2 is established. Applying anew Theorem 3.30, we deduce from Θ1 Θ2 = Θ1 Θ2 that Θ1 = Θ1 . T HEOREM 3.36. The semigroup of varieties of linear representations (over a field K) of semigroups is free. This theorem follows at once from Theorem 2.3.
2. The role of the wreath product of groups in the proof of the theorem of Shmel’kin and Neumanns on the possibility of free generation of nontrivial varieties of groups by indecomposable varieties of groups is well-known; cf. [51, Theorem 23.4]. However, there is a different path of proof using another technique [64]. Guided by this, we give, for the sake of completeness, another proof of Theorem 3.36, which, moreover, works inside the ring KΨ∗ ; it is a suitable reinterpretation of the argument in [56]. Moreover, it is convenient to give here a new formulation of Theorem 3.36: The semigroup of proper special ideals of the ring KΨ∗ is free. Second proof of Theorem 3.36. It is mentioned in [56] that the semigroup ring F = KΨ∗ (it is a free associative algebra with unit on X over K) is a left and right FIring without non-trivial elements invariant from the right. Consequently, one can apply Theorem 5 in the same paper [56]; according to this theorem, the semigroup R of all nonzero two-sided ideals of the ring F is free with the set of all indecomposable proper ideals F in the role of the system of free generators. Furthermore, we note that the product of proper special ideals of F is again a proper special ideal, and in this way one distinguishes in R a subsemigroup S of such ideals. Clearly, our theorem is proved if we show that, for arbitrary ideals A and B in R such that AB ∈ S, it is true that A ∈ S and B ∈ S. This will be proved below.
3. Triangular products and stability of representations
59
We remark that from the uniqueness of decomposition of an ideal A ∈ S into indecomposable factors follows the invariance of these factors with respect to each special13 automorphism of the ring F. Moreover, it is expedient to introduce the following notion. An endomorphism of the ring F is called particular if it is induced by an endomorphism η of the monoid Ψ∗ such that X ⊂ X η . Let us show that for each particular endomorphism η holds A ⊂ Aη . Indeed, let u be an arbitrary element of A and S = {x1 , · · · , xn } ⊂ X be such that u ∈ Kx1 , . . . , xn . In view of the particularity of η there exist xi ∈ X such that xηi = xi , i = 1, . . . , n. Consider a permutation γ on X permutation such that xγi = xi , i = 1, . . . , n, and extend it to an automorphism of F. It is clear that γ is a special automorphism of F and according to the above remark we therefore have Aγ = A. By our construction uγη = u; hence, u = uγη ∈ Aγη = Aη . Thus we have proved that A ⊂ Aη . Next, we complete the proof of our main statement. The map η, being particular and AB a special ideal, we deduce that AB ⊃ (AB)η = Aη B η ⊃ Aη B ⊃ AB, hence AB = Aη B. The particularity of η further forces that F η = F . Therefore Aη is an ideal in F and, next, from the freedom of the semigroup R it follows, in particular, that A = Aη . Furthermore, let μ be a special endomorphism of the ring F . For each u ∈ A one can construct a particular endomorphism η : F → F , which coincides with μ on the element u. Indeed, for all xi ∈ S we set xηi = xμi ∈ Ψ∗ , and on the complement X\S we define η as an arbitrary surjective map X\S X. The map η : X → Ψ∗ obtained in this way is extended to a special endomorphism η : F → F which is particular by the construction. We have uμ = uη ∈ Aη = A which proves that A is a special ideal. In an analogous way one proves that B is special. This completes the proof. 3. Let us consider the relation between the above mentioned material and the theory of automata. An automaton A = (A , Γ, B ) is called an invariant subautomaton of the linear automaton A = (A, Γ, B) if the following conditions are fulfilled: (1) A ⊂ A and B ⊂ B are K-submodules; (2) A is Γ-invariant with respect to the action ◦; (3) For any a ∈ A and γ ∈ Γ we have a ∗ γ ∈ B . Every invariant subautomaton A ⊂ A is accompanied by a factor automaton A/A = (A/A , Γ, B/B ), where for all a ¯ ∈ A/A and γ ∈ Γ is put a ¯ ◦ γ = a ◦ γ and a ¯∗γ = a ∗ γ. It is apparent that this definition is consistent. Having this notion to our disposal, we can then define a corresponding associative multiplication of varieties of linear automata. Let there be given any two linear automata Θ1 and Θ2 . Then, by definition A = (A, Γ, B) ∈ Θ1 · Θ2 if there exists an invariant subautomaton A ⊂ A, A ∈ Θ1 , such that A/A ∈ Θ2 . We denote by Ma (K) the semigroup of varieties of linear automata over K. Each linear automaton (A, Γ, B) is accompanied by a linear pair (A, Γ), and the semigroup Ma (K) of the varieties of such pairs is free (Theorem 3.36). It is naturally to try to settle the question of the freedom 13Such a name is given to those automorphisms (endomorphism) of the ring F which are induced by automorphisms (endomorphisms) of the monoid Ψ∗ .
60
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
of the semigroup of varieties of linear automata Ma (K). The answer is given in the following theorem. T HEOREM 3.37. The semigroup Ma (K) of varieties of linear automata (over the field K) is not free, but it contains a maximal free subsemigroup isomorphic to the semigroup of varieties of linear representations of semigroups. P ROOF. Introduce on the set I a (K) of ideal pairs (cf. Section 3.1.4) the following multiplication (U1 , V1 ) ∗ (U2 , V2 ) = (U1 U2 , U1 V2 ). It is clear that I a (K) equipped with this multiplication is a semigroup, the semigroup of ideal pairs. It turns out that this semigroup is anti-isomorphic to the semigroup Ma (K). In order to see this we have to show: if the varieties of linear automata Θi are defined by the ideal pair (Ui , Vi ), i = 1, 2, then the variety Θ1 · Θ2 is defined by the ideal pairs (U1 , V1 ) ∗ (U2 , V2 ). Let us denote by Θ the variety defined by the latter ideal pair. It is easy to check that the automaton A = (F /U2 U1 , Ψ, F/V2 V1 ) ∈ Θ is a free14 linear automaton in the variety Θ. Let us also take into consideration the following two automata: A1 = (F/U1 , Ψ, F /V1 ) and A2 = (F/U2 , Ψ, F /V2 ); it is clear that the Ai are free in the varieties Θi , i = 1, 2, respectively. Let us show that A ∈ Θ1 · Θ2 . Let us consider in A the invariant subautomaton A3 = (U2 /U2 U1 , Ψ, V2 /U2 V1 ); the properties of the ideal pair (U2 , V2 ) guarantee its existence. One has A3 ∈ Θ1 . Indeed, we have the relations (U2 /U2 U1 ) ◦ U1 = (U2 /U2 U1 ) · U1 = U2 U1 /U2 U1
and
(U2 /U2 V1 ) ∗ V1 = (U2 /U2 V1 ) · V1 = U2 V1 /U2 V1 . This means that in A there is an invariant subautomaton A3 , A3 ∈ Θ, such that A/A3 = (F /U2 , Ψ, F/V2) ∈ Θ2 . Hence, it follows by definition that A ∈ Θ1 · Θ2 . So we have proved that Θ ⊂ Θ1 · Θ2 . Let us show the converse inclusion Θ1 · Θ2 ⊂ Θ. Take any automaton A = (A, Γ, B) ∈ Θ1 · Θ2 . By definition, there exists an invariant subautomaton A = (A , Γ, B ) ⊂ A, A ∈ Θ1 , such that A/A = (A/A , Γ, B/B ) ∈ Θ2 . With the help of this we show that A ∈ Θ. We have to verify that in A hold all bi-identities y ◦ u ≡ 0, u ∈ U2 U1 and all bi-identities z ∗ v ≡ 0, v ∈ V2 V1 . The interpretation of the relations A ∈ Θ1 , A/A ∈ Θ2 gives A ◦ U1σ = 0, A ∗ V1σ = 0 and A ◦ U2σ ⊂ A , A ∗ V2σ ⊂ B for each specialization homomorphism σ. This implies that A ◦ (U2 U1 )σ = A ◦ (U σ U1σ ) = (A ◦ U2σ ) ◦ U1σ ⊂ A ◦ U1σ = 0 and A ◦ (U2 V1 )σ = A ◦ (U σ V1σ ) = (A U2σ ) V1σ ⊂ A V1σ = 0.
14The notion of a free (free in a given variety) of a linear automaton is formulated in the known category scheme, and is left to the Reader.
3. Triangular products and stability of representations
61
From these computations it follows that A ∈ Θ. The relation Θ1 · Θ2 ⊂ Θ has been checked, and thus we have established the statement in the beginning of the proof. In view of the anti-isomorphism of the semigroups Ma = Ma (K) and the relation a I = I a (K) it is sufficient to prove the non-freedom for the semigroup I a . We assume the converse, and take arbitrary three ideal pairs (U1 , V1 ), (U1 , V1 ) and (U2 , V2 ) with V1 = V1 ). Then we have (U1 , V1 ) ∗ (U2 , V2 ) = (U1 U2 , U1 V2 ) = (U1 , V1 ) ∗ (U2 , V2 ). But the semigroup I a , by assumption being free, is a semigroup with cancellation. Therefore, from the equality (U1 , V1 ) ∗ (U2 , V2 ) = (U1 , V1 ) ∗ (U2 , V2 ) we deduce that (U1 , V1 ) = (U1 , V1 ), contradicting the condition V1 = V1 . At the same time, there is an epimorphism of the semigroup Ma onto the free semigroup M, τ : Ma M, which in the language of ideal pairs is given by the formula (U, V)τ = U. Moreover, in the semigroup Ma we can distinguish the free subsemigroup M0 , isomorphic to M; in the same language of ideal pairs M0 is described as a subsemigroup of all ideal pairs of the form (U, U). It is easy to see that M0 is a maximal free subsemigroup of Ma . Indeed, in the opposite case one can embed M0 into a larger free subsemigroup M1 ⊂ Ma , anti-isomorphic copy I1a of which in I a contains ideal pairs of the form (U1 , V1 ) with V1 = U1 . But then we have, for each pair (U2 , V2 ) ∈ I a , (U1 , V1 ) ∗ (U2 , V2 ) = (U1 U2 , U1 V2 ) = (U1 , U1 ) ∗ (U2 , V2 ). We obtain a relation which, by virtue of V1 = U1 cannot hold true in the free semigroup I1a . Thus Theorem 3.37 is proved. 4. The fact established in Theorem 3.35 brings up the question of the description of indecomposable varieties of linear representations of semigroups. There exists a discussion of the corresponding question for varieties of group pairs, [30]. It turns out that these arguments remain in force also for semigroup pairs. First, let us remark that in the ring KΨ∗ one can build up a Fox calculus [70]15, and deduce, in particular, all results which are reviewed in the two first pages of [30]. We omit the details of this translation of the fundamentals of the free differential calculus to the semigroup case. In view of this one can prove the following facts. T HEOREM 3.38. Let Θ be a variety of linear representations of semigroups given by bi-identities of the form y · u ≡ 0, where the expression of the elements u ∈ KΨ∗ involves only n elements (variables) in X. Then the equation Θ = Θ1 Θ2 . . . Θm with m > n is not possible for any varieties of pairs Θ1 , Θ2 , . . . , Θm . From this we obtain at once the following C OROLLARY 3.39. If we, in the conditions and notations of the previous theorem, in addition impose n = 1, then the variety Θ is indecomposable. P ROOF. The proof of Theorem 3.38 runs parallel to the corresponding proof in the group case. It is necessary to alter a little bit only the proof of Lemma 1 on pp. 1209-1210 in [8], where there is derived an expression of a certain special form of the element u. 15Translators’ Remark. For the work of Ralph Fox (1913-1873), see [90].
62
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
But this expression of the element u exists also in the ring KΨ∗ ; it suffices only to take in account the relations xi xj ≡ xj xi
(mod Δ2 ) and xti − 1 = t(xi − 1) (mod Δ2 ),
which are fulfilled for arbitrary xi , xj ∈ X and any natural number t. The further particularities are omitted. 3.2.6. The theorem on generating representations of algebras 1. The facts on linear semigroup pairs in the form in which they were presented in the previous section, have analogues also for representations of algebras. The degree of the parallelism with the semigroup case is high here, and all statements and their statements verifications can in practice be carried over word by word to the algebra case. Therefore we limit ourselves to formulating the results and making remarks. A central role is anew taken by the triangular product construction; for representations of algebras this construction was introduced in Section 3.1.3. As before, through out this section, K will be a fixed field and all algebras considered associative K-algebras. 2. A variety of representations of associative algebras, is, by definition, a class of pairs (G, G), where G is an algebra and G a K- and G-module, satisfying a condition of saturation, this class being closed with respect to Cartesian products, subpairs and homomorphic images. In the case of algebras we have the following results. P ROPOSITION 3.40. For any subpairs (A , S1 ) and (B , S2 ) in (A, S1 ) and (B, S2 ) respectively, the pair (A , S1 ) (B , S2 ) belongs to the variety Var((A, S1 ) (B, S2 )). Let Θ be a variety of representations of algebras and U a special ideal corresponding to it in F = KΨ∗ , being a free algebra of countable rank. The regular pair (F /U, F ) is a cyclic and free pair in the variety Θ, and Θ = Var(F/U, F). P ROPOSITION 3.41. Let (A, S) be an arbitrary pair, and (R, F) a free pair in the variety Θ2 . Then Var((A, S) (R, F )) = Var(A, S) · Θ2 . In complete analogy to the semigroup case we can define a multiplication of varieties of representations of algebras and the semigroup A(K). Similarly to Theorem 3.30 we prove Theorem 3.42. T HEOREM 3.42. The semigroup of varieties of representations of algebras is a semigroup with cancellation. If we take into account that in the subalgebra Φ = HomK (B, A) ⊂ EndK (A ⊕ B), arising in the definition of the pair (A, S1 ) (B, S2 ), the multiplication is zero, then the necessary reasonings in the Sections 3.2.3 and 3.2.4 can be easily carried over to the situation of representations of algebras, so in the same way we prove T HEOREM 3.43 (Theorem of generators of algebras). Let K1 and K2 be two classes of representations of algebras. Then holds the formula Var K1 · Var K2 = Var(K1 K2 ).
3. Triangular products and stability of representations
63
From this basic result, similarly to the implication similarly to “Theorem 3.33 =⇒ Theorem 3.35”, one obtains T HEOREM 3.44. Each variety of representations of algebras can be uniquely decomposed into a finite product of indecomposable varieties of representations. C OROLLARY 3.45. The semigroup A(K) of varieties of representations of algebras is free. 3. We indicate some applications of these results. To this end, we first briefly describe the known connections between varieties of algebras and varieties of their representations, [41]. To each variety of representations Θ one associates the class ω −1 Θ of algebras, It admitting faithful representations in Θ; parallel to ω −1 Θ we use also the notation Θ. −1 is immediate to verify that ω Θ is a variety of algebras. On the other hand, to each variety N of algebras we associate the variety of representations ωN , stipulating that (G, G) ∈ ωN if the algebra G, up to the kernel of the corresponding representation belongs to N . It turns out that for any N and Θ there hold the relations ω(ω −1 Θ) = Θ and ω −1 (ωN ) = N . From this follows the existence of a bijective correspondence between the varieties of algebras and the varieties of their representations. However, the set K(K) of all proper varieties of K-algebras is in one-to-one correspondence with the set J (K) of all (non-zero) T -ideals in the free algebra F , [83]. The usual multiplication of ideals in J (K), with respect to which this set is closed, induces on K(K) an associative multiplication on varieties of algebras, which we denote by the symbol “·”. Next, let N1 and N2 be the varieties of algebras corresponding to the T -ideals U1 and U2 respectively. Let us consider the variety of algebras N = ω −1 (ωN1 · ωN2 ) and let U be the T -ideal corresponding to it in F . One can prove that U = U2 · U1 . This means that there exists an anti-isomorphism between the semigroups A(K) and J (K). Using the previous connections, one can easily deduce from Theorem 3.44 the following. T HEOREM 3.46. Every proper T -ideal can uniquely be written as a product of finitely many indecomposable T -ideals. and as a consequence of Theorem3.44 an interesting result of Bergman and Lewin (cf. [56, Theorem 7]). T HEOREM 3.47. The semigroup J (K) is free. In addition to this, we obtain the following. The remarks in Paragraph 5 of Section 3.2.5 remain in force and so, in the case of algebras considered here, each T -ideal, apparently, is special and from the same type of reasoning as in [8] one proves a theorem (in a variant for algebras) whose original form for groups can be found in [8], p. 1209. In order to give the formulation of this result we give a definition. A family of elements uα ∈ F is, by definition, called a special basis of the ideal U ⊂ F if U as an ideal is generated by all elements of the form uηα ∈ F which are images of elements uα under all special endomorphisms η of the algebra F . It turns out that if a special basis of a T -ideal U can be written in terms of only n variables in X, then the equality U = U1 ·U2 ·· · ··Um , m > n, is not possible for any choice of T -ideals U1 , . . . , Um . From this it follows, in particular, that a variety of algebras is indecomposable if it is defined using identities
64
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
in only one variable (from X). Using the Nagata-Higman theorem (cf., e.g. [83, Appendix C]), we deduce that there exist indecomposable varieties consisting of radical algebras. 4. In order to exhibit still another series of indecomposable varieties of algebras, let us first prove an auxiliary statement. We denote by var K the variety generated by the class of algebras K, at the same time we denote, as before, by Var K the variety of representations of algebras generated by their class K. Furthermore, if one adjoins to the algebra S a unit, we get as the result an algebra S ∗ and the regular representation (S ∗ , S). L EMMA 3.48. For any algebra S and a faithful representation (L, S) of it we have the formula −−−−−−→ var S = Var(L, S). P ROOF. Denote Θ1 = ω var S and Θ2 = Var(L, S). It follows from the definition of Θ1 that (L, S) ∈ Θ1 , which is sufficient for Θ2 ⊂ Θ1 . Let us prove the converse inclusion. Using the existence for any c ∈ L of an isomorphism of pairs (S ∗ / AnnS (c), S) ∼ = ∼ = (c·S ∗ , S) and Remak’s theorem, we see that the regular pair (S ∗ /∩c∈L AnnS (c), S) is contained in Θ2 . But from the fact that (L, S) is faithful it follows that ∩c∈L AnnS (c) = 0, giving that the pair (S ∗ , S) lies in Θ2 . Next, let (A, T ) be an arbitrary pair in the variety Θ1 , while (A, T1 ) is the corresponding faithful pair. It follows from (A, T1 ) ∈ Θ1 that T1 ∈ var S and, therefore, T1 ∈ QSC S. Let us show that from the last thing it follows that (T1∗ , T1 ) ∈ Θ2 . In the case T1 = S I (Cartesian power) the statement is easy to prove if we use the embedding (S I )∗ → (S ∗ )I and the fact that Θ2 is closed with respect to to subpairs and Cartesian products. If T1 is a subalgebra of S I then the embedding (T1∗ , T1 ) → ((S I )∗ , S I ) shows that (T1∗ , T1 ) ∈ Θ2 . Finally, let T1 be the endomorphic image of the subalgebra S1 ⊂ S I . We have the endomorphism of pairs (S1∗ , S) (T1∗ , T1 ), so in view of (S1∗ , S) ∈ Θ2 it follows that (T1∗ , T ) ∈ Θ2 . Thus we have proved that T1 ∈ var S implies (T1∗ , T1 ) ∈ Θ2 . Now it is not hard to see that (A, T1 ) ∈ Θ2 . Indeed, for each a ∈ A the cyclic subpair (a ◦ T1∗ , T1 ) in (A, T1 ) is isomorphic to the pair (T1∗ / AnnT1 (a), T1 ) ∈ Θ2 , and so lies in Θ2 . But from the membership in the variety Θ2 of all cyclic subpairs of the pair (A, T1 ) it follows that (A, T1 ) ∈ Θ2 . For the proof we have to apply a variation of the argument in the proof of Lemma 3.27. As a result we get the inclusion Θ1 ⊂ Θ2 , but along with it also the equality ω(var S) = Var(L, S). Applying to the main part of this equation the operator ω −1 we are lead to the formula of interest to us. 5. T HEOREM 3.49. If the algebra A is semi-simple (in the sense of Jacobson), then the variety var A is indecomposable. P ROOF. 1) Assume that the algebra is primitive. In this case there exists an irreducible representation (G, A). The variety of representations generated by this pair, will be denoted by Θ. From the relation A ∈ ω −1 Θ we deduce that (A∗ , A) ∈ Θ; cf. Lemma 3.48 for this type of proof. On the other hand, there exists an epimorphism of
3. Triangular products and stability of representations
65
pairs (A∗ , A) (G, A), from which it follows that (G, A) ∈ Var(A∗ , A), hence also Θ = Var(A∗ , A). Lemma 3.48 now gives ω −1 Θ = var A. Let us assume that var A = N2 · N1 and that to the variety Ni in F corresponds the T -ideal Ui , i = 1, 2. Consider the variety N = ω −1 (ωN1 · ωN2 ). In Paragraph 3 of this section it was stated that the T -ideal corresponding to this variety of algebras is U2 · U1 . But this T -ideal corresponds to the variety N2 · N1 = var A. Hence, N = var A. From this we deduce that Θ = ωN1 · ωN2 , that is, the decomposablity of the variety of representations Var(G, A). But this is a contradiction, as will be proved in the second half of the proof. 2) Let the algebra A be semi-simple. Then it is a subdirect sum of primitive algebras: A=
. sd i∈I
Ai ,
Ai ∼ = A/Di ,
Di = 0.
i∈I
Furthermore, let (Gi , Ai ) be a faithful irreducible representation corresponding to the primitive summand Ai . Repeating the argument of the first part of the proof, we derive the equalities Var(Gi , Ai ) = Var(A∗i , Ai ), i ∈ I. However, the pairs (A∗i , Ai ) are contained in the variety Var(A∗ , A); this follows from the existence of an epimorphism (A∗ , A) (A∗i , Ai ). Moreover, it follows from Remak’s theorem that Var(A∗ , A) ⊂ Var( i∈I (A∗i , Ai )). In view of the equalities Var(A∗i , Ai ) = Var(Gi , Ai ), i ∈ I, it follows from this that the variety Var(A∗ , A) is generated by the irreducible pairs (Gi , Ai ). Using these facts we show that Var(A∗ , A) is indecomposable. Let us assume that Var(A∗ , A) = Θ1 ·Θ2 . Introduce the notation Ω = i∈I (Gi , Ai ) and Ωl = Θl ∩ Ω, l = 1, 2. It turns out that one has Ω = Ω1 ∪ Ω2 . Apparently, we have only to comment on the inclusion Ω ⊂ Ω1 ∪ Ω2 . If a pair (Gi , Ai ) is not contained in Ω1 , then it is not contained in Θ1 either. But at the same time, it follows from (Gi , Ai ) ∈ Θ1 · Θ2 that there exists a subpair (Hi , Ai ) in (Gi , Ai ) such that and (Hi , Ai ) ∈ Θ1 and (Gi /Hi , Ai ) ∈ Θ2 . As (Gi , Ai ) is irreducible, it follows, however, that either Hi = 0 or Hi = Gi . In the second case (Gi , Ai ) ∈ Θ1 , which is excluded. Hence Hi = 0 and so (Gi , Ai ) ∈ Θ2 , hence also (Gi , Ai ) ∈ Ω2 . The inclusion Ω ⊂ Ω1 ∪ Ω2 is established. From the equality Ω = Ω1 ∪ Ω2 and the relation Var(A∗ , A) ⊂ Var Ω it follows that Θ1 · Θ2 = Var Ω1 · Var Ω2 . As the semigroup of varieties of representations is free, this then shows that Θ = Var Ω , = 1, 2. Furthermore, from the definitions we deduce that Θ1 · Θ2 ⊂ Var(Θ1 ∪ Θ2 ) ⊂ Θ2 · Θ1 . In the case of incidence of the varieties Θ1 and Θ2 , the preceding relation gives a contradiction. Indeed, if, for example, Θ1 ⊂ Θ2 , then Ω ⊂ Θ2 and so Θ1 · Θ2 = Θ2 , which is a contradiction. Therefore, in order to complete the proof of the theorem it suffices to show that Θ1 and Θ2 are incident. We argue by contradiction and choose arbitrary (Gi , Ai ) ∈ Ω1 \Θ2 and (Gi , Ai ) ∈ Ω2 \Θ1 ; here i , i ∈ I. In the triangular product (G, G) = (Gi , Ai ) (Gi , Ai ) we take a verbal with respect to Θ1 . As (G, G) ∈ Θ1 · Θ2 ⊂ Θ2 · Θ1 , we have (∗ Θ1 (G, G), G) ∈ Θ2 . The irreducibility of all pairs in Ω implies that the only Gmodules in G are 0, Gi and G. If now ∗ Θ1 (G, G) is 0 or Gi , then, together with the pair (G, G) or the pair (G/Gi , G) (Gi , G) respectively, also its subpair (Gi , Ai ) lies Θ1 , which was excluded by the choice. If ∗ θ1 (G, G) = G, then, likewise, (Gi , Ai ) ∈ Θ2 , which also was excluded. The statement is proved.
66
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Let us note that the reasoning given works also in the case |I| = 1, guaranteeing the indecomposability of Var(A∗ , A) in the first part of the proof. This proves the theorem. 6. For associative K-algebras one can introduce a operation of wreath product. Namely, for the algebras A and B we consider the K-module HomK (B ∗ , A∗ ) as an algebra with zero multiplication and set def
AwrB = HomK (B ∗ , A∗ ) (A ⊕ B), we call this algebra AwrB the wreath product of the algebras A and B. The operation of the wreath product of algebras permits us to make explicit the generating algebra of the product of varieties of algebras. Indeed, let A = var A and B = var B. Using Lemma 3.48 and the theorem of generating representations of algebras, we obtain ω(B · A) = ωA · ωB = Var(A∗ , A) · Var(B ∗ , B) = = Var((A∗ , A) Var(B ∗ , B)) = Var(A∗ ⊕ B ∗ , AwrB) = = Var((AwrB)∗ , AwrB) = ωvar (AwrB). Let us add that in this computation we used the equation Var(A∗ ⊕ B ∗ , AwrB) = Var((AwrB)∗ , AwrB), the verification of which is immediate on the basis of the properties of the corresponding generating pairs. These computations prove the following T HEOREM 3.50. For any two algebras A and B holds the formula (var B) · (var A) = var (AwrB). Finally, let us indicate yet another application of the wreath product of algebras. A T -ideal is called finitary, if the variety of algebras defined by it is generated by a finite dimensional algebra. T HEOREM 3.51. The product of finitely many T -ideals in F is finitary if and only if all the factors are finitary. P ROOF. It is clearly sufficient to prove the theorem for two T -ideals U1 and U2 . Thus, let the T -ideals Ui be finitary, and let the varieties Ni defined by them be generated by the finite dimensional algebras Ai , i = 1, 2. The product U1 · U2 is a T -ideal defining the variety N1 · N2 . According to Theorem 3.50 the variety A1 · N2 . is generated by the algebra A2 wrA1 = Hom(A∗1 , A∗2 ) (A2 ⊕ A1 ), which clearly is finite dimensional. Thus the finitarity of U1 · U2 is established. Conversely, assume that the T -ideal U1 U2 is finitary. Then the variety N1 ·N2 defined by it is generated by some finite dimensional algebra G, A1 · N2 = var G. Let us then consider the regular pair (G ∗ , G), which in view of Lemma 3.48 generates the variety ωN2 · ωN1 . Take in G ∗ a right ideal A such that (A, G) ∈ ωN2 and (G ∗ /A, G) ∈ ωN1 .
3. Triangular products and stability of representations
67
According to Propositions 3.17 and 3.40 we have (G ∗ , G) ∈ Var((A, G) (G ∗ /A, G)). However, using the theorem of generating representations, generating we have Var((A, G) (G ∗ /A, G)) ⊂ ωN2 · ωN1 , from which it follows that ωN2 · ωN1 = Var(G ∗ , G) ⊂ Var(A, G) · Var(G ∗ /A, G) ⊂ ωN2 · ωN1 . Thus we have proved the equality Var(A, G) · Var(G ∗ /A, G) = ωA2 · ωN1 , where Var(A, G) ⊂ ωN2 and Var(G ∗ /A, G) ⊂ ωA1 . By the Corollary to Theorem 3.44 it follows from this that ωN2 = Var(A, G) and ωA1 = Var(G ∗ /A, G). Thus, the varieties ωN1 and ωN2 are generated by finite dimensional pairs. Let (C1 , H1 ) be a finite dimensional faithful pair generating the variety of representations ωN1 . Then the algebra H1 is a finite dimensional K-algebra, and it is not hard to see that N1 = var H1 . Thus, the ideal U1 is finitary. In an analogous manner one shows that the T -ideal U2 is finitary. 7. Let T = T (n) be the algebra of upper triangular matrices of order n over the field K. The natural representation (L, T ) of this algebra is faithful and there is the isomorphism of representations (L, T ) ∼ = (K, K) · · · (K, K) . n times According to Theorem 3.43 we get from this the equation Var(L, T ) = (Var(K, K))n . Moreover, we remark that the variety of algebras var T and the variety of representations of algebras Var(L, T ) correspond in the free algebra F to one and the same T -ideal Tn : proof by unwrapping the definitions. This remark and the anti-isomorphism of the semigroup of varieties of representations with the semigroup of T -ideals of the algebra F allows us to rewrite the above equation as Tn = T1n , T1 being the ideal of identities of the algebra K. Thus we have proved the following. T HEOREM 3.52. The ideal of identities of the algebra of upper triangular matrices of order n over the field K coincides with T1n , T1 being the ideal of identities of the algebra K. For char K = 0 this theorem coincides with the result of Yu. N. Mal’cev (1971) stating that the ideal Tn is generated by the polynomials [x1 , x2 ][x3 , x4 ] · · · · · [x2n−1 , x2n ], where we have written [x, y] = xy − yx. In the case char K > 0 this constitutes an answer to Problem 109 in [6].
68
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
3.2.7. Comments 1. The notion of automaton as an object for mathematical inquiry arose long ago and simultaneously in many papers; cf. [31]. The connections of this circle of questions in Theoretical Computer Science with Algebra (as indicated in B. M. Glushkov [9]) led to the proof of the basic theorems on decomposition of automata; a systematic study of these questions and their connection with linear systems was given in [18]. The point of view of automata as three-sorted systems and their connections with pairs, as in the work of B. I. Plotkin and others in [42], has led to a systematic application in automata theory of the techniques ready in the theory of representations. In particular, the -product introduced in Section 3.1 for semigroups leads to the construction of the -product of Moore automata and to a proof of theorems analogous to the theorems of KaluzhninKrasner and Krohn-Rhodes (B. I. Plotkin, unpublished). In problems of classification of linear automata, as indicated in the present chapter, the bijection between the varieties of linear automata varieties and ideal pairs in the algebra F proves to be useful. Here this bijection is used for the study of the multiplicative properties of the set of varieties of linear automata. 2. The subject of this chapter is related to the theme of unique factorization of ring and semigroups. The unique factorization in the ring of integers is beautiful and useful, and its properties have been known over a long time, but already in some rings of algebraic integers it is not easy to establish this property. The study of the rings of linear differential operators (Edmund Landau, 1902) led to the question of the unique decomposition of elements in certain non-commutative rings. Much attention has been devoted to the unique factorization in semigroups, as the unique factorization of a ring is a property of its multiplicative semigroup; cf. the paper [62] and the book of P. Cohn [5]. The Theorems 3.35 and 3.46 proved in this chapter are natural reformulations of the theme indicated parallel to the Theorem 23.4 in [34] and of the main theorem in [43]. 3. It is clear that the approach described in Paragraph 7 of Section 3.2.6 allows us to reduce the search of a basis for the identities of the algebra T of upper block-triangular matrices to the corresponding case of diagonal blocks. A definitive answer can be obtained in the case when the sizes of the upper triangular matrices from T do not exceed two, while the field K is either finite or has characteristic 0, because thanks to work by Yu. N. Mal’cev, E. N. Kuzmin and Yu. P. Razmyslov one knows a basis for the identities of square matrices for such fields. 4. In the theory of varieties of algebras one encounters usually another multiplication. Let us denote by T (A) the T-ideal defining a variety of (associative) K-algebras A. Then for any two varieties of algebras A1 and A2 , their product A1 ∗A2 is the variety of algebras defined by the T-ideal T A1 ∗ T A2 , generated by the set {f (g1 . . . gn ) | f (x1 . . . xn ) ∈ T (A1 ), g1 . . . gn ∈ T (A2 )} ⊂ F. It is possible that an improvement of the approach in this Section 3.2 together with an attraction of the notion of the free Menger system16 allows to achieve progress in the description of the “*-structure” arising here (cf. Problems 18 and 25 in [6]). 16Translators’ note. Cf. Jaak Henno. Free G-commutative Menger systems. In: Mathematics and Theoretical Mechanics, VIII, Proc. Estonian Acad. Sci., Phys. Math., 373, 1975, 19–26
3. Triangular products and stability of representations
69
5. Our treatment of M(K) and L(K) as locally finite partially ordered sets together with the information on indecomposable elements in these semigroups permits us, really, to use with advantage ideas in [98] and to develop the analytic side of a question in the spirit of the book [86].
3.3. Powers of the fundamental ideal and stability of representations of groups and semigroups Let ZΓ be the integral group ring of a group Γ. The fundamental ideal Δ in the ring ZΓ is the kernel of the homomorphism ZΓ → Z; in other words Δ is the set of all possible finite sums i ni γi , where ni ∈ Z, γi ∈ Γ, such that i ni = 0. Powers of Δ are inductively, that is, Δν = Δν−1 · Δ for a non-limit ordinal ν and defined ν μ Δ = μ<ν Δ for limit ordinals ν. In the ring ZΓ there is a decreasing series of ideals (9)
ZΓ ⊃ Δ ⊃ Δ2 ⊃ · · · ⊃ Δν ⊃ Δν+1 ⊃ . . . .
Let τ = τ (Γ) be the index of stabilization the series (9), that is, τ is an ordinal number beginning from which Δτ = Δτ +1 = . . . Partially following [71], we use the following notation and terminology: τ = τ (Γ) is the terminal of the group Γ; Δ∞ = Δτ (Γ) is the terminal of the ring ZΓ; Dν = Γ ∩ (1 + Δν ) is the ν-th (generalized) dimensional subgroup: in particular, D∞ = D∞ (Γ) = Γ ∩ (1 + Δ∞ ) is the limit dimensional subgroup (shorter limit) of Γ. The goal of this Section is the computation of the terminal and the limit of various finite groups, and also of Artinian groups. A main role in these computations is played by the apparatus of triangular products and the connection of the question with stability, as described in Section 3.1. For this approach it is highly essential that for some classes of nilpotent groups it is possible to carry out the computation of the terminal exactly. This constitutes the main subject of Section 3.3.1. There we give also the definitions and known facts necessary in the proofs, and, furthermore, the proof of a refinement of a theorem of Gruenberg [69, Theorem B]. The main part of the following two sections concerns the terminal and the limit of finite groups. Section 3.3.4 is devoted to an extension of the theme of stability (and likewise, in a perspective, also the problem of the terminal) for semigroups. Here the technique of quasi-rings (distributively generated near-rings) is useful, [68]. Everywhere in Sections 3.3.1–3.3.3, while studying pairs (G, Γ) the acting object Γ will be a group, and the domain of action G an Abelian group. In this way, pairs are representations of groups by automorphisms of Abelian groups. The symbol17 ω denotes the first infinite ordinal, and, likewise, the operator associating to a subgroup in Γ the right ideal in ZΓ. Finally, we remark that writing A ⊂ B does not exclude the equality of the sets A and B.
17Such is the tradition, whose meaning is readily understood from the context.
70
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
3.3.1. Preliminary topics; on the terminal of nilpotent groups 1. Let there be given a pair (G, Γ). We fix γ ∈ Γ and assume that in G there is a finite decreasing series of subgroups (10)
G = S 0 ⊃ S1 ⊃ · · · ⊃ Sm = 0
such that for all i = 0, 1, . . . , m − 1 the condition g ◦ γ − g ∈ Si+1 is fulfilled for all g ∈ Si . In such a situation we say that γ acts finitely stably with respect to the series (10). If all γ ∈ Γ satisfy this condition, then we say that the group Γ is finitely stable with respect to (10); the pair (G, Γ) is then also called, by definition, finitely stable (more exactly, m-stable), which will be written (G, Γ) ∈ S m . In the case of a faithful pair (G, Γ) the group G is a subgroup of the stabilizer of the series (10). The latter constitutes the collection of all automorphisms of G which act stably on (10). By a well-known theorem by L. A. Kaluzhnin ([19, p. 144]; [84]) it is a nilpotent group. The fact that a faithful pair (G, Γ) is finitely stable with respect to (10) will be written (G, Γ) ∈ S m . Moreover, if for a given group Γ it is only important the existence of such a pair, and not −→ the concrete nature of G, we write Γ ∈ S m . What can be said about the structure of the stabilizer of an infinite decreasing series of subgroups in an Abelian group? What group theoretic properties enjoys a group Γ which embeds in the stabilizer of a series of type G = S0 ⊃ S1 ⊃ Sα ⊃ · · · ⊃ Sσ = 0, −→ in this case we write (G, Γ) ∈ S σ and also simply Γ ∈ S ∞ . It appears that this is tightly connected with the problem of the calculation of the limit of the group Γ. For the pair (G, Γ) we introduce the notation [g, γ] = −g + g ◦ γ, g ∈ G, γ ∈ Γ; the Z-module generated by all [g, γ], g ∈ G, γ ∈ Γ, will be called the mutual commutator of G and Γ and denoted by [G, Γ]. One can introduce the submodule [G, Γ, ν] for all ordinal numbers ν. To this end we set [G, Γ, 0] = G, [G, Γ, 1] = [G, Γ], and, furthermore, we define by induction [G, Γ, ν] = [[G, Γ, ν − 1], Γ] for each non-limit ordinal ν and [G, Γ, ν] = μ<ν [G, Γ, μ] for each limit ν. The series
(11)
(12)
G ⊃ G1 ⊃ · · · ⊃ Gν ⊃ Gν+1 ⊃ . . . ,
where Gν = [G, Γ, ν]
is called18 the lower stable series of the pair (G, Γ). For example, in case of the regular pair (ZΓ, Γ) the series (12) coincides with (9). The stability of the action of Γ on (11) means, by definition, that [Sν , Γ] ⊂ Sν+1 for all ν < σ. A trivial induction shows that [G, Γ, ν] ⊂ Sν for all ν ≤ σ. We remark that together with Γ there acts in G also the group ring ZΓ. We can, in particular, write [g, γ] = g ◦ (γ − 1) so that [G, Γ, n] = G◦ Δn ω for n = 1, 2, 3, . . . . However, in general we only have G ◦ Δ ⊂ Gω = n G ◦ Δn . Transfinite induction shows that G ◦ Δν ⊂ Gν for all ordinals ν. −→ It turns out that the condition Γ ∈ S ∞ is equivalent to the triviality of the limit of −→ Γ, i.e. the condition D∞ (Γ) = 1. Indeed, let Γ ∈ S ∞ ; then there exists a faithful pair (G, Γ) such that Γ is embedded in a stabilizer of a series of the type (11). In particular, we have G ◦ Δσ ⊂ Sσ = 0. The fact that (G, Γ) is faithful gives now Dσ = 1, as for each γ ∈ Dσ (in view of γ − 1 ∈ Δσ ) we have G ◦ (γ − 1) = 0. Clearly, D∞ ⊂ Dσ implies D∞ = 1. In order to prove the implication in the converse direction we consider 18One speaks also of the lower Γ-stable series of G.
3. Triangular products and stability of representations
71
the pair (ZΓ/Δ∞ , Γ/D∞ ). This is a faithful pair, and the group Γ/D∞ acts stably on the series ZΓ/Δ∞ ⊃ Δ/Δ∞ ⊃ · · · ⊃ Δ∞ /Δ∞ = 0. −→ −→ Thus we have Γ/D∞ ∈ S ∞ whence it follows for D∞ = 1 that Γ ∈ S ∞ We note the possibility of more general formulations considered in [39]; the variety S of pairs with identical action can be replaced with an arbitrary variety of pairs X . The fundamental ideal Δ is then replaced by an ideal DX ⊂ ZΓ, namely by the indicator of the class X in ZΓ, cf. Section 3.1.2. (13)
2. Here we present a list of known facts, which below will be frequently used. The following three statements are due to V. G. Vilyatser (cf. [35, p. 458–464]). L EMMA 3.53. If for a pair (G, Γ) the element of finite order σ ∈ Γ is an outer nil-element with respect to G, and [G, σ] is torsion free, then σ is a pure element. L EMMA 3.54. Let the pair (G, Γ) be finitely stable. An element σ ∈ Γ is an almost π-element if and only if [G, σ] is a π-group. We are also going to use a corollary of this lemma. L EMMA 3.55. For a finitely stable pair (G, Γ) the group Γ is a relative π-group if and only if [G, Γ] is a π-group. L EMMA 3.56. If a pair (G, Γ) is contained in S ∞ , then the group Γ is residually nilpotent. P ROOF. By hypothesis, for the members Gk of the lower Γ-stable series of G one has k≥0 Gk = 0. Let Σk be the kernel of (G/Gk , Γ). Then the pair (G/Gk , Γ/Σk ) is faithful and finitely stable and so the group Γ/Σ k∞is nilpotent (according to Kaluzhnin’s theorem). For any k ≥ 0, g ∈ G, σ ∈ Σ = k Σk there exists a gk ∈ Gk such that ∞ g ◦ σ = g + gk . Consequently, g ◦ σ − g ∈ k Gk , that is g ◦ σ − g = 0. Hence, σ ∈ Ker (G, Γ), so that, (G, Γ) being faithful, we must have σ = 1 and Σ = 1. L EMMA 3.57 (Connell, [63]). If Γ is infinite, then the left annihilator of the fundamental ideal Δ in the ring ZΓ equals zero. L EMMA 3.58 (Hartley, [77]). Assume that Γ contains an element x of prime order p. Then for the fundamental ideal Δ ⊂ ZΓ one has (a) p(1 − x) ∈ Δp ; n (b) (1 − x)(1 − y p ) ∈ Δn+2
for all
y∈Γ
and
n ≥ 0.
L EMMA 3.59 (Buckley, [60]). Let γ1 and γ2 be a pair of commuting elements of relatively prime orders in Γ. Then the element (γ1 − 1)(γ2 − 1) is contained in the ideal Δω+1 ⊂ ZΓ. L EMMA 3.60 (Plotkin, [39]). Assume that (A, Γ) is a pair, Σ a normal subgroup of Γ having a central (in Γ) series of length m, (A, Σ) ∈ S n , and that A∗ is the submodule of Σ-invariant elements of A. If the pair (A∗ , Γ) belongs to the variety X , then (A, Γ) ∈ m Xn .
72
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Below this lemma will be used in the case m = 1, in which case Σ is a central subgroup of Γ. The proof of the lemma in this special case runs by induction over the length n of the upper Σ-stable series of A, 0 ⊂ A∗ ⊂ H2 ⊂ · · · ⊂ Hn = A. Consider the case n = 2. For any σ ∈ Σ introduce the map f (σ) : A → A by the rule ∀a ∈ A, af (σ) = a ◦ (σ − 1) . In view of the centrality of the subgroup Σ every map f (σ), σ ∈ Σ, permutes with the action of Γ in Aand so is an endomorphism of the Γ-module A. Let Aσ = Ker f (σ); it is clear that σ∈Σ Aσ = A∗ . In view of the properties of the class X we have (A/Aσ , Γ) ∈ X and, applying Remak’s theorem , deduce (A/A∗ , Γ) ∈ X . This, to gether with (A∗ , Γ) ∈ X , gives the required statement (A, Γ) ∈ X 2 . The general case. It is clear that 0 ⊂ H2 /A∗ ⊂ · · · ⊂ A/A∗ is the upper Σ-stable series of length ≤ n − 1 in A/A∗ ; in particular, H2 /A∗ is the submodule of Σ-invariant elements in A/A∗ . Hence (A/A∗ , Σ) ∈ S n−1 . The relation (H2 /A∗ , Γ) ∈ X is derived similarly to how (A/A∗ , Γ) ∈ X was obtained in the first part of the proof. By the induction hypothesis we have here (AA∗ /, Γ) ∈ X n−1 . This proves the Lemma for m = 1. L EMMA 3.61 (Gruenberg, [70]). The terminal of a group, for which the factor group by the commutator subgroup is complete and torsion, equals two. Essential applications of the proofs can be found in T HEOREM 3.62 (P. Hall, [72]). The integral group ring of a group, which admits an invariant polycyclic subgroup of finite index, satisfies the ascending chain condition for right ideals. 3. We give a simple proof, essentially based on the following fact. P ROPOSITION 3.63 ([71] or [39]). The group Γ has a finite terminal if and only if either Γ = [Γ, Γ] and then τ (Γ) = 1, or Γ = [Γ, Γ], and then Γ/[Γ, Γ] is a complete torsion group, so in this case τ (Γ) = 2. P ROOF. In the case Γ = Γ = [Γ, Γ] we use the following identity, valid for all γ1 , γ2 ∈ Γ: γ1−1 γ2−1 γ1 γ2 − 1 = γ1−1 γ2−1 [(γ1 − 1)(γ2 − 1) − (γ2 − 1)(γ1 − 1)], and conclude at once that ΔΓ = Δ2Γ . Next, assume that Γ = Γ . First, we remark the following. Let ϕ : Γ → Σ be an arbitrary epimorphism and M = Ker ϕ. By the assumption ΔnΓ ⊂ Δn+1 it follows that Γ + ωM )/ωM. (ΔnΓ + ωM )/ωM ⊂ (Δn+1 Γ ∼ Under the isomorphism ZΓ/ωM = Z(Γ/M ) the ideal (ΔnΓ + ωM )/ωM is mapped onto ΔnΣ . We conclude that ΔnΣ ⊂ Δn+1 Σ . Thus, every homomorphic image of a group with a finite terminal also has a finite terminal. In particular, the terminal of the Abelian group ¯ is not complete then there exists an epimorphism Γ ¯T ¯ = Γ/Γ must be finite. If Γ Γ
3. Triangular products and stability of representations
73
where T is a cyclic group of prime order, say |T | = p. Then ZT ∼ = Z[x]/(xp − 1), where Z[x] is the ring of integral polynomials in one variable. The assumption ΔT = Δ2T implies that there exist polynomials f (x) and g(x) such that x − 1 = (x − 1)2 · f (x) + (xp − 1) · g(x). Cancelling in both members the factor x − 1 and setting x = 1 in the resulting relation yields 1 = p · g(1), which is a contradiction. Thus ΔT = Δ2T . In an analogous manner ¯ is one shows that Δ2T = Δ3T , . . . and so τ (Γ) ≥ ω. We arrive at the conclusion that Γ ¯ is non-torsion torsion then there exists an epimorphism then Γ ¯ Q(+), complete. If Γ ¯ must be which in view of τ (Q(+)) = ω (cf. [67]) anew gives a contradiction. Hence, Γ a complete Abelian group, which by Lemma 3.61 implies that τ (Γ) = 2. 4. Let us pass to the investigation of the terminal of nilpotent groups. We need the following generalization of Lemma 7 in [39]. L EMMA 3.64. Let Γ be a nilpotent group and assume that there exists a pair (A, Γ) containing a finitely stable subpair (F, Γ) such that A/F is a Γ-Noetherian module. Furthermore, assume that in A there is a submodule B consisting of Γ-invariant elements and having a non-trivial intersection with every Γ-submodule of A. Then (A, Γ) is a finitely stable pair. P ROOF. Let Σ be the center of the group Γ and σ ∈ Σ. All members {Ai | i = 0, 1, 2, . . . } of the upper σ-stable series of A are Γ-admissible and from the finite stability of (F, Γ) it follows that there exists k ∈ N such that that we have the series F ⊂ Ak ⊂ · · · ⊂ Am ⊂ Am+1 ⊂ · · · ⊂ A. “Lowering” this series to F gives a series of Γ-submodules 0 ⊂ Ak /F ⊂ · · · ⊂ Am /F ⊂ Am+1 /F ⊂ · · · ⊂ A/F in the module A/F which is Γ-Noetherian. Therefore there exists m ∈ N such that Am /F = Am+1 /F = . . . . In view of F ⊂ Am this gives Am = Am+1 = . . . Let us show that Am = A. We observe that a ∈ Am if and only if a ◦ (σ − 1)m = 0. This means that from the assumption An < A it follows that B1 = 0 where, for simplicity, B1 = A ◦ (σ − 1)m . By assumption the Γ-submodule B1 has a non-zero intersection with B. Hence, for some non-zero b ∈ B1 ∩ B we have simultaneously b = a ◦ (σ − 1)m and b ◦ (σ − 1) = 0, which in view of Am = Am+1 shows that b = 0, which is a contradiction. The equality Am = A is proven. As a consequence, all σ ∈ Σ act finitely stably on A, so that, A/F being ΓNoetherian, it follows that the pair (A/F, Σ) is finitely stable; this statement is established in [39, p. 201–202]. For the Reader’s convenience. we provide here the corresponding argument. Choose any a ¯ ∈ A/F . As A/F is Γ-Noetherian, we can find a finitely generated subgroup Σ∗ ≤ Σ such that a ¯ ◦ (ωΣ) = a ¯ ◦ (ωΣ∗ ). Assume that (A/F, Σ∗ ) ∈ S n(¯a) ; ∗ n(¯ a) = 0. For any σ1 , . . . , σn(¯a) ∈ Σ there exist u1 , . . . , un(¯a) ∈ in particular, a ¯ ◦ (ωΣ ) ωΣ∗ such that a ¯ ◦ (σi − 1) = a ¯ ◦ ui , i = 1, . . . , n(¯ a). Using that the subgroup Σ is central
74
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
this yields a ¯ ◦ (σ1 − 1)(σ2 − 1) . . . (σn(¯a) − 1) = (¯ a ◦ u1 ) ◦ ((σ2 − 1) . . . (σn(¯a) − 1)) = = (¯ a ◦ (σ2 − 1) . . . (σn(¯a) − 1))u1 = · · · = =a ¯ ◦ un(¯a) · un(¯a)−1 · · · · · u1 = 0. Let a ¯1 , , . . . , a ¯m be the generators of the Γ-module A/F . Choose n in such a way that ¯ ◦ (σ1 − 1) . . . (σn − 1) = 0 is for arbitrary n elements σ1 , . . . , σn ∈ Σ the equation a fulfilled, simultaneously simultaneously for all j = 1, . . . , m. Using again the centrality of Σ we easily derive now a ¯(σ1 − 1) . . . (σn − 1) = 0 for every a ¯ ∈ A/F . Hence, n (A/F ) ◦ (ωΣ) = 0, as was required to prove. We obtain thus the finite stability of the pairs (A/F, Σ) and (F, Σ). From this we derive that the pair (A, Σ) has the same property; the argument for this goes according to the following diagram ⊃
······
⊃ FO ⊃
A/F ⊃
······
⊃ F
AO
······
⊃
0
Next, continue by induction over the nilpotency class n of the group Γ. Let us assume that the assertion is already proved for groups nilpotent of degree ≤ n − 1 and let Σ be the center of the group Γ, A∗ the submodule of Σ-invariant elements in A, F ∗ = F ∩ A∗ . We have the pair (A∗ , Γ/Σ), the embeddings B ⊂ AΓ ⊂ A∗ and the relations A∗ /F ∗ ∼ = (A∗ + F )/F ⊂ A/F . As the pair (F, Γ) is finitely stable, finitely stable the same holds also for (F ∗ , Γ/Σ), while from A∗ /F ∗ ⊂ A/F it follows that A∗ /F ∗ is Γ-Noetherian, but then also Γ/Σ-Noetherian. This means that for the given data {(A∗ , Γ/Σ); f ∗ ; B} the conditions of our lemma are fulfilled, from which it follows by the induction assumption that the pair (A∗ , Γ/Σ) is finitely stable. Again from this it follows by Lemma 3.60 the same thing for (A, Γ/Σ), but then also for (A, Γ), because the classes S t are saturated. 5. T HEOREM 3.65. If in the pair (G, Γ) the group Γ is nilpotent, while the group G contains an Γ-Artinian Γ-submodule D such that G/D is Γ-Noetherian, then the length of the lower stable series of this pair does not exceed ω. P ROOF. The lower stable series of G, G = G0 ⊃ G1 ⊃ · · · ⊃ Gω ⊃ Gω ⊃ Gω+1 ⊃ . . . generates a decreasing Γ-stable series for the module A, A = G/Gω+1 ⊃ G1 /Gω+1 ⊃ · · · ⊃ Gω /Gω+1 ⊃ 0. This series admits then also the submodule F ⊂ A, F = (D + Gω+1 )/Gω+1 . But the module F is Γ-Artinian, because it is a factor module of the Γ-Artinian module D by the module D ∩ Gω+1 . Hence, the series considered in F is finite, considered which establishes the finite stability of the pair (F, Γ).
3. Triangular products and stability of representations
75
Set B = Gω /Gω+1 and let H be a maximal Γ-submodule in A, having zero intersection with B; such an H does exist in view of Zorn’s lemma. Next, let us set ¯ = (B + H)/H. A¯ = A/H, F¯ = (F + H)/H and B As the pair (F, Γ) is finitely stable, it follows that (F¯ , Γ) is likewise finitely stable. It suffices to remark that there exist epimorphisms of Γ-modules ¯ F¯ G/D A/F A/ ¯ F¯ is Γ-Noetherian. Thus, to the triple {A, ¯ F¯ , B} ¯ Lemma 3.64 and that by this A/ is applicable. We deduce that the pair (A/H, Γ) is finitely stable. This shows that Gn /Gω+1 ⊂ H for some n ∈ N. This again, apparently, gives B ⊂ H. In view of the choice of H from this it must follow that B = 0, i.e. Gω = Gω+1 . 6. If in the assumptions of Theorem 3.65 D = 0, we are led to the following. T HEOREM 3.66 (Plotkin, [39]). If in the pair (G, Γ) the group Γ is nilpotent, while the module G is Γ-Noetherian, then the lower stable series of this pair has length not exceeding ω. In turn, an immediate consequence of Theorems 3.62 and 3.66 is the following. T HEOREM 3.67 (Smith, [105]). The terminal of a Noetherian nilpotent group equals ω. On the other hand, we have. T HEOREM 3.68. The terminal of a complete Artinian Abelian group equals 2. If Γ is a non-complete nilpotent group, then τ (Γ) = ω. P ROOF. The first statement of the theorem follows from Lemma 3.61; here it is formulated in order to make the picture complete. Next, assume that Γ is not full. Then (cf. [22, p. 370-371]) Γ can be represented in the form Γ = Σ · Φ, where Σ is a central complete Artinian subgroup of Γ, and Φ a finite invariant subgroup of Γ; by our assumption Φ = 1. The fundamental ideal in ZΓ will be written Δ. By Proposition 3.63, Δn = Δω for all natural n. Let us show that Δω = Δω+1 , to this end reducing the proof of this claim to a situation in which Theorem 3.65 is applicable. Let T = {γ1 , . . . , γt } be a complete system of representatives of the cosets of Σ in Γ, ΔΣ being the fundamental ideal in the group ring ZΣ. Let us demonstrate the existence of direct decompositions of the form (14)
(ωΣ)r = ΔrΣ · γ1 + · · · + ΔrΣ · γt ;
r = 1, 2, . . . .
Let us denote the sum to the right in (14) by M . It is clear that M (r) ⊂ (ωΣ)r , because clearly ΔrΣ · γi ⊂ (ωΣ)r . We have to check that (ωΣ)r ⊂ M (r) . We note that for any z1 , . . . , zr ∈ ΔΣ and γi1 , . . . , γir ∈ T one has z1 · . . . · zr ∈ ΔrΣ and there exist σ ∈ Σ, γk ∈ T such that γi1 ·. . .·γir = σγk . Therefore the product (z1 γi1 )·. . .·(zr γir ), which, as Σ is central equals (z1 . . . zr )(γi1 . . . γir ), lies in M (r) . This gives (ωΣ)r ⊂ M (r) . The relation (ΔrΣ · γi ) ∩ (ΔrΣ · γj ) = 0 for i = j is verified by an argument by contradiction. In particular, we have the direct decomposition (14) for r = 1 and 2 and we obtain the isomorphism of Z-modules ωΣ/(ωΣ)2 ∼ = ΔΣ /Δ2 · γ1 + · · · + ΔΣ /Δ2 · γt . (r)
Σ
By a well-known result,
ΔΣ /Δ2Σ
Σ
∼ = Σ. Hence, the Z-module ωΣ/(ωΣ)2 is Artinian.
76
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Let us consider the Γ-module G = ZΓ/Δω+1 and in it the submodule D = (ωΣ + )/Δω+1 . By Lemma 3.61 we have Δ2Σ = Δω+1 Δ Σ . Therefore we obtain ω+1
(ωΣ)2 = Δ2Σ · γ1 + · · · + Δ2Σ · γt = Δω+1 · γ1 + · · · + Δω+1 · γt ⊂ Δω+1 . Σ Σ Γ Thus, the Z-submodule D, being an epimorphic image of the module ωΣ/(ωΣ)2 i, is also Artinian. The module G/D, being an epimorphic image of the Γ-Noetherian module ZΓ/ωΣ ∼ = ZΦ is Γ-Noetherian; that Φ is Φ-Noetherian (and, likewise, ZΦ isΓNoetherian) follows from Theorem 3.62. What concerns the data {G, D}, all conditions of Theorem 3.65 are fulfilled. Therefore we may conclude that Gω = Gω+1 , i.e. Δω = Δω+1 . 7. In this subsection we prove Theorem 3.69, constituting a generalization of a result by K. Gruenberg. T HEOREM 3.69. Let Δ be the fundamental ideal in the integral group ring of the group Γ satisfying the descending chain condition for normal subgroups, let k be a nonnegative integer and λ = ω(k + 1). In the ring ZΓ holds the relation Δλ = 0 if and only if the group is finite and primary. Remark. For k = 0 this theorem reduces to a result of Gruenberg ([69, Theorem B]), another proof of it is given in [74]. We note that for a non-limit λ the relation Δλ = 0 is not possible in the integral group rings of an infinite group; this follows from Lemma 3.57. P ROOF. a) The condition is sufficient. Let Γ be a finite p-group. Let us denote the additive group of the ring Zpn Γ by A. The group A Γ is a finite p-group and, thus, nilpotent. But then the pair (A, Γ) is stable, which is equivalent to the nilpotency of the fundamental ideal Δn ⊂ Zpn Γ. Thus, there exists a number m = m(n) such that Δm n = 0. Furthermore, let us remark that for every natural number n the natural homomorphism of the rings of coefficients Z → Zpn induces a homomorphism of group rings νn : ZΓ → Zpn Γ. One verifies at once that for every natural number m there holds m m(n) ⊂ Ker νn . the relation (Δm )νn ⊂ Δm n . Together with Δn = 0 this shows that Δ But Ker νn consists of all elements of the ring ZΓ whose coefficients are divisible by pn . Thus one has ∩∞ n=1 Ker νn = 0. In view of the embeddings ∞
∞
n=1
n=1
Δω ⊂ ∩ Δm(n) ⊂ ∩ Ker νn this gives the condition of the Theorem. b) The condition is necessary. Let us set Δω0 = ZΓ. For each i = 0, 1, . . . , k we denote by Σi the semigroup of Γ of all elements which act as identity in the factor Δωi /Δω(i+1) ; we have Σi Γ. We apply Lemma 3.56 to the pair (Δωi /Δω(i+1) , Γ/Σi ). It follows that the groups Γ/Σi must be nilpotent, as these groups satisfy the descending chain condition for normal subgroups. Set Σ∗ = ∩ki=0 Σi ; using Remak’s theorem we see that the group Γ/Σ∗ is nilpotent. We remark further that the pair (ZΓ, Σ∗ ) is finitely stable, while the domain of action of this pair is torsion-free. It follows from Lemma 3.53 that Σ∗ acts trivially on ZΓ. Consequently, Σ∗ ⊂ Ker (ZΓ, Γ). Again, the faithfulness of (ZΓ, Γ) shows that Σ∗ = 1. Thus, the group Γ is nilpotent. However, in the class of nilpotent groups the descending chain condition for normal subgroups implies the same condition for all subgroups. Thus, Γ is an Artinian nilpotent
3. Triangular products and stability of representations
77
group. From the assumption and the fact that the terminal of an Artinian nilpotent group does not exceed ω it follows that Δω = 0. Again, arguing by contradiction and applying Lemma 3.59, it follows that Γ is a Chernikov p-group. Let Σ be a complete subgroup of primary index in Γ. Then Γ = Σ · Φ, where Φ is a finite p-group. Let us assume that indΓ Σ > 1. Then there exists a pair of non-unity elements σ ∈ Σ, ϕ ∈ Φ with ϕp = 1, which in view of Lemma 3.58 implies that (1 − ϕ)(1 − σ) ∈ Δω Γ . This contradicts the condition Δω Γ = 0. We conclude that either Σ = Γ or Σ = 1. In the first case ω 2 Lemma 3.61 gives Δω Γ = ΔΣ = ΔΣ = 0, which is a contradiction. Consequently, Σ = 1 and so Γ = Φ. 8. P. Smith [105] has shown that the nilpotency of a finite group is equivalent to the existence of x ∈ ΔΓ ⊂ ZΓ such that (15)
Δω Γ · (1 − x) = 0 .
It is clear that (15) implies that τ (Γ) ≤ ω. However, the converse is not true. P ROPOSITION 3.70. Let Γ be a group. In the group ring ZΓ holds the relation τ (Γ) = ω if and only if Γ has an invariant subgroup Σ such that [Σ, Σ] = Σ and τ (Γ/Σ) = ω. P ROOF. Taking Σ = 1, it is clear that the condition is necessary. ¯ = Γ/Σ and G = ZΓ. ¯ Let us, first of all, Let us show that it is sufficient. Denote Γ ω+1 show that ωΣ ⊂ ΔΓ . By the assumption, for each σ ∈ Σ there exist elements σ1 and σ2 in Σ such that σ = σ1−1 σ2−1 σ1 σ2 . We have (16) σ − 1 = σ1−1 σ2−1 (σ1 − 1)(σ2 − 1) − (σ2 − 1)(σ1 − 1) . This relation shows that ωΣ ⊂ Δ2Γ . In turn, this inclusion together with (16) gives ωΣ ⊂ Δ3Γ etc. We see that ωΣ ⊂ ∩n ΔnΓ = Δω Γ . Using (16) once more we deduce ω+1 . Hence, we have ΔnΓ + ωΣ = ΔnΓ for each natural from ωΣ ⊂ Δω Γ that ωΣ ⊂ ΔΓ number n. Therefore, under the isomorphism G ∼ = ZΓ/ωΓ the ideal Δω ¯ is identified Γ ω ω ¯ /ωΣ = Δ /ωΣ · Δ with ΔΓ /ωΣ. In view of τ (Γ) = ω we have Δω Γ /ωΣ, which Γ Γ ω+1 ω ⊂ Δ · Δ + ωΣ ⊂ Δ . Thus τ (Γ) ≤ ω. implies that Δω Γ Γ Γ Γ ¯ ∈ S n+1 \ S n On the other hand, let us remark the following: The relation (G, Γ) ¯ implies that (G, Γ) ∈ together with the right epimorphism of pairs (G, Γ) (G, Γ) S n+1 \ S n . In these conditions (ZΓ, Γ) ∈ S n , because we have the epimorphism of ¯ = ω we obtain from this τ (Γ) ≥ ω. pairs (ZΓ, Γ) (G, Γ). As a consequence of τ (Γ) This establishes the equality τ (Γ) = ω. Concretizing, let us consider, in the group of all substitutions of a set of cardinality ν, the subgroup Fν of all those substitutions which permute only a finite number of elements. Moreover, let Aν be the set of those elements in Fν which can be written as the product of an even number of transpositions; the group Aν is simple for all ν, except for ν = 4, Aν Fν and |Fν /Aν | = 2. Using Proposition 3.70 and the Theorem of Gruenberg mentioned in the previous Subsection we deduce that τ (Fν ) = ω. Thus, the series of groups Fν , ν ≥ 5, provides an example of non-nilpotent groups of arbitrary large cardinality whose terminal is ω.
78
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
3.3.2. Construction of stable representations of groups with the aid of the triangular product 1. In this Section a technique of triangular products and, likewise, connections with stability, described in the last Subsection, are applied to exhibit the possible values of the terminal of finite groups. The result is somewhat unexpected: in the non-trivial case19 they comprise all ordinals τ with ω ≤ τ < ω2. 2. For given pairs (A, P ) and (B, Q) consider their triangular product (G, Γ) = (A, P ) (B, Q) = (A ⊕ B, Φ Σ), where Φ = HomZ (B, A) and Σ = P × Q. Let G = G0 ⊃ G1 ⊃ · · · ⊃ Gω = ∩k Gk ⊃ Gω+1 ⊃ . . . , where Gν = [G, Γ; ν], be the lower stable series of this pair (G, Γ). Moreover, let B ∗ be the subgroup of all Q-fixed elements in B, Bk = [B, Q; k] and Ak = [A, P ; k], k ∈ N. Fix prime numbers p and q, p = q, and let us assume that the following conditions are fulfilled for the pairs (A, P ) and (B, Q): a) A is an Abelian p-group, the pair (A, P ) being finitely stable; more exactly, An−1 = 0 = An for some n ∈ N. b) B and B/B1 are free Abelian groups, B ∗ being a direct summand in B, all B1 /Bk (for k ≥ 2) being q-groups and ∩∞ k Bk = 0. In the notation just defined and the conditions fulfilled we have the following. T HEOREM 3.71. There hold the relations Gω = A and Gω+n−1 > Gω+n = 0; if, in addition, if it is required that A is a vector space over a field of characteristic p and that Q is a finite q-group, then it is also true that G ◦ Δω Γ = A. P ROOF. The proof of the Theorem is given in several steps. (1) We show that A ⊂ G1 . We remark that [B, Φ] ⊂ G1 and [b, ϕ] = −b + b ◦ ϕ = −b + (b + bϕ ) = bϕ for all b ∈ B and ϕ ∈ Φ. Take as the element b a basis element in the free Abelian group B. Then for each a ∈ A the map b → a can be extended to a homomorphism ϕ of B to A. Hence, we have A ⊂ [B, Φ] ⊂ G1 . (2) We show that A ⊂ G2 . The group B1 , being a subgroup of the free Abelian group B, is likewise free and Abelian. The same reasoning as in (1) shows that A ⊂ [B1 , Hom(B1 , A)]. The factor group B/B1 is free by assumption, and so the semigroup B1 is a direct summand of B, B = T ⊕ B1 . Hence Φ = Hom(T, A) ⊕ Hom(B1 , A), so Hom(B, A) ⊂ Φ. It follows that A ⊂ [B1 , Hom(B1 , A)] ⊂ [B1 , Φ] ⊂ [G1 , Γ] = G2 . (3) The inclusion A ⊂ Gk holds for all k ≥ 3. Indeed, let b1 be a generating element of the free Abelian group B1 . For any a ∈ A there is a ϕ ∈ Hom(B1 , A) such def
m that bϕ 1 = a. Moreover, there exists a number m such that b = q b1 ∈ Bk−1 , because B1 /Bk−1 is (by assumption) a q-group. Therefore m bϕ = (q m b1 )ϕ = q m bϕ 1 = q a.
In view of p = q, when the element a runs through the whole group A, the element x = q m a will run through the group A. We saw above that for any such element x ∈ A 19The meaning of this expression is revealed on Proposition 3.63
3. Triangular products and stability of representations
79
there exists ϕ ∈ Hom(B1 , A) such that bϕ = x. Hence, for any x ∈ A, x = q m a, we have x = bϕ = −b+(b+bϕ) = −b+b◦ϕ = [b, ϕ] ∈ [Bk−1 , Hom(B1 , A)] ⊂ [Gk−1 , Γ] = Gk , which gives A ⊂ Gk . (4) Together with the obvious relation Bk ⊂ Gk , what is proved in (1)-(3) also gives A + Bk ⊂ Gk for all k. Let us show by induction over k that Gk = A + Bk , k = 0, 1, 2, . . . . For k = 0 we have trivially G0 = A + B0 . Let us assume that the equality Gs = A + Bs is true for all s ≤ k. In order to prove that Gk+1 = A + Bk+1 it suffices to check the validity of Gk+1 ⊂ A + Bk+1 . Take any a ∈ A, b ∈ Bk , ϕ ∈ Φ, σ ∈ Σ. Using the relations in Paragraph 2 of Section 3.1.3, we find [a + b, ϕσ] = −(a + b) + (a + b) ◦ ϕσ = −a − b + a ◦ ϕσ + b ◦ ϕσ = = −a − b + (a ◦ ϕ) ◦ σ + (b ◦ ϕ) ◦ σ = = −a + a ◦ σ − b + b ◦ σ + (bϕ ) ◦ σ = = [a, σ] + [b, σ] + [bϕ , σ] + bϕ ∈ A + Bk+1 . From this it follows that Gk+1 = [Gk , Γ] = [A + Bk , Γ] ⊂ A + Bk+1 , which completes the induction. (5) Next, we show that Gω = A. In view of the results (1)-(3) it is clear that it suffices to verify that Gω ⊂ A. Let x ∈ Gω . Then x = a1 + b1 where a1 ∈ A, b1 ∈ B1 . On the other hand, in view of (4), for every k > 1 there exist ak ∈ A, bk ∈ Bk such that x = ak + bk . We have a1 − ak = bk − b1 ∈ A ∩ B = 0, whence we obtain b1 = bk ∈ Bk . We conclude that b1 ∈ ∩k Bk . But Bω = 0 by assumption. So b1 = 0, and, by the same token, x = a ∈ A. (6) For any a ∈ A, γ ∈ Γ, γ = ϕσ we compute [a, γ] = −a + (a ◦ ϕ) ◦ σ = −a + a ◦ σ = [a, σ] ∈ A1 . This computation shows that [A, Γ] = A1 . Hence, we have Gω+1 = [Gω , Γ] = [A, Γ] = A1 . By induction over k we conclude that Gω+k = Ak for all k ≥ 1. In particular, Gω+n = An = 0 and Gω+n−1 = An−1 = 0. This concludes the proof of the first statement of the theorem. Let us pass to the proof of the second statement of the theorem. Thus, below we assume that A is a vector space over a field of characteristic p and that Q is a finite qgroup. In particular, Φ = Hom(B, A) is a p-group. Let Φ1 be the commutator of the subgroups Φ and Q, and Φ2 be the commutator of the subgroups Φ1 and Q in Γ; setting ¯ is a p-group. ¯ = Φ/Φ2 , it is clear that Φ Φ (7) One has the equality Φ1 = Φ2 . With the goal to prove this, let us remark that conjugation in the semidirect product Γ = Φ Σ induces an action of Σ on Φ which is 2-stable by the construction; for this reason cf. [70, p. 2] and Paragraph 2 of Section 3.1.3. We make the following observation. For any 2-stable pair (M, Σ) of Z-modules, where M is Abelian and Σ a q-group, we consider the commutator [M, Σ], that is, the
80
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Z-module, generated by the commutators [a, σ] = −a + a ◦ σ, a ∈ M , σ ∈ Σ. As a consequence of the 2-stability of (M, Σ) we derive from a ◦ σ = a + [a, σ] that a ◦ σ2 = (a + [a, σ]) ◦ σ = a ◦ σ + [a, σ] = a + 2[a, σ], . . . n
and (by induction over k), a ◦ σ k = a + k[a, σ]. For some n = n(σ) we have σ q = ε, n because Σ is a q-group. Hence a = a ◦ ε = a ◦ σq = a + q n [a, σ], whence q n [a, σ] = 0. Therefore [M, Σ] is a q-group. ¯ Q) this observation shows that Φ1 /Φ2 is a q-group. When applied to the pair (Φ, ¯ We However, at the same time Φ1 /Φ2 must be a q-group, as it is a subgroup of Φ. conclude that Φ1 ⊂ Φ2 . As Φ2 ⊂ Φ1 is evident, the equality Φ1 = Φ2 is proved. (8) We present some auxiliary computations for the pair (G, Γ), cf. Paragraph 2 of Section 3.1.3. For any b ∈ B, ϕ ∈ Φ, σ ∈ Q we have b = b ◦ ϕϕ−1 = (b + bϕ ) ◦ ϕ−1 = b ◦ ϕ−1 + bϕ , whence b ◦ ϕ−1 = b − bϕ . Furthermore b ◦ [ϕ, σ] = (b − bϕ ) ◦ (σ −1 ϕσ) = (b ◦ σ −1 − bϕ ) ◦ (ϕσ) = = (b ◦ σ −1 + (b ◦ σ −1 )ϕ − bϕ ) ◦ σ = b + (b ◦ σ −1 )ϕ − bϕ . Take now arbitrary b ∈ B and ϕ1 ∈ Φ1 . There exist ϕ ∈ Φ and σ ∈ Q such that ϕ1 = [ϕ, σ] and, so we find [b, ϕ1 ] = −b + b ◦ [ϕ, σ] = (−b + b ◦ σ −1 )ϕ = [b, σ −1 ]ϕ . These computations show that [B, Φ1 ] ⊂ [B, Q]Φ . But the group [B, Q] ⊂ B is free, and so [B, Q]Φ = A. Thus the module A contains [B, Φ1 ]. (9) Let us show that A ⊂ [B, Φ1 ]. This was shown in step (2) of this proof in the case Φ1 = Φ. Let Φ1 < Φ. Using the classical theorem of Maschke ([19, p. 182]) for the pair (Φ, Q), we obtain the existence of a Q-invariant decomposition Φ = Φ0 ⊕ Φ1 . Moreover, (Φ0 , Q) ∈ S. Indeed, for any ϕ ∈ Φ0 and σ ∈ Q we have −ϕ + ϕ ◦ σ ∈ Φ1 , and also −ϕ + ϕ ◦ σ ∈ Φ0 , as Φ0 is Q-invariant. From Φ1 ∩ Φ0 = 0 we conclude that ϕ ◦ σ = ϕ. Consider the pair (B, Q) and let B ∗ be the set of all Q-fixed points of B. We check now that AnnΦ B ∗ = Φ1 . On one hand, in view of the definition of the action of Φ in B (Subsection 3.1.1.2) we must have b ◦ ϕ1 = b + bϕ1 for any b ∈ B ∗ and ϕ1 ∈ Φ1 . The elements ϕ1 = [ϕ, σ], where ϕ ∈ Φ and σ ∈ Q generate Φ1 , and so in view of the computations in step (8) we have b ◦ ϕ1 = b ◦ [ϕ, σ] = b + (b ◦ σ −1 )ϕ − bϕ = b + bϕ − bϕ = b, because b ◦ σ −1 = b. We come to equalities b + bϕ1 = b ◦ ϕ1 = b, from which it follows that bϕ1 = 0. Hence ϕ1 ∈ AnnΦ B ∗ . So we have verified that Φ1 ⊂ AnnΦ B ∗ . On the other hand, an immediate verification shows that for any b ∈ B the element −1 ¯b = is Q-invariant. Moreover, an arbitrary ϕ ∈ AnnΦ B ∗ as well as each σ∈Q b ◦ σ other element of Φ = Φ0 ⊕ Φ1 , can be written in the form ϕ = ϕ0 + ϕ1 , where ϕi ∈ Φi , i = 0, 1. Taking now into account the relations Φ1 ⊂ AnnΦ B ∗ and (Φ0 , Q) ∈ S obtained before, along with the formula ∀b ∈ B, ψ ∈ Φ, σ ∈ Q,
bψ◦σ = (b ◦ σ −1 )ψ
3. Triangular products and stability of representations
81
in Paragraph 2 of Section 3.1.3, we obtain 0 = ¯bϕ = ¯bϕ0 +ϕ1 = ¯bϕ0 + ¯bϕ1 = ¯bϕ0 = b ◦ σ −1 )ϕ0 = bϕ0 ◦σ = bϕ0 = |Q| · bϕ0 . =( σ∈Q
σ∈Q
σ∈Q
From this it follows that bϕ0 = 0, because bϕ0 lies in the p-group A, while |Q| = q k for some k ∈ N, and this is true for all b ∈ B. Hence ϕ0 = 0. This argument shows that AnnΦ B ∗ ⊂ Φ1 and so we have Φ1 = AnnΦ B ∗ . (10) The subgroup B ∗ is servant in B. Indeed, if for some b ∈ B and n ∈ Z the element nb is contained in B ∗ , then for any σ ∈ Q we have nb = n(b ◦ σ), hence n(b − b ◦ σ) = 0. For a free Abelian group B this is possible only if b ◦ σ = b. Hence b ∈ B ∗ and we have established that B ∗ is servant in B. We add, however, that this fact follows here from condition b), according to which there exists a subgroup B∗ ≤ B such that B = B ∗ ⊕ B∗ . From this we conclude that Hom(B, A) = Hom(B ∗ , A) ⊕ Hom(B∗ , A). We remark that for each a ∈ A and a basis element b∗ ∈ B∗ the map b∗ → a extends to a Z-homomorphism ϕ∗ : B∗ → A. We obtain from this A ⊂ [B∗ , Hom(B∗ , A)]. The equality Φ1 = AnnΦ B ∗ proved above along with the obvious relation Hom(B∗ , A) ⊂ Φ1 shows, however that A ⊂ [B, Φ1 ]. (11) Using the relation Φ1 = Φ2 we see that Ψ = Φ1 Q is a subgroup of the group Γ and that Φ1 Ψ. We shall find the ideal Δω Ψ in the ring ZΨ. ˜ Ψ∗ the right ideal in ZΨ, generated by all For any subgroup Ψ∗ ≤ Ψ we denote by ω ∗ ψ − 1 where ψ ∈ Ψ . In an analogous way as was done in the proof of Proposition 3.70, one can prove that ω ˜ Φ1 ⊂ Δω Ψ . Let us prove the converse inclusion. First, we remark that Ψ/Φ1 is a finite q-group so that Δω Ψ/Φ1 = 0. But ∞
n Δω ˜ Φ1 )/˜ ω Φ1 , Ψ/Φ1 = ∩ (ΔΨ + ω n=1
which gives ∞
∞
n=1
n=1
ω ˜ Φ1 ⊃ ∩ (ΔnΨ + ω ˜ Φ1 ) ⊃ ∩ ΔnΨ = Δω Ψ. Δω Ψ
Thus we have the equality =ω ˜ Φ1 . (12) By what was set out above, it follows that ω ω Gω = A = [B, Φ1 ] ⊂ B ◦ ω ˜ Φ1 = B ◦ Δω Ψ ⊂ B ◦ ΔΓ ⊂ G ◦ ΔΓ . ω From this we obtain A = Gω = G◦ Δω Γ , because the inclusion converse to Gω ⊂ G◦ ΔΓ is always true, cf. Subsection 3.3.1.1.
82
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
3. Let us pass to estimating the terminal of the group Γ introduced in the previous Subsection; to this end we assume that the auxiliary requirements, indicated in the statement of Theorem 3.71 are fulfilled. We assume that Δω+n−1 = Δω+n , then we have Γ Γ n−1 = (G ◦ Δω = 0 = Gω+n−1 ⊂ Gω ◦ Δn−1 Γ ) ◦ ΔΓ Γ n−1 = G ◦ (Δω ) = G ◦ Δω+n−1 = G ◦ Δω+n ⊂ Gω+n = 0. Γ · ΔΓ Γ Γ
= ΔΓω+n . A quantitative reformulation of the This contradiction proves that Δω+n−1 Γ result of these computations gives the following. T HEOREM 3.72. The terminal of the group Γ, introduced in Subsection 3.3.2.2, is not less than ω + n. 4. We give an application of this result. Fix n ∈ N and let A be an n-dimensional vector space over Zp , and let P = U T 1 (n, Zp ) be the group of (n × n)-matrices with elements in Zp with unit main diagonal and zeros under it (the unitriangular group). We denote by U T r (n, Zp ) the subset of matrices in P with r − 1 zero diagonals above the main diagonal. In view of (12) in [19, p. 38] we have the relation [U T r (n, Zp ), U T 1 (n, Zp )] = U T r+1(n, Zp ), showing that the nilpotency class of P equals n − 1. Let us remark that, in a natural way, there appears the pair (A, P ), which is faithful and has a stable series of length n. However, A does not have a P -stable series of the length less than n, because otherwise by Kaluzhnin’s theorem ([19, p. 144]) the nilpotency class of P would be less than n − 1. The pair (A, P ) satisfies, thus, the requirement a) in Theorem 3.71; cf. the beginning of Subsection 3.3.2.2. Furthermore, let us for B take the additive group of the integral group ring of a finite q-group Q, leading to the regular pair (B, Q); then B = B0 = ZQ(+) and Bk = ΔkQ , k = 1, 2, . . . , while the Abelian group B/B ∗ decomposes into a direct sum of cyclic subgroups, because together with B also B/B ∗ is finitely generated. Hence, the servant subgroup B ∗ is a direct summand of B ([22, p. 150]). It is easy to see that for such a group B the whole requirement b) in Theorem 3.71 is fulfilled. requirement In [61, p. 277], one finds, writing exp(Q/[Q, Q]) = n, the simple fact that the additive group ΔQ /ΔkQ has an exponent dividing nk (and hence is a q-group), but this can also be proved by the argument in step (7) of the proof of Theorem 3.71 by applying it to the pairs k (Δk−2 k = 2, 3, . . . Q /ΔQ , Q/[Q, Q]), and making an induction over k. Hence, the results of the two previous Subsections are also true for the pair (G, Γ) = (A, P )(B, Q) introduced here. We obtain the following result. T HEOREM 3.73. For each natural number n there exists a finite group such that in its integral group ring the (ω +n−1)-th and (ω +n)-th powers of the fundamental ideals are distinct. An example in [71, p. 223], shows that all values ω + n, n = 0, 1, 2, . . . , indeed appear as terminals of finite groups.
3. Triangular products and stability of representations
83
5. Groups, in which there exists an invariant nilpotent subgroup with a nilpotent factor group, are usually called metanilpotent. We denote in this Subsection by Γn the n-th term of the lower central series of the group Γ. Along with Theorem 3.73 one may now formulate the following statement which gives further properties of terminals of finite groups. T HEOREM 3.74. Let there be given a representation (G, Γ) of the group Γ with all metanilpotent factor groups torsion and the factor group Γ/∩n Γn nilpotent, by automorphisms of the Z-module G whose torsion part B is Γ-Artinian, while the factor module G/B is Γ-Noetherian. If G has a Γ-stable descending series of length ≤ ωn (n ∈ N) which reaches the zero, then the lower stable series of the pair (G, Γ) stabilizes to zero at a term of index < ω2. P ROOF. The statement of the theorem is obvious for n = 1, because the terms of the lower stable series of the pair (G, Γ) are contained in the corresponding terms of the given Γ-stable series of G. Hence we may assume that n ≥ 2. By the assumption, there exists in the module G a descending stable series of length ≤ ωn, (17)
G ⊃ G1 ⊃ · · · ⊃ Gu ⊃ . . . Gλ ⊃ · · · ⊃ Gμ = 0.
Consider the family of submodules {Gλ + B | λ ≤ μ}. In view of the Γ-stability of the series (17) and the Γ-invariance of the module B invariance we have for a non-limit λ (18)
Gλ−1 + B, Γ] ⊂ [Gλ−1 , Γ] + [B, Γ] ⊂ Gλ + B .
Let us show that for a limit λ holds the relation (19) Gλ + B = (Gα + B). α<λ def
¯ α def ¯ def ¯ = G/Gλ , G = Gα /Gλ for α ≤ λ and B = We introduce the factor modules G ¯ is Γ-Artinian, in view of the Γ-Artinicity of B. It is clear (B + Gλ )/Gλ ; the module B ¯ = ∩α<λ (G ¯ α + B), ¯ which we shall show. that (19) is equivalent to the equality equality B ¯ We remark that in B one has the descending series descending of submodules ¯ ⊃ ∩B ¯ ··· ⊃ G ¯α ∩ B ¯ ⊃ ..., ¯⊃G ¯1 ∩ B B ¯ is Artinian, must stabilize at some index β, β < λ. From the fact that the which, as B ¯ G ¯α ) = ¯ ¯α = ¯ 0. This gives ∩α<λ (B∩ 0. So series (17) exists we obtain the relation ∩α<λ G ¯ α ∩B ¯ =¯ ¯ the stabilization hinted at above occurs at the term equal to 0. Hence, we have G 0 for all α, on index β ≤ α ≤ λ. As λ is a limit ordinal, there are infinitely many such ¯ α + B) ¯ ⊂ B. ¯ Indeed, take any ordinal numbers α. It is now easy to see that ∩α<λ (G ¯ ¯ ¯ ¯ ¯β element x ¯ in ∩α<λ (Gα + B). Then x ¯ ∈ Gα + B, and so x ¯ = g¯ + ¯b for some g¯ ∈ G ¯ ¯ ¯ ¯ and b ∈ B. On the other hand, as x ¯ belongs to ∩α<λ (Gα + B), it follows that for each ¯ α and ¯bα ∈ B ¯ such that x¯ = g¯α + ¯bα . We α, β < α < λ, there exist elements g¯α ∈ G have ¯β ∩ B ¯=¯ 0, g¯ − g¯α = ¯bα − ¯b ∈ G ¯ α , β < α < λ, and whence g¯ = g¯α . As a consequence, the element g¯ lies in every G ¯ ¯ ¯ therefore also in the module ∩α<λ Gα , which equals 0. We obtain x¯ = g¯ + ¯b = ¯b ∈ B. ¯ ¯ ¯ Clearly the equality ∩α<λ (Gα + B) = B is now proved.
84
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Let ωm, m ∈ N, be the last limit ordinal not exceeding μ. Then one can find a non-negative integer e such that μ = ωm + e. It follows from the relations (18) and (19) that in G one has the Γ-stable series (20) G ⊃ G1 + B ⊃ · · · ⊃ Gω + B ⊃ · · · ⊃ Gω2 + B ⊃ · · · ⊃ Gωm + B ⊃ · · · ⊃ B, where the link from Gωm + B to B contains e members. Let Σ0 be the kernel of the pair (G/B, Γ), Σ1 be the kernel of the pair G/(Gω + B), Γ , Σk be the kernel of the pair((Gω(k−1) + B)/(G ωk + B), Γ), for k = 2, . . . , m, and Σm+1 be the kernel of the pair (Gωm + B)/B, Γ . An application of Lemma 3.56 implies that the groups Γ/Σk , k = 1, . . . , m, are residually nilpotent. This yields the relations ∩n Γn ⊂ Σk , thanks to which it follows from the nilpotency of Γ/∩n Γ that all groups Γ/Σk , k = 1, . . . , m, are nilpotent; again the nilpotency of Γ/Σm+1 follows from Kaluzhnin’s Theorem. Let Σ∗ = ∩m+1 k=1 Σk ; it is clear that Σ0 ⊂ Σ∗ , while an application of Remak’s theorem shows that G/Σ∗ is nilpotent. Likewise, the factor group Σ∗ /Σ0 is nilpotent, because the subpair (G/B, Σ∗ /Σ0 ) of (G/B, Γ) is faithful and finitely stable. Thus the group Γ/Σ0 is metanilpotent, but then according to the condition of the Theorem it is also a torsion group. Its subgroup Σ∗ /Σ0 acts faithfully and finitely stably on the torsion free module G/B, which according to Lemma 3.53 implies that Σ∗ /Σ0 . Thus, we have shown that the group Γ/Σ0 of automorphisms of the Γ-Noetherian module G/B is nilpotent. We may apply Theorem 3.66; we find that the lower stable series of the pair (G/B, Γ/Σ0 ) must stabilize at a term of index ≤ ω, (21)
G/B ⊃ · · · ⊃ (G/B)ω = (G/B)ω+1 = . . . .
0, because the terms of the series (21) are We have, however, the equation (G/B)ω = ¯ contained in the corresponding terms of the Γ/Σ0 -stable series 0. G/B ⊃ (G1 + B)/B ⊃ · · · ⊃ (Gω + B)/B ⊃ · · · ⊃ ¯ Let us consider the series of preimages of the terms of the series (21) under the epimorphism G G/B and let us add to it the finite chain from 0 to B a of the original series (17); such a chain exists as B is Γ-Artinian module. We obtain a descending Γ-stable descending series of length < ω2 for the module G, (22)
G ⊃ · · · ⊃ B ⊃ · · · ⊃ 0.
As, the terms of the lower Γ-stable series of G are contained in the corresponding terms of the series (22), then this series becomes zero for terms of index < ω2. This concludes the proof of Theorem 3.74. 6. For representations of an Artinian group Γ by automorphisms of the Z-module G we consider the corresponding lower stable series, G ⊃ G1 ⊃ · · · ⊃ Gω ⊃ · · · ⊃ · · · ⊃ Gω2 ⊃ . . . . Let us assume that the torsion part of the module G/Gω2 is a Γ-Artinian Z-module, and that its quotient by the torsion part is Γ-Noetherian. Then it follows from Theorem 3.74 that the lower stable series of G/Gω2 stabilizes to zero at a term of index < ω2. Hence, in the initial series we have for some n ∈ N Gω+n = Gω+n+1 = . . . .
3. Triangular products and stability of representations
85
Let Γ be arbitrary finite group. The regular pair (ZΓ, Γ) can be taken as the representation (G, Γ) considered above in this Subsection. The lower stable series of this pair is the series (9). As Γ is a finite group the Z-module ZΓ/Δω2 is Noetherian and so its torsion part must be finite. Thus the series (9) stabilizes at a term of index < ω2. In other words, we have proved the following. T HEOREM 3.75. The terminal of a finite group is less than ω2. 3.3.3. Generalized measure subgroups of finite groups 1. In this Section we show that the results set forth above on the terminal and the triangular product construction of linear pairs allows one to give a completely closed description of the limit groups in the class of finite groups, the task referred by B. Hartley in [80, p. 15]. The nilpotent coradical of a group Γ is the subgroup N(Γ) of Γ which is the intersection of all normal subgroups of Γ such that the quotients groups of Γ by them are nilpotent. P ROPOSITION 3.76. If Γ is a finite group, then the kernel of the regular pair (ZΓ/Δω , Γ) is the nilpotent coradical of Γ. P ROOF. Let Σ be the kernel of (ZΓ/Δω , Γ). The group Γ/Σ acts in ZΓ/Δω faithfully and ω ∗ -stably. An application of Lemma 3.56 shows that Γ/Σ is nilpotent. Hence, we have Σ ⊃ N(Γ). In order to prove the inclusion Σ ⊂ N(Γ), we require an observation: for each finite nilpotent group Γ there exists a natural number n such that (ZΓ/Δn , Γ) is faithful. Indeed, if Γp is the p-component in the primary decomposition of Γ, then Γp acts faithfully on the basis subgroup Bp of the wreath product Zp wrΓp because the action is regular. This action is also stable, since Bp is a finite p-group. The pair (Σp Bp , Γ) gives an rstable representation of Γ, where we for n take the maximal length of Γ-stable series of the groups Bp . But then, likewise, the free representation corresponding to the pair (ZΓ/Δn , Γ) is faithful. Next, let Γ be anew an arbitrary finite group. Then the group Γ/N(Γ) is nilpotent. Therefore, by our observation, there exists a number n such that the pair (ZΓ/Δn , Γ/N(Γ)) is faithful. In other words, the coradical N(Γ) is the kernel of the pair (ZΓ/Δn , Γ). But the kernel of the pair (ZΓ/Δω , Γ) apparently must be contained in the kernel of the pair (ZΓ/Δn , Γ), so we must have Σ ⊂ N(Γ). One can also look on the reasoning given above as a new proof of a theorem of Buckley [60], based on the technique of linear pairs. Indeed, let Dk be the kernel of (ZΓ/Δk , Γ), Dω = ∩k Dk , while D∗ is the kernel of (ZΓ/Δω , Γ); it is readily seen that D∗ = Dω (the proof amounts to unwinding the definitions of D∗ and Dω ). Moreover, let Γω be the intersection of the terms of the lower central series of the finite group Γ; it is clear that Γω = N(Γ). From Proposition 3.76 now follows the following. C OROLLARY 3.77 ([60, Theorem 2]). For each finite group Γ one has the equality D ω = Γω .
86
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Remark. It remains to settle the question whether every group Γ admits a representation of class S ω exactly when it is residually nilpotent. It is clear that this is equivalent to the question about the equality Dω = Γω and that a positive answer to it would follow from a positively solution of the problem of the existence, for each nilpotent group, of a faithful finitely stable representation. The existence of such a representation is evident for finite nilpotent groups (cf. the observation in the proof of the previous Proposition), and it was proven by A. I. Mal’cev [27] for torsion-free nilpotent groups, and, likewise, for nilpotent groups with finite exponent, while B. I. Plotkin [39] established it for nilpotent groups of finite special rank. 2. Let Γ be a torsion group, Ω the set of all prime numbers, π ⊂ Ω, π = Ω \ π, while Γπ is the subgroup of Γ, generated by all π -elements in Γ. It is clear that Γπ Γ. Let us introduce the notation Π2 (Γ) for ∩π,|π|=2 Γπ , and consider the class of torsion groups Γ, satisfying Π2 (Γ) = 1. We show that these are precisely the residually biprimary groups. Indeed, on the one hand, Γ/Γπ is a π-group. We denote by Π(n) the set of all prime divisors of n, and by Π(Γ) the set of prime divisors of the orders of all elements of Γ. Let γ ∈ Γ be an element of order n, γ ∈ Γπ . Then Π(n) ⊂ π , whence Π(n) ∩ π = ∅. Let n = m · n , where Π(m) ⊂ Π(n) ∩ π and Π(n ) ∩ π = ∅. Then it follows from 1 = γ n = (γ m )n that γ m ∈ Γπ , as Π(n) ⊂ π . We have (γΓπ )m = 1, which again, in view of Π(m) ⊂ π implies that Π(γΓπ ) ⊂ π. As a consequence, each element of Γ/Γπ is a π-element. On the other hand, let Γ be a residually biprimary torsion group: there exist subgroups Σi Γ, i ∈ I, such that every Γ/Σi is a πi -group, |πi | = 2 and ∩i∈I Σi = 1. We show that Γπi ⊂ Σi for all i ∈ I. It suffices to check that each πi -element in Γ lies in Σi . Take any γ ∈ Γ, γ m = 1, Π(m) ⊂ πi . It follows from Π(Γ/Σi ) ⊂ πi that there exists a π -number n such that γ n ∈ Σi . In view of Π(m) ∩ Π(n) = ∅ there exist integers u, v such that um + vn = 1. We have γ = γ um+vn = (γ m )n (γ n )v = (γ n )v ∈ Σi . i.e. γ ∈ Σi . These reasonings prove that Γπi ⊂ Σi , which immediately implies that Π2 (Γ) = 1. The original statement is completely proved. (2) Let us now study some auxiliary properties of the class N , introduced by B. Hartley [75]. This class of finite groups is defined via the following conditions 20 Γ ∈ N(2) ⇐⇒ Γ ∈ AN & Π2 (Γ) = 1.
(23)
As a consequence of what was said above and the properties of the variety AN we see that the N(2) -groups are subdirect products of biprimary AN-groups, and, conversely, every such product is an N(2) -group. T HEOREM 3.78 ([75, Theorem 2.]). Each N(2) -group has a faithful stable representation of type (ω + n)∗ in a finitely generated Abelian group. P ROOF. The proof consists of a reduction of N(2) -groups to groups of rather special form, for which the desired representation is constructed with the aid of the triangular product construction. Let us introduce the corresponding class of finite groups R: by 20As usual
A denotes the class of Abelian groups, and N the class of nilpotent groups.
3. Triangular products and stability of representations
87
definition T ∈ R if for some prime numbers p and q there exist an Abelian p-group A and a nilpotent (p, q)-group B such that T = AwrB. Let us first show 21 that N(2) ≤ R0 SR. For any prime numbers p and q set π = ¯ = Γ/Γπ . As the properties of being AN- and π-groups are preserved under {p, q} and Γ ¯ ∈ A and Π(Γ) ¯ ⊂ π. Therefore there exists a normal subgroup epimorphisms we have Γ ¯ ¯ ¯ ¯ ¯ ¯ ¯ A Γ such that A ∈ A, and Σ = Γ/A¯ is a nilpotent π-group. Let A¯ = A(p) × A(q) ¯ ¯ ¯ be the primary decomposition of A. It is easy to observe that Γ/A(q) is an extension of ¯ by Σ, ¯ from which, by a well-known theorem ([19, p. 70]), it follows that the wreath A(p) ¯ ¯ contains a subgroup isomorphic to Γ/ ¯ A(q). ¯ ¯ A(q) ¯ product A(p)wr Σ Hence, Γ/ ∈ SR, ¯ ¯ ¯ and, as A(p) ∩ A(q) = 1, then Γ ∈ R0 SR. It follows now from Π2 (Γ) = 1 that also Γ ∈ R0 SR. −−−→ Fix the number n. It is not hard to see that for the class of groups S ω+n we have −−−→ −−−→ −−−→ the equality R0 S S m+n = S ω+n . Indeed, the class S ω+n is, trivially, closed. Moreover, assume that for some group Γ there exist two normal subgroups Σ1 and Σ2 such that −−−→ Σ1 ∩Σ2 = 1 and the factor groups Γ/Σ1 and Γ/Σ2 lie in S ω+n . Then we have the faithful ¯ = Γ/Σ1 ×Γ/Σ2 ; we pairs (Gi , Γ/Σi ) ∈ S ω+n , i = 1, 2. We denote G = G1 ⊕G2 and Γ ω+n ¯ . As the group Γ may be viewed have the pair (G, Γ), which is faithful and lies in S ¯ we obtain the pair (G, Γ), having the same properties. We conclude as a subgroup of Γ, −−−→ that Γ ∈ S ω+n and our equality is proved. From the equality just proven and the relation N(2) ≤ R0 SR it follows that it suffices to prove the theorem for R-groups. Let T = AwrB be such a group and let P × Q be the primary decomposition for B, where P = B(p) and Q = B(q). Furthermore, let AP be the basis subgroup of AwrB. with (AP , P ) and (ZQ, Q), which are also faithful. The group AP P is a p-group and so nilpotent, which implies that (AP , P ) ∈ S n for a suitable n. The fact that the pair (ZQ, Q) is contained in the class S ω follows from Theorem 3.69. The triangular product of these pairs (G, Γ) = (AP , P ) (ZQ, Q) is contained in the class S n · S ω = S ω+n and is faithful, because the initial pairs are −−−→ faithful (cf. Paragraph 2 of Section 3.1.3). Hence Γ ∈ S ω+n and since by (3) the group −−−→ Γ is isomorphic to Awr(P × Q), we have T ∈ S ω+n . Taking in account the fact that the group AP ⊕ ZQ is finitely generated, it is not hard to deduce by the argument of the proof given (from its beginning) the possibility to satisfy these properties for the required representation of the given N(2) -group. P ROPOSITION 3.79. A group of class N(2) is nilpotent precisely when its terminal equals ω. For a non-nilpotent N(2) -group Γ holds the inequality τ (Γ) ≥ ω + 1 and there exists an ordinal number ν such that the ν-th member of the central series of Γ is not contained in the corresponding (generalized) dimension subgroup Dν . P ROOF. If the N(2) -group Γ is nilpotent then it follows from Theorem 3.67 that τ (Γ) = ω. Conversely, let the terminal of some N(2) -group Γ equal ω. By Theorem 3.78 there exist a number n ≥ 0 and a faithful representation (G, Γ) such that Gω+n = 0. 21Here S is the operator of taking subgroups, while the operator R is defined by the following rule: A 0 class K is called R0 -closed precisely when it contains each group Γ having invariant subgroups Ψ and Σ such that Ψ ∩ Σ = 1 and Γ/Ψ ∈ K and Γ/Σ ∈ K.
88
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
We have also G ◦ ΔΓω+n ⊂ Gω+n . In view of τ (Γ) = ω we deduce that G ◦ Δω Γ = 0. This means that Dω ⊂ Ker (G, Γ), and as Γω ⊂ Dω and Ker (G, Γ) = 1, then Γω = 1. Therefore Γ is nilpotent. For the nilpotency of an N(2) -group Γ we have, thus, the inequality τ (Γ) ≥ ω + 1. For such a group let the number n and the pair (G, Γ) once more be given by Theorem 3.78; it is not hard to see that here n ≥ 1. We show that Γω+n ⊂ Dω+n . Indeed, in the opposite case we would have Γω = Γω+n ⊂ Dω+n ⊂ Ker (G, Γ) = 1, i.e. Γω = 1, which contradicts the non-nilpotency of Γ. Remark. In [63] it was, erroneously, stated (Proposition 17) that Γν ⊂ Dν holds true for all groups Γ an all ordinals ν. B. Hartley [77] solved positively Gruenberg’s conjecture to the effect that the intersection of all powers of the fundamental ideal of the integral group ring of an arbitrary torsion-free nilpotent group is zero. This led to a conjecture of A. A. Bovdi, formulated during the XI Algebraic Symposium in Kishinev: Δω Φ = 0 for each torsion-free group Φ. The answer is, however, negative. Let us consider B. I. Plotkin’s example of a weakly stable automorphism group Φ, having no central system, of an Abelian group G, being the direct sum of a countable family of groups Q(+); cf. for details [35, pp. 470–471]. One can show that the group Φ in this example is torsion-free. There arises a faithful pair ω+1 ω+1 (G, Φ) ∈ S ω+1 . For Δω we obtain G ◦ Δω ⊂ Gω+1 = 0, which Φ = ΔΦ Φ = G ◦ ΔΦ gives Φω ⊂ Dω ⊂ Ker (G, Φ) = 1, i.e. Φω = 1. This contradicts the existence for Φ a ω+1 and also Δω central system. Therefore we have Δω Φ = ΔΦ Φ = 0. 3. Let Γ∗ be a finite group, Δ the fundamental ideal of the ring ZΓ∗ and τ the terminal of Γ∗ . Let us make explicit the group theoretical structure of the limit subgroup D∞ = (1 + Δτ ) ∩ Γ∗ . In view of Proposition 3.63 and the fact that structure of the dimension subgroups D1 and D2 is known, one can restrict oneself to those finite groups Γ∗ whose terminal is ≥ ω. T HEOREM 3.80. The limit of a finite group Γ∗ with infinite terminal τ is the smallest of its normal subgroups such that all its factor groups are N(2) -groups. P ROOF. Let us consider the regular pair (ZΓ∗ /Δω , Γ∗ ). In view of Proposition 3.76 its kernel is the nilpotent coradical of Γ∗ , which we denote by Σ∗ . As Δτ ⊂ Δω we have D∞ ⊂ Σ∗ , which makes it possible to pick in Γ∗ /D∞ the subgroup Σ∗ /D∞ . As was established in Theorem 3.75 there exists a non-negative integer n such that τ = ω + n. The additive group of the ring ZΓ∗ /Δω+n will be denoted by G, and the factor groups Γ∗ /D∞ and Σ∗ /D∞ by Γ and Σ, respectively. It is not hard to see that Σ is the nilpotent coradical of Γ. Therefore we have [Γp , Γq ] = ∩p Γp and Σp = Γp ,p , (24) Σ= p=q
where we denote by Γ the subgroup generated by the p-elements in Γp ; the straightforward verification of these equations can be found in [80, p. 5]. As a first step in the proof we establish that Σ ∈ A. The regular pair (G, Γ) is faithful and admits the lower stable series p ,p
(25)
G ⊃ G1 ⊃ · · · ⊃ Gω ⊃ · · · ⊃ Gω+n = 0.
3. Triangular products and stability of representations
89
where Gν = Δν /Δω+n (+), ν = 1, 2, . . . . The subpair (G, Σ) in (G, Γ) is faithful, because (G, Γ) is faithful. It is finitely stable, because in the factor G/Gω of the series (25) the subgroup Σ acts as the identity. By Kaluzhnin’s theorem it follows that Σ ∈ N. In view of this the subgroup Σp coincides with the Sylow p-subgroup Σ(p). Thus, it suffices to establish that Σp ∈ A, as Σ is the direct product of the groups Σ(p). The group Σp , being a p-group, is also a relative p-group ([35, p. 144]), while the pair (G, Σp ) is faithful and finitely stable since it is a subpair of (G, Σ). By Lemma 3.55 it follows from this that the commutator [G, Σp ] is a p-group, which we denote by H. Next, we show that [H, Γp ] = 0. Indeed, for all g ∈ G, σ ∈ Σp and γ ∈ Γp we have [g, σ] ◦ γ = −g ◦ γ + g ◦ σγ = −g ◦ γ + g ◦ γ · γ −1 σγ = = −(g ◦ γ) + (g ◦ γ) ◦ γ −1 σγ = [g ◦ γ, γ −1 σγ]. This computation shows that the subgroup H ≤ G is Γp -invariant: indeed, γ −1 σγ ∈ Σp , as Σp Γp , while the elements of the form [g, σ], where g ∈ G, σ ∈ Σp , generate H. For each ν = 1, 2, . . . the term Hν of the lower stable series of the pair (H, Γp ) is contained in Gν+1 , and as Gτ +1 = 0, then also Hτ = 0. Let K be the kernel of (H, Γp ) The factor group Γp /K, which we denote by Φ, is a p-group. Indeed, for any h ∈ H we consider the Φ-invariant subgroup H ∗ ≤ H, generated by all h ◦ ϕ, ϕ ∈ Φ. Clearly, H ∗ is a finite p-group, as Φ is finite and H is an Abelian p-group. Therefore the intersection of H ∗ with the terms of the lower stable series of the pair (H, Φ) gives a finite Φ-stable series of H ∗ . Lemma 3.55, applied to (H ∗ , Φ), shows that Φ acts on H ∗ as a p-group. In other words, for each ϕ ∈ Φ there exists a number m with Π(m) = {p} such that ϕm is contained in the kernel of the pair (H ∗ , Φ); we denote this kernel by Ψ. Assume now that there exists in Φ a non-unit p -element ϕ = 1. Then, for some n with Π(n) ⊂ p , we have ϕn = 1, and as n and m are relatively prime there exist u, v ∈ Z such that nu + mv = 1. This gives ϕ = ϕnv · ϕmv = (ϕm )v ∈ Ψ, i.e. ϕ ∈ Ψ. Thus, the p -element φ acts trivially on H ∗ . The choice of the element h ∈ H being arbitrary in our construction, we conclude that ϕ acts trivially on H. This means that ϕ lies in the kernel of the pair (H, Φ) and, as this pair is faithful, we deduce ϕ = 1. Contradiction. Thus we have proved our statement concerning the group Φ. The equality [H, Γp ] = 0 now follows readily. Indeed, let ϕ be an arbitrary p element of Γ. For some p -number n we have ϕn = 1, and, as by what was proved above Γp /K is a p-group, there exists a p-number m such that ϕm ∈ K. In view of (m, n) = 1 it follows (as above) that ϕ ∈ K. The group Γp is generated by the p -elements of Γ, and so it is entirely contained in K. This means that Γp acts trivially on H. In particular, we have [H, Σp ] = 0. This shows that Σp is Abelian. Indeed, the subpair (G, Σp ) of the faithful pair (G, Γ) must be faithful also. In view of what was proved above, we have [[G, Σp ], Σp ] = 0. This implies that the faithful pair (G, Σp ) is 2-stable. From Kaluzhnin’s theorem it follows that Σp is Abelian. Thereby we have also proved the relation Γ ∈ AN. A second step in the proof is the verification of the equality Π2 (Γ) = 1. We observe that Σ is a normal subgroup of Γ on which the lower central series of Γ stabilizes. Above we have established that Σ ∈ A. Hence, by a theorem of Shenkman [104] there exists a subgroup Θ ≤ Γ such that Γ = Σ Θ. Let p and q be two arbitrary prime numbers.
90
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
The subgroup Σp is characteristic in Σ and therefore in Σ and, hence, Σp Γ. One can therefore consider the subgroup Σp · Θ(p,q) . Invoking the relation Σ ∩ Θ = 1, a simple argument (by contradiction) shows that Σp · Θ(p,q) = 1. (26) p=q
The isomorphism of Θ with Γ/Σ gives Θ ∈ N. Therefore we have Σ = Σp × Σp and Θ = Θ(p,q) × Θ(p,q) , which along with Σp Γ gives the equality 22 Γ = gp{Σp · Θ(p,q) , Σp · Θ(p,q) }. Hence, (27)
Γ/(Σp · Θ(p,q) ) ∼ = Σp · Θ(p,q) /(Σp · Θ(p,q) ∩ Σp · Θ(p,q) ).
One verifies, however, immediately that Σp · Θ(p,q) ∩ Σp · Θ(p,q) = 1. Therefore the factor group to the right of (27) is isomorphic to Σp · Θ(p,q) , while the last group is biprimary, in view of the facts that Σ ∈ A, Θ ∈ N and Σp Γ. The relations (26) and (27) now show that Π2 (Γ) = 1. By what has been said we have Γ ∈ N(2) . In other words, there exists a family Ω of normal subgroups X of Γ having trivial intersection, factor groups Γ/X by which are biprimary and belong to the class AN. Let Ω∗ = {X ∗ } be the complete pre-image of X ∈ Ω in Γ∗ . It is clear that Γ∗ /X ∗ is biprimary, contained in the class AN, and that !∗ the family of all invariant subgroups X ∗ in Γ∗ ∩X ∗ ∈Ω∗ X ∗ = D∞ . We denote by Ω with the properties described. We have ∩X ∗ ∈Ω∗ ⊂ ∩X ∗ ∈Ω∗ , which gives ∩X ∗ ∈Ω∗ ⊂ D∞ . Let us prove the converse inclusion. It suffices to verify that D∞ is contained in !∗ . We remark that Γ∗ /X ∗ ∈ N(2) for every Γ∗ ∈ Ω !∗ . Therefore, for every X ∗ ∈ Ω some non-negative n the group Γ∗ /X ∗ admits a faithful (ω + n)∗ -stable representation in an Abelian group (cf. Theorem 3.78). We obtain the faithful pair (A, Γ∗ /X ∗ ), which can be naturally lifted by the epimorphism Γ∗ Γ∗ /X ∗ to the pair (A, Γ∗ ) with kernel X ∗ . By what has been said, Aω+n = [A, Γ∗ ; ω + n] = 0. However we have also ω+n ∞ ⊂ Aω+n , which implies that A ◦ Δ∞ A ◦ Δω+n Γ∗ = 0, because ΔΓ∗ ⊂ ΔΓ∗ . The Γ∗ result obtained A ◦ (D∞ − 1) = 0 means that D∞ lies in the kernel of (A, Γ∗ ), that is, D∞ ⊂ X ∗ . The Theorem is proven. 3.3.4. Mal’cev nilpotency and stability of semigroups 1. The problems on the terminal and the dimension subgroups can also be carried over to semigroups. In order to formulate them we have to make precise some notions. For a congruence A of a semigroup Γ we consider in the ring R = ZΓ the ideal I(A), generated by all differences γ − σ, where γ, σ ∈ Γ and γ ∼ σ (A). It is clear that I(A) is a two-sided ideal of R and it arises in a natural way as the kernel of the homomorphism of semigroup rings ZΓ → Z(Γ/A). We observe some properties of the correspondence A → I(A), their proofs being straightforward verifications: (1) A ≤ B =⇒ I(A) ⊂ I(B); 22Translators’ note. The symbol gp here, apparently, means taking the subgroup spanned by the groups indicated within the curly brackets.
3. Triangular products and stability of representations
91
(2) I(A ∩ B) ⊂ I(A) ∩ I(B); (3) I(A ∪ B) ⊃ I(A) ∪ I(B). In the case when A is the zero congruence on Γ, i.e. it is defined by the set Γ × Γ of all pairs, it is natural to call the ideal I(A) the fundamental ideal 23 and denote it by Δ(Γ, Z) or simply Δ. For every natural number n one defines on Γ a binary relation ϑn : γ ∼ σ(ϑn ) on Γ ⇐⇒ γ − σ ∈ Δn in ZΓ. It is clear that ϑn is a congruence on Γ. We call it the n-th dimension congruence of the semigroup Γ with respect to Z. If Γ is a monoid, then the class of ϑn containing the unity element is a submonoid; we denote it Dn (Γ, Z). For a group Γ the submonoid Dn (Γ, Z) is the well-known n-th dimension subgroup, which has been the object of much attention (cf. [33, 91, 92, 96], as well as the literature given there). Nilpotence of semigroups will be considered in the sense of A. I. Mal’cev. Let x, y, u1 , u2 , . . . , un , . . . be arbitrary variables. We set X0 = x, Y0 = y, and define further Xn+1 = Xn un+1 Yn ,
Yn+1 = Yn un+1 Xn ,
According to Mal’cev [28], a semigroup Γ whose elements satisfy the identity Xn = Yn is said to be nilpotent of class n. For a group Γ we obtain then the usual notion of nilpotence of class n for groups ([28, Theorem 1]). For an arbitrary semigroup Γ one can consider the congruence Cn+1 = Nn (Γ) with respect to the variety of nilpotent semigroups of class n: it is the minimal congruence A on Γ with the property that Γ/A ∈ Nn . Moreover we agree that τ1 is the zero relation on Γ. This gives rise to a decreasing series of congruences C1 ≥ C2 ≥ · · · ≥ Cn ≥ Cn+1 ≥ . . . , which is called the lower central series of the semigroup Γ. We remark that for a group Γ the Cn -classes (n ≥ 1) containing the unity element coincide with the terms of the lower central series of this group Γ. The comparison of the mutual relations between the congruences Cn and ϑn seems to be an interesting problem. One must, however, add that already the equation C2 = ϑ2 for a semigroup Γ requires some separativity type conditions on Γ. As the series (9), likewise the definition of its terminal τ (Γ) in the case when Γ is a semigroup do not change, there arises a question of terminal behavior in the class of finite semigroups. In particular, we conjecture that τ (Γ, Z) ≤ ω2 for each finite monoid Γ. Moreover, considering the limit congruence D∞ (Γ) on Γ, which is the kernel of the pair (ZΓ/Δτ (Γ) , Γ), one can also state the problem of describing (in terms of semigroups) this limit congruence for the class of all finite semigroups. 2. In this Subsection we consider pairs whose domain of action is an (arbitrary) group, while the acting object is a semigroup. In the study of pairs of this kind it will be convenient to use the language of quasi-rings, [68]. A set K equipped with two binary operations (addition and multiplication) is, by definition, a quasi-ring, if: 23Translators’ note. Also known as the augmentation ideal.
92
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
(1) K is a group with respect to addition, generated by right distributive elements, that is, elements d such that for arbitrary a, b ∈ K it holds that (a + b)d = ad + bd; (2) K is a semigroup under multiplication; (3) The operations of addition and multiplication on K are connected by left distributivity. Let us have a look at one example essential for the following, [34]. Let G be a group, Γ being the semigroup of all maps of G into itself. For all σ, τ ∈ Γ and g ∈ G we require that g σ+τ = g σ · g τ , which equips the semigroup Γ also with an addition. With respect to this new operation Γ is a group; we observe just that for each σ ∈ Γ the additive inverse element (−σ) is defined by the formula g −σ = (g σ )−1 . The two operations on Γ are connected with left distributivity (which, generally speaking, is not true for the right distributive law) and so Γ is a near-ring. We distinguish in Γ the subgroup E(G) with respect to addition, generated by all endomorphisms of the group G; the elements of E(G) are called quasi-endomorphisms of the group G. This additive semigroup is closed with respect to multiplication, i.e. E(G) is a quasi-ring. We remark that, for an Abelian group G, E(G) coincides with the ring End G. The class of all pairs (G, Γ) displayed at the beginning of this Subsection, the kernel of which is the zero congruence in Γ, is a variety which we denote by S. Starting with S one can define classes S n , n ≥ 1. By definition, (G, Γ) ∈ S n if there exists in the group G an ascending Γ-admissible invariant series of length ≤ n (28)
G0 < G1 < · · · < Gi−1 < Gi < · · · < Gm = G;
m ≤ n,
such that all pairs (Gi /Gi−1 , Γ), i = 1, 2, . . . , m, lie in S. In other words, the elements of the semigroup Γ act as identity endomorphisms in the factors of the series (28). Several times we have used the fact that for group Γ in (G, Γ) ∈ S n the nilpotency of Γ is a consequence of (G, Γ) being faithful. The example of matrix semigroups shows, however, that in general for a semigroup Γ a similar conclusion is not true. We call a pair (G, Γ) ∈ S n focal (more exactly n-focal) if there exists a quasiendomorphism Θ ∈ E(G) such that the series (28) is Θ-invariant, while for all factors of this series the elements of Γ act as the quasi-endomorphism Θ. Let f : Γ → End G be the morphism of semigroups accompanying the pair (G, Γ). By definition, (G, Γ) is stable (n- stable) if it is focal (n-focal) and the corresponding quasi-endomorphism Θ permutes in the quasi-ring E(Γ) with all differences αf − β f ; α, β ∈ Γ. Let (G, Γ) be a pair. For each Γ-admissible normal subgroup H in G we have the pair (H, Γ), the kernel of which we denote k. Moreover, let us consider the subgroup Z(H) ≤ G, Z(H) = {g ∈ G| ∀h ∈ H, gh = hg}. L EMMA 3.81. For an arbitrary g ∈ G and elements γ1 , γ2 ∈ Γ with γ1 ∼ γ2 (k), one has the relation (g ◦ γ1 )(g ◦ γ2 )−1 ∈ Z(H ◦ γ1 ). P ROOF. We use the following notation: for elements x, y ∈ G and γ ∈ Γ set xy = y −1 xy, [x, γ] = x−1 · (x ◦ γ) and z = (g ◦ γ1 )(g ◦ γ2 )−1 . It is clear that for (G, Γ)
3. Triangular products and stability of representations
93
one has the relation ∀x, y ∈ G, γ ∈ Γ, [xy, γ] = [x, γ]y · [y, γ]; we use this twice in the calculations below. For an arbitrary h ∈ H one has h−1 zh = (g −1 h)−1 · [g, γ1 ] · [g, γ2 ]−1 · (g −1 h) = = [g, γ1 ]g
−1
h
= [h, γ1 ] · [g
· ([g, γ2 ]g
−1
−1
−1
h, γ1 ]
= [h, γ1 ] · ([g −1 hg, γ1 ]g
h −1
)
· [g −1
−1
=
h, γ2 ] · [h, γ2 ]−1 =
· [g −1 , γ1 ])−1 · [g −1 hg, γ2 ]g
−1
· [g −1 , γ2 ][h, γ2 ]−1 =
= [h, γ1 ][g −1 , γ1 ]−1 · [g −1 , γ2 ] · [h, γ2 ]−1 = = [h, γ1 ] · z · [h, γ2 ]−1 , from which the required relation z = (h ◦ γ1 )z(h ◦ γ1 )−1 follows.
T HEOREM 3.82. If a faithful pair is n-stable, then the acting semigroup is nilpotent of class (≤ (n − 1)) in the sense of Mal’cev. P ROOF. The proof will be given by induction over n. For n = 1 the statement of the Theorem is trivial. Let us consider the case n = 2. As (G, Γ) is a faithful pair, we may assume that Γ is a subset of E(G). the elements of Γ being distributive elements of E(G). Take arbitrary α, β and γ in Γ, g in G and let Θ be the quasi-endomorphism in E(G) associated with the 2-stable pair (G, Γ) and set g Θ = h. Then there exists g1 ∈ G1 such that g γβ = g1 h. We have further g α−γ ∈ G1 . Using Lemma 3.81, we see that we have the following computation: g αβγ−γβα = g (αβγ−γβγ)+(γβγ−γβα) = = g (α−γ)βγ · g γβ(γ−α) = g (α−γ)θ · g γβ(γ−α) = = g θ(α−γ) · g γβ(γ−α) = hα−γ · (g1 h)γ−α = = hα · h−γ · (g1γ hγ h−α g1−α ) = = hα · h−γ · (g1γ hγ h−α g1−γ ) = = hα · h−γ (hγ h−α ) = 1, from which it follows that, (G, Γ) being a faithful pair, one has αβγ = γβα. Hence, the identity X1 = Y1 holds in the semigroup Γ, and so it must be 1-nilpotent. Assume that the statement holds true for all m-stable faithful pairs, m < n. Furthermore, let (G, Γ) be an arbitrary faithful n-stable pair. By definition, in the group G there is a series (28) with respect to which Γ acts stably. We introduce in Γ the congruences k1 = Ker (Gn−1 , Γ) and k2 = Ker (G/G1 , Γ). In view of the induction hypothesis, the factor semigroups Γ/k1 and Γ/k2 both lie in the class Nn−2 , which yields Γ/k1 ∩ k2 ∈ Nn−2 , because Γ/k1 ∩ k2 is a subsemigroup of Γ/k1 × Γ/k2 . In the identity Xn−2 = Yn−2 defining (n − 2)-nilpotency of semigroups we give to the variables x, y, u1 , u2 , . . . , un−2 encountered in the left and the right hand side (arbitrary) fixed values in Γ. Let σ and τ be the corresponding values of Xn−2 and
94
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Yn−2 . From Γ/k1 ∩ k2 ∈ Nn−2 it follows that σ ∼ τ (k1 ∩ k2 ). Hence, we have for any g ∈ G the congruence g σ ≡ g τ (mod G1 ) and deduce that g σ−τ = g2 ∈ G1 . For g ∈ Gn−1 we have also g σ = g τ . Furthermore, fix an element γ ∈ Γ. The stability of the action of Γ with respect to the series (28) allows us to carry out the following computation: g σγτ = (g σ )γτ = (g2 · g τ )γτ = g2Θ (g τ γ )τ = g2Θ (g Θ · g −Θ g τ γ )τ = = g2Θ g Θτ · (g −Θ+τ γ )τ = g2Θ g Θτ (g −Θ+τ γ )σ = g2Θ g Θ(τ −σ) · g τ γσ = = g (σ−τ )Θ · g Θ(τ −σ) · g τ γσ = g Θ(σ−τ )+Θ(τ −σ)+τ γσ = g τ γσ . As (G, Γ) is faithful, it follows from this that σγτ = τ γσ. Again, from this we deduce that in Γ we have the identity Xn−2 · un−1 · Yn−2 = Yn−2 · un−1 · Xn−2 , that is Γ ∈ Nn−1 . Remark. The quasi-endomorphism Θ introduced in the definition of the stability of the pair (G, Γ) may not belong to the semigroup Γ. In the special case when Γ is a monoid, the notion of the stability of the pair (G, Γ) acquires the general meaning (in the factors of the series (28) the elements of Γ act identically). In view of Theorem 3.82 we have Γ ∈ Nn−1 . However, one can also make use of the following observation: the series (28) is admissible for the elements of Γ, which, acting identically on the factors of this series, are automorphisms, ([35, p. 222]). Hence, Γ is a cancellative nilpotent semigroup of class n − 1 and so can be viewed as a subsemigroup of a nilpotent group ([28, Theorem 2]). 3. Which properties of a semigroup are equivalent to the absence of zero divisors in its semigroup ring? We recall the necessary definitions. A semigroup S is called a Kaplansky semigroup if from the absence of zero divisors in a ring K follows their absence also in the ring KS. Kaplansky semigroups are, apparently, cancellative: for arbitrary a, b, x ∈ S any of the equations ax = bx or xa = xb gives a = b. The class of cancellative semigroups will be denoted by B, the class of Kaplansky semigroups by K. By an immediate reasoning, via contradiction, one may show that that the class of Kaplansky semigroups is closed with respect to subdirect products. Using the notions of index and period of an element in a semigroup ([4, p. 39]) it is not hard to see that that in a Kaplansky semigroup all cyclic subsemigroups, with at most one exception, are infinite. Furthermore, a semigroup S is called an A-semigroup if for arbitrary two finite subsets F and H there exists a pair of elements a ∈ F , b ∈ H such that from ab = xy, where x ∈ F , y ∈ H, it always follows that x = a, y = b. It is easy to see that linearly ordered cancellative groups (their class will be denoted O) are A-subgroups, and the latter in turn, are Kaplansky semigroups. A semigroup S is called R-semigroup if for each natural number m and elements a, b ∈ S it follows from the relation am = bm that a = b. In the class of R-semigroups one can distinguish a class of E-semigroups. By definition a semigroup S belongs to the class E if any non-empty finite subset F in it contains an element a such that for any natural number k it always follows from the equations ak = f1 f2 . . . fk , where fi ∈ F that f1 = f2 = · · · = fk = a. Generalizing a result of Banachewski [53] we have
3. Triangular products and stability of representations
95
T HEOREM 3.83. For locally nilpotent (in the sense of Mal’cev) semigroups S, the following conditions are equivalent: conditions (1) S is a Kaplansky semigroup; (2) S is an A-semigroup; (3) S is a cancellative R-semigroup; (4) S is a cancellative E-semigroup; (5) S is a cancellative O-semigroup. The proof of this Theorem is based on two lemmata. L EMMA 3.84. Each Kaplansky semigroup which is nilpotent of class n, is embeddable into a torsionfree nilpotent group of class n. P ROOF. Let a Kaplansky semigroup S be nilpotent of class n. Then S ∈ B, and, in view of a theorem of Mal’cev ([28, Theorem 2]) S can be embedded into a group GS of (right) fractions of S which is nilpotent of class n; this group GS is uniquely determined by S up to isomorphism, contains S as a subsemigroup, and each element of GS can be written in the form ab−1 ; a, b ∈ S. We show that the center Z of GS is a torsion-free group. To this end we remark that def if ab−1 ∈ Z then the elements a and b commute. Indeed, in view of [z = ]ab−1 ∈ Z −1 −1 −1 we have (ab )b = b(ab ), i.e. a = bab , whence ab = ba. Next for such a pair of elements a, b ∈ S one has the relation abm = bm a for each natural number m, which yields z m = (ab−1 )m = am b−m . So the relation 1 = z m is equivalent to am = bm . Let us assume that some element equivalent z = ab−1 ∈ Z has finite order m. Then m is the least positively integer such that am = bm . As the elements a and b are permutable, we obtain the relation 0 = (a − b)(am−1 + am−2 b + · · · + abm−2 + bm−1 ) in the ring KS; here K is any cancellative ring with unity. From this equality it follows that 0 = am−1 + am−2 b + · · · + abm−2 + bm−1 , because a − b = 0 and S ∈ K. But the last equality cannot hold true if in its right hand side equality all terms are distinct. Consequently, for some i and j, i < j < m, one must have am−1 bi−1 = am−j bj−1 , from which we deduce, in view of S ∈ B that aj−i = bj−1 . Here 0 < j − i < m and so there arises a contradiction to the choice of m. Our statement about Z is proven. As the center of the group GS is torsion-free, one obtains easily that the group of fractions GS is also torsion-free. Indeed, in a nilpotent group elements of finite order form a normal subgroup which must have a non-trivial intersection with the center. The following lemma is proved in an analogous way. L EMMA 3.85. Each nilpotent R-semigroup of class n can be embedded into a torsion-free nilpotent group of class n. Proof of Theorem 3.83. Our objective shall be to prove for the class L of locally nilpotent semigroups the chain of inclusions O ∩ B ⊂ A ⊂ K ⊂ R ∩ B ⊂ O ∩ B ⊂ E ∩ B ⊂ R ∩ B. It is clear that it suffices to show that K ∩ L ⊂ R, B ∩ R ∩ L ⊂ O and O ∩ B ⊂ E.
96
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Let S be an arbitrary semigroup of the class K∩L, a and b being distinct elements of S. Let also m be a natural number such that am = bm . Consider in S the subsemigroup T generated by the elements a and b. Clearly T is a nilpotent Kaplansky semigroup and so, by virtue of Lemma 3.84, it can be embedded into a torsion-free nilpotent group GT . But torsion-free nilpotent groups are R-groups. Therefore, having the embedding T → GT , it follows from am = bm that a = b. As a consequence, S ∈ R. Next, let S be an arbitrary semigroup in B ∩ R ∩ L, and let T be any of its finitely generated subgroups. Then T is a cancellative R-semigroup. In view of Lemma 3.85 it can be embedded into a torsion-free nilpotent group GT . By a known theorem GT must be a O-group. By the embedding T → GT , we can order T linearly. However, for O-semigroups the local theorem holds true (cf. [44, Theorem 2.4.3]) and so S ∈ O. The required implication is established. Finally, each linearly ordered cancellative semigroup S is an E-group. The proof is carried out by verifying for S the conditions in the definition of E-semigroups. Let F ⊂ S be an arbitrary non-empty finite subset. For |F | = 1 the conditions are trivial. Therefore we assume that |F | ≥ 2, F = {f0 , f1 , . . . , fn }, where, moreover, f0 is the least element in F . The implication f0k = f1 . . . fk =⇒ f0 = f1 = · · · = fk shall be prove by induction over k. We assume that f0 = f1 . Then f0 < f1 , which yields f0k = f0 · f0k−1 < f1 f2 . . . fk , which is a contradiction. As a consequence f0 = f1 . But then it follows from f0k = f1 f2 . . . fk that f0k−1 = f2 . . . fk . So the induction hypothesis gives f0 = f2 = · · · = fk . Our statement is proved, which together all what has been proved also the whole Theorem. 3.3.5. Comments and remarks 5. In a group Γ one can obtain definite information by passing to a factor group Γ/Γ . However, many may be “glued together” for different Γ in this approach. An attempt to invoke Γ /Γ , . . . does not always help, because these factors may have a rather involved construction. Ph. Hall suggested, in 1933, to study, instead of Γ /Γ , . . . , the lower central series of Γ, which joined with the ideas of W. Magnus (1940) led to a series of beautiful developments; [25, Chap. 5]. Thus one has Magnus’ Theorem on residual nilpotence of the free group. In terms of the group Γ and the field K, A. I. Mal’cev [27] completely settled the question concerning ∩n Δn (Γ, K) = 0, which is closely connected with what we have said above. In particular, from his results follows Magnus’ Theorem with a new proof, cf. [22, p. 230]. 2. The question of the powers of the fundamental ideal was taken up anew (now over Z) by Gruenberg [69], who for a noncyclic free group Γ and Σ Γ studied the structure of the lower central series for factor groups Γ/Σ with the object of a deeper understanding of the connections between the commutator structure of Γ/Σ structure and the arithmetical structure of Γ/Σ in view. Essentially relying on “Gruenberg’s Theorem” (cf. Theorem 3.69) he proved that if indΓ Σ < ∞ the intersection of the terms of the lower central series of the groups Γ/Σ equals unity if and only if Γ/Σ is primary. The systematic study of the terminal of groups (the stabilization of the powers Δ(Γ, Z)) was begun by Gruenberg and Roseblade in [71], and, independently, by the present author [13, 14]. The applied technical means for this were different: the computations in [71] were done inside the group algebra itself, while our approach makes use of the language
3. Triangular products and stability of representations
97
and technique of general theory of group representations [35], and the circle of ideas connected with Kaluzhnin’s Theorem [84]. A program for the study of this connection was set up this, was by B. I. Plotkin in [39]. 3. The connections of the theme studied with dimension subgroups is well-known, cf. [96, Chapter 3], and also [51, 92]. In the author’s paper [14] there is given a group theoretic description of the limit group for finite groups. Independently of this, the same description was obtained by Sandling [102]. Based on the technique and results of [71], Hartley [80] gives a complete description of the limit of locally finite groups. An essential role is here played by the abstract characterization of locally finite groups Γ, admitting a faithful representation in a group, where there exist infinite descending invariant descending Γ-stable series (Hartley [75]). In Sections 3.3.1–3.3.3 there was given a closed study of these questions. 4. The search of a proof for semigroups of the inclusion R ∩ B ⊂ A is apparently difficult, because its success would also mean the solution of the known problem of zero divisors [85] in the class of R-groups. Some little hope in the access of this more particular question is based on the following observation. The insulator of an arbitrary non-unity element of an R-group is a torsion-free Abelian group of rank one, and serves as the insulator of an arbitrary non-unity element of it, while the insulators of any two elements of an R-group either coincide or intersect at the unity; [22, p. 413]. At any case, this together with the fact that the absence of zero divisors in a group ring KS of a torsion-free group S is equivalent to the absence in KS of non-zero elements with zero square (cf. [95, p. 176]) allows us to derive, for any R-group, the absence of zero divisors in its group ring with coefficients in a field of characteristic 2. Puuduvad viited: [2] [1] [23] [29] [7] [55] [52] [59] [73] [78] [79] [82] [87] [89] [93] [94] [97] [101] [103] References Publications in Russian. [1] [2]
L. A. Bokut. Associative rings, Vol. I, Novosibirsk, 1977. A. A. Bovdi. The intersection of the powers of the fundamental ideal of an integral group ring. Mat. Zametki 2, 1967, 129–132. [3] A. A. Bovdi. Group rings. University of Uzhgorod, Uzhgorod, 1974. [4] A. H. Clifford and G. B. Preston. The algebraic theory of semigroups. Vol. I., 1961. Russian translation: Algebraic theory of semigroups, 1-2, Moscow, 1972. [5] P. M. Cohn. Free rings and their relations. London Mathematical Society Monographs 2. Academic Press, London, New York, 1971. Russian translation: Mir, Moscow, 1975. [6] The Dnestrovskiˇı tetrad, Novosibirsk, 1976. [7] L. Fuchs. Infinite abelian groups, Vol. 1, 2. Academic Press, New York, 1970, 1973. Russian translation: Mir, Moscow, 1974 (Vol. 1), 1977 (Vol. 2). [8] A. S. Ginberg. On multiplication of varieties of pairs. Sib. Mat. Zh. 14 (6), 1973, 1207–1215. [9] V. M. Glushkov. Abstract theory of automata. Usp. Mat. Nauk 16 (5), 1961, 3–62. [10] L. M. Gluskin. Semigroups and rings of endomorphims of linear spaces. Izv. Akad. Nauk SSSR, Ser. Math. 23, 25, 1959, 1961, 841–870, 809–814. [11] U. Kaljulaid. On the absence of zero divisors in certain semigroup rings. Acta Comm. Univ. Tartuensis 281, 1971, 49–57. (see [K71a]). [12] U. Kaljulaid. On the absence of zero divisors in some semigroup rings. In: All Union Colloquium of Algebra, Kishinev, 1971, 138–139. (see [K71c]).
98
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
[13] U. Kaljulaid. On the powers of the augmentation ring of the integral group ring for finite groups. Acta Comm. Univ. Tartuensis 281, 1971, 58–62. (see [K71b]. [14] U. Kaljulaid. On the powers of the augmentation ideal. Proc. Estonian Acad. Sci. Phys. Math. 22, 1973, 3–21. (see [K73c]). [15] U. Kaljulaid. On wreath type constructions for algebras. In: Abstracts of the Third All Union Symposium of Rings, Algebras and Modules, Tartu, 1976, 49–50. (see [K76]). [16] U. Kaljulaid. Triangular products of representations of semigroups and associative algebras. Uspehi Mat. Nauk 32, no 4/196, 253-254, 1977, 253-254. (see [K77a] and Sec. 2). [17] U. Kaljulaid. Remarks on the varieties of semigroup representations and automata. Acta Comm. Univ. Tartuensis 431, 1977, 47–67. (see [K77b]). [18] R. Kalman, P. L. Falb, and M. A. Arbib. Topics in mathematical system theory. McGraw-Hill Book Co., New York, Toronto, Ont., London, 1969. Russian translation: Mir, Moscow, 1971. [19] M. I. Kargapolov and Yu.I. Merzlyakov. General group theory, Moscow, 1972. English translation: Fundamentals of the theory of groups, translated from the second Russian edition by Robert G. Burns. Graduate Texts in Mathematics, 62. Springer-Verlag, New York, Berlin, 1979. [20] A. I. Kostrikin. Introduction to algebra. Nauka, Moscow, 1977. English translation: Springer-Verlag, New York-Berlin, 1982. [21] A.G. Kurosh. Lectures on general algebra. Fizmatgiz, Moscow, 1962. English translation: Chelsea Pub. Co., New York, 1965. [22] A.G. Kurosh. Theory of groups, 1967. English translation: Vol. 1-2., Chelsea Pub. Co., New York, 1979. [23] J. Lambek. Lectures on rings and modules (with an appendix by Ian G. Connell). Blaisdell Pub. Co., Waltham, Toronto, London, 1966. Russian translation: Rings and modules, Mir, Moscow, 1971. [24] E. S. Lyapin. Semigroups. Fizmatgiz, Moscow, 1960. English translation: In Translations of Mathematical Monographs, Vol. 3, American Math. Soc., 1963. [25] W. Magnus, A. Karras, and D. Soltar. Combinatorial group theory. Presentations of groups in terms of generators and relations, 1976. Russian translation: Combinatorial theory of groups, Nauka, Moscow, 1974. [26] A. I. Mal’cev. On the embedding of associative system in groups. Mat. Sbornik 6, 8, 1939, 1940, 311– 336, 251–264. [27] A. I. Mal’cev. Generalized nilpotent algebras and their associated groups. Mat. Sb., Nov. Ser. 25, 1949, 347–366. [28] A. I. Mal’cev. Nilpotent semigroups. Uch. Zap. Ivanovskogo Pedinstituta 4, 1953, 107–111. [29] A. I. Mal’cev. On some classes of infinite solvable groups. Mat. Sb., Nov. Ser 28 (3), 1951, 567–588. [30] A. I. Mal’cev. On the multiplication of classes of algebraic systems. Sib. Mat. Zh. 7, 1967, 346–365. [31] Mathematical Encyklopedia, I. Edited by Vinogradov, I. M., Sovetskaya Encyklopedia, Moscow, 1976. [32] M. B. Menskiˇı. The method of induced representations: space-time and the particle concept, Moscow, 1976. [33] A. V. Mikhalev. Isomorphisms of semigroups by endomorphisms of modules. Algebra i Logika 5, 6 (5, 2), 1966, 1967, 59–67, 35–48. [34] H. Neumann. Varieties of groups. Springer-Verlag, New York, 1967. Russian translation: Mir, Moscow, 1969. [35] B. I. Plotkin. Groups of automorphisms of algebraic systems. Nauka, Moscow, 1966. a tes Zin¯ atniskie raksti [36] B. I. Plotkin. The triangular product of pairs. P.Stuˇckas Latvijas Valsts universit¯ (Acta Universitatis Latviensis) 151, 1971, 140–170. [37] B. I. Plotkin. Radicals and varieties of representations of groups. Latvian mathematics yearbook 10, 1972, 75–131. [38] B. I. Plotkin. Group varieties and varieties of pairs connected with group representations. Sib. Mat. Zh. 13 (5), 1972, 1030–1053. [39] B. I. Plotkin. Remarks on stable representations of nilpotent groups. Transactions of the Moscow Math. Soc. 29, 1973, 191–205. [40] B. I. Plotkin. Radicals in groups, operations on groups and radical classes. In: Book in memory of A.I. Mal’cev, Novosibrisk, 1973. English translation: Am. Math. Soc., Ser. 2, 119, 1983, 89-118. [41] B. I. Plotkin. Varieties of group representations. Usp. Mat. Nauk 32 (5), 1977, 3–68. English translation: Russian Math. Surveys 32 (1977), no. 5, 1–72. [42] B. I. Plotkin, C. E. Dididze, and E. M. Kublanova. Varieties of automata. Dokl. Akad. Nauk SSSR 221 (6), 1975, 1284–1287.
3. Triangular products and stability of representations
99
[43] B. I. Plotkin and A. S. Grinberg. On groups of varieties and varieties of pairs connected with group representations. Sib. Mat. Zh. 13 (4), 1972, 841–858. [44] A Robinson. Introduction to model theory and to the metamathematics of algebra. North-Holland Publishing Co., Amsterdam, 1963. Russian translation: Nauka, Moscow, 1967. [45] V. I. Shestakov. On a universal method of a symbolic representation of cascadic two step chains. Vestnik Mosk. Univ., Ser. III 18, 1977, 11–19. [46] L. A. Skornyakov. On the homological classification of monoids. Sib. Mat. Zh. 10 (5), 1969, 1139–1043. [47] B. I. Spasskiˇı and A. V. Moslovskiˇı. Quantum physics and the dilemma of near action and action on distance. Vestnik Mosk. Univ., Ser. III 18, 1977. [48] D. I. Suprunenko. Matrix groups, 1972. English translation: (Monographs 45.) American Mathematical Society, Providence, R.I., 1976. [49] S. M. Vovsi. Semigroup of prevarieties of linear representations of groups. Mat. Sb. (N.S.) 93 (135), 1974, 405–421. [50] H. Weyl. The classical groups. Their invariants and representations. Princeton University Press, Princeton, N.J., 1939. Russian translation: Gos. Izdat. Inostr. Lit., Moscow, 1947. [51] A. E. Zaleskiˇı and A. V. Mikhalev. Group rings. In: Contemporary Mathematics, 2. VINITI, Moscow, 1973, 5–118. English translation: J. Sov. Math. 4, 1–78, 1975.
Publications in English. [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74]
S. Amitsur. The T -ideals of the free ring. J. London Math. Soc. 30, 1955, 470–475. B Banachewski. On proving the absense of zero divisors for semigroup rings. Canad. Math. Bull. 4, 1961, 225-231. A. O. Barut and R. Racska. Theory of group representations and applictions. PWN – Polish Scientific Publishers, Warsaw, 1977. Second revised editoin: World Scientific, Singapore, 1986. G. Baumslag. Lecture notes on nilpotent groups. In: Regional Conference Series in Mathematics 2. Am. Math. Soc., Providence, R.I., 1971. G. Bergman and J. Lewin. The semigroup of ideals of a fir is (usually) free. J. London Math. Soc. 11 (2), 1975, 21–31. G. Birkhoff. The role of algebra in computing. In: Computers in algebra and number theory, SIAM-AMS Proc., Amer. Math. Soc., Vol. IV, 1971, 1 – 47. G. Birkhoff. Current trends algebra. Am. Math. Monthly 88, 1973, 760–762. L. S. Bobrow and M. A. Arbib. Discrete mathematics: applied algebra for computer and information science. W.B. Saunders, Philadelphia, 1974. J. Buckley. On the D-series of a finite group. Proc. Am. Math. Soc. 18, 1967, 185–186. J. Buckley. Polynomial functions and wreath products. Illinois J. Math. 14, 1970, 274–282. P. M. Cohn. Factorization in general rings and strictly cyclic modules. J. Reine Angew. Math. 239/240, 1970, 185–200. I. G. Connell. On the group ring. Canad. J. Math. 15, 1963, 650–685. M. J. Dunwoody. On product varieties. Math. Zeit. 104, 1968, 91–97. S. Eilenberg. Algebraic problems in the theory of automata. In: Spezialtagung über algebraische Structuren und ihre Anwendungen, Potsdam, 1970. S. Eilenberg. Automata, languages and machines, Vol. A, B. Academic Press, New York, London, 1947, 1976. E. Formanek. A short proof of a theorem of Jennings. Proc. Am. Math. Soc. 26, 1970, 405–407. A. Fröhlich. Distributively generated near-rings. Proc. London Math. Soc. 8, 1958, 76–108. K. W. Gruenberg. The residual nilpotence of certain presentations of finite groups. Arch. Math. 13, 1962, 408–417. K. W. Gruenberg. Cohomological topics in group theory. In: Lecture Notes in Mathematics, Vol. 143. Springer-Verlag, Berlin, New York, 1970. K. W. Gruenberg and J. Roseblade. The augmemtation terminal of certain locally finite groups. Can. J. Math. 24, 1972, 221–238. P. Hall. Finiteness conditions for soluble groups. Proc. London Math. Soc. 4, 1954, 419–436. P. Hall. Some sufficient conditions for a group to be nilpotent. Illinois J. Math. 2, 1958, 787–801. P. Hall and B. Hartley. The stability group of a series of semigroups. Proc. London Math. Soc. 16, 1966, 19–39.
100
[75] [76] [77] [78] [79] [80] [81] [82] [83] [84]
[85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105]
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
B. Hartley. Locally finite groups embedded in stability groups. J. Algebra. 3, 1966, 187–205. B. Hartley. The stability group of a descending invariant series of semigroups. J. Algebra 5 (2), 1967, 133–156. B. Hartley. The residual nilpotence of wreath products. Proc. London Math. Soc. 20, 1970, 365–392. B. Hartley and D. McDougall. Injective modules and soluble groups satisfying the minimal condition for normal subgroups. Bull. Austral. Math. Soc. 4, 1971, 113–135. B. Hartley. A class of modules over a locally finite group. Bull. Austral. Math. Soc. 14, 1976, 95–110. B. Hartley. Augmentation powers of locally finite groups. Proc. London Math. Soc. 32, 1976, 1–24. M. Henle. Dissection of generating functions. Studies in Appl. Math. 51, 1972, 397–410. N. Jacobson. The structure of rings. Am. Math. Soc., Providence, RI, 1964. Revisited edition. S. A. Jennings. The group ring of a class of infinite nilpotent groups. Canad. J. Math. 7, 1955, 169–187. L. Kalujnin (Kaluznin). Über gewisse Beziehungen zwischen lineare Gruppen und ihren Automorphismen. In: Bericht über die Mathematiker-Tagung in Berlin. Deutscher Verlag der Wissenschaften, Berlin, 1953, 164–172. I. Kaplansky. “Problems in the theory of rings” revisited. Am. Math. Monthly 77 (5), 1970, 445–454. A. Kerber. Representations of permutation groups, I, II. Lect. Notes in Math. 240, 495, 1971, 1975. J. Knopfmacher. Abstract analytic number theory, (North-Holland Mathematical Library, 12). NorthHolland Publishing Co., 1975. Second edition: Dover Publications, Inc., New York, 1990. S. Mac Lane. Extensions and obstructions for rings. Illinois J. Math. 2, 1958, 316–345. I. Mohamed. On series of subgroups related to groups of automorphisms. Proc. London Math. Soc. 13, 1963, 711–723. T. S. Motzkin and O. Taussky. On representations of finite groups. Nederl. Akad. Wetensch. Proc. Ser. A 55 (5), 1952, 511–512. I. Passi. Polynomial maps on groups. J. Algebra 9 (2), 1968, 121–151. I. Passi. Dimension subgroups. J. Algebra 9 (2), 1968, 152-182. D. Passman. Infinite group rings (Pure and Applied Mathematics 6). Marcel Dekker, Inc., New York, 1971. D. Passman. Advances in group rings. Israel J. Math. 19 (1–2), 1974, 67–107. D. Passman. What is a group ring?. Am. Math. Monthly 83 (3), 1976, 173–184. D. Passman. The algebraic structure of group rings. Wiley-Interscience, New York, 1977. D. Robinson. Finiteness conditions and generalized soluble groups, Part 2. (Ergebnisse der Mathematik und ihrer Grenzgebiete, Band 63). Springer-Verlag, Berlin, Heidelberg, New York, 1972. G.-C. Rota. On the foundations of combinatorial theory I. Theory of Möbius functions. Z. Wahrscheinlichkeitstheorie 2, 1964, 340–368. G.-C. Rota. Baxter algebras and combinatorial identities. Bull. Am, Math. Soc. 75, 1969, 325–334. G.-C. Rota. On the combinatorics of the Euler characteristic. In: Studies in Pure Mathematics. Academic Press, London, 1971, 221–233. W. Rudin and H. Schneider. Idempotents in group rings. Duke Math. J. 31, 1964, 585–602. R. Sandling. Note on the integral group ring problem. Math. Z. 124, 1972, 255–258. R. Sandling. Dimension subgroups over arbitrary coefficient rings. J. Algebra 21, 1972, 250–265. E. Schenkman. The splitting of certain solvable groups. Proc. Am. Math. Soc. 6, 1955, 286–290. P. F. Smith. On the intersection theorem. Proc. London Math. Soc. 21, 1970, 385–389.
101
4.
[K79b] Triangular products and stability of representations. (Author review of Candidate thesis in Physico-Mathematical Sciences) Translation revised by B. I. Plotkin
The urgency of the theme.24 The group representation is an important mathematical notion with many applications outside algebra. Representations of associative algebras and Lie algebras, semigroups and other algebraic objects are also studied. In the classical theory much attention was given to the problem of decomposition of representations, and the corresponding results on irreducible linear representations of groups and algebras play a great role in algebra. Parallel to the traditional computational apparatus, varieties and the bi-identities of representations as two-sorted systems were applied for the study of individual representations and their systematization. One of the ways for reduction of classes of systems to simpler ones is the introduction and study of composition of classes (for a general formulation, see A. I. Mal’tsev [7]). As a forerunner of this approach there is a well-known result of A. L. Shmel’kin and H. Neumann (1962) on multiplication in the variety of groups (freedom of their semigroups), where a fundamental technical role is played by the wreath product of groups. In the case of linear representations of groups wreath product are replaced by the triangular product of representations, introduced by B. I. Plotkin (1971), which allows one to reduce arbitrary varieties with a changing the group to indecomposable ones, [10]. Questions of decomposition of varieties and defining them by means of identities have led to new applications until now; [15], [21], etc. The object of this thesis is the decomposition of varieties of semigroups and of algebras, and further, the study of the augmentation ideal25 of the integral group ring. The introduction of the triangular product of linear representations of semigroups and associative algebras makes it possible to prove decomposition theorems about their varieties, which, in turn, will be used in the study of linear automata and algebras. The connection of the results of the dissertation with the theory of varieties of algebras opens up a possibility for a new approach to some problems in this active domain, studied by Soviet, as well as foreign authors. In particular, this makes it possible to find the ideal of identities of upper triangular matrices over an arbitrary field; cf. Problem 109 in [2]. From the very beginning on, in the theory of group representations an important role was played by the group rings, that constitutes now an intensively evolving branch of 24 Editors’ note. According to Soviet tradition all Candidate Dissertations were presented as manuscripts. However, Author Reviews based on the Dissertation (maximal length 16 pages) were published before the defence. Besides a description of the paper’s main result such a review was supposed to contain a special chapter on the importance of the paper; on its novelties and the possibility to make applications; further information about the place and time of the defence; the name of the opponents; and of the so-called Leading Institution, that was supposed to have been acquainted beforehand with the Dissertation given it its approval. This Leading Institution was as a rule a scientific establishment, one of whose principal scientific directions of research was connected with the Dissertation’s theme. The Department, where the work had been done, was not allowed to be the Leading Institution. 25 Translators’ note. Throughout the translation the term fundamental ideal, in the Russian original, has been replaced by augmentation ideal, which is customary in Western literature.
102
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
algebra (the recent survey [12], the lectures [1] and the book [20] are entirely devoted to this topic). Such a rapid development was to a great extent stimulated by the problem of Kaplansky and Mal’tsev on group rings. The theme of the third chapter of the paper under review is the application of tools of representation theory to the study of the stabilization of the series of powers of the augmentation ideal in an integer group ring. Such a statement of the problem arises from the conditions implying the triviality of the intersection of the finite powers of the augmentation. The beautiful and deep work of A. I. Mal’tsev [5] and K. Gruenberg [16] are devoted to this subject. The main results of the dissertation are contained in the Theorems 3.33, 3.43, 3.49 and 3.65, 3.71, 3.74. The goal of this research. The aim of this dissertation is to study of questions related to the decomposition of the varieties of linear representations and, further, the application of the construction of triangular product arising in this connection to the study of varieties of algebras, and to the terminal and the limit of groups. Scientific novelty and practical significance. The main results of this paper are new. We introduce triangular products of representations of semigroups and algebras, investigating their properties and applications to the problem of decomposition of the varieties of the corresponding representations, to the problem of the description of indecomposable varieties of algebras, and further to the determination the ideal of identities of the algebra of triangular matrices. We study the connection of the results obtained with automata theory. An approach to the triangular product and the technique of stable representations different from direct computational methods is developed in this dissertation. This approach is based on triangular products and on the technique of stable representations. The paper has a theoretical character. Its results can be use in the theory of varieties of algebras, to the study of group and semigroup rings, and further in automata theory. Approval of the thesis. The results of the dissertation were presented at the All Union Algebraic Colloquium (Kishinev, 1971), at the XI All Union Symposium on Ring Theory, Algebras and Modules (Kääriku, 1976); at the Algebraic Seminars of Tartu and Riga (1977); at the Seminars of Higher Algebra and Rings and Modules at Moscow State University; at the Minsk Algebra Seminar and the Combined Seminar of the Department of Algebra and Number Theory of the Latvian State University and the Laboratory of Algebraic Methods of LOMI (1978). The material of the first two chapters was used in lecture courses in automata theory, which the author read twice at Tartu State University; the main aspects of this course were set forth at the Third Regional Conference-Seminar of leading lecturers of mathematics of the Belorussian, Latvian, Lithuanian, Estonian Soviet Republics and the Kaliningrad Oblast of the Soviet Union (Minsk, 1977). Size of the thesis. The thesis comprises 142 pages, and has three chapters consisting of 18 sections. The bibliography carries 105 items.
The Contents of the Thesis. In Section 3.126, which has a preparatory character, we introduce the operation of the triangular product for representations of semigroups and algebras, and study their properties and the connection with the triangular product 26 Editors’ note. Throughout this paper, the references to corresponding section numbers in this volume are used instead of original ones. For example, Section 3.1 is referred as Chapter 1 of the Dissertation itself.
4. Triangular products and stability
103
for groups. By definition, a representation (G, Γ) is the triangular product of the subrepresentations (A, Σ1 ) and (B, Σ2 ) if: (1) for the subgroup Σ = {Σ1 , Σ2 } ≤ Γ, the representation (G, Σ) decomposes into the direct product of its subrepresentations (A, Σ1 ) and (B, Σ2 ); (2) in the group Γ there exists a normal divisor Φ such that the subrepresentation (G, Φ) is faithful, and the image of Φ in Aut G coincides with the centralizer of the series 0 ⊂ A ⊂ G; (3) the group Γ coincides with the semi-direct product Φ Σ. Next, we give a survey of the contents of Section 3.1. Let there be fixed an arbitrary associative and commutative ring K with unity, a Kmodule G and a semigroup Γ. By a representation of the semigroup Γ we understand a two-sorted system (G, Γ), where there is defined a composition G × Γ → G denoted by ◦ with the following properties: (1) for a fixed γ ∈ Γ the map g → g ◦ γ is a K-endomorphism of the module G, and (2) for all g ∈ G and γ1 , γ2 ∈ Γ there holds the identity g ◦ (γ1 γ2 ) = (g ◦ γ1 ) ◦ γ2 . In order to indicate the variable character of the semigroup Γ and the balance of the roles of G and Γ, we introduce for (G, Γ) the term “pair”. By a morphism of pairs μ : (G, Γ) → (G , Γ ) we mean a couple of two homomorphisms μ : G → G and Γ → Γ subject to the condition ∀g ∈ G,
γ ∈ Γ,
(g ◦ γ)μ = g μ ◦ γ μ .
For this category of pairs one introduces, similarly to the case when Γ is a group (cf. [9, Chapter 1]), a series of notions: kernel of a pair, congruence of a pair, subpair, Cartesian product of pairs, Birkhoff class of pairs etc. In a similar way, one defines pairs where the acting object Γ is an associative algebra. Representations by module endomorphisms of semigroups and algebras is a classical object of study, the interest of which still prevails; [3], [8] etc. Let us mention the definition of the triangular product for representations of semigroups and algebras. For representations of the semigroups (A, Σ1 ) and (B, Σ2 ) we interpret the semigroup Φ = Hom+ K (B, A) as the centralizer of the sequence 0 ⊂ A ⊂ A ⊕ B in the semigroup End(A ⊕ B). The natural action of the semigroups Σ1 and Σ2 on Φ makes it possible to define a multiplication on the set Φ × Σ1 × Σ2 , (ϕ1 , σ1 , σ2 ) · (ϕ1 , σ1 , σ2 ) = (σ2 · ϕ + ϕ · σ1 , σ1 σ1 , σ2 σ2 ). There arises the semigroup Γ = Φ (Σ1 × Σ2 ); it acting on G = A ⊕ B according to the formula (a + b) ◦ (ϕ, σ1 , σ2 ) = bϕ + a ◦ σ1 + b ◦ σ2 , leads to a representation (G, Γ), called the triangular product of the given representations and is denoted (A, Σ1 ) (B, Σ2 ). Among the properties of this construction proved in Section 3.1 we mention the following propositions.
104
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
P ROPOSITION 3.8. Let there be given an arbitrary faithful pair (G, Γ), a Γ-submodule A of G, and Σ1 , Σ2 the semigroups of endomorphisms induced by Γ in A and G/A respectively. Then the pair (G, Γ) can be embedded as a subpair in the triangular product (A, Σ1 ) (G/A, Σ2 ). Let there be given K-algebras Φ and Σ∗ , where Σ∗ acts from the left and from the right on Φ, these action commuting with each other and making Φ a bimodule algebra in the sense of Hochschild. On Γ∗ = Φ ⊕ Σ∗ we keep the definition there of addition and multiplication by scalars, but define multiplication anew, setting (ϕ, σ) · (ϕ , σ ) = (ϕ · σ + σ · ϕ + ϕϕ , σσ ). There arises the K-algebra Φ Σ∗ – the semidirect product of Φ and Σ∗ . For given pairs (A, Σ1 ) and (B, Σ2 ), where the Σi are K-algebras, let (A, Σ∗1 ) and (B, Σ∗2 ) be the corresponding faithful pairs and set G = A ⊕ B. We treat Σ∗ = Σ∗1 ⊕ Σ∗2 as a subalgebra and Φ = HomK (B, A) as the annihilator of the series 0 ⊂ A ⊂ G in EndK G. Multiplication in EndK defines the left and right action of Σ∗ on Φ, these actions intertwine and give a bimultiplication on Φ. Setting Σ = Σ1 ⊕ Σ2 , we obtain a natural epimorphism f : Σ → Σ∗ which allows us to “understand” the action of Σ∗ on Φ as an action of Σ on Φ. We arrive at the K-algebra Φ Σ = Γ, the action of which on G = A ⊕ B is defined by the formula (a + b) ◦ (ϕ, σ) = bϕ + (a + b) ◦ σ. This action agrees with the operations in Γ. There arises the pair (G, Γ), which is the triangular product of the pairs of representations of the algebras (A, Σ1 ) and (A, Σ2 ). We likewise denote this pair by (A, Σ1 ) (B, Σ2 ). One can speak of a cryptomorphism (in the parlance of G. Birkhoff) of the “theories” of the three noted constructions in the limits of the list of properties, which are exploited in the proofs in the following chapter. Apparently, the reason for this phenomenon is the existence of a general (category-theoretic) construction, whose presentations are the three ones indicated. As an example of this correlation we mention the following. P ROPOSITION 3.21. Let there be given the pairs of semigroup representations (A, Σ1 ) and (B, Σ2 ), and let (G, Γ) be their triangular product. The acting semigroup Γ is a group if and only if Σ1 and Σ2 are groups and the semigroup Φ = Hom+ K (B, A) can be treated as a group. If this condition is fulfilled the pair (G, Γ) is isomorphic to the triangular product of (A, Σ1 ) and (B, Σ2 ) viewed as group pairs. The list of properties just noted, of the construction introduced, is applicable if K is a field. This requirement relates also to the results of Section 3.2. The following two chapters of our thesis are devoted to applications of the tools indicated. Such a structure of presentation of the material has been chosen in order to underline the independent value of the notions introduced, besides what is indicated of it in this paper. Section 3.2 is devoted to the arithmetic properties of classes of linear representations (over the field K) of semigroups and likewise algebras. The main result of the Section is the “formula of generating of representations” Var(K1 ) · Var(K2 ) = Var(K1 K2 ), which is valid for arbitrary classes K1 and K2 of representations of semigroups (representations of algebras). This is the content of Theorems 3.33 and 3.43 in the dissertation;
4. Triangular products and stability
105
an essential part of the corresponding proofs consist of an analysis of of the form of the bi-identities satisfied by triangular products of pairs. As an application we obtain facts about the structure of the semigroups of the corresponding varieties of representations. We now pass to the definitions required. A variety of representations of semigroups (algebras) is a saturated Birkhoff class of corresponding pairs. By definition, a class K is saturated if for any epimorphism of pairs (G, Γ) → (G, Γ ) it follows from (G, Γ ) ∈ K that (G, Γ) ∈ K. The variety generated by the class of pairs K will be denoted Var(K). Multiplication of the varieties Θ1 and Θ2 is defined by the rule: The pair (G, Γ) is contained in Θ1 · Θ2 , if there is in G an invariant submodule H such that (H, Γ) ∈ Θ1 and (G/H, Γ) ∈ Θ2 . There arises a semigroup M(K) (a semigroup L(K)) of varieties of semigroups (algebras). T HEOREM 3.35. Each variety of linear representations (over the field K) can be uniquely decomposed into a product of finitely many indecomposable varieties. Here the indecomposability of a variety means that it cannot be written as the product (in the semigroup M(K)) of two non-trivial factors. In this Section we introduce and study the semigroup of varieties of linear automata. A linear automaton is a partial extension of linear systems. The exact definition reads as follows. A linear semigroup automaton A = (A, Γ, B) is a 3-sorted algebraic system where A (the states) and B (the outputs) are K-modules, Γ (the inputs) a semigroup operations, and there are given K-linear operations A ◦ Γ → A and A ∗ Γ → B such that (A, Γ) is a pair (linear) with respect to the action ◦ and one has a ∗ (γ1 γ2 ) = (a ◦ γ1 ) ∗ γ2 for all a ∈ A and γ1 , γ2 ∈ Γ. T HEOREM 3.37. The semigroup of varieties of linear automata (over K) is not free but contains a maximal free subsemigroup isomorphic to M(K). In the case of algebras the formula of generators of representations gives information about the semigroup L(K). We establish a fact analogous to Theorem 3.35 just describe likewise for varieties of representations of algebras. From this we derive, using the known connection between T -ideals and varieties of algebras, on the one hand, and the connection of the last objects with varieties of representations of algebras, on the other hand, the following. T HEOREM 3.46 (Bergman-Lewin [14]). The semigroup of T -ideals of a free countably infinitely generated K-algebra is free. This approach does not only include the Bergman-Lewin theorem and its proof in a single line with the results on varieties of representations of groups, but gives also supplementary information about varieties of algebras, which are hard to discover in the language of T -ideals. T HEOREM 3.49. If the K-algebra A is semi-simple (in the sense of Jacobson), then the algebra of varieties var A generated by A is indecomposable. Turning to the connection between T -ideals and varieties, we obtain from the formula of generators of representations the following. T HEOREM 3.52. The ideal Tn of identities of the algebra of upper triangular matrices of order n over the field K coincides with T1n , where T1 is the ideal of identities of K. For char K = 0 this statement reduces to the well-known result of Yu. N. Mal’cev (1971), to the effect that the ideal Tn is generated by the polynomials [x1 , x2 ] · [x3 , x4 ] ·
106
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
. . .·[x2n−1 , x2n ], where [x, y] = xy −yx, while for char K > 0 it constitutes the answer to Question 3 in [2]. Section 3.3 contains results obtained by application of the technique of the theory of representations (in particular, their triangular product) to the study of the stabilization of the powers of the augmentation ideal of the integral group ring. Let ZΓ be an integral group ring of the group Γ. The fundamental ideal Δ in the ring ZΓ is the kernel of the homomorphism ZΓ → Δ. Let us set Δν = Δν−1 · Δ for ν not a limit ordinal, and Δν = ∩μ<ν Δμ for ν a limit ordinal. In particular, Δω = ∩n Δn ; here the letter ω stands for the first infinite ordinal. There arises the decreasing system of ideals (29)
ZΓ ⊃ Δ ⊃ Δ2 ⊃ · · · ⊃ Δν ⊃ Δν+1 ⊃ . . .
In this Section we investigate the stabilizing index (as a function of ν) of the series (29), that is, the number τ such that from this number on one has Δτ = Δτ +1 = . . . We shall use the following notation and terminology: τ = τ (Γ) is the terminal of the group Γ; Dν = Γ ∩ (1 + Δν ) is the ν-th (generalized) dimensional subgroup of Γ; D∞ = Γ ∩ (1 + Δτ (Γ) ) is the limit dimensional subgroup (shorter, the limit) of Γ. The acting object Γ of the pair (G, Γ), studied in Sections 3.1–3.3, is a group, while the domain of action G is a Z-module. In other words, here we have a pair which is a group representation by automorphisms of an Abelian group. Let G0 = G, and let G1 = [G, Γ] be the Z-module generated in G by all elements [g, γ] = −g + g ◦ γ, g ∈ G, γ ∈ Γ, and define by induction Gν = [Gν−1 , Γ] for a non-limit ν, and Gν = ∩μ<ν Gν for a limit ordinal number ν. The series (30)
G ⊃ G1 ⊃ · · · ⊃ Gν ⊃ Gν+1 ⊃ . . .
is called the lower stable series of the pair (G, Γ). For example, for the regular pair (ZΓ, Γ) the series (30) coincides with (29). Together with Γ, the whole ring ZΓ acts on G; in particular, we can write [g, γ] = −g ◦ (γ − 1). For ν = 1, 2, 3, . . . we have Gν = G ◦ Δν ; for infinite ν, we have however only G ◦ Δν ⊂ Gν . If σ is the stabilizing index of the series (30), then we say also that σ is the length of the series (30). T HEOREM 3.65. If in the pair (G, Γ), the group Γ is nilpotent, while the module G contains a Γ-Artinian Γ-submodule D such that G/D is Γ-Noetherian, then the length of the lower stabilizing series of the pair does not exceed ω. From Theorem 3.65 one derives as consequences the following results. T HEOREM 3.66 (B. I. Plotkin). If in the pair (G, Γ), the group Γ is nilpotent, while the module G is Γ-Noetherian, then the length of the lower stabilizing series of the pair has length not exceeding ω. T HEOREM 3.67 (P. Smith). The terminal of a Noetherian nilpotent group equals ω. T HEOREM 3.68. The terminal of a complete Artinian Abelian group equals two. If Γ is a non-complete Artinian group, then τ (Γ) = ω. The first statement of Theorem 3.68 is mentioned for the completeness of the picture. This is known, as well as all facts concerning groups with a finite terminal; see e.g. [17]. Below we shall call non-trivial only the case of groups with infinite terminal. The triangular product and the possibility mentioned above to interpret the series (29) as the lower stable series of the regular pair (ZΓ, Γ) leads to clear up the issue of the possible values of the terminal in the class of finite groups: in the non-trivial case these
4. Triangular products and stability
107
are all ordinal numbers τ subject to the condition ω ≤ τ < ω2. This unexpected result agrees with the more general result of Gruenberg and Roseblade [17] on the terminal of a class of locally finite groups, but was obtained independently in the author’s papers [25, 26]. The facts about the terminals of finite groups follow as consequences from the main results of this Section formulated as Theorems 3.71 and 3.74. Let us pass to set them forth. For given pairs (A, P ) and (B, Q) let us consider their triangular product (G, Γ) = (A, P ) (B, Q) and let Gν be the terms of the lower stable series for the pair (G, Γ). Furthermore, let B ∗ be the semigroup of all Q-invariant points of B, setting Bk = [B, Q; k] and Ak = [A, P ; k], k ∈ N. Let us fix two prime numbers p and q, p = q, and impose the following conditions on the pairs (A, P ) and (B, Q): (a) A is an Abelian p-group, while the pair (A, P ) is finitely stable; more exactly An−1 = 0 = An for some n ∈ N. (b) The Abelian groups B and B/B1 are free, B ∗ appears as a direct summand in ∞
B, all B1 /Bk are q-groups and ∩ Bk = 0. k
In these notation and assumptions the following holds true. T HEOREM 3.71. One has the relations Gω = A and Gω+n−1 > Gω+n = 0; if we assume in addition that A is a vector space over a field of characteristic p and Q is a finite q-group, then one has G ◦ Δω = A. This result is supplemented by the following. T HEOREM 3.74. Let there be given a representation (G, Γ) of the group Γ by automorphisms of the Z-module G, whose periodic part B is Γ-Artinian, the factor module G/B being Γ-Noetherian. Suppose that all metanilpotent factor groups of Γ are periodic and that the factor group Γ/ n Γn is nilpotent. Then, if in G, one has a Γ-stable decreasing series of length ≤ ωn (n ∈ N) descending to zero, then the lower stable series of (G, Γ) stabilizes to zero for a term of number < ω2. On the basis of the results indicated on a terminal a group-theoretic description of the limit of finite groups is obtained. A major role in this is played by the class of N (2) groups: these are subdirect products of biprimary AN -groups (as usual, we denote by A the class of Abelian groups and by N the class of nilpotent groups). Each N (2) -group admits an faithful representation in a finitely generated Abelian group whose lower stable series has a length not exceeding ω + n for some n ∈ N. In this dissertation there is given a simple proof of a result of Hartley [18], which amounts to a reduction of N 2 -groups to groups of a rather special form, for which the required representation is constructed with the aid of the triangular product. The description of the limit is given by the following. T HEOREM 3.80. The limit of a finite group with an infinite terminal is the least of its normal divisors with the property that factor group is an N 2 -group. This result is contained in the author’s paper [26]. In a somewhat different form it was obtained by R. Sandling [22], and later by Hartley [19], who extended it to locally finite groups. Let us, however, add that our knowledge of the terminal of locally finite groups was still rather fragmentary; cf. [17] and [19]. Questions about the terminal and the limit can be formulated also for semigroups. In Section 3.3.4 there is a theorem showing that the most used facts in this Section – that the
108
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
finite stability of a faithful group action implies its nilpotence – extends to semigroups. Let us introduce the classes S n (n > 4) of pairs-representations (G, Γ), where the semigroup Γ is represented by quasi-endomorphisms of an (arbitrary) group G. By definition, (G, Γ) ∈ S n if in the group G there exists an increasing Γ-admissible invariant series (31)
1 = G0 < G1 < · · · < Gi−1 < Gi < · · · < Gm = G,
m ≤ n,
such that the kernels of all the pairs (Gi /Gi−1 , Γ), i = 1, 2, . . . , are unit congruences on Γ. The representation (G, Γ) is called n-stable if (G, Γ) ∈ S n , and, if in the distributively generated near-ring E(G) of endo-isomorphisms of the group G, there exists an element θ which commutes in E(G) with all differences αf − β f , α, β ∈ Γ, the series (31) is θ-invariant, while in the factors of this series the elements of Γ have the same effect as θ. We have the following generalization of Kaluzhnin’s theorem. T HEOREM 3.82. If a faithful representation of a semigroup by endomorphisms of some group is n-stable, then this semigroup is (≤ (n − 1))-fold nilpotent in the sense of Mal’cev. For semigroups locally nilpotent in the sense of Mal’cev we generalize in the same section a result of B. Banachewski [13] on zero divisors of the semigroup ring of a commutative semigroup. Acknowledgement. The author is obliged to Prof. B. I. Plotkin for supervising this work, and for his valuable advice and interesting discussion. [4],[6],[11], [24],[23],[25],[26], [27],[29],[28],[30], References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]
A. A. Bovdi. Group rings. University of Uzhgorod, Uzhgorod, 1974. The Dnestrovskiˇı tetrad, Novosibirsk, 1976. L. M. Gluskin. Semigroups and rings of endomorphims of linear spaces. Izv. Akad. Nauk SSSR, Ser. Math. 23, 25, 1959, 1961, 841–870, 809–814. A.G. Kurosh. Theory of groups, 1967. English translation: Vol. 1-2., Chelsea Pub. Co., New York, 1979. A. I. Mal’cev. Generalized nilpotent algebras and their associated groups. Mat. Sb., Nov. Ser. 25, 1949, 347–366. A. I. Mal’cev. Nilpotent semigroups. Uch. Zap. Ivanovskogo Pedinstituta 4, 1953, 107–111. A. I. Mal’cev. On the multiplication of classes of algebraic systems. Sib. Mat. Zh. 7, 1967, 346–365. A. V. Mikhalev. Isomorphisms of semigroups by endomorphisms of modules. Algebra i Logika 5, 6 (5, 2), 1966, 1967, 59–67, 35–48. B. I. Plotkin. Varieties of group representations. Usp. Mat. Nauk 32 (5), 1977, 3–68. English translation: Russian Math. Surveys 32 (1977), no. 5, 1–72. B. I. Plotkin and A. S. Grinberg. On groups of varieties and varieties of pairs connected with group representations. Sib. Mat. Zh. 13 (4), 1972, 841–858. D. I. Suprunenko. Matrix groups, 1972. English translation: (Monographs 45.) American Mathematical Society, Providence, R.I., 1976. A. E. Zaleskiˇı and A. V. Mikhalev. Group rings. In: Contemporary Mathematics, 2. VINITI, Moscow, 1973, 5–118. English translation: J. Sov. Math. 4, 1–78, 1975. B Banachewski. On proving the absense of zero divisors for semigroup rings. Canad. Math. Bull. 4, 1961, 225-231. G. Bergman and J. Lewin. The semigroup of ideals of a fir is (usually) free. J. London Math. Soc. 11 (2), 1975, 21–31. S. Eilenberg. Automata, languages and machines, Vol. A, B. Academic Press, New York, London, 1947, 1976.
4. Triangular products and stability
109
[16] K. W. Gruenberg. The residual nilpotence of certain presentations of finite groups. Arch. Math. 13, 1962, 408–417. [17] K. W. Gruenberg and J. Roseblade. The augmemtation terminal of certain locally finite groups. Can. J. Math. 24, 1972, 221–238. [18] B. Hartley. Locally finite groups embedded in stability groups. J. Algebra. 3, 1966, 187–205. [19] B. Hartley. Augmentation powers of locally finite groups. Proc. London Math. Soc. 32, 1976, 1–24. [20] D. Passman. The algebraic structure of group rings. Wiley-Interscience, New York, 1977. [21] G.-C. Rota. Baxter algebras and combinatorial identities. Bull. Am, Math. Soc. 75, 1969, 325–334. [22] R. Sandling. Note on the integral group ring problem. Math. Z. 124, 1972, 255–258.
Publications of the author on the theme of the dissertation [23] U. Kaljulaid. On the absence of zero divisors in certain semigroup rings. Acta Comm. Univ. Tartuensis 281, 1971, 49–57. (see [K71a]). [24] U. Kaljulaid. On the absence of zero divisors in some semigroup rings. In: All Union Colloquium of Algebra, Kishinev, 1971, 138–139. (see [K71c]). [25] U. Kaljulaid. On the powers of the augmentation ring of the integral group ring for finite groups. Acta Comm. Univ. Tartuensis 281, 1971, 58–62. (see [K71b]. [26] U. Kaljulaid. On the powers of the augmentation ideal. Proc. Estonian Acad. Sci. Phys. Math. 22, 1973, 3–21. (see [K73c]). [27] U. Kaljulaid. On wreath type constructions for algebras. In: Abstracts of the Third All Union Symposium of Rings, Algebras and Modules, Tartu, 1976, 49–50. (see [K76]). [28] U. Kaljulaid. Triangular products of representations of semigroups and associative algebras. Uspehi Mat. Nauk 32, no 4/196, 253-254, 1977, 253-254. (see [K77a] and Sec. 2). [29] U. Kaljulaid. Remarks on the varieties of semigroup representations and automata. Acta Comm. Univ. Tartuensis 431, 1977, 47–67. (see [K77b]). [30] U. Kaljulaid. Remarks on the course on discrete mathematics. In: Proc. of the III Regional ConferenceSeminar of Leading Departments and Leading Lecturers of Mathematics, Minsk, 1977, 50. (see [K77c]).
This page intentionally left blank
111
5.
[K87a] Some remarks on Shevrin’s problem Edited with the help of K. Kaarli
5.1. Preliminary remarks Let us begin by recalling some notions. A semigroup S with zero is said to be nilpotent if there exists n ∈ N such that all products with n or more factors in S are zero. The least among these numbers n is called the class of the nilpotent semigroup S. A semigroup S is said to be nil if for any element x in S there exists a number n(x) ∈ N such that xn(x) = 0; the least of such numbers n(x) is called the nilpotency index of x. Clearly all subsemigroups of a nilpotent semigroup are also nilpotent. But the same is not obvious for subgroups of a nil semigroup. A non-nilpotent semigroup with all its proper subsemigroups nilpotent will be called critical. In [2] it is shown that among non-nil semigroups there are only few critical ones. But even now the question about the existence of critical nil semigroups remains open. This question was posed by L. Shevrin [2]. He answered it negatively for commutative nil semigroups. It is quite natural to look for noncommutative generalizations of these results of Shevrin. From known facts about nil ideals in [4] we obtain quite easily the following. T HEOREM 5.1. Critical nil semigroups cannot be finitely generated. To give further results we need some more definitions. A semigroup S is said to be left duo if for all u, v ∈ S there exists v ∈ S such that uv = v u. If, in addition, there exists u ∈ S with uv = vu then S is called a duo semigroup. A semigroup is said to be left subduo (subnilpotent) if all its proper subsemigroups are left duo (nilpotent). Examples of such semigroups can be found in [3]. In her diploma work [5] Riina Miljan answered Shevrin’s question for locally duo semigroups. The announcement [1] shows a permanent interest in the above line of reasoning. So we present here a negative answer to Shevrin’s question for left duo nil semigroups. As noticed already, for the nilpotency of a semigroup S it is clearly necessary for S to be subnilpotent. Our result is that subnilpotency of left nil subduo nil semigroup S is also sufficient for S to be nilpotent. T HEOREM 5.2. Every subnilpotent left nil subduo semigroup is nilpotent. From this theorem the non-existence of commutative critical nil semigroups follows immediately. However, the proof of our Theorem 5.2 is nothing more than a noncommutative version of Shevrin’s original argument in [2]. The interaction of Theorem 5.1 with the results of [1] and [3] shows, of course, a close interconnection of our results with those of Miljan and Katzman. Unfortunately, there exist no published versions of these results referred, and so we prefer to give an independent presentation of this theme.
112
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
5.2. On the number of generators of critical nil semigroups Let us suppose that there exists a critical nil semigroup S. It appears that to solve Shevrin’s problem it is convenient to consider separately the following two a priori possible cases; (a) S not finitely generated, (b) S has a finite system of generators. Our aim in this section is to show that (b), actually, is not possible, i.e. to prove Theorem 5.1. P ROOF OF T HEOREM 5.1. Assume on the contrary that there exists a critical nil semigroup S generated by a1 , . . . , an . Then [4, Lemma VIII 4.1] says that there exist bj , j = 1, . . . , n, all having the same common factor am ∈ {a1 , . . . , an }, and such that the subsemigroup T = b1 , . . . , bn is not nilpotent. Since S is critical, it follows that T = S. For any subset H ⊆ S let Ann H = {z ∈ S|∀x ∈ H, xz = 0}. It is clear that Ann S ⊆ Ann T . For S critical it appears that Ann S = Ann T . = 0 and so Denote by h the nilpotency index of am and let h > 2. Then ah−1 m h−2 am ∈ Ann S. Also, every element of T is a finite product of elements bj = bj am , ∈ Ann T . Then it follows that bj ah−2 = bj ah−1 = 0 for j ∈ {1, . . . , r}, and so ah−1 m m m h−2 all j ∈ 1, . . . , r}. Consequently, am ∈ Ann T . But this is a contradiction because T = S and ah−2 ∈ Ann T . In the case h = 2 it follows from bj am = bj a2m = bj 0 = 0 m that am ∈ Ann T . Supposing that am ∈ Ann S we obtain that all bj = bj am = 0, i.e., T = 0, and this contradicts the fact that T is non-nilpotent. Therefore Ann A ⊆ Ann T , but this is impossible because T = S.
5.3. Some lemmas about nil semigroups The result in the previous section allows us to assume, in what follows, that our critical nil semigroup S is not finitely generated. For convenience of reference we state the following easy L EMMA 5.3. Let u be a nonzero element in a nil semigroup S with nilpotency index h. Then u = {0, u, u2, . . . , uh−1 }. The following two lemma are contained in [2]. L EMMA 5.4. A nonzero element of a nil semigroup cannot be a proper factor of itself. L EMMA 5.5. If S is a critical semigroup then S = S 2 . L EMMA 5.6. A finitely generated left duo nil semigroup is nilpotent. a left duo nil semigroup. Then there exist ni ∈ N P ROOF. Let T = t1 , . . . , tn be such that tni i = 0. Let us denote n = i ni and show that T n = 0. Indeed, it is clear that every product s = s1 s2 · · · · · sn , si ∈ T , contains one of the generators ti = ti (s), i ∈ {1, . . . , m}, at least ni times, say k times. As T is left duo, we have s = (. . . )1 ti (. . . )2 . . . ti . . . = (. . . )1 (. . . )2 . . . tki = utki for some u ∈ T , while from k ≥ ni it follows that tki = 0. So we have s = 0.
5. Some remarks on Shevrin’s problem
113
L EMMA 5.7. For any elements u and v in left duo semigroup S and for any k ∈ N there exists wk ∈ S such that (uv)k = wk · uk . P ROOF. By repeated application of the left duo property of S it follows that there exist elements w0 , w1 , . . . in S such that 2 3 (uv)k = uvuv . . . uv = w0 u v . . . uv = w0 w1 u v . . . uv = k times
= · · · = w0 w1 . . . wk−1 y k = wk uk . For the free semigroup Fm with free generators f0 , f1 , . . . , fm we denote by n F (m, n) the Rees factor semigroup Fm /Fm . The semigroup F (m, n) is called the free m-generated nilpotent semigroup. An easy combinatorial consideration shows that L EMMA 5.8. The semigroup F (m, n) is finite. Denote by f (m, n) the number of elements in F (m, n).
5.4. Left duo versions of Shevrin’s lemma Let S be a critical semigroup. Then Lemma 5.5 tells us that S = S 2 , and so for any x ∈ S there exist elements a1 , a2 , . . . and b1 , b2 , . . . in S such that x = a1 b 1 ,
b 1 = a2 b 2 ,
b 2 = a3 b 3 ,
...,
bk−1 = ak bk ,
...
Consequently, x = a1 b1 = a1 a2 b2 = · · · = a1 a2 . . . ak bk = . . . , and so to each x ∈ S we have associated an infinite sequence {ak } of its factors; in [2] such a sequence is called an x-sequence. In this section our aim is to prove the following lemma. L EMMA 5.9. Let S be a left subduo nil semigroup which is not finitely generated and such that S = S 2 . Then for each nonzero element x ∈ S there exists an x-sequence {ak } such that a1 ∈ a2 , a3 , . . . am for all m = 2, 3, . . . . P ROOF. The proof runs by induction over m. 1. We begin with the case m = 2. Let x = a1 b1 . We show that there exists a factorization b1 = a2 b2 such that a1 ∈ a2 . Suppose that b1 = u1 v1 and let h be the nilpotency index of u1 . There are two possibilities. First, if a1 ∈ u1 , then take a2 = u1 . Second, if a1 ∈ u1 , a1 = uk11 , and for a factorization v1 = u2 v2 , consider the element u1 u2 . Again, two possibilities can occur: u1 ∈ u1 · u2 or a1 ∈ u1 · u2 . In the first case take a2 = u1 u2 . In the second case we have a1 = (u1 u2 )k2 and for a factorization v2 = u3 v3 , consider the element u1 u2 u3 . Continuing in this way, it can be shown that there exists an r ∈ N such that a1 ∈ u1 u2 · · · · · ur . Suppose, on the contrary, that a1 ∈ u1 , a1 ∈ u1 · u2 ,. . . , i.e. that a1 = uk11 = (u1 u2 )k2 = · · · = (u1 u2 . . . ur )kr = . . . .
114
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
All the subsemigroups u1 u2 , u1 u2 u3 , . . . , u1 u2 . . . ur are proper, because S is not finitely generated, and so these subsemigroups are all left duo. It follows from Lemma 5.7 that there exist elements w2 , . . . , wr ∈ S such that (u1 u2 )k2 = w2 uk12 , (u1 · u2 u3 )k3 = w3 uk13 , . . . , (u1 · u2 . . . ur )kr = wr uk1r . Note that here all ki < h. Indeed, having ki ≥ h for some i, implies that a1 = (u1 · u2 ui )ki = 0, but this contradicts 0 = 0x = a1 b1 . Lemma 5.3 shows that among the non-zero elements, uk11 , uk12 , . . . , uk1h there are at k least two of them equal to the other. Let uk1i = u1j . Because 1 ≤ ki , kj ≤ h, this gives ki = kj ; without loss of generality we may assume that i < j. Denoting c = u1 u2 · ui and c = di+1 · uj , we obtain a1 = cki = (c · d)ki . On other hand, observe now that the previous calculations were done in the subsemigroup P = u1 , u2 , . . . , ui , . . . , uj , which is proper in S, as S is not finitely generated. So it follows that this subsemigroup is left duo. Two cases are possible: either cd = c or ˜ In this last case there exists (by Lemma 5.7) an elethere exists d˜ ∈ P such that cd = dc. ki ki ment w ˜ ∈ P such that c = (cd) = wc ˜ ki . On other hand, according to Lemma 5.4 the ki non-zero element c (or c ) cannot be its own proper factor so we come to a contradiction in both cases. From the above considerations it follows that for some r < h we have a1 ∈ u1 u2 . . . ur . We take a2 = u1 u2 . . . ur and b2 = vr . The deduction in Case 1 is now complete. 2. Suppose now that there are elements a2 , . . . , am in S such that x = a1 a2 . . . am bm and a1 ∈ a2 , . . . , am . Then we prove that there exists a factorization bm = am+1 bm+1 with a1 ∈ a2 , . . . , am , am+1 . Let bn = y1 z1 . If a1 ∈ y1 , a2 , . . . , am take am+1 = y1 . If a1 ∈ y1 , a2 , . . . , am , then a1 may be written in the form k
(1)
k
(1)
k(1)
a1 = y1 1 a22 . . . amm , (1)
(1)
where clearly k1 > 0 and otherwise ki ≥ 0. We get such a representation for a1 in the following way. Starting with a1 ∈ y1 , a2 , . . . , am , i.e. a1 = s0 (y1 .a2 , . . . , am ), we utilize, recursively for i = 0, 1, . . . , m − 3. the left duo property of the subsemigroup (1) y1 , a2 , . . . , am−i of S to extract the whole power akm−i from the right, k
(1)
m−i . si (y1 , a2 , . . . , am−i ) = si+1 (y1 , a2 , . . . , am−i−1 )am−i
Observe also that k
(1)
k
(1)
k
(1)
sm−2 (y1 , a2 ) = sm−1 (y1 )a22 = y1 1 a22 . This process of extracting all powers of a fixed generator am−i to the right is finite, because a1 = 0 in the nil semigroup S. Furthermore, take a factorization z1 = y2 z2 and consider the element y1 y2 . If a1 ∈ y1 y2 , a2 , . . . , am , take am+1 = y1 y2 . In the case a1 ∈ y1 y2 , a2 , . . . , am repeat the procedure described above, starting with a1 = s0 (y1 , a2 , . . . , am ) ∈ y1 , y2 , a2 , . . . , am and obtain (2)
k
(2)
k(2)
a1 = (y1 y2 )k1 a22 · · · · · amm
5. Some remarks on Shevrin’s problem
(2)
115
(2)
with k1 > 0 and otherwise ki ≥ 0. Continuing in this way, we see that, after a finite number of such procedures, we obtain an element y1 y2 . . . yr such that a1 ∈ y1 , . . . yr , a2 , . . . , am . To prove this assertion observe that the subsemigroup F = y1 , a2 , . . . , am is nilpotent (by Lemma 5.6). So F is an epimorphic image of F (m, n), n being the index of nilpotency of F and it follows from Lemma 5.8 that |F | ≤ f (m, n). Denote s = f (m, n) and suppose that a1 is contained in all subsemigroups y1 , a2 , . . . , am , y1 y2 , a2 , . . . , am , . . . , y1 y2 . . . ys , a2 , . . . , am . Then we have k
(1)
k
(1)
k(1)
a1 = y1 1 a22 . . . amm = (2)
k
(2)
k(2)
= (y1 y2 )k1 a22 . . . amm = · · · = (s)
k
(s)
k(s)
= (y1 y2 . . . ys )k1 a22 . . . amm , (j)
with k1 > 0 for all j = 1, 2, . . . s. The subsemigroup Y = y1 y2 . . . ys is a left duo. Therefore, by Lemma 5.7, Y contains elements wk(2) , . . . , wk(s) such that 1
(1)
a1 =
k y1 1
(1)
k a2 2
(s) k1
= wk(s) y1 1
k(1) . . . amm (s) k2
a2
(2)
=
k wk(2) y1 1 1
1
(2)
k a1 1
k(2) . . . amm
= ··· =
k(s)
. . . amm .
Since the semigroup F is an epimorphic image of F (m, n), it follows that among the k
(t)
k
(t)
k(t)
non-zero elements y1 1 a12 · · · · · amm , t = 1, 2, . . . , s, in F there are at least two of them which are equal as words in the alphabet {y2 , a2 , . . . , am }. From this it follows again that k (i) = k (j) for some i < j and so we obtain (32)
a1
(i)
k
(i)
k(i)
= · · · = (y1 y2 . . . yi )k1 a22 . . . amm = · · · = (i)
k
(i)
k(i)
= (y1 y2 . . . yi+1 . . . yj )k1 a22 . . . amm . According to Lemma 5.4, y1 y2 . . . yi = y1 y2 . . . yi yj is impossible. Therefore it follows from Lemma 5.7 that for some w ˜k(i) ∈ Y one has 1
(i)
(i)
˜k(i) (y1 y2 . . . yi )k1 . (y1 y2 . . . yi yi+1 . . . yj )k1 = w 1
From (32) we now get (i)
k
(i)
k(i)
(i)
k
(i)
k(i)
˜k(i) (y1 y2 . . . yi )k1 a22 . . . amm . (y1 y2 . . . yi )k1 a22 . . . amm = w 1
But this equality again contradicts Lemma 5.4. We deduce that for some r ≤ s = f (m, n) one must have a1 ∈ y1 y2 . . . yr , a2 , . . . , am . Taking am+1 = y1 y2 . . . yr and bn+1 = zr , we get the desired result. The induction argument is completed and so Lemma 5.9 is proved.
116
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
5.5. Proof of Theorem 5.2 Suppose on the contrary that there exists a critical left subduo nil semigroup S. Then by Theorem 5.1 the semigroup S is not finitely generated. From Lemma 5.5 it follows that S = S 2 . We shall prove that S must have a non-nilpotent proper subsemigroup. Take any nonzero element x ∈ S and let {ak } be an x-sequence considered in Lemma 5.9. Denote Sm = a2 , . . . , am , m = 2, 3, . . . , and let S∗ = ∪m≥2 Sm . Then clearly S∗ is left subduo subsemigroup on S. Obviously, from x = 0 it follows that a2 . . . am = 0 for all m ≥ 2. So S∗ ia a left duo subsemigroup in S. Observe that from x = 0 it follows that a2 . . . am = 0 for all m ≥ 2. In view of Lemma 5.9 we have a1 = Sm for all m ≥ 2. Therefore the subsemigroup S∗ in S is proper. Consequently, we have found a proper non-nilpotent subsemigroup S∗ in S, which contradicts the fact that S is critical. Theorem 5.2 is proved. References [1] S. I. Katsman. On subgroups whose all proper subgroups are nilpotent. In: XVIII All Union Conference of Algebra, abstracts of talks, Vol. Part I, Kishinev, 1985. [2] L. N. Shevrin. On subgroups whose all proper subgroups are nilpotent. Sib. Mat. Zh. 2 (6), 1961, 936–942. [3] A. Cherubini and A. Varisco. On subgroups whose all proper subgroups are nilpotent. Czechoslovak Math. J. 34, 1984, 630–644. [4] N. Jacobson. The structure of rings. Am. Math. Soc., Providence, RI, 1964. Revisited edition. [5] R. Miljan, Some structure theorems concerning subgroups. Diploma work at Tartu University, Tartu, 1973.
117
6.
[K90] Transferable elements in group rings
This paper has two objects. First, we give a detailed presentation of some arguments in [13], with the object to set forth this as a basic reference in further publications of the author on semigroup rings. Second, we indicate some new applications of the notion of transferable elements of a ring. In particular, the description by S. V. Mihovski [14] of strongly regular semigroup rings is here obtained as a consequence of Menal’s theorem [13]. For this it is important to emphasize that the answer to the question if a given group ring k[G] is a during or not, does not depend only on the group theoretic structure of G and the characteristic of the field k, but also on other arithmetic and algebraic circumstances. Examples of this kind are not very numerous in the theory of group rings.
6.1. Preliminary results 1. Let R be an associative ring with unity. An element x ∈ R is called right (left) R-transferable if Rx ⊂ xR (xR ⊂ Rx).27 If all elements of R are right (left) transferable, then R is called a right (left) transferable duoring (or a right (left) subcommutative ring). This notion was introduced by Feller [6] in 1958, and was subsequently studied by Barbilian, Koch, Kurter and others. Transferable elements of subgroups and rings are also studied by Cohn [5]. Thereby, a major role in the study of the arithmetic of noncommutative rings is played by these transferable elements which are not zero divisors of R. Such elements are called invariant elements of R. They generate a subgroup denoted by I(R). 2. We remark that in a right duoring all right ideals are two-sided. Indeed, let I ≤ R be a right ideal in a right duoring R. Then for all i ∈ I, r ∈ R there exists r ∈ R such that ri = ir and so ri ∈ I. This shows that I is a left ideal in R. Analogously, in a left duoring each left ideal is two-sided. Consequently, in a duoring all ideals are two-sided. Also the converse is true, if in a ring R with unity each right (left) ideal is two-sided, then it is a right (left) duoring. Indeed, if, for example, each right ideal in R is two-sided, then, in particular, each principal right ideal xR is two-sided for each element x ∈ R, i.e. R · xR ⊆ xR, which implies that Rx ⊆ Rx · R ⊆ xR. Consequently, we have Rx ⊆ xR for any x ∈ R. In the same way we can argue in the “left” case. As a result we arrive at the conclusion that a right (left) ideal duoring may be defined an associative ring with unity in which every right (left) ideal is two-sided. Koch defines these rings in this way. 27Translator’s note. Here Rx is the left principal ideal generated by x ∈ R, similarly xR stands for the right principal ideal.
118
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
3. In the case when R is the group ring of a group G over a field k let us mention two major points. First, if R = k[G] is a right duoring it is a the same time a left duoring and vice versa. def For the proof, we shall use the anti-isomorphism ∗ of R, ( g xg g)∗ = g xg g −1 . For example, if xR ⊆ Rx we have a relation of the type xy = y · x, which gives y ∗ · x∗ = x∗ · (y )∗ . Taking into account that y ∗ runs through the whole of k[G], we conclude that Rx∗ ⊆ x∗ R. Analogously, we establish the implication Rx ⊆ xR =⇒ x∗ R ⊆ Rx∗ . Thus, for group rings the notions of right and left ideals do not differ from each other. At the same time there exist right duorings which are not left ideal duorings, and vice versa; even among finite dimensional k-algebras one has such examples, as found by R. Kurter (1982). Second, if for all finitely generated subgroups H, H ≤ G, the ring k[H] is a duoring, then also k[G] is duo. Indeed, take any two elements x = g∈G xg g and y = h yh h def
in k[G]. Consider the subgroup H = supp x ∪ supp y; it is finitely generated. In view of our assumption, k[H] is subcommutative. Therefore, for it elements x and y there exists y such that xy = y x. From this follows the subcommutativity of k[G]. Also the converse holds true: if k[G] is subcommutative then also all subrings k[H] for all finitely generated subgroups H, H ≤ G, are subcommutative. In order to prove this statement, we fix a complete system of representatives T = {ti | i ∈ I} in the decomposition G = ∪t∈T tH with cosets of G with respect to H; we assume that t0 = 1. This makes it possible to view k[G] as a right k[H]-module with basis T ; therefore each element x in k[G] may be represented uniquely in the form i ti zi , zi ∈ k[H]. For arbitrary x and y from k[H] there exists (as k[G] is subcommutative) an element y = i ti yi , yi ∈ K[H], such that ti y i ) · x = ti (yi x). xy = y · x = ( i
i
We remark that xy and all elements yi x lie in k[H], while the elements ti , i ∈ I, give a basis for the k[H]-module k[G]. Therefore xy = i ti (yi x) implies that all coefficients ti (i = 0) vanish. Hence, xy = y0 x, y0 ∈ k[H]. This argument shows that k[H] is subcommutative. 4. Let us consider the group ring R = k[G] of the non-Abelian group G over the field k, making the assumption that is a duoring. In a duoring all ideas are two-sided. Consequently, this is also true for all (right) ideals in R of the form ωH, H ≤ G, generated by all elements h − 1, h ∈ H. This implies that all subgroups H in G are invariant: ∀g ∈ G,
h ∈ H,
1 − g −1 hg = g −1 (1 − h)g ∈ ωH =⇒ g −1 hg ∈ H.
Non-Abelian groups, in which all subgroups are invariant, are called Hamiltonian; their structure is well-known: G is the direct product of an 8-th order group V of quaternions, an Abelian group of exponent 2, and an Abelian group A1 , all of which elements have odd number; [7, p. 190 (213)]28 28Translator’s note. Page references in [3], [4],[7], [16], etc. are to the English original, with those in the Russian translation used by the author within parentheses.
6. Transferable elements in group rings
119
Consequently we can assume that G = A×V , where A is the product of the Abelian factors and V the group of quaternions, def
V = a, b | a4 = 1, a2 = b2 , ba = a−1 b. 5. It turns out that in the case char k = 2, k[V ] is the direct composition of two subrings, k[V ] = P (k) ⊕ V (k), where P (k) = k ⊕ k ⊕ k ⊕ k is the direct composition of four fields isomkorphic to k, and V (k) is the algebra of quaternion with respect to the pair (−1, 1) over the field K; [3, p. (300)]. It follows at once from the definitions that P ⊕ V (k) is subcommutative precisely when V (k) is subcommutative, because P is commutative as a direct sum of fields. At the same time one has the following alternative: if char k = 2, the algebra V (k) is either an sfield (for this it is necessary and sufficient that the form x20 +x21 +x22 +x23 does not represent zero in the field k) or V (k) ∼ = M2 (k)29; [3, p. (267)]. We remark that in an arbitrary sfield and for arbitrary elements x and y one −1 has x · y = y −1 · yxy and xy = xyx · x, and that in the algebra M2 (k) all matrices a 0 of the form generate a left ideal, but not a right ideal. This argument shows that b 0 the sfield is subcommutative, but not M2 (k). 6. It follows from the previous subsection that the abundance of transferable elements in a group ring k[G] is influenced not only by the structure of G and char k but also by some other arithmetic and algebraic circumstannces for k and G. Relying on the alternative indicated in Subsection 5 we shall supplement yet another example. It is a well-known fact that V (k) is an sfield for k = Q. However, already k = Q(i) admits the non-trivial presentation of zero, as 0 = 12 + i2 + 02 + 02 in Q(i). Hence V (Q(i)) is not an sfield. But then in view of our alternative V (Q(i)) ∼ = M2 (k)(Q(i)), i.e. V (Q(i)) is nota duoring. 7. A field k, char k = p > 2, contains the prime subfield Z2 . If k[G] is a duoring, then in view of the relation k[G] ∼ = k ⊗Zp Zp [G] Zp must be duo too. This again implies the subcommutativity of V (Zp ), in view of the formula Zp [G] ∼ = P (Zp ) ⊕ V (Zp ). Hence, an obstruction to the subcommutativity of the ring k[G] is the fact that that V (Zp ) is not duo. We observed above that for p = 2 such a condition appears as the representability of zero in Zp by the form x20 + x21 + x22 + x23 . It follows from a well-known theorem of Lagrange in Number Theory that each (prime) number p > 2 admits a representation in integers p = c20 + c21 + c22 + c22 ; moreover, it is known that not all relations 0 of the ci ≡ 0 (mod p), i ∈ {0, 1, 2, 3}, are fulfilled. In other words, in Zp the element ¯ form x20 + x21 + x22 + x23 is represented in the following way: ¯ 0 = c20 + c21 + c22 + c23 , and there is an index i such that c¯i = 0. As a result, we see that in the group ring k[G] for char k ∈ / {0, 2} cannot be a duoring.
29Translator’s note. The algebra of 2 × 2 matrices with entries in k.
120
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
6.2. Menal’s theorem, the case of 0 characteristic. 8. From Menal’s results [13] it is possible to obtain a description of the subcommutative group rings. This can be formulated as follows. T HEOREM 6.1. Let k be a field and G a non-Abelian group. The group ring k[G] is a duoring if and only if one of the following two conditions is fulfilled: (1) char k = 0, G is a Hamiltonian group, G = E × A1 × V , and for each odd n, which is the order of a suitable x ∈ A1 , the quaternion algebra V (k(ξn )) is an sfield; here ξn is a primitive n-th root of unity over k; (2) char k = 2 and G is a Hamiltonian group of the form G = A1 × V , where A1 is an Abelian group, with all of each elements having odd order, while k and all fields k(ξn ) have no primitive cubic roots of unity; here ξn is is a primitive n-th root of unity over k, for n = o(x) for the element x ∈ A1 . P ROOF. A sufficiently detailed and as closed as possible proof of this theorem will be given below for fields of characteristic 0. The case of fields of characteristic 2 will be given in a subsequent paper by the author30. Theorem 6.1 was first proved in [13]. 9. N ECESSITY. Let char k = 0. Assume that k[G] is subcommutative. Then G is Hamiltonian and for each finitely generated subgroup H ≤ G the ring k[H] is subcommutative. Consider now subgroups of the form H = x × V , x ∈ A1 ; let us denote n = o(x). In view of [work] by Deskins and others (cf. [17, p. (48)]) 31, k[x] ∼ = ⊕ k(ξd ), d|n
where ξd is primitive d-th root of unity. Therefore, we have ∼ k[x] ⊗k k[V ] = ∼ k[x × V ] = ∼ = [ ⊕ k(ξd )] ⊗k [(k ⊕ k ⊕ k ⊕ k) ⊕ V (k)] = d|n
= · · · ⊕ (k(ξd ) ⊗k V (k)) ⊕ · · · = · · · ⊕ V (k(ξd )) ⊕ . . . The subcommutativity of the ring k[x × V implies the subcommutativity of the factors k(ξd )) in the direct composition for all d, d|o(x), x ∈ A1 . In particular, this means that all the algebras V (k(ξd )), n = o(x), x ∈ A1 , are sfields. 10. Before passing to the proof of condition (1), we remind of a fact necessary in what follows on groups ring of Abelian groups. Namely, let k be a field of characteristic = 2, and let C2 = t | t2 = 1, a cyclic group of order 2. The following relations hold true: (1) k[C2 ] ∼ = k ⊕ k; (2) k[C2 ×C2 ] ∼ = k ⊕k ⊕k ⊕k, and, in general, for an elementary Abelian 2-group E ∗ = C2 × · · · × C2 it holds k[E ∗ ] ∼ ⊗ ...⊗ k. =k 2n times
Indeed, 30Translator’s note. A promise that, apparently, was not fulfilled. 31Translator’s note. Probably, W. E. Deskins, cf. the book [18].
6. Transferable elements in group rings
121
(1) in the group ring k[C2 ] the elements 12 (1 − t) and 12 (1 + t) are orthogonal idempotents. Hence, k[C2 ] ∼ = 12 (1 − t) · k[C2 ] ⊕ 12 (1 + t)k[C2 ]. It remains to remark that the maps corresponding to the first and the second fact in this direct decomposition over k, given by the formulae 12 (1 + t)(α + βt) → α + β and 12 (1 − t)(α + βt) → α − β, respectively, give isomorphisms (of rings). (2) We remark that the reasoning (1) also yields k[C2 × C2 ] = k[C2 ][C2 ] ∼ = k[C2 ] ⊕ k[C2 ] ∼ = (k ⊕ k) ⊕ (k ⊕ k) ∼ = k ⊕ k ⊕ k ⊕ k. Continuing a similar reasoning, we deduce at the general case. 11. S UFFICIENCY. Let char k = 0. Assume that k and G satisfy the condition (1). This means that G is a Hamiltonian group, that G = E × A1 × V and that all algebras V (k(ξn )), n = o(x), x ∈ A1 , are sfields. Let us show that k[G] is a duoring. To this end, it suffices to verify that the group rings (over k) of all non-Abelian finitely generated subgroups in G are duorings. But each such semigroup of G is finite and has the form V, c1 , . . . , ck with elements c1 , . . . , ck from the centralizer of the semigroup V . Such a subgroup can be presented in the same form as G itself, V, c1 , . . . , ck = E ∗ × A∗1 × V , where E ∗ is an elementary Abelian 2-group, while A∗1 is a finite Abelian group of odd order; [7, p. (215)]. Let us further remark that for the direct sum of commutative rings K = K1 ⊕ K2 and an arbitrary group G we have K[G] ∼ = K1 [G] ⊕ K2 [G], because the map
(i) (i) (i) (i) (τ1 , τ2 )gi −→ ( τ1 gi , τ2 gi )
i
i
i
yields the desired isomorphism. We have k[E ∗ × A∗1 × V ] ∼ = k[E ∗ ][A∗1 × V ] ∼ = ∼ ⊕ . . . ⊕ k [A∗1 × V ] ∼ = k = k[A∗1 × V ] ⊕ . . . ⊕ k[A∗1 × V ] . 2m times
∗
2m times × A∗1 ×
Consequently, the subcommutativity of the ring k[E commutativity of the direct factor k[A∗1 × V ]. We have
V ] follows from the sub-
k[A∗1 × V ] ∼ = k[A∗1 ] ⊗k k[V ] ∼ = k[A1 ] ⊗k (k ⊕ k ⊕ k ⊕ k ⊕ V (k)) ∼ = ∗ ∼ (k[A ] ⊗k k) ⊕ · · · ⊕ (k[A∗ ] ⊗k k) ⊕ (k[A∗ ] ⊗k V (k)) ∼ = = 1 1 1 ∗ ∗ ∼ k[A ] ⊕ · · · ⊕ k[A ] + ((⊕md k(ξd )) ⊗k V (k)) ∼ = = 1 1 ∗ ∗ ∼ = k[A1 ] ⊕ · · · ⊕ k[A1 ] ⊕ · · · ⊕ V (k(ξd )) ⊕ . . . . In these calculations we have employed the result of Deskins and others (cf. [17, p. (48)]): there is a ring isomorphism k[A∗1 ] ∼ = ⊕ md k(ξd ); in this formula ξd is a d-th root of d
unity, the number md · [k(ξd ) : k] counts the number of elements of order d in A∗1 , while md k(ξd ) means the direct composition of md copies of the ring k(ξd ). The first four factors in the composition of rings obtained in the course of these computations are Abelian, while the last factors V (k(ξd )) are sfields, as the conditions (1) of Theorem 6.1
122
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
are fulfilled; at the same time, they are duorings. But then also the whole direct composition k[E ∗ × A∗1 × V ] is a duoring, from which, in view of the second remark in Subsection 6.1.3, follows the subcommutativity of the entire ring k[G]. These reasonings conclude the proof of Theorem 6.1 for fields k of characteristic zero.
6.3. Transferable elements in regular rings 12. Let us recall that an associative ring R is called strictly regular if for all a ∈ R there exists an element x ∈ R such that a = a2 x. Characterizations of such rings have been given by Andrunakievich (1964), Luh (1964), Shain (1966), Lajos- Szasz (1970), and others. It is easy to see that a strictly regular ring does not contain nilpotent elements, and so is regular. That these two last conditions are equivalent to the strict regularity of the ring is the content of the theorem of Forsyte-MacKoy: a regular ring without nilpotent elements is strictly regular. Let us add that in view of the Lemma 6.2 given below, it is sufficient, for the proof of this theorem, to show that a regular ring R without nilpotent elements is a duoring. The previous statement can be proved as follows. Let a ∈ R; there exists b ∈ R such that a = aba. Let us denote e = ab; one verifies immediately that for each r ∈ R one has 0 = (er − ere)2 , which yields er = ere. In an analogous way one shows that re = ere. On the other hand, one can show that aR = eR and Ra = Re. Consequently, aR = Ra for each a ∈ R. 13. L EMMA 6.2. Strictly regular rings are regular duoring and vice versa. P ROOF. If the ring R is strictly regular, then (in view of [1, Theorems 3.2 and 3.4]) R is regular and subcommutative. Conversely, if R is a regular duoring, then for each a ∈ R there exist y ∈ R such that a = aya, and x ∈ R such that ya = ax. Now we have a = a(ya) = a(ax) = a2 x. 14. In [12] Lajos and Szász proved the theorem: an associative ring is strictly regular if and only if its multiplicative semigroup is a many-structured group. It is rather easy to prove (following [12]) that an associative ring R, whose multiplicative semigroup S is many-structured, is strictly regular. The converse statement is proved in [12] with the help of a sufficiently lengthy chain of checks and calculations. We shall show that this result follows from well-known facts on semigroups. Let R be strictly regular. It follows from the definitions and the above lemma that the multiplicative semigroup S of the ring R is a regular and subcommutative semigroup. In a subcommutative semigroup any two of its idempotents commute. Indeed, given two idempotents e, f ∈ S there exist in S elements f and f such that ef = f e and ef = f e, which again yields ef = f e = ef e and ef e = ef = f e. We deduce that ef = ef e = f e. Hence, S is a regular semigroup. Such a semigroupS is however inverse ([4, p. (50), Theorem 1.17]). Moreover, in a subcommutative semigroup S every ideal is two-sided. The fact that these two last conditions are simultaneously valid implies that S is a many-structured group (cf. [4, Vol. 1, p. (173), Excercise 2]).
6. Transferable elements in group rings
123
15. In [14], Mihovski gave the following description of strictly regular group rings: the group ring k[G] is strictly regular if and only if condition (1) of Theorem 6.1 is fulfilled. We indicate a new path deriving this result, based on the lemma formulated above and Menal’s theorem. The case char k = 0. If k[G] is a strictly regular group ring, then it is subcommutative. Condition (1) follows from Theorem 6.1. Conversely, let condition (1) of Theorem 6.1 hold. Then k[G] is a duoring. According to the criterion of regularity (a group ring k[G] is regular exactly when S is regular, G locally finite and the order of all elements from G invertible in k; cf. [16, p. (141), Theorem 18]) k[G] is also a regular ring. Indeed, this criterion is applicable, because G, being Hamiltonian, is locally finite and the order of all of its elements is invertible in the field k. The case char k = 2. We argue by contradiction, and show that in the case at hand the ring is not strictly regular. Lemma 6.2 gives that this ring is subcommutative and regular. The first thing implies (in view of condition (2) of Theorem 6.1) that G is presentable as the direct product of an Abelian group such that all its elements have odd order, and a quaternion group V . The second conclusion of the lemma says that k[G] = k[A × V ] ∼ = k[A][V ] is regular. By the regularity criterion just quoted k[A] must be regular (which in the case at hand indeed is the case), and all elements in V must be invertible in k[A]. However, the last thing is not true: 2 and 4 are not invertible in k[A] as char k = 2. Contradiction. In the case char k > 2 we do not have either a regular group ring, because by Melan’s theorem in this case k[G] is not a duoring. This argument proves Mihovski’s theorem.
6.4. On a ring without non-trivial transferable elements 16. So far we have considered rings reach of transferable elements, namely the duorings. Let us know consider the opposite case, rings with non-trivial transferable elements. Let k be field, and F the free monoid of elements of a countable set X = {x1 , x2 , . . . } as a system of free generators. The semigroup ring k[F ] does not contain zero divisors, and so all transferable elements in k[F ] are invariant. However, in Bergman-Lewin [2] it is observed that the semigroup ring k[F ] is a left and right FI-ring without non-trivial right invariant elements. For this ring k[F ] holds the following T HEOREM 6.3. The semigroup of proper special ideals of k[F ] is free. P ROOF. According to Theorem 5 in [2], the semigroup R of all non-zero two-sided ideals in the ring R is free with the set of all indecomposable proper ideals in k[F ] as a system of free generators. Further, we remark that the product of proper special ideals in R is again a proper special ideal in R, and so it distinguishes a subsemigroup S in R. The theorem is proved if for all ideals A and B in R such that AB ∈ S we prove that A ∈ S and B ∈ S. This will also be proved in what follows. We remark that by the unique factorability of the ideal A ∈ S in indecomposable factors it follows the invariance of factors with respect to each special automorphism of R. By the word “special” we refer to those automorphisms (endomorphisms) which are induced by automorphisms of the monoid F . Furthermore, it will be expedient to introduce the following notion. An endomorphism of R is called singular if it induced by an endomorphism η of the monoid F such
124
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
that X ⊂ X η . Let us show that for an arbitrary proper endomorphism η : R → R holds Aη ⊂ A. Indeed, let u be an arbitrary element in A, and S = {x1 , . . . , xn } ⊂ X such that u ∈ k[x1 , . . . , xn ]. As η is proper, there exist elements xi ∈ X such that xi η = xi , i ∈ n. Let us consider a permutation of X such that xi γ = xi , i ∈ n, and extend it to an automorphism of R. It is clear that η is a special automorphism of R and that by the remark just made Aγ = A. By our construction u = uγη ; as a consequence, u = uγη ∈ Aγη = Aη . Hence, we have proved that A ⊂ Aη . Next, let us complete the proof of our main statement. As η is proper and AB a special ideal we have AB ⊃ (AB)η = Aη B η ⊃ Aη B ⊃ AB, and by the same token the relation AB = Aη B. As η is singular we have also the relation A = Aη . Moreover, let μ be an arbitrary special endomorphism of R. For any u ∈ A we can construct a proper endomorphism η : R → R, coinciding with μ on the element u. Indeed, for each xi ∈ S we put xηi = xμi , while on the complement X\S we define η as an arbitrary surjection X\S X. The map thus defined η : X → F is then extended to a special endomorphism η : R → R, which will be singular by construction. We have uμ = uη ∈ Aη = A, showing that A is a special ideal. In an analogous manner, one shows that B is special. Theorem 6.3 is proved. We add that Theorem 6.3 admits also another formulation – as a statement about the freedom of the semigroup of the variety of representations (over k), and in this form it was established in [11] using the technique of triangular products. 17. The argument set forth above leads to the problem: describe all subcommutative group rings 32 kG. Likewise the simpler problem of describing crossed subcommutative group rings 33 is of interest; see the definition in [15]. In this connection the answer two the following problem might be of interest: What is the criterion for a twisted group ring K G for a commutative ring K, with G a group? May we take the risk to ask if there is something (and namely what?) in the role of the alternative mentioned in Subsection 6.1.5. for k V ; for k t [V ]? We raise also the question of the question of subcommutative semigroup rings k[S]. The author will turn to this issue in a future publication34. We remark that subcommutativity is preserved under epimorphisms of rings. Thus for all non-Abelian groups G the group ring Z[G] is not subcommutative, although there exist endomorphisms onto the group ring Zp [G], p = 2. Therefore there arises the natural problem of describing the semigroup of transferable elements in Z[G], more generally group rings K[G] with G an arbitrary group. Let us add that such a problem has been posed already for V (Z); cf. [5, p. 155]. Namely, here is of interest the question if the subcommutativity of K[G] depends on other arithmetical and algebraic circumstances, besides the existence of “bad reductions” K → k, where k is a field with char k = 0, 2. 32[in English in the original] skew group rings (– the twisting is trivial) 33[in English in the original] twisted group rings (– the action is trivial) 34Translator’s note. Again this was never materialized.
6. Transferable elements in group rings
125
Puudu: [8],[9],[10] References [1] [2] [3] [4]
[5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18]
R. Arens and I. Kaplansky. Topological representations of algebras. Tr. Am. Math. Soc. 63, 1948, 457– 481. G. Bergman and J. Lewin. The semigroup of ideals of a fir is (usually) free. J. London Math. Soc. 11 (2), 1975, 21–31. N. Bourbaki. Elements of Mathematics, Algebra I, Chapters 1–3; 4–7. Springer-Verlag, Berlin, 1998. Russian translation: Fizmatgiz, Moscow, 1962. A. H. Clifford and G. B. Preston. The algebraic theory of semigroups. Vol. I–II. Mathematical Surveys, No. 7. American Mathematical Society, Providence, R.I., 1961: 1967. Russian translation: Algebraic theory of semigroups, 1-2, Mir, Moscow, 1972. P. Cohn. Free rings and their relations. Academic Press, London, 1971. E. H. Feller. Properties of primary non-commutative rings. Trans. Am. Math. Soc. 89, 1958, 79–91. M. Hall. The theory of groups. MacMillan, New York, 1959. Russian translation: Mir, Moscow, 1962. U. Kaljulaid. On two results on strongly regular rings. In: Proc. of the Conference STheoretical ¸ and ˇ Abstracts of talks, Tartu, 1985, 67U-69. ˝ applied questions of mathematicsT, (see [K85c]). U. Kaljulaid. On the freedom of the semigroup of special ideals. In: Abstracts of the conference SMethods ¸ ˇ Tartu, 1983, 10U-12. ˝ of algebra and analysisT, (see [K83e]). U. Kaljulaid. Remarks on subcommutant rings. In: XVIII All Union Algebraic Conference, Abstracts of talks, Kishinev, 1985, 227. (see [K85b]). U. Kaljulaid. Triangular products of representations of semigroups and associative algebras. Uspehi Mat. Nauk 32 (4/196), 1977, 253–254. (see [K77a]). S. Lajos and F. Szasz. Characterisations of strongly regular rings, II. Proc. Japan Acad. 46, 1970, 287– 289. P. Menal. Group rings in which every left ideal is a right ideal. Proc. Am. Math. Soc. 76, 1979, 204–208. S. V. Mihovski. Strictly regular group rings. Bull. de l’Inst. Math., Acad. Sci. Bugare 14, 1971, 67–71. D. Passman. Group rings, crossed products and Galois theory. In: CBMS Regional Conf. Ser. in Math., 64. Am. Math. Soc., Providence, RI, 1986. P. Ribenboim. Rings and modules. Interscience Publ., New York, London, 1969. S. Sehgal. Topics in ring theory. Marcel Dekker, New York, 1978. M. (ed.) Weinstein. Between nilpotent and solvable. Polygonal Publishing House, Passaic, New Jersey, 1962.
This page intentionally left blank
127
7.
[K00] Ω-rings and their flat representations Coauthor O. Sokratova
Abstract. Ω-rings are a natural generalization of rings, semirings, distributive lattices, and semigroups. Here we consider localizations of Ω-rings, tensor products of representations of Ω-rings (acts over Ω-rings), and a few associated concepts of flatness for acts.
7.1. Introduction
An Ω-ring is a universal algebra equipped with a binary associative multiplication connected with operations in Ω with two-sided distributivity. Ω-rings provide a natural common generalization of rings, semirings, distributive lattices, and semigroups. Called distributive Ω-semigroups they appeared in the investigations of B. I. Plotkin [18] of representations of groups by automorphisms of Ω-algebras. The study of Ω-rings on their own right began with Jaak Hion [8] and was continued by L. N. Shevrin [20] through an investigation of dense immersions of ideals of Ω-rings, and by L. A. Skornyakov [21] who studied radicals of Ω-rings. Many constructions in ring theory can be transferred to monoids or semirings. So it is interesting to consider analogous constructions in the more general context of Ω-ring theory. Here some new constructions and results on Ω-rings and their acts are presented. For a commutative Ω-ring with its underlying Hamiltonian Ω-algebra it is proved that there exists an immersion of such an Ω-ring into an Ω-ring with every element either a unit or a zero divisor. Tensor products for different autonomous (commutative) varieties of algebras have been considered in the general context by many mathematicians at different times. The tensor product bifunctor in the general context of autonomous varieties of algebras was explicitly described by Y. Katsov [13]. Related categorical constructions appeared in [1, 11]. We consider tensor products of acts over Ω-rings, a case that is not included in the papers cited above. We prove a generalization of the Govorov-Lazard and the Stenström theorems: an act over an Ω-rings is strongly flat if and only if it is a direct limit of finitely generated free acts. Localizations of Ω-rings are also considered. Namely, we introduce the notion of an Ω-ring of fractions of Ore and prove that there exists unique immersion of an Ω-ring, approximated by an inverse system of congruences, into the inverse limit of corresponding factor-Ω-rings.
128
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Several proofs are omitted, especially those parallel to corresponding proofs in ring theory. Details can be found in [22].
7.1.1. Notions and examples Let Ω be a signature35. By an Ω-ring we mean an Ω-algebra R equipped with a multiplication such that (R1) (R, ·) is a monoid with identity element 1; (R2) The formulae r(s1 . . . sn ω) = (rs1 ) . . . (rsn )ω and (s1 . . . sn ω)r = (s1 r) . . . (sn r)ω hold for all n-ary (n ≥ 0) operation ω ∈ Ω and all elements r, s1 , . . . , sn ∈ R. The formulae r(s1 . . . sn ω) = (rs1 ) . . . (rsn )ω and (s1 . . . sn ω)r = (s1 r) . . . (sn r)ω hold for all n-ary (n ≥ 0) operation ω ∈ Ω and all elements r, s1 , . . . , sn ∈ R. Let an Ω-ring R be given. By a left unitary R-act is meant an Ω-algebra A with an action (r, a) → a ∈ A such that the following conditions hold: (A1) (rs)a = r(sa); (A2) (r1 . . . rn ω)a = (r1 ) . . . (rn a)ω and r(a1 . . . an ω) = (ra1 ) . . . (ran )ω; (A3) 1a = a; for all elements a, a1 , . . . an in A, r, s, r1 . . . rn in R and every n-ary (n ≥ 0) operation in Ω. Right R-acts are defined analogously. Note that all nullary operations in Ω, if they exist, fix in an Ω-ring R (in an R-act A) one and the same element, which will be denoted by 0 (0A ). Recall that an Ω-algebra A is called commutative, if any two operations in Ω are permutable on it, i.e. (a11 . . . a1n ω)(a21 . . . a2n ω) . . . (am1 . . . amn ω)τ = = (a11 . . . am1 τ )(a12 . . . am2 τ ) . . . (a1n . . . amn τ )ω for arbitrary ω ∈ Ωn , τ ∈ Ωm and a11 , . . . , amn ∈ A. Throughout the paper, Ω-rings, such that their Ω-algebras belong to some given variety A of commutative Ω-algebras, will be considered.36 For such an Ω-ring R, the class of all left (right) R-acts whose Ω-algebras belong to A is a variety R A (AR ) with signature {Ω, ·R} ({Ω, R·}). So, we can use for acts over an Ω-ring the usual notions of universal algebra. In particular, by injective (free) left R-acts we mean injective (free) algebras in the variety R A. 35Editors’ Note. The word “signature” means here a set of operators. For example, a ring is a universal algebra with signature {+, ·, −, 0, 1}. 36Editors’ Note. An Ω-algebra is any universal algebra with the set of operators Ω. For a fixed set Ω, and Ω-ring is any algebra whose operations are all the operations in Ω, plus binary multiplication (plus 1 necessary). The Ω-algebra of an Ω-ring R is the same set R if we "forget" about multiplication. If an Ω-ring R is an ordinary ring, its Ω-algebra is the same set R but considered only as Abelian group (with respect to the +). If an Ω-ring R is a monoid, its Ω-algebra is just the same set R with no operations.
7. Ω-rings and their flat representations
129
As usual, an Ω-ring R is called commutative if the monoid (R, ·) is commutative. By a (left) ideal of an Ω-ring R is meant an Ω-subalgebra closed under (left) multiplication by elements of R. The set of all units of R is denoted by U (R). If U (R) = R\{0}, then R is called a division Ω-ring. A commutative division Ω-ring is called an Ω-field. The set of all cancellative elements of (R, ·) is denoted by C(R). For a nonempty subset S of R, the following congruences generated by S are considered: the congruence Θ(S) of the Ω-ring R, the congruence ΘΩ (S) of the Ω-algebra of R, and the congruence ΘR (S) (R Θ(S)) of the right R-act RR (the left R-act R R). Special attention is given to Ω-rings whose Ω-algebra is Hamiltonian. Recall that an algebra is called Hamiltonian if any subalgebra is the class of a suitable congruence. Let us consider the main examples of Ω-rings and acts over them. E XAMPLE 7.1. If A is the class of all sets, i.e. Ω is empty, then the notion of Ω-ring coincides with the notion of monoid. E XAMPLE 7.2. In case when A is the variety of (additive) Abelian groups we get the notion of (ordinary) ring and module over it. E XAMPLE 7.3. In case when A is the variety of commutative semigroups, then an Ω-ring turns out to be a semiring and R-acts are semimodule over it. E XAMPLE 7.4. If, in addition to the conditions in Example 7.2, the identity x + x = x holds in A and R is a bounded distributive lattice we obtain acts over a distributive lattice.
7.2. The semigroup of Ω-rings In this section we introduce the construction of a semigroup Ω-ring. Let Γ be a semigroup. Given an element γ ∈ Γ, denote by Rγ an Ω-algebra isomorphic to (R, Ω). Let (RΓ, Ω) be a coproduct of the algebras Rγ in the variety A, Rγ. (33) (RΓ, Ω) = γ∈Γ
Then the multiplication on RΓ is defined by extending rxi sxj = rsxi+j by the distributive law. That is, given arbitrary elements p(r1 γ1 , . . . , rn γn )
and
q(s1 γ1 , . . . , sm γm )
in RΓ, take (34)
)= p(r1 γ1 , . . . , rn γn ) · q(s1 γ1 , . . . , sm γm = p(q(r1 s1 γ1 γ1 , . . . , r1 sm γ1 γm ), . . . , q(rn s1 γn γ1 , . . . , rn sm γn γm )).
The associativity of the multiplication in RΓ follows from that in R. Thus we get the following P ROPOSITION 7.1. The Ω-algebra RΓ defined by (33) is an Ω-ring with respect to the multiplication (34).
130
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Let us call RΓ the semigroup Ω-ring of Γ over R. Setting r → r1 (r ∈ R) we get an embedding of R into the semigroup Ω-ring RΓ. Semigroup R-acts are defined analogously. For a left R-act R A take Aγ, (AΓ, Ω) = γ∈Γ
where by Aγ (γ ∈ Γ) are denoted Ω-algebras isomorphic to A. The R-action on AΓ is defined by rp(a1 γ1 , . . . , an γn ) = p(ra1 , . . . , ran , γn ), for all r ∈ R and p(a1 γ1 , . . . , an γn ) ∈ AΓ. In the case when Γ is a free monoid X ∗ we obtain the polynomial Ω-ring, which will be denoted by R[X]. Another way to obtain polynomial Ω-rings is to use the general construction of a polynomial algebra. Let R be an algebra " in a variety K, and let F be the free algebra in K over a set X. Then the coproduct R F in K is called the polynomial algebra over X with coefficients in R. Polynomial Ω-rings can be obtained as a special case of this constructions. Using the above construction of polynomial Ω-rings we show further (see Teorem 7.6 below) that every commutative Ω-ring with its Ω-algebra Hamiltonian can be embedded into a commutative Ω-ring such that that every non-unit is a zero-divisor. The following lemma can be proved using Mal’cev’s description [16] of principal congruences. L EMMA 7.2. If I is a left ideal of an Ω-ring R, then ΘΩ (I) = ideal of R, then ΘΩ (I) = Θ(I).
R Θ(I).
If I is an
P ROPOSITION 7.3. Every proper congruence of an Ω-ring R with zero is contained in a maximal proper congruence. P ROPOSITION 7.4. Every commutative congruence-free Ω-ring with its Ω-algebra Hamiltonian is an Ω-field. Let us prove a preliminary result. L EMMA 7.5. Let R be a commutative Ω-ring with zero such that (R, Ω) is Hamiltonian and let a be a non-unit in R. Then R can be embedded into a commutative Ω-ring R with (R , Ω) Hamiltonian so that a is a zero-divisor in R and U (R ) ⊆ U (R). P ROOF. Since aR = R and (R, Ω) is Hamiltonian, the congruence ΘΩ (aR) is proper. According to Lemma 7.2, ΘΩ (aR) = Θ(aR). By Proposition 7.3 this congruence is contained in a maximal congruence ρ0 of R. For every i ∈ N0 , let φi : R[x] → R be the unique homomorphism extending the R-homomorphisms defined by rxj → δij r; here δij is the Kronecker symbol. Define the relation ρ on R[x] setting uρv if and only if φ0 (u) = φ0 (v) and φi (u), φi (v) ∈ ρ0 for all i ≥ 1. One can show that ρ is a congruence of the rings R[x]. Denote R[x]/ρ by R . Note that R is embedded into the Ω-ring R . We identify R with its image in R . Observe that the element a is a zero-divisor in R . It remains to show that U (R) ⊆ U (R ). Assume that u/ρ · v/ρ = 1, where u, v ∈ R[x]. If u ∈ R, then φ0 (uv) = uφ0 (v), and u/ρ · v/ρ = 1 implies that uφ0 (v) = 1. Hence, u ∈ U (R). Thus, it is sufficient to show that the class u/ρ contains an element of R.
7. Ω-rings and their flat representations
131
To prove this, suppose that u ∈ R. Then one can assume that φi (u), 0 ∈ ρ0 and φi (v), 0 ∈ ρ0 provided φi (u) = 0, respectively φi (v) = 0, hold for all i ≥ 1. Let i (respectively j) be the maximal element for the element u (for v) such that φi (u) = 0 (φj (v) = 0). Furthermore, consider two cases. F i r s t, i = 0. Then u, φ0 (u) ∈ ρ, and hence one can suppose that u ∈ R. S e c o n d, i ≥ 1. Then i+j ≥ 1 and from uv, 1 ∈ ρ it follows that φi+j (u), 0 ∈ ρ0 . One can check that φi+j (u + v) = φi (u)φj (v). Since ρ0 is maximal, it follows from Proposition 7.4 that either φi (u), 0 ∈ ρ0 or φj (v), 0 ∈ ρ0 , contradicting the assumption. Hence, one can suppose that the elements u and v belong to R and so to U (R) as well. Using the preceding lemma we can prove the following T HEOREM 7.6. Every commutative Ω-ring R with zero such that (R, Ω) is Hamil¯ with (R, ¯ Ω) Hamiltonian and such tonian can be embedded into a commutative Ω-ring R ¯ is a zero-divisor. that every non-unit element of R P ROOF. Let the set J(R) = {aα |α ∈ I} of all non-units in R be indexed by a wellordered set I. Using Lemma 7.5 we can build inductively an increasing chain (Rα , α ∈ I) of Ω-rings such that the following conditions are satisfied: aα is a zero-divisor in Rα+1 for every α ∈ I, U (Rα ) ⊆ U (R), (Rα , Ω) ∈ A, and Rα = ∪β>α Rβ for any limit ordinal α. Now, put R(0) = R and define R(1) = ∪α∈I Rα . It is clear that (R(1) , Ω) ∈ A, U (R(1) ) ⊆ U (R) and that all non-units in R are zero-divisors in R(1) . Analogously, ¯= the Ω-ring R(1) can be embedded into a commutative Ω-ring R(2) , and so on. Put R ∞ (i) ∪α∈I R .
7.3. Ω-rings of fractions The classical construction of fractions can be considered in the context of Ω-rings. Any submonoid B of C(R) satisfying the following conditions: (O1) bR ∩ rB = 0 for all b ∈ B and r ∈ R; (O2) if b and br belong to b, then r ∈ B can serve as a (right) Ore set of elements for R. L EMMA 7.7. Given elements b1 , . . . , bn in B, there exist elements u1 , . . . , un ∈ B such that b1 u1 = b2 u2 = · · · = bn un . P ROOF by induction over n. For an Ore set B ⊆ R define the (equivalence) relation ∼ on R × B by setting (r, b) ∼ (r , b ) ⇐⇒ ∃c, c ∈ B such that bc = b c and rc = r c . Denote the set (R × B)/ ∼ by RB −1 and the equivalence class of a pair (r, b) by r/b. Define multiplication in RB −1 by r1 /b1 · r2 /b2 = r1 r/b2 b, where the elements b ∈ B and r ∈ R are chosen so that r2 b = b1 r, provided by (O1). The operations from Ω in RB −1 are defined by (r1 /b1 ) . . . (rn /bn )ω = (r1 u1 ) . . . (rn un )ω/b1 u1 ,
132
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
where the elements u1 , . . . , un are such that b1 u1 = b2 u2 = · · · = bn un , and are provided by Lemma 7.7 T HEOREM 7.8. The set RB −1 with the operations defined above is an Ω-ring such that (RB −1 , Ω) ∈ A. Moreover, r → r/1 (r ∈ R) gives an embedding R → RB −1 . The Ω-ring RB −1 is called an Ω-ring of fractions for R. Let R be an Ω-ring with an Ore system B. Given an R-act AR , a construction of an RB −1 -act of fractions AB −1 appears. First on the set A × B we define an (equivalence) relation: (a, r) ∼ (c, s) ⇐⇒ ∃r , s ∈ R, as = cr , rs = sr , and consider the ∼-classes as the elements a/r of RB −1 . Then the operations from Ω on AB −1 are defined as follows: (a1 /r1 ) . . . (an /rn )ω = (a1 u1 ) . . . (an un )ω/r1 u1 , where the elements u1 , . . . , un with r1 u1 = · · · = r2 u2 = rn un are provided by Lemma 7.7. The RB −1 -act structure on AB −1 is given by a/r · s/t = as /tr , where the elements s and r such that rs = sr exist due to (O1). Next, let us consider extensions of homomorphisms of Ω-rings (acts) to their Ω-rings (acts) of fractions. P ROPOSITION 7.9. Let S and R be Ω-rings with Ore systems B(S) and B(R) , re−1 such that f (B(S) ) ⊆ B(R) , there spectively. Given a homomorphism f : S → RB(R) −1 −1 ˜ exists a unique extension f : SB → RB . If, in addition, f (B(S) ) = B(R) and f is (S)
injective, then f˜ is injective, too.
(R)
˜ = r/(f (b)c), where s ∈ S, b ∈ B(S) and f (s) = r/c P ROOF. Define f˜ by f(s/b) for some r ∈ R and c ∈ B(R) . P ROPOSITION 7.10. Let R be an Ω-ring with an Ore systems B. Let AR and CR be R-acts. Every homomorphism f : AR → CR can be uniquely extended to a corresponding homomorphism f¯ of acts of fractions. Moreover, if f is injective, then f¯ is also injective. ¯ = f (a)/r for any a/r ∈ AB −1 . P ROOF. Define f¯ : AB −1 → CB −1 as f(a/r) Let {ρλ |λ ∈ I} be a family of congruences of an Ω-ring R such that the following conditions hold: ρλ ≤ ρμ whenever λ ≥ μ and ∩ ρλ = ΔR . Then R is said to be λ∈I
approximated by the inverse system of congruences {ρλ |λ ∈ I}. T HEOREM 7.11. Let an Ω-ring R be approximated by an inverse system of congruences {ρλ |λ ∈ I}. Assume that each R/ρλ has an Ore system Bλ and that Bλ /(ρμ /ρλ ) ⊆ Bμ holds whenever λ ≥ μ. Then there exists an embedding of R into the inverse limit of the Ω-rings of fractions of R/ρλ .
7. Ω-rings and their flat representations
133
P ROOF. According to the assumption each R/ρλ has an Ore system Bλ , and by Theorem 7.8, the Ω-ring R/ρλ can be embedded into the Ω-ring of fractions Kλ := R/ρλ Bλ−1 using certain injections hλ . Let fλμ : R/ρλ → R/ρμ (λ ≥ μ) be the natural surjections. According to the assumption, fλμ (Bλ ) ⊆ Bμ and, by Proposition 7.9, the homomorphism fλμ can be extended to f¯λμ : Kλ → Kμ . Thus we obtain an inverse system of Ω-rings of fractions (Kλ , f¯λμ , I). Let K ∞ = lim Kλ be its inverse limit. ←− Denote by pλ the restriction of the λ-th projection of λ∈I Kλ onto K ∞ . Then f¯λμ pλ = pμ (λ ≥ μ). Using the natural surjections fλ : R → R/ρλ define gλ = hλ fλ . It holds f¯λμ hλ = hμ fλμ . Then gμ = f¯λμ gλ , and by the universal property of inverse limits there exists a unique homomorphism g : R → K ∞ such that pλ gλ = gλ (λ ∈ I). The homomorphism g is injective due to the assumption ∩λ∈I ρλ = ΔR .
7.4. Flatness of acts over Ω-rings 7.4.1. Flat acts Throughout this section, an Ω-ring R is fixed. Let AR ∈ AR and R B ∈ R A be some acts and let C be an Ω-algebra in A. A map Φ : A × B → C is called bilinear if Φ(a1 . . . an ω, b) = Φ(a1 , b) . . . Φ(an , b)ω, Φ(a, b1 . . . bn ω, b) = Φ(a, b1 ) . . . Φ(a, bn )ω, Φ(ar, b) = Φ(a, rb) for arbitrary elements a, a1 , . . . , an in A, b, b1 , . . . , bn in B, r in R, and every operation ω ∈ Ωn (n ≥ 0). By a tensor product of AR and R B is meant an Ω-algebra G in A together with a bilinear map Φ : A × B → G such that that for every algebra C ∈ A and every bilinear ¯ = ψ. map ψ : A× B → C there exist a unique homomorphism ψ¯ : G → C such that ψΦ The (unique) tensor product A⊗B of AR and R B is constructed as the factor algebra (AB)/∼, where −− : A × A → A is the internal tensor product bifunctor (see [13, Section 2]) and ∼ is the congruence generated by the pairs (arb, arb), a ∈ A, b ∈ B. Note that the algebra A ⊗ B is generated by the elements Φ(a, b) = a ⊗ b (= ab/∼). Tensoring with a (left) R-act R C is a functor − ⊗ C : AR → A. Given a homomorphism κ : AR → BR of right R-acts, define the homomorphism κ ⊗ 1C A ⊗ C → B ⊗ C as the extension of the bilinear map (a, c) → κ(a) ⊗ c, a ∈ A, c ∈ C. Thus, (35)
(κ ⊗ 1C )(a ⊗ c) = κ(a) ⊗ c
a ∈ A, c ∈ C.
A left R-act is called flat if the functor − ⊗ C preserves injective homomorphisms; i.e., if for any injective homomorphism κ the induced homomorphism κ⊗1C is injective, too. Readily repeating the routine arguments in the classical case, we get L EMMA 7.12. Given a right R-act AR , there exists a natural isomorphism of Ωalgebras A ∼ = A ⊗ R. Moreover, in the tensor product A ⊗ R it holds a ⊗ r = b ⊗ s if and only if ar = bs in A. C OROLLARY 7.13. Every one-generated free R-act is flat.
134
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Given a family {Bi |i ∈ I} of left R-acts, one can form a priori two different co" products. Let B be a coproduct of the given family of R-acts in the variety AR and i R " let Bi be"a coproduct of the Ω-algebras of the Bi in the variety A. It turns out that the Ω-algebra Bi is a left R-act with respect to the natural action on it. Furthermore, " " L EMMA 7.14. There exists a natural isomorphism of R-acts Bi ∼ Bi . = R
As usual, one can prove that the functor Hom(C, −) is a right adjoint to the functor − ⊗ C. An Ω-algebra D ∈ A is called a cogenerator if for any distinct homomorphisms α, β : A → B (A, B ∈ A) there exists a homomorphism φ : B → D such that φα = φβ. Injective cogenerators exist and are known in many varieties of algebras. E.g., Q/Z is an injective cogenerator for the variety of all Abelian groups; the smallest injective cogenerator for the class of all sets is the two-element set [9]; for an injective cogenerator for the variety of all semilattices with zero serves the two-element lattice 2 [10]; and Q/Z × 2 is an injective cogenerator for the variety of commutative inverse monoids [13]. For all these examples appropriate R-acts of characters have been considered in the literature. However, the variety of all commutative monoids does not contain any nonzero injective objects. In Theorems 7.15–7.18 below it is assumed that the variety A contains an injective cogenerator D. For a left R-act C, the homomorphisms C → D form a right R-act, which is called ∗ the act of characters of the R-act R C and is denoted by CR . The following theorem is of a folklore character and can be proved repeating, for example, the arguments in [11, Theorem 2]. ∗ T HEOREM 7.15. A left R-act R C is flat if and only if its R-act of characters CR is injective.
The preceding results lead to the following examples of R-acts. T HEOREM 7.16. Every free R-act is flat. P ROOF. Note that every " free R-act is isomorphic to a coproduct of some copies of the R-act R. So, let R F = Ri (all Ri ∼ = R R) be a free R-act. Then # Hom(Ri , D) = Ri∗ . FR∗ = Hom( Ri , D) ∼ = Hence, FR∗ is injective as a direct product of injective R-act, and R F is flat by Theorem 7.15. T HEOREM 7.17. Every projective R-act is flat. P ROOF. Let R "P be a projective R-act generated by {pi |i ∈ I}. Take the appropriate free R-act R F = Ri . Then there exist homomorphisms φ : R F → R P and ψ : R P → ∗ R F such that φψ = 1P . One can show that PR is injective. Hence R P is flat. Immediately from Theorem 7.15 it follows that T HEOREM 7.18. If all right R-acts are injective, then all left R-acts are flat. Given a right R-act M and a direct system (Ci , hij , I) of left R-acts, the Ω-algebras {M ⊗ Ci |i ∈ I} form the direct system (M ⊗ Ci |i ∈, 1M ⊗ hij , I).
7. Ω-rings and their flat representations
135
L EMMA 7.19. lim(M ⊗ Ci ) ∼ = M ⊗ lim Ci . −→
−→
Using Lemma 7.19 we can prove that P ROPOSITION 7.20. Every direct act of flat acts is flat. P ROPOSITION 7.21. Let R be an Ω-ring with an Ore system B. For any R-act AR , there exists a natural isomorphism A ⊗ RB −1 ∼ = AB −1 . P ROOF. The desired isomorphism is the homomorphic extension of a bilinear map φ : A × RB −1 → AB −1 defined as φ(a, r/s) = ar/s, for a ∈ A and r/s ∈ RB −1 . T HEOREM 7.22. RB −1 is a flat R-act. The proof follows from Propositions 7.10 and 7.21. R EMARK 7.23. Without requiring an injective cogenerator in A we need a special assumption in order to make a free R-act flat. P ROPOSITION 7.24. A free R-act over a set {xi |i ∈ I} is"flat if and " only if for any inclusion AR → BR of R-acts, the induced homomorphism I A → I B is also injective. P ROOF. The proof follows from Corollary 7.13 and the fact that tensor products and coproducts commute. In particular, the condition of Proposition 7.24 holds if the variety A satisfies the hereditary condition " for coproducts (see [17]). Let B = i∈I Bi be any coproduct of algebras in A. Then, for any family of subalgebras Ai ⊂ Bi (i ∈ I), the subalgebra of B generated by all Ai is isomorphic to their free product. In this way, Theorems 7.16 and 7.17 remain true for semimodules over semirings as well. Theorem 7.18 holds for any R-acts without any special assumptions. It is well known that all left (right) modules over a ring are flat if and only if the ring is regular. In the case of monoids, flatness of all S-acts implies regularity of the monoid S. The converse is not true [9]. Note also that all semimodules over a semiring are flat if and only if this semiring is a regular ring [24]. These results can be extended to the case of Ω-rings. An Ω-ring R is called regular if its semigroup (R, ·) is (von Neumann) regular, i.e., for every r ∈ R there exists s ∈ R such that r = rsr. T HEOREM 7.25. Let R be an Ω-ring with (R, Ω) Hamiltonian. If all left R-acts are flat, then R is regular. P ROOF. Given an arbitrary element r ∈ R, assume that rR = R and Rr = R. Let us consider the congruences θ = ΘΩ (Rr) and θ = ΘΩ (rRr) on (R, Ω). Let us denote R/θ by R/Rr and R/θ by R/rRr. By Lemma 7.2, R/Rr is a left R-act. By Lemma 7.12, r · 1/θ = r · r/θ = r · r/θ implies that r ⊗ 1/θ = r ⊗ r/θ holds in the tensor product R ⊗ R/Rr. Since R/Rr is a flat act, the last equality holds in R ⊗ R/Rr, too.
136
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
Define a homomorphism φ : rR ⊗ R/rRr → R/rRr as the extension of a bilinear map φ : rR × R/Rr → R/rRr such that φ(u, v/θ) = (ruv)/θ . Then r/θ = φ(r1, 1/θ) = φ(r ⊗ 1/θ) = φ(r ⊗ r/θ) = φ(r1, r/θ) = r2 /θ . It follows that r ∈ rRr.
It is known that if all (left) modules over a commutative ring R are flat, then every element of R is either a unit or a zero-divisor. For monoids this statement does not hold, as shows the following example. E XAMPLE 7.5. Let S = {0, e, 1} be a monoid with zero and with e = e2 . Obviously, S is regular. Moreover, since S is a commutative monoid with all its ideals principal, we can see that all S-acts are flat [9]. However, the element e is neither a unit nor a zero-divisor. An Ω-ring R with zero is called regular at zero, if every congruence of R is uniquely defined by its class defined containing zero. T HEOREM 7.26. Let R be a regular at zero, commutative, regular Ω-ring. Then every element of R is either a unit or a zero-divisor. P ROOF. Given an arbitrary element r ∈ R, we have r = rxr for some x ∈ R. Denote by e the idempotent xr. Then Θ(e, 1) = σe , where the congruence σe is defined by x, y ∈ σe if and only if ex = ey, x, y ∈ R. One has σe = σr . Now let K be the zero class of the congruence Θ(e, 1) = σe = σr . If there exists a non-zero element a ∈ K, then ra = r0 = 0 implies that r is a zero-divisor. If, instead, K = {0}, the the congruence Θ(e, 1), is trivial and so r is a unit. 7.4.2. Strongly flat acts A pair (X, K) is a presentation of an R-act A if A ∼ = F/ΘR (K), where F is the free R-act over the set X. As usual, an R-act is called finitely presented if it has a presentation (X, K) such that the sets X and K are finite. A surjective homomorphism of R-acts φ : R B → R A is called pure if for any finitely presented R-act R C and any arbitrary homomorphism η : R C → R A there exists a homomorphism μ : R C → R B such that φμ = η. An R-act A is called strongly flat if there exists a pure surjective homomorphism φ : R F →R A for some free R-act F . Note that in the case of (ordinary) rings, the notions of flat and strongly flat modules coincide. They are different for acts over monoids. Examples of strongly flat R-acts are given in the following propositions. P ROPOSITION 7.27. Every free R-act is strongly flat. P ROOF. For a projective R-act P there exist a surjective homomorphism φ of some free R-act onto P and a homomorphism ψ : P → F such that φψ = 1P . The homomorphism φ is pure, since for every homomorphism η : C → P we have φ(ψη) = η. P ROPOSITION 7.28. An R-act A is strongly flat if and only if for every free R-act F every surjective homomorphism φ : F → A is pure.
7. Ω-rings and their flat representations
137
Let K be a variety of algebras and let A be an algebra in K. Following R. Shannon [19], we say that A has the “Killing Interpolation Property” (KIP) if for any n-ary polynomials p and q, and an element a ∈ An the equality p(a) = q(a) implies the existence of m-ary polynomials t1 , . . . , tn and of an element c ∈ Am such that ti (c) = (a)i (i = 1, . . . , n) and p(t1 , . . . , tn ) = q(t1 , . . . , tn ) is an identity in K. The following theorem was proved for acts over monoids by B. Stenström [23], and was stated for arbitrary algebras by S. Bulman-Fleming and K. McDowell in [3]. Y. Katsov [14] extended this theorem to functor categories. We give a sketch of a proof for acts over Ω-rings. T HEOREM 7.29. The following properties of an R-act R A are equivalent: (1) (2) (3)
RA
is strongly flat; A R is a direct limit of finitely generated free R-acts; R A has the KIP with respect to the variety R A.
P ROOF. (1) =⇒ (2) : Assume " that R A is a strongly flat act. Denote by E the Cartesian product A × N and let F = E R be the free R-act with the set of generators indexed by E. Define the homomorphism φ : F → A by φ(r(x,n) ) = rx,
r ∈ R, x ∈ A, n ∈ N.
Let the R-act A be presented by (E, K) with K being the subset of F × F that generates the congruence Ker φ. Then R A is isomorphic to the limit of a direct system (Ai , hij , I) of finitely presented R-acts. Moreover, one can choose I to be the set of all pairs (E , K ) such that " E ⊆ E and K ⊆ Θ(K) ∩ (F × F ) are finite; here we denote by F the free R-act E R. It remains to show that all those indices i for that Ai is a free R-act form a cofinal subset of I. Given any R-act Ai , with i = (E , K ) in the direct system, there exists a homomorphism μ : Ai → F such that φμ = hi . Since Ai is finitely generated, the image μ(Ai ) is contained in some finitely generated free subact F ⊆ F . Let F be generated by a set E = {(x1 , n1 ), . . . (xk , nk )} of free generators. Choose n1 , . . . , nk ∈ N in such a way that n1 = n1 , . . . , nk = nk , and the sets Eˆ = {(x1 , n1 ), . . . (xk , nk )} and E are " disjoint. Note that the R-acts F and Fˆ := Eˆ R are isomorphic. Let α : F → Fˆ be the " corresponding isomorphism. Denote F = F Fˆ . Take the surjective homomorphism β : F → F , so that the restriction of β to F equals μφ (here φ is the natural surjection from F onto Ai ) and so that the restriction of β to Fˆ is the map inverse to α. Then the congruence Ker β on F is generated by the finite set K of all pairs u, αβ(u) with ˆ K ). Moreover, K ⊆ Ker β = Θ(K ) and u ∈ E . Thus, F¯ is presented by (E ∪ E, K ⊆ Θ(K). (2) =⇒ (1): Let us assume now that A = lim Fi , where Fi (i ∈ I) are free −→ finitely-generated R-acts. We have to show that an arbitrary surjective homomorphism φ : F → A is pure. Indeed, any homomorphism η : C → A of a finitely-presented R-act C factorizes through some component Fk ; i.e., there exists a homomorphism μ : C → Fk such that η = hk μ. Then one can find a homomorphism κ : Fk → F such that φk = hk . It holds φ(κμ) = hk μ = η. (2) ⇐⇒ (3) is proved for arbitrary varieties in [19].
138
C HAPTER I. REPRESENTATIONS OF SEMIGROUPS AND ALGEBRAS
C OROLLARY 7.30. If the variety A has an injective cogenerator or satisfies the hereditary conditions for coproducts, then every strongly flat act is flat. The proof follows from Theorem 7.16, Propositions 7.24 and 7.20, and Theorem 7.29. An act A satisfying condition (2) of Theorem 7.29 is called L-flat [12]. For modules over (ordinary) rings the notions of flatness and L-flatness coincide (the Govorov and Lazard Theorem [6, 7, 15]). The same result holds also for acts over finite Boolean algebras (Y. Katsov [12]). In the case of monoids, an act A is strongly flat if and only if the functor − ⊗ A preserves pullbacks [2]. It seems interesting to investigate conditions on Ω-rings for strong flatness and pullback-flatness to coincide. The authors are grateful to Yefim Katsov for the literature he gave them, and for very useful discussions on the subject.
Comments. Informally Ω-rings can be viewed as rings with several (not necessarily binary) additions. In this way, Ω-rings are a common generalization of rings and semigroups. Due to their nature, they require interesting techniques for investigation – a combination of techniques used for rings, semigroups, universal algebras, and category theory. Some Readers can probably see a formal parallel with Ω-groups, which explore the common features of rings and groups. In spite of the similarity of definition (Ω-groups are defined as rings with several multiplications), these objects are of different nature. Ω-rings have a long history in Tartu. While they appeared implicitly in the works of Boris Plotkin and Lev Skornyakov, they were formally introduced by the Tartu mathematician Jaak Hion [8]. At this time Ωrings and their acts were studied by Hion and his students, particularly by Vladimir Fleischer. Later Fleischer suggested flatness of acts over Ω-rings as a topic for my Master thesis. Uno Kaljulaid was very enthusiastic about this subject. He proposed several new directions in investigations of Ω-rings such as semigroup Ω-rings, their localization, etc. This became a part of my Ph.D. thesis “Ω-rings, their flat and projective acts with some applications” (Diss. Math. Univ. Tartuensis 24, Univ. of Tartu, 2000) supervised by Uno Kaljulaid. The present paper was written for the conference AAA-58 in Vienna, and it is a short version of our joint reprint [22]. Olga Sokratova
[4], [5]
References [1] [2] [3] [4] [5] [6] [7] [8] [9]
B. Banaschewski and E. Nelson. Tensor products and bimorphisms. Canad. Math. Bull. 19, 1976, 385– 402. S. Bulman-Fleming. Pullback flat acts are strongly flat. Canad. Math. Bull. 34, 1991, 456–461. S. Bulman-Fleming and K. McDowell. Flatness in varieties of normal bands. Semigroup Forum 19, 1980, 139–149. P. M. Cohn. On the embedding of rings and skew fields. Proc. London Math. Soc. 3, 1961, 511–530. V. Fleischer. Ω-rings over which all acts are n-free. Acta Comm. Univ. Tartuensis 390, 1975, 56–83. V. Govorov. Rings over which all flat modules are free. Dokl. Akad. Nauk 144, 1962, 965–968. V. Govorov. On flat modules. Sib. Mat. Zh. 6, 1965, 300–304. J. Hion. Ω-ringoids, Ω-rings and their representations. Transactions of the Moscow Math. Soc. 14, 1965, 3–47. M. Kilp. On flat acts. Acta Comm. Univ. Tartuensis 253, 1970, 66–72.
7. Ω-rings and their flat representations
139
[10] V. Kornienko. On flat acts over distributive lattices. Ordered Sets and Lattices 4, 1977, 69–85. [11] Y. Katsov. The tensor product of functors. Sib. Mat. Zh. 19, 1978, 222–229. [12] Y. Katsov. The Govorov-Lazard theorem for modules over finite boolean algebras. Mathematika 4 (2), 1986, 8–14. [13] Y. Katsov. Tensor products and injective envelopes of semimodules over additively regular algebras. Algebra Colloquium 4 (2), 1997, 121–131. [14] Y. Katsov. Note on flatness: categorical-algebraic approaches. In: Kurosh Algebraic Conference ’98, Abstracts of talks, Moscow, 1998, 67–68. [15] D. Lazard. Sur les modules plats. Comptes Rendus Acad. Sci. Paris, Series I 258, 1964, 6313–6316. [16] A. I Mal’cev. On the general theory of algebraic systems. Matem. Sb. 35 (77), 1954, 3–20. [17] A. I Mal’cev. Algebraic systems. Springer-Verlag, New York, Heidelberg, 1973. [18] B. I. Plotkin. Ω-semigroups, Ω-rings and representations. Dokl. Akad. Nauk 149, 1963, 1037–1040. [19] R. Shannon. Lazard’s theorem in algebraic categories. Algebra Universalis 4, 1974, 226–228. [20] L. N. Shevrin. On the dense embedded idels of algebras. Matem. Sb. 88 (130) (2), 1972, 218–226. [21] L. A. Skornyakov. Radicals of Ω-rings. In: Selected issues of algebra and logic, Novosibirsk, 1972, 283–299. [22] U. Kaljulaid and O. Sokratova. Flatness and localizations of Ω-semigroups. Technical Report CS96/98. Institute of Cybernetics, 1998. (see [K98a]). [23] B. Stenström. Flatness and localization over monoids. Math. Nachr. 48, 1971, 315–334. [24] L. Tukavkin. Commutative semirings with flat modules. Vestnik Mosk. Univ. 5, 1978, 60–62.
This page intentionally left blank
CHAPTER II Automata theory
This page intentionally left blank
143
1.
Preamble by the Editors
Besides representations theory, automata was another major field of interest of Uno Kaljulaid, particularly during the second half of his life. His favorite study topic was algebraic composition theory of automata and its applications in biology, cryptology and image processing. Uno used extensively category theoretic representation of automata. His main goal was to develop as general as possible composition operation for automata. Uno Kaljulaid believed that this generalized composition can be given as a wreath product of automata and that all practically meaningful compositions can be obtained by specializing of this product. Uno Kajulaid also drafted a lectures course on classical and modern automata theory including his own research results in years 1994 – 1997. The grandiosity of this attempt can be imagined from his plan of contents of the lecture notes below. The main part of this subject was given as a series of colloquium presentations at Lund University (Automata, Languages and Rationality) in 1994 and at Stockholm University (On the Languages of and Rationality of Formal Series Using Order and Topology) in 1996 and 1997. Jaak Peetre wrote down part of these lectures using Kaljuliad’s slides and handwritten notes in Estonian and English, occasionally also Russian. The lecture notes published in this volume is edited by J. Penjam. In the table of contents of the planned lectures these parts have supplemented by the corresponding sections numbers of this Chapter in parenthesis. The lectures on generalized automata (Sections 3.1 – 3.4) were reprinted from the technical report no 97 of the Estonian Institute of Cybernetics, 1997. The star * in front of a section means that this part of lectures Kaljulaid did not finished and never presented.
The planned contents of the lectures on Automata Theory Part A: Automata, Languages and Rationality * Introduction Chapter I. Automata and their decomposition (Sec. 2) 1.1. Definition of automata 1.2. Preliminary motivation and more notions 1.3. Semigroup automata 1.4. Cyclic automata 1.5. Wreath products of actions 1.6. Kaluzhnin-Krasner type theorem 1.7. Cascades and wreath products of automata; their interconnections 1.8. Linear automata 1.9. Triangular products and decomposition of linear automata 1.10. Decomposition of linear automata and image compression
144
C HAPTER II. AUTOMATA THEORY
Chapter II. Rationality 2.1. 2.2. 2.3. 2.4.
Recalling well-known things: formal series Rational series Recognizable series Rational (regular) languages
Chapter III. Generalized automata (Sec. 3) 3.1. Preliminary motivation and a very brief introduction to categories 3.2. Cascades once more – their intersections with wreath products 3.3. Wreath products of general automata – covariant case 3.4. Wreath products of presheaves – contravariant case *3.5. Properties of the wreath product construction and the key result *3.6. Groupoids, symmetries and the Van Kampen Theorem *3.7. Wreath products of of species and their many combinatorics
Part B: On the Languages and Rationality of Formal Series Using Order and Topology Chapter IV. Remarkable functions (zeta!) 4.1. General remarks, motivation 4.2. Some more definitions, concerning languages and automata 4.3. The Berstel-Reutenauer Theorem 4.4. Other remarks on rationality 4.5. Rationality – on the notion itself *4.6. Wreath products of actions 4.7. Gert Almkvist’s results on periodic Boolean sequences revisited 4.8 Supplement. Radar codes
Chapter V. Rationality 5.1. How the idea of continuity first appears in Language Theory 5.2. How to generalize this context 5.3. The leading example – Björner topology 5.4. The key object of study – the algorithm C 5.5. New interpretation of C. Further possibilities *5.6. Grothendieck topologies and formal languages *5.7. Grothendieck topologies and RO-groups and R-groups The Chapter is terminated by Section 4 with Uno Kaljulaid’s abstract of his presentation at Kurosh Algebraic Conference presenting the plan of further research, and so, it is like scientific testimony in its way. Together with Enn Tamme, Uno Kaljulaid wrote a popular scientific paper on his interests in automata theory that is also published in this volume (Section 1 of Chapter VI). There is another paper for the first time published in English here (Section 9 of Chapter VI), that is closely related to automata and where Uno Kaljulaid, in his attractive style, introduces applications of formal language theory in mathematics, computer science and biology.
145
2.
Automata and their decomposition Lecture notes by Uno Kaljulaid (compiled with the assistance of Jaak Peetre)
Qu’on ne dise pas que je n’ai rien dit de nouveau: la disposition des matières est nouvelle; quand on joue à la paume, c’est une même balle dont joue l’un et l’autre, mais l’un la place mieux. – BLAISE PASCAL.
2.1. Definition of automata By an automaton we mean a quintuple A = (A, X, Y ; λ, δ) where – A is the set of states of A; – X is the input alphabet; – Y is the output alphabet; – λ : A × X → A is the (state) transition function; – δ : A × X → Y is the output function. This notion was introduced in the 30’s by S. Kleene, A. Markov and A. Turing. Later on, it will be convenient to use the notation λ(a, x) = a x and δ(a, x) = a ∗ x. Correspondingly, we will use also alternative notation for automata: A = (A, X, Y ; , ∗), where it is appropriate. Furthermore, for brevity of notations, we sometimes even skip and ∗ and write simply A = (A, X, Y ) rather than A = (A, X, Y ; , ∗), if there is no ambiguity. Quite often, it is supposed that A, X and Y are finite. Also, a state s ∈ A (called the start (or initial) state for the automaton A) and, furthermore, a subset Afin ⊆ A (of final states for A) are fixed. Such automata are called deterministic finite state machines or Mealy machines. Initially, λ : A × X → A could have been a partial map. However, λ can always be made a total function using additional (dummy) states. Given A = {a1 , . . . , am } and X = {x1 , . . . , xn } we can implement λ as a m × n matrix Λ with its elements defined by the rule Λ(i, j) = k if and only if λ(ai , xj ) = ak . In this way one can represent a semi-automaton Aˇ = (A, X; λ) in a computer. Moreover, any such (semi-)automaton can be represented as a labelled digraph. def
E XAMPLE 2.1. Let A = {s = a1 , a2 , a3 , a4 }, where a3 is the final state and a4 is a dummy state, X = {0, 1} and λ defined by the following table: λ s = a1 a2 a3 a4
0 a4 a1 a3 a4
1 a2 a3 a4 a4
146
C HAPTER II. AUTOMATA THEORY
The labelled digraph representing this (semi-)automaton Aˇ is given in Figure 1. X (inputs) + /.-, /.-, ()*+ ()*+ /.-, a1:t a3 a2 4 ()*+ } g :: } } :: }} :: }} :: } }} :: }} :: } :: }}} ~} ()*+ /.-, aM 4 k Fig. 1: The digraph G(A)
Let us develop this example to represent the automaton A where, in addition, Y = X and the output function δ : A × X → Y is given by the following table: δ a1 a2 a3 a4
0 1 0 1 0
1 0 1 0 1
In this case, the labels for the above digraph may be taken to be of the type i/o (where i stands for input, o for output), and we get the picture in Figure 2. X (inputs) 1/1 t 0/0 + /.-, /.-, ()*+ /.-, ()*+ ()*+ a16 a3 4 a2 g 66 66 1/0 66 66 6 1/0 0/1 66 66 66 6 /.-, ()*+ aM 4 k 0/0
0/1
1/1
Fig. 2: The automaton G(A)
Let us extend the transition function λ to λ : A × X ∗ → A by the rules λ(a, ε) = a and λ(a, ux) = λ(λ(a, u), x) for all a ∈ A, x ∈ X, u ∈ X ∗ . Here ε stands for the
2. Automata and their decomposition
147
empty string in X ∗ . Define, quite generally, the language accepted by an automaton A as the set def L(A) = {w | w ∈ X ∗ , λ(s, w) ∈ Afin }. The language accepted by the automaton in Figure 1 is (10)∗ 11(0)∗ . 1 Such languages in X ∗ , just accepted by some Mealy automaton are called regular. Note also that there exist algorithms for finding a deterministic finite state automaton which accepts a given regular language L with a minimum number of states. Returning to the general case, let us notice that the maps λ and δ admit natural extensions λk : A × X k → Ak and δk : A × X k → Y k defined as follows. Let a ∈ A and x ¯ = (x1 , . . . , xk ) ∈ X k be given. We first define recursively states a1 , a2 , . . . , ak by requiring ai = λ(ai−1 , xi ) for i = 1, . . . , k putting also a0 = a and then corresponding outputs y1 , y2 , . . . , yk by setting yi = δ(ai−1 , xi ) for i = 1, . . . , k. Thereafter we put λk (a, x1 , . . . , xk ) = (a1 , a2 , . . . , ak ) and δk (a, x1 , . . . , xk ) = (y1 , y2 , . . . , yk ). It is clear that λ1 = λ, δ1 = δ. The figure below illustrates the above construction if k = 3. a
x1 a1 y1
x2 a2 y2
x3 a3 y3
Intuitively, this has the following interpretation. The automaton, initially in state a, responds to the input x1 with the output y1 and passes then to the state a1 . To the input x2 it responds with the output y2 and passes to the state a2 . Finally, to the input x3 it responds with the output y3 and passes to the state a3 . If we have an automaton A = (A, X, Y ; λ, δ) and we want to emphasize that the input and the output alphabets are respectively just X and Y , we say that A is an (X, Y )automaton. If we have another (X, Y )-automaton ¯ δ), ¯ ¯ X, Y ; λ, A¯ = (A, we say that A¯ covers A (and write then A¯ f : A → A¯ such that δk (a, x) = δ¯k (f (a), x)
¯ if there exists a mapping A or A ≺ A) for all
x ∈ X k.
In the general case, let us make the following definition.
1Here the asterisk * is used to design a generic power (a non-negative integer). Thus a typical word in the language may be 101011000 = (10)(10)11(0)(0)(0) (the group of characters 10 appears twice, the group 0 three times).
148
C HAPTER II. AUTOMATA THEORY
D EFINITION 2.1. A homomorphism of automata ¯ δ) ¯ ¯ X, ¯ Y¯ ; λ, σ : A = (A, X, Y ; λ, δ) → A¯ = (A, ¯ σ2 : X → X, ¯ σ3 : Y → Y¯ , is a triple σ = (σ1 , σ2 , σ3 ) of maps, where σ1 : A → A, such that ¯ 1 (a), σ2 (x)) σ2 (λ(a, x)) = λ(σ ¯ 1 (a), σ2 (x)) σ3 (δ(a, x)) = δ(σ for all a ∈ A and x ∈ X. In particular, an epimorphism of two (X, Y )-automata A = (A, X, Y ; λ, δ) and ¯ δ) ¯ is provided by a triple of mappings of the type (σ, idX , idY ) with ¯ X, Y ; λ, A¯ = (A, ¯ σ : A → A surjective such that ¯ σ , x) [λ(a, x)]σ = λ(a
and
¯ σ , x) [δ(a, x)]σ = δ(a
for all a ∈ A. (Here we have written σ : a → aσ .) ¯ then A¯ P ROPOSITION 2.2. If there exists an epimorphism A → A,
A.
P ROOF. Define f (a) = aσ for a ∈ A. Take any element (x1 , . . . , xk ) ∈ X k and let a ∈ A be given. Define, as before, a1 , a2 , . . . , ak recursively starting with a0 = a ¯1 , . . . , a ¯k and postulating that ai = λ(ai−1 , xi ) for 1 ≤ i ≤ k. In a similar way, define a ¯ Note that starting with a ¯0 = aσ ∈ A. a ¯0 = aσ = aσ0 . Suppose that we have already proved that a ¯i−1 = aσi−1 for some i ≥ 1. Then we get ¯ σ , xi ) = λ(¯ ¯ ai−1 , xi ) = a aσi = [λ(ai−1 , xi )]σ = λ(a ¯i . i−1 Hence, by induction we see that a ¯i = aσi holds for all i ≤ k. Now it is easy to prove that δk (a, x1 , . . . , xk ) = δ¯k (aσ , x1 , . . . , xk ). Indeed, as δ1 = δ and δ¯1 = δ¯ we find ¯ σ , x1 ) = δ(a, x1 ) = δ1 (a, x1 ). δ¯1 (aσ , x1 ) = δ(a Suppose that we have already proved, for some i < k, that δi (a, x1 , . . . , xi ) = δ¯i (aσ , x1 , . . . , xi ).
2. Automata and their decomposition
149
Then it follows that
δi+1 (a, x1 , . . . , xi+1 ) = δi (a, x1 , . . . , xi ), δ(ai , xi+1 ) = ¯ σ , xi+1 ) = = δ¯i (aσ , x1 , . . . , xi ), δ(a i ¯ ai , xi+1 ) = = δ¯i (aσ , x1 , . . . , xi ), δ(¯ = δ¯i+1 (aσ , x1 , . . . , xi+1 ).
Thus, in view of our definition of f , we have established that for any x ¯ ∈ Xk δk (a, x ¯) = δ¯k (f (a), x ¯). D EFINITION 2.3. We say that two automata A and A¯ are equivalent if and only if ¯ It is easy to see that this relation is an equivalence relation on A¯ A and A A. ¯ (X, Y )-automata. Let us agree to denote it by A ∼ A. Consider the semigroup X + of all (non-empty) words in the alphabet X, that is, w = x1 x2 . . . xk where x1 , x2 , . . . , xk are any elements of X. We can define recursively the operation on X + by stipulating that a x1 . . . xk = (a x1 . . . xk−1 ) xk
(k > 1).
Note that is a semigroup action of X on A in the sense that (a w) w = a (ww ) for all a ∈ A and all w, w ∈ X + . Similarly, we can define ∗ on X + by requiring that +
a ∗ x1 . . . xk = (a x1 . . . xk−1 ) ∗ xk (k > 1). R EMARK 2.4. We see that a x1 . . . xk is nothing but the k-th component of λk (a, x1 , . . . , xk ) and, similarly, that a ∗ x1 . . . xk is nothing but the k-th component of δk (a, x1 , . . . , xk ). For any element a ∈ A consider the function δa : X + → Y + given by the rule δ a (x1 x2 · · · xk ) = δk (a, x1 , . . . , xk ) with x1 , . . . , xk ∈ X, k ≥ 1. Such a function will be called the line of behavior2 of A beginning at the state a. Introduce the set Δ(A) = {δ a : X + → Y + | a ∈ A}. If the correspondence a → δ a is a bijection between the set of states of A and the set Δ(A), then the automaton A is called reduced. Fix again a ∈ A and consider the map f a : X + → Y , f a (w) = a ∗ w. From the algorithmic point of view it is more convenient to consider the set Φ(A) = {f a : X + → Y | a ∈ A} rather than the previous set Δ(A). We have the following useful result. 2Editors’ Note. Sometimes called also as a trace of A.
150
C HAPTER II. AUTOMATA THEORY
P ROPOSITION 2.5. For any automaton A one has δ a = δ b ⇐⇒ f a = f b . P ROOF. Suppose that δ a = δ b for some a, b ∈ A. Take any word w = x1 x2 · · · xk ∈ X . Then k
f a (w) = a ∗ (x1 x2 · · · xk ) = = (a x1 · · · xk−1 ) ∗ xk = = prk (δ a (w)) = prk (δ b (w)) = · · · = f b (w), where prk stands for projection to the kth component. Therefore, f a = f b . On the other hand, assume that f a = f b for some a, b ∈ A. Then we obtain δ a (w) = (a ∗ x1 , a ∗ x1 x2 , . . . , a ∗ x1 x2 · · · xk ) = = (f a (x1 ), f a (x1 x2 ), . . . , f a (x1 x2 · · · xk )) = = (f b (x1 ), f b (x1 x2 ), . . . , f b (x1 x2 · · · xk )) = · · · = δ b (w). This shows that δ a = δ b .
It follows from Proposition 2.5 that for any automaton A = (A, X, Y ; λ, δ) the correspondence a → δ a (a ∈ A) is one-to-one if and only if the same is true for the correspondence a → f a (a ∈ A). Therefore, for an automaton A to be reduced it is necessary and sufficient that one has a one-to-one correspondence between A and Φ(A). P ROPOSITION 2.6. Let A and A¯ be any two reduced (X, Y )-automata. Then the following conditions are equivalent: (i): A and A¯ are isomorphic; (ii): A and A¯ are equivalent; ¯ (iii): Δ(A) = Δ(A); ¯ (iv): Φ(A) = Φ(A). P ROOF. (i) ⇒ (ii). Suppose that A and A¯ are two isomorphic (X, Y )-automata. Then, one has, in particular, epimorphisms μ : A → A¯ and μ−1 : A¯ → A. Hence, it ¯ Therefore, A ∼ A. ¯ follows from Proposition 2.2 that A¯ A and A A. ¯ ¯ By (ii) ⇒ (iii). Suppose that A ∼ A for some reduced (X, Y )-automata A and A. ¯ ¯ definition, we then have A A and A A. Therefore, according to Proposition 2.2 ¯ μ = (f, idX , idY ), and ν : A¯ → A, ν = there exist epimorphisms μ : A → A, (g, idX , idY ), such that for the first components of these maps (i.e., for the corresponding “epimorphisms of states" f : A → A¯ and g : A¯ → A) one has ¯ (a), x) and δ(¯ ¯ a, x) = δ(g(¯ δ(a, x) = δ(f a), x), where a ∈ A, a ¯ ∈ A¯ and x ∈ X.
2. Automata and their decomposition
151
It follows that, for any word w = x1 · · · xk ∈ X + , δ a (w) = δk (a, x1 · · · xk ) = = (δ(a, x1 ), δ(a1 , x2 ), . . . , δ(ak−1 , xk )) = ¯ (a), x1 ), δ(f ¯ (a1 ), x2 ), . . . , δ(f ¯ (ak−1 ), xk ) = = δ(f ¯ (a), x1 ), δ( ¯ f (a) , x2 ), . . . , δ¯ f (a) = (δ(f , xk ) = 1 k−1 = δ¯k (f (a), x1 · · · xk ) = δ¯f (a) (w). Here ai = a x1 · · · xi and we have used the relations f (ai ) = f (a x1 · · · xi ) = f (a) x1 · · · xi = f (a)i
(i ≥ 2)
which hold true since μ = (f, idX , idY ) is a homomorphism of automata. As a result ¯ In the same way, we obtain we get δ a = δ¯f (a) , which implies that Δ(A) ⊆ Δ(A). a ¯ g(¯ a) ¯ ¯ and so also Δ(A) ⊆ Δ(A). δ =δ ¯ implies that for any function δ a ∈ Δ(A) (iii) ⇒ (iv). The equality Δ(A) = Δ(A) a ¯ ¯ ¯ ¯ ∈ A) such that δ a = δ¯a¯ as functions on X + . It there exists a function δ (where a follows that ¯ k ∈ N} = Φ(A), ¯ Φ(A) = {δ a | a ∈ A, k ∈ N} ⊆ {δ¯a¯ | a ¯ ∈ A, k
k
where the subscript k indicates the k-th component of the function in question. Inter¯ ⊆ Φ(A) also. changing the rôle of A and A¯ gives Φ(A) ¯ implies that for any f¯a¯ there exists some (iv) ⇒ (i). The equality Φ(A) = Φ(A) a a a ¯ ¯ f ∈ Φ(A) such that f = f as functions on X + . At the same time, there is no other state a1 ∈ A such that f a1 = f¯a¯ – otherwise one would have f a = f a1 (with a = a1 ) contradicting the fact that A is a reduced automaton. It follows that the correspondence ψ : a → a ¯ thus obtained is a bijection. Let us now prove that μ = (m, idX , idY ) : A → A¯ is an isomorphism of automata. Even more is true. Namely, for any a ∈ A, v ∈ X + , we have (a ∗ v)μ = (a ∗ v)idY = f a (v) = f¯a¯ (v) = a ¯ ∗ v = am ∗ v idX + = aμ ∗ v μ . Moreover, (a v)μ = aμ v μ . Indeed, as A is reduced, it is enough to show that m m f¯(av) = f¯a v . To this end, take any w ∈ X + ; then m f¯(av) (w) = f av (w) = (a v) ∗ w = a ∗ (vw) = f a (vw) = m m = f¯a (vw) = am ∗ vw = (am v) ∗ w = f¯a v (w).
152
C HAPTER II. AUTOMATA THEORY
2.2. Preliminary motivation and more notions Finite automata can serve as a bridge also between such topics in theoretical computer science as classification of formal languages and the investigation of their structure, or crypto-analysis and systems theory. Automata theory provides several valuable concepts for the analysis and modelling of cryptographic devices. For instance, it often happens that a cipher device is assembled from a set of simpler components (such as shift registers), and so additional attacks may become feasible if the given machine can be decomposed in a way different from its original construction. However, it appears that cascades of clock-controlled shift registers can only be decomposed in the way they were constructed [7]. In order to show that the completion of a semi-automaton with output function δ is not a mere formality for achieving coherence with the notions in the classics, let us see how it can be used in coding. Namely, any Mealy automaton that satisfies the condition ∀a ∈ A δ a (u) = δ a (v) ⇒ u = v in X ∗ ,
(36)
is called a Mealy coding machine. This means that, for such an automaton A, Δ(A) consists of injective functions. In terms of the digraph representation for A condition (36) means that there are no two arcs leaving any given state with the same output label. At the same time, as λ is a function, it is clear that each node of the digraph representing A has no arcs leaving this node with the same input labels. This property makes it possible to construct a new automaton A−1 = ((s; A), X, Y ; λ−1 , δ −1 ) defining, for any a ∈ A λ−1 (a, o) (37)
δ
−1
(a, o)
def
=
def
=
b if and only if λ(a, i) = b and i if and only if δ(a, i) = o .
Here i is an input symbol and o the corresponding output symbol, defined by δ(a, i) = o. It follows that δ s (v) = w ⇒ (δ −1 )s (w) = v; this can be shown by induction on the length |v| = |w|. For instance, for the Mealy automaton given in Figure 2, taking w = 101100 as an input word in X ∗ we get δ s (w) = δ a1 (101100) = 000111. Note also that s w = a1 101100 = a3 . Given a digraph G(A) representing a Mealy coding machine A, as in the case of Figure 2, it is easy to get G(A−1 ): just change the labels i/o to o/i. For instance, for the automaton A in Figure 2 this means that G(A−1 ) is as in Figure 3. In this case one finds that (δ −1 (s, 000111) = 101100 and λ−1 (s, 000111) = a3 . Let us go one step further. Consider a shift register Rn# functioning according to the scheme in Figure 4. During any fixed time interval the content (B0 , B1 , . . . , Bn−1 ) of its binary storage elˆi (i = 0, 1, . . . , n − 1) is represented by a (0, 1)-vector called a state of R# . ements B n
2. Automata and their decomposition
153
X (inputs) 0/0 1/1 + /.-, /.-, ()*+ /.-, ()*+ ()*+ a a16 t a3 2 4 g 66 0/1 66 66 66 6 1/0 66 0/1 66 66 6 /.-, ()*+ aM 4 k 0/0
1/0
1/1
Fig. 3: The reversed automaton G(A−1 )
$ o
L
ˆ = R( B ˆ0 , B O
ˆ1 , B O
··· ,
ˆn−1 ) B ˆ B O
/B ˆ0
/B ˆ1
/ ···
/B ˆn−1
Fig. 4: A shift register R# n
During this same time interval the value B of the function R(B0 , B1 , . . . , Bn−1 ) is calculated. Using this value and the input i, the new value B0 for B0 is found; often, this is done using the rule def ¯ ·i B0 = B · ¯i ∨ B
(“exclusive or").
B0
Usually, this value is considered also as the output of Rn# during this time interval. Further, at the same time the transfer of the content of the storage elements is made according to the rule ˆk+1 (k = 0, 1, . . . , n − 2). ˆk → B B Thus, during this time interval the transfer (B0 , B1 , B2 , . . . , Bn−1 ) → (B0 , B0 , B1 , . . . , Bn−2 ) takes place. A concrete example (for n = 3) of R3# is given by the following table; we def ¯1 ∨ B2 and B def ¯ · i ∨ B · ¯i. = B take X = {0, 1} = Y , B = R(B0 , B1 , B2 ) = B0 · B 0
154
C HAPTER II. AUTOMATA THEORY
State no.
B0 B1 B2 B = B0 · B¯1 ∨ B2
0
0
0
0
0·¯ 0∨0=0
1
0
0
1
0·¯ 0∨1=1
2
0
1
0
0·¯ 1∨0=0
3
0
1
1
0·¯ 1∨1=1
4
1
0
0
1·¯ 0∨0=1
5
1
0
1
1·¯ 0∨1=1
6
1
1
0
1·¯ 1∨0=0
7
1
1
1
1·¯ 1∨1=1
def
i
def ¯ · i ∨ B · ¯i B0 = B
¯ 0·0∨0·¯ 0 ¯·1∨0·¯ 0 1 ¯ 1·0∨1·¯ 0 ¯ 1·1∨1·¯ 1 ¯ 0·0∨0·¯ 0 ¯ 0·1∨0·¯ 1 ¯ 1·0∨1·¯ 0 ¯ 1·1∨1·¯ 1 ¯ 1·0∨1·¯ 0 ¯ 1·1∨1·¯ 1 ¯ 1·0∨1·¯ 0 ¯ 1·1∨1·¯ 1 ¯ 0·0∨0·¯ 0 ¯ 0·1∨0·¯ 1 ¯ 1·0∨1·¯ 0 ¯ 1·1∨1·¯ 1
0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
= = = = = = = = = = = = = = = =
New B0 B1 B2 state no 0 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0
0 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0
0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1
0 0 0 0 1 1 1 1 0 0 0 0 1 1 1 1
0 4 4 0 1 5 5 1 6 2 6 2 3 7 7 3
Output 0 1 1 0 0 1 1 0 1 0 1 0 0 1 1 0
Corresponding to this table the Mealy automaton A(R3# ) can be represented by def
the following labelled digraph (see Figure 5). There, ai = (i)2 are the states for G(A(R3# )). Here (i)2 stands for the binary representation of the number i. 0/0
a0 ?>=< 89:; MMM 8 000 qq MMM 1/1 1/0 qqq 0/1 MMM q q q MMM q q q M( & a q q a a 1 2 4 89:; ?>=< o 89:; o ?>=< 89:; ?>=< 001 010 100 O F 0/0 1/0 1/0 a3 89:; ?>=< 011
h
1/0 0/1
0/1
a5 89:; / ?>=<
1/1
0/1
a6 ?>=< / 89:; 101 110 q qq qqq q q q 0/0 qqq 1/1 a7 xq 89:; ?>=< i 111 L 0/1
1/0
Fig. 5: The automaton of the shift register G(A(R# 3 ))
To get the digraph G(R+ ) corresponding to the decoder for the Mealy coding machine given by the digraph G(Rn# ) it remains to reverse the i/o-labels to o/i-labels. The Mealy automaton obtained in this fashion corresponds to the shift register where ˆk → B ˆk+1 (k = 0, 1, . . . , n − 2) are calculated as before R(B0 , B1 , . . . , Bn−1 ) and B but B0 is taken to be the input. The output is “exclusive or" of B and B0 (for further information on cipher systems cf. [1]). D EFINITION 2.7. By a congruence on an automaton A = (A, X, Y ; λ, δ) is meant a triple of equivalence relations ρ = (ρ1 , ρ2 , ρ3 ) such that if a ρ1 b in A and u ρ2 v in X
2. Automata and their decomposition
155
then we have also (a u) ρ1 (b v) in A
and
(a ∗ u) ρ3 (b ∗ v) in Y .
Every congruence ρ on an automaton A gives rise to a factor-automaton A/ρ = (A/ρ1 , X/ρ2 , Y /ρ3 ; , ∗ ) where def
def
[a]ρ1 [x]ρ2 = [a x]ρ1 and [a]ρ1 ∗ [x]ρ2 = [a ∗ x]ρ3 . It is easy to formulate and to prove the analogues of the well-known first and second homomorphism theorems (of E. Noether) in this context. This is left as an exercise to the Reader. Here we will only require the following special case. T HEOREM 2.8. Let there be given an epimorphism of automata ψ : A → B. Then there exists an isomorphism ϕ : A/ ker ψ → B that makes the diagram ψ / /;B A GG w GG w GG w G w ϕ τ GGG # w A/ ker ψ
commutative. P ROOF. The proof is standard.
Here ker ψ = (ker ψ1 , ker ψ2 , ker ψ3 ) is a congruence, while τ is the natural epimorphism of A onto the factor-automaton A/ ker ψ. (More precisely, for instance the equivalence of a and b under ker ψ1 simply means that these elements have the same ψ1 -image, that is, ψ1 (a) = ψ1 (b).) To every automaton A = (A, X, Y ) there corresponds a reduced automaton A = (A/ρ, X, Y ) where ρ is a congruence such that a ρ a if and only if a ∗ u = a ∗ u for all u ∈ X.
2.3. Semigroup automata Given an automaton A = (A, X, Y ; , ∗) we have already extended (Sec. 2.1) the operations and ∗ to cover the case of the free semigroup X + generated by X. In a similar way all other notions can be extended to automata of the type A = (A, X + , Y ; , ∗). More generally, we can consider, instead of X + , any semigroup Γ. We can then define a semigroup automaton as a quintuple A = (A, Γ, Y ; , ∗) obeying the rules a γ1 γ2 = (a γ1 ) γ2 and a ∗ γ1 γ2 = (a γ1 ) ∗ γ2 for all a ∈ A and all γ1 , γ2 ∈ Γ. In particular, every semigroup automaton A may be considered as a heterogeneous algebra (in the sense of G. Birkhoff)3. This point of view helps one to find quickly the notions decisive for the decomposition of automata. In this connection we give the following definition. 3Editors’ Note. Also called a many-sorted algebra.
156
C HAPTER II. AUTOMATA THEORY
D EFINITION 2.9. A homomorphism of semigroup automata A = (A, Γ, Y ; , ∗) → A = (A , Γ , Y ; , ∗ ) is a triple ψ = (ψ1 , ψ2 , ψ3 ) of mappings, ψ1 : A → A , ψ2 : Γ → Γ and ψ3 : Y → Y , where, in addition, ψ2 is a semigroup homomorphism and we have ψ1 (a u) = ψ1 (a) ψ2 (u) and ψ3 (a ∗ u) = ψ1 (a) ∗ ψ2 (u) for all a ∈ A and u ∈ Γ. The notions of isomorphism, epimorphism, monomorphism etc. are defined in the usual way. Notice that we obtain a category with semigroup automata as its objects and their homomorphisms as its morphisms. This point of view is also very helpful in finding patterns for decomposition of automata. In the special case when Γ has a unit element ε ∈ Γ it is assumed that a ε = a for all a ∈ A. This implies that a ∗ γ = a ∗ γε = (a γ) ∗ ε. Therefore, introducing a map μ : A → Y by the rule μ(a) = a ∗ ε, it follows that a ∗ γ = μ(a γ) for all a ∈ A and γ ∈ Γ. Conversely, consider any action (A, Γ; ) of a semigroup Γ on a set A and any map μ : A → Y . We obtain an automaton A = (A, Γ, Y ; , ∗) defining ∗ by the rule a ∗ γ = μ(a γ). Indeed, we have a ∗ γ1 γ2 = μ a (γ1 γ2 ) = μ (a γ1 ) γ2 = (a γ1 ) ∗ γ2 . We say then that A is a Moore automaton. Let us now change our notation writing F (X) for the free semigroup over X (previously written X + ) and, similarly, F ∗ (X) for the free monoid over X. Given a semigroup automaton A = (A, F (X), Y ; , ∗), define on F (X) a binary relation ρ by stipulating that u ρ v ⇐⇒ a u = a v
for all a ∈ A.
It follows at once from the properties of the action (A, F (X); ) that ρ is an equivalence on F (X) which is two-sided stable under multiplication by elements of F (X), i.e. ρ is a congruence on F (X). The factor-semigroup ΓA = F (X)/ρ is called the semigroup of the automaton A. Note that ρ = ker(A, F (X)), that is, ρ is the kernel of the action (A, F (X); ). Furthermore, the induced action (A, ΓA ; ) is faithful. The last thing means that any two elements in ΓA acting in the same way on A coincide by necessity (i.e. a σ = a τ for all a ∈ A implies that σ = τ ). Notice also that if |A| < ∞ then |ΓA | < ∞. In words: finiteness of the set of states of an automaton forces its semigroup to be finite also.
2. Automata and their decomposition
157
Finally, let us take a semigroup automaton A = (A, Γ, Y ; , ∗) and define a binary relation ρˆ on Γ by the rule γ1 ∼ γ2
(mod ρˆ) ⇐⇒ a γ1 = a γ2
and
a ∗ γ1 = a ∗ γ2
for all
a ∈ A.
Then ρˆ is two-sided stable on Γ, both for and ∗. So, it is clear that always ρˆ ⊂ ker(A, Γ). For a Moore automaton one has equality, ρˆ = ker(A, Γ).
2.4. Cyclic automata Let us call an action (A, Γ; ) cyclic if there exists an element a ∈ A such that A = {a γ | γ ∈ Γ}. Consider now the map ψ : (F ∗ (X), X) → (A, X), of the regular action (F ∗ (X), X) into a cyclic automaton (A, X), by the rule ψ(u) = a u, It is easy to see that
u ∈ F ∗ (X).
(F ∗ (X)/ρ, X) ∼ = (A, X)
with ρ as in Sec. 2.3. More can be said in this context. Namely, let (A, X, Y ; , ∗) be any cyclic automaton, i.e. A coincides with the set {ax | x ∈ X} for some a ∈ A. Consider the automaton (F ∗ (X), X, Y ; , ∗ ) with the basic operations given by u x = ux
and
u ∗ x = a ∗ ux
for x ∈ X and u ∈ F ∗ (X);
here ux means that we take the product of u and x in F ∗ (X). The map ψ : F ∗ (X) → A def
given by uψ = a u yields an epimorphism of automata ψ : (F ∗ (X), X, Y ) → (A, X, Y ) and so we obtain a natural isomorphism (F ∗ (X)/ρ, X, Y ) ∼ = (A, X, Y ) with ρ as above. Now consider the special case when A = (A, X, Y ) is a reduced cyclic Moore automaton. Let A be generated by the element a ∈ A. Consider the map f = fa : F ∗ (X) → Y, given by f (u) = a ∗ u
for all u ∈ F ∗ (X).
It is readily seen that knowledge of f allows us to restore ρ. Indeed, take on F ∗ (X) the binary relation ρ∗ defined as follows u ∼ u
(mod ρ∗ ) ⇐⇒ f (vuw) = f (vu w) def
for all v, w ∈ F ∗ (X).
It is easy to check that ρ = ρ∗ . Indeed, take any two words u and u in F ∗ (X) such that uρu . Then, for any words v and w in F ∗ (X) it is true that (vu) ρ (vu ) and a vu = a vu .
158
C HAPTER II. AUTOMATA THEORY
Therefore, a ∗ (vuw) = (a vu) ∗ w = (a vu ) ∗ w = a (vu w), which implies f (vuw) = f (vu w). This shows that uρ∗ u , which proves ρ ⊆ ρ∗ . In the other direction, suppose that uρ∗ u . Then, for any two words v and w in ∗ F (X) we have a ∗ (vuw) = f (vuw) = f (vu w) = a ∗ (vu w). Taking here v = ε gives (a u) ∗ w = a ∗ uw = a ∗ u w = (a u ) ∗ w. Therefore, fau = fau and, as A is reduced by hypothesis, it follows that a u = a u and so also uρu . This reasoning leads us to the following construction: given (arbitrary) sets X and Y and a map f : F ∗ (X) → Y define on F ∗ (X) the binary relation ρ∗ as just was done above. So we get an action (F ∗ (X)/ρ∗ , X). As ρ∗ ⊆ ker f (by definition!) the map f : F ∗ (X) → Y induces a map μ : F ∗ (X)/ρ∗ → Y. Thus, having the action (F ∗ (X)/ρ∗ , X) (a semi-automaton) and the map μ, we get a Moore automaton (F ∗ (X)/ρ∗ , X, Y ), which we denote by A(f ). A straightforward verification shows that A(f ) is a reduced cyclic Moore automaton. Therefore, the following result is true. P ROPOSITION 2.10. The automaton A(f ) is a reduced cyclic Moore automaton. Conversely, every reduced cyclic Moore automaton can be obtained in this way.
2.5. Wreath products of actions Let there be given two semigroup actions (A, Φ) and (B, Σ). Take the set of all functions ϕˆ : B → Φ (that is, ϕˆ ∈ ΦB = Fun(B, Φ)) and make it a semigroup by defining multiplication of such functions pointwise: (ϕˆ1 · ϕˆ2 )(b) = ϕˆ1 (b) · ϕˆ2 (b)
for b ∈ B.
Furthermore, let us consider the semi-direct product Γ = ΦB Σ = {(ϕ, ˆ σ)|ϕˆ ∈ ΦB , σ ∈ Σ} defining multiplication of pairs (ϕ, ˆ σ) by the formula (ϕˆ1 , σ1 ) · (ϕˆ2 , σ2 ) = (ϕˆ1 · σ1ϕˆ2 , σ1 σ2 ), where σ1ϕˆ2 is given by ϕˆ2 (b) = ϕˆ2 (b σ1 )
σ1
for b ∈ B.
2. Automata and their decomposition
159
(To check the associativity, it suffices to note that (ϕˆ1 · σ1ϕˆ2 )· σ1 σ2ϕˆ3 = ϕˆ1 · σ1 (ϕˆ2 · σ2ϕˆ3 ).) Lastly, let us take the Cartesian product G = A × B = {(a, b) | a ∈ A, b ∈ B} giving an action of the semigroup Γ on G by the rule def
(a, b) (ϕ, ˆ σ) = (a ϕ(b), ˆ b σ) . In this way we obtain an action (G, Γ), (G, Γ) = (A, Φ)wr(B, Σ) called the wreath product of the given pairs (A, Φ) and (B, Σ). P ROPOSITION 2.11. Let ρ = ker(G, Γ), ρ1 = ker(A, Φ) and ρ2 = ker(B, Σ). Then (ϕˆ1 , σ1 ) ∼ (ϕˆ2 , σ2 )
(mod ρ)
if and only if σ1 ∼ σ2
(mod ρ2 )
and ϕˆ1 (b) ∼ ϕˆ2 (b) (mod ρ1 ) for all b ∈ B. P ROOF. right hand side, suppose that (ϕˆ1 , σ1 ) ∼ (ϕˆ2 , σ2 )
(mod ρ).
Then for any (a, b) ∈ G holds (a ϕˆ1 (b), b σ1 ) = (a, b) (ϕˆ1 , σ1 ) = (a, b) (ϕˆ2 , σ2 ) = (a ϕˆ2 (b), b σ2 ). Therefore, b σ1 = b σ2
for all b ∈ B
which gives σ1 ∼ σ2 (mod ρ2 ). Also, as a ϕˆ1 (b) = a ϕˆ2 (b) for all a ∈ A, this implies that ϕˆ1 (b) ∼ ϕˆ2 (b) (mod ρ1 ) for all b ∈ B. On the other hand, if a ϕˆ1 (b) = a ϕˆ2 (b) and b σ1 = b σ2 for all a ∈ A and all b ∈ B, we get (a, b) (ϕˆ1 , σ1 ) = (a, b) (ϕˆ2 , σ2 ) for all (a, b) ∈ G and therefore (ϕˆ1 , σ1 ) ∼ (ϕˆ2 , σ2 ). C OROLLARY 2.12. If both the actions (A, Φ) and (B, Σ) are faithful, then the same is true for (G, Γ) also.
160
C HAPTER II. AUTOMATA THEORY
P ROOF. Indeed, suppose that (a, b) (ϕˆ1 , σ1 ) = (a, b) (ϕˆ2 , σ2 ) holds for all (a, b) ∈ G. Then, according to Proposition 2.11, it follows that σ1 ∼ σ2
(mod ρ2 )
and ϕˆ1 (b) ∼ ϕˆ2 (b) (mod ρ1 ) for all b ∈ B. The faithfulness of (B, Σ) implies that σ1 ∼ σ2
(mod ρ2 ) =⇒ σ1 = σ2 .
Also, the faithfulness of (A, Φ) gives ϕˆ1 (b) ∼ ϕˆ2 (b) (mod ρ1 ) =⇒ ϕˆ1 (b) = ϕˆ2 (b) for all b ∈ B. The last thing means that ϕˆ1 = ϕˆ2 and so we get (ϕˆ1 , σ1 ) = (ϕˆ2 , σ2 ).
P ROPOSITION 2.13. Let there be given two homomorphisms of pairs ¯ ¯ α : (A, Φ) → (A, Φ) and β : (B, Σ) → (B, Σ), where the first component of β is the identity map on B. Then there exists a homomorphism ¯ ¯ (A, Φ)wr(B, Σ) → (A, Φ)wr(B, Σ) ¯ and (B, Σ) (B, Σ) ¯ extending the given mappings. In particular, if (A, Φ) (A, Φ) are the natural epimorphisms onto the corresponding faithful pairs, then there exists an epimorphism ¯ ¯ (A, Φ)wr(B, Σ) (A, Φ)wr(B, Σ) with the right hand side faithful also. ¯ where G ¯ = A × B, by P ROOF. Define μ : G → G, (a, b)μ = (aα , b)
for any
(a, b) ∈ G.
¯ B be given by the rule Also, let ν : ΦB → Φ ϕˆν (b) = [ϕ(b)] ˆ α. Thereafter, extend these two maps to ¯B Σ ¯ def ¯ μ : ΦB Σ → Φ = Γ defined by (ϕ, ˆ σ)μ = (ϕˆν , σ β ). ¯ appears in this way. Let us prove that the homomorphism μ : Γ → Γ Indeed, on the one hand we have [(ϕˆ1 , σ1 ) · (ϕˆ2 , σ2 )]μ = (ϕˆ1 · σ1 ϕˆ2 , σ1 σ2 )μ = (ϕˆ1 · σ1 ϕˆ2 )ν , (σ1 σ2 )β and on the other hand
β (ϕˆ1 , σ1 )μ · (ϕˆ2 , σ2 )μ = (ϕˆν1 , σ1β ) · (ϕˆν2 , σ2β ) = ϕˆν1 · σ1 ϕˆν2 ), σ1β · σ2β
As β is a homomorphism of actions, we certainly have (σ1 σ2 )β = σ1β · σ2β .
2. Automata and their decomposition
161
So, it remains to prove that β
(ϕˆ1 · σ1 ϕˆ2 )ν = ϕˆν1 · σ1 ϕˆν2 .
(38)
Take any b ∈ B. As β a homomorphism of actions, it follows that b σ1 = (b σ1 )β = bβ σ1β = b σ1β . Hence,
α α (ϕˆ1 · σ1 ϕˆ2 )ν (b) = (ϕˆ1 · σ2 ϕ)(b) ˆ = ϕˆ1 (b) · ϕˆ2 (b σ1 ) = ϕˆ1 (b)α · ϕˆ2 (b σ1 )α = β = ϕˆν1 (b) · ϕˆν2 (b σ1β ) = ϕˆν1 · σ1 (ϕˆν2 ) (b). This holds for all b ∈ B and so implies (38). It remains to prove that μ is a homomorphism of actions, i.e. that (39)
μ
((a, b) (ϕ, ˆ σ)) = (a, b)μ (ϕ, ˆ σ)μ .
Indeed, (39) follows from the properties of the homomorphisms α and β along with the following series of equalities: α ((a, b) (ϕ, ˆ σ))μ = (a ϕ(b), ˆ b σ))μ = ((a ϕ(b)) ˆ , b σ) = aα ϕ(b) ˆ α , b σβ ) = = (aα , b) (ϕˆν , σ β ) = (a, b)μ (ϕ, ˆ σ)μ . Now suppose that α and β are natural epimorphisms and let us prove that the homomorphism ¯ ¯ μ : (A, Φ)wr(B, Σ) → (A, Φ)wr(B, Σ), defined above by (ϕ, ˆ σ)μ = (ϕˆν , σ β ) and ∀b ∈ B ϕˆν (b) = [ϕ(b)] ˆ α, is an epimorphism. In view of what has been done above it remains to show that μ is surjective. ¯ σ ¯ B Σ. ¯ We must verify that there exists an element Take any element (ψ, ¯) ∈ Φ B (ϕ, ˆ σ) in Φ Σ such that ¯ σ (ϕ, ˆ σ)μ = (ψ, ¯ ). Here, as β is surjective, it follows immediately that there exists σ ∈ Σ such that σ β = σ ¯. ¯ Therefore, it remains to prove that there exists a function ϕˆ : B → Φ such that ϕˆν = ψ. ¯ ¯ of the form ψ(b) As α is likewise surjective, it follows that for every element in Φ ¯ with b ∈ B there exists an element ϕb ∈ Φ such that (ϕb )α = ψ(b). Take the function ϕ, ˆ ϕˆ : B → Φ, given by the rule ∀b ∈ B ϕ(b) ˆ = ϕb . ¯ for every b ∈ B one has It is easy to check that ϕˆν = ψ: ˆ ϕˆν (b) = ϕ(b) ˆ α = [ϕˆb ]α = ψ(b). ¯ B is surjective. It follows that Consequently, ϕˆν = ψ¯ and, hence, the map ν : ΦB → Φ the same is true for ¯ B Σ. ¯ μ : ΦB Σ → Φ As μ is the identity map on A × B, we can take μ as the epimorphism requested.
162
C HAPTER II. AUTOMATA THEORY
Note that there exists a natural analogue of Theorem 2.8 for semi-automata. This shows that ker μ = (ˆ 0A×B , π), ˆ where 0A×B is the least equivalence on A × B (with one-element subsets as its classes) and π is the kernel for the pair (G, Γ).
2.6. Kaluzhnin-Krasner type theorem Let there be given a group action (A, Γ), that is, a semigroup action such that Γ is a group. Assume that ρ is a congruence of the action (A, Γ) with the property that (A/ρ, Γ) is transitive, i.e. for any two classes [a] and [b] in A/ρ there exists γ ∈ Γ such that [a] γ = [b]. Fix (arbitrarily) an element a ∈ A and consider the stabilizer of [a] in Γ, i.e. the semigroup % def Σ = StΓ ([a]) = {γ ∈ Γ % [a] γ = [a]}. Notice that the choice of [a] does not matter very much – if we take another element than [a] we obtain a semigroup conjugate to Σ, this due to the transitivity of (A/ρ, Γ). Let us interpret [a] as a subset of A denoting it in this role by B. We obtain thus a subpair ¯ the faithful action corresponding to (A/ρ, Γ). (B, Σ) of (A, Γ). Denote by (A/ρ, Γ) T HEOREM 2.14. There exists a monomorphism of semi-automata ¯ (A, Γ) → (B, Σ)wr(A/ρ, Γ). P ROOF. Let us begin with the observation that any transitive action (D, Γ) is isomorphic to some factor-action (Γ/Σ, Γ) of the (right-)regular action (Γ, Γ). Indeed, fix any element e ∈ D. Then for any x ∈ D there exists γ ∈ Γ such that e γ = x. If def
e γ = e γ1 for some elements γ and γ1 in Γ, then γγ1−1 ∈ StΓ (e) = Σ. It follows that the map μ : x → γ gives an isomorphism of the actions (D, Γ) and (Γ/Σ, Γ). Let us apply this observation to the action (A/ρ, Γ). Then we find that (A/ρ, Γ) is isomorphic to (Γ/Σ, Γ) where Σ = StΓ ([a]ρ ). Let T be a full set of representatives for Σ-cosets Σγ (γ ∈ Γ) in Γ/Σ. By the above we have a bijection ν : A/ρ → T . Moreover, we have A/ρ = {[a] τ | τ ∈ T }, and this representation of elements of A/ρ is unique. Thus, μ
ν
a −→ [a] τ −→ τ, i.e. aμν = τ . To prove the theorem we consider a pair of maps (40)
A → B × A/ρ
and (41)
¯ Γ → ΣA/ρ Γ,
given as follows. To get (40), let us set (42)
def
aα = (a (aμν )−1 , aμ ).
2. Automata and their decomposition
163
To get (41), define first a collection of maps fγ (γ ∈ Γ) where each individual map fγ : A/ρ → Σ is given by −1 . fγ (y) = y ν · γ · (y γ μ )ν
(43)
Then the required map (41) is obtained by the rule γ α = (fγ , γ μ ).
(44)
It remains to verify that the pair of maps whose construction has been indicated does the job! The map α : A → B × A/ρ given by (42) is injective. Indeed, let a1 and a2 be elements in A such that −1 μ −1 μ , a1 = a2 (aμν , a2 . a1 (aμν 1 ) 2 ) It follows that aμ1 = aμ2 which, in turn, implies that −1 −1 −1 −1 −1 a1 (aμν = a2 (aμν = a2 (aμ2 )ν = a2 (aμ1 )ν = a2 (aμν . 1 ) 2 ) 1 ) This gives a1 = a2 showing that α is indeed an injection. Next, observe that the right hand side of (43) belongs to Σ. Indeed, take y = [a] τ , τ ∈ T with y ν = τ . Then −1 ν −1 [a] fγ (y) = [a] y ν · y · (y y μ )ν = ([a] τ y) ([a] τ ) y μ = = [a] (τ γ) (τ γ μ )−1 = [a].
It follows that fγ (y) ∈ StΓ ([a]) = Σ. Let us prove that the map ¯ α : Γ → ΣA/ρ Γ given by (44) is a homomorphism of groups. Let us take any two elements γ1 and γ2 in Γ and prove that (γ1 γ2 )α = γ1α · γ2α . The left hand side is given by (γ1 γ2 )α = fγ1 γ2 , (γ1 γ2 )μ = (fγ1 γ2 , γ1μ γ2μ ) and the right hand side by μ
γ1α · γ2α = (fγ1 , γ1μ )(fγ2 , γ2μ ) = (fγ1 · γ1 fγ2 , γ1μ · γ2μ ). So, it suffices to prove that (45)
μ
f γ1 γ2 = f γ1 · γ1 f γ2 .
164
C HAPTER II. AUTOMATA THEORY
Indeed, for any y ∈ A/ρ we have −1 ν μ ν = fγ1 γ2 (y) = y · γ1 γ2 · y (γ1 γ2 ) −1 = y ν · γ1 γ2 · (y γ1μ γ2μ )ν =
−1 ν −1 = y ν · γ1 (y γ1μ )ν · (y γ1μ )ν · γ2 · (y γ1μ ) γ2μ =
= fγ1 (y) · fγ2 (y γ1μ ) = μ
= (fγ1 · γ1 fγ2 )(y). This proves (45). It turns out that the homomorphism α is really a monomorphism. Suppose that ¯ This implies that γ μ is the identity γ α = (fγ , γ μ ) is the identity element in ΣAρ Γ. ¯ element in Γ and, therefore, acts trivially on A/ρ. Also, it follows that fγ is the identity of ΣA/ρ and, consequently, for every y ∈ A/ρ it holds fγ (y) = ε ∈ Σ. We obtain −1 ε = fγ (y) = y ν · γ · y γ μ )ν = y ν · γ · (y ν )−1 , ¯ is injective. i.e. γ = ε showing that the map α : Γ → ΣA/ρ Γ Next, notice that the action of Γ on A can be extended to a corresponding action on B × A/ρ. Indeed, for any two elements a ∈ A and γ ∈ Γ, α μν −1 μ = , (a γ) (a γ) = (a γ) (a γ) μν −1 μν μν −1 μ , (a γ) = = a (a ) · a · γ · (a γ) −1 = a (aμν )−1 aμν · γ · (a γ)μν , aμ γ μ = = a (aμν )−1 , aμ (fγ , γ μ ) = = aα γ α . Resuming, the above shows that we have a monomorphism of actions a
¯ (A, Γ) −→ (B × A/ρ, ΣA/ρ Γ). As a special case, let us take (A, Γ) to be the regular action (Γ, Γ) and let Σ ≤ Γ be any semigroup. Moreover, take ρ to be a partition of Γ into right Σ-cosets. One of these cosets, the one containing ε, is Σ and its centralizer in Γ is Σ also. According to Theorem 2.14 there exists a monomorphism ¯ α : (Γ, Γ) → (Σ, Σ)wr(Γ/Σ, Γ) and, thus, there exists a monomorphism ¯ Γ → ΣΓ/Σ Γ
2. Automata and their decomposition
165
also. ¯ = Γ/Σ and so Further specializing, let Σ be an invariant semigroup in Γ. Then Γ there exists an immersion (Γ, Γ) → (Σ, Σ)wr(Γ/Σ, Γ/Σ). The acting group of the wreath product of actions here coincides with the wreath product of groups Σwr(Γ/Σ). Let us now describe how this result can be used to obtain the Krohn-Rhodes Theorem. Suppose that Γ is a finite group. Then by the Jordan-Hölder Theorem there exists an invariant (composition) series: Γ = Γ0 Γ1 · · · Γm = (1), with all its factor cyclic groups of prime order. Applying here Theorem 2.14 gives a monomorphism, (Γ, Γ) → (Γ1 , Γ1 )wr(Γ∗1 , Γ∗1 ), ∗ where we have put Γ1 = Γ/Γ1 . Analogously, we find (Γ1 , Γ1 ) → (Γ2 , Γ2 )wr(Γ∗2 , Γ∗2 ) where we have set Γ∗2 = Γ1 /Γ2 . Continuing this way gives a sequence of the form (Γ, Γ) → (Γ1 , Γ1 )wr(Γ∗1 , Γ∗1 ) → (Γ2 , Γ2 )wr(Γ∗2 , Γ∗2 ) wr(Γ∗1 , Γ∗1 ) → · · · Finally, using the associativity of the operation wr for semi-automata, we find that (Γ, Γ) → (Γ∗m−1 , Γ∗m−1 )wr(Γ∗m−2 , Γ∗m−2 )wr · · · wr(Γ∗1 , Γ∗1 ). Here all Γ∗i (i = 1, 2, . . . , m − 1) are finite simple groups and at the same time they are epimorphic images of groups contained in the (semi)group Γ, i.e. its “factors" in the sense of Krohn-Rhodes theory. To this end, let us add that the Krohn-Rhodes Theorem asserts that every finite (semi-)automaton can be modelled by a cascade of finite simple (semi-)automata that are “factors" and “triggers". The latter are defined as (semi-)automata (A, X) with A = {a1 , a2 } and X = {ε, x1 , x2 } along with the rules a ε = a, a xi = ai (i = 1, 2) for any a ∈ A. It remains to explain what a cascade of automata is; this will be done in the next section.
2.7. Cascades and wreath products of automata; their interconnections Let there be given two automata A1 = (A1 , X1 , Y1 )
A2 = (A2 , X2 , Y2 )
and
and, in addition, a set X together with the following two maps: α : X × A2 → X1
and
β : X → X2 .
We obtain a new automaton A = (A, X, Y ) if we define the basic sets A and Y as def
A = A1 × A2 ;
def
Y = Y1 × Y2
166
C HAPTER II. AUTOMATA THEORY
and give the basic operations and ∗ by the rules def
(a1 , a2 ) x = (a1 α(x, a2 ), a2 β(x)) and def
(a1 , a2 ) ∗ x = (a1 ∗ α(x, a2 ), a2 ∗ β(x)) This new automaton A is called the cascade of the two given automata determined by the joining maps α and β, and is denoted α A1 β A2 . Figure 6 below illustrates this construction. •
X
β
α
A2
A1
Y
/
Fig. 6: The cascade of automata A = α A1 β A2
Two special cases of this construction deserve special mentioning. First, the parallel composition of two automata A1 and A2 appears if we take X = X 1 × X2 ; pr
α : (X1 × X2 ) × A2 −→ X1 ; pr
β : (X1 × X2 ) −→ X2 . Second, the sequential composition of two automata A1 and A2 appears if we take X = X2 ; α(x, a2 ) = (a2 ∗ x)ψ
with ψ : Y2 → X1 given;
β = idX2 . Given any two “pure" automata A1 = (A1 , X1 , Y1 )
and
A2 = (A2 , X2 , Y2 )
let us extend them to the corresponding semigroup automata A•1 = (A1 , F (X1 ), Y1 ) and A•2 = (A2 , F (X2 ), Y2 ) Let us further, for brevity, write Fi = F (Xi ), i = 1, 2. Then we have the two actions (A1 , F1 )
and
(A2 , F2 )
and so also the action (A1 , F1 )wr(A2 , F2 ) = (A1 × A2 , Fun(A2 , F1 ) F2 ), which action we denote by (A, Φ).
2. Automata and their decomposition
167
Define now a map f : X → Φ. To this end, we first define f1 : X → Fun(A2 , F1 ) by def
xf1 (a2 ) = α(x, a2 ) ∈ X1 → F1 . Next, we define f2 : X → F2 by the rule def
xf2 = β(x) ∈ X2 → F2 . With the aid of these two maps f1 and f2 we obtain f by the rule def
x → xf = (xf1 , xf2 ). A verification shows that (a) (a1 , a2 ) xf = a1 α(x, a2 ), a2 β(x) = (a1 , a2 ) x in αA1 βA2
and,
def
(b) extending f to f∗ : F = F (X) → Φ, we have a morphism of actions f∗ : (A, F ) → (A1 , F1 )wr(A2 , F2 ) = (A, Φ). Passing to the corresponding faithful actions using the natural epimorphisms of the semigroups involved, def
F1 Σ1 = F1 /ρ1 ,
def
F2 Σ2 = F2 /ρ1
and
def
F Φ = F/ρ,
gives the diagram (A, F ) X X X X X X X X f∗ ·Ψ X X X f∗ X X X X X+ Ψ / / (A1 , Γ1 )wr(A2 , Γ2 ) (A1 , F1 )wr(A2 , F2 ) One sees that ker(A, F ) = ker(f∗ · ψ). Therefore, there exists a monomorphism (A, F/ ker f∗ ψ) → (A, Φ). Moreover, we have F/ ker(f∗ ϕ) ∼ = Φ. As f∗ ϕ is the identity map on A, we obtain (A, Φ) ∼ = (A, F )/ ker(f∗ ϕ) → (A1 , Σ1 )wr(A2 , Σ2 ). To sum up, we have the following: Given any two “pure" automata A1 and A2 , A1 = (A1 , X1 , Y1 )
and
A2 = (A2 , X2 , Y2 ),
together with joining maps α and β, we can form their cascade A = α A1 β A2 . Extending these three automata A, A1 and A2 to the corresponding semigroup automata (A, F (X), Y ),
(A1 , F (X1 ), Y1 )
and
(A2 , F (X2 ), Y2 )
and thereafter taking for these extended automata the corresponding faithful actions – denote them (A, Γ), (A1 , Σ1 ) and (A2 , Σ2 ) respectively –, we get a natural immersion of the faithful cascade semi-automaton into the wreath product of the faithful semi-automata corresponding to the given automata. This leads to the following construction.
168
C HAPTER II. AUTOMATA THEORY
For any two automata A1 and A2 , et us define a new automaton A = A1 wrA2 = (A, Γ, Y ) by taking def
(A, Γ) = (A1 , Σ1 )wr(A2 , Σ2 ),
def
Y = Y1 × Y2
and setting def
(a1 , a2 ) ∗ (ϕ, ˆ σ) = (a1 ∗ ϕ(a ˆ 2 ), a2 ∗ σ). Here (ϕ, ˆ σ) ∈ Fun(A2 , Σ1 ) Σ,
Σ = Σ1 × Σ 2
and the operation (a1 , a2 ) (ϕ, ˆ σ) is already given by the action (A, Γ). A verification shows that we in this way obtain a new semigroup automaton which is called the wreath product of the given semigroup automata A1 and A2 . Note also that in the case when both A1 and A2 are Moore automata then their wreath product A1 wrA2 is a Moore automaton too. From the above reasoning one sees also the role of this construction for cascade joins of automata. The interested Reader will find additional comments in Section 3.
2.8. Linear automata Let Λ be any commutative ring. Consider the triple A = (A, X, B), where A and B are Λ-modules (of states and of output signals, respectively) and X is the set of input signals. We assume also that there are given two maps ν1 : X → EndΛ A
and
ν2 : X → HomΛ (A, B).
In this situation we say that A is a linear (Mealy) automaton over Λ. Let us write for ν1 and ∗ for ν2 , as we did above, that is, ν1 (x)a = a x and ν2 (x)a = a ∗ x. Using this notation, we say that A is a linear Moore automaton provided, if there exists a Λ-linear map ε : A → B such that a ∗ x = (a x)ε
or, equivalently,
ν2 (x) = ε(ν1 (x)).
The map ν1 gives an action of X on A which can be extended to an action of F (X): For any word u ∈ F (X) we set uν1 = (u1 x)ν1 = uν11 · xν1 if u = u1 x (x ∈ X). Similarly, for ν2 we define uν2 = uν11 · xν2 . As a result, we get a linear semigroup automaton A+ = (A, F (X), B). If the original linear automaton is a Moore automaton, then we take, instead of F (X), the monoid F ∗ (X) = F (X) ∪ {1} and require that 1ν2 = ε. This yields xν2 = (x1)ν2 = xν1 · 1ν2 = xν1 · ε,
(u · v)ν2 = uν1 · v ν2
and v ν2 = (v1 x)ν2 = v1ν1 · 1ν2 = v1ν1 · ε.
Therefore, we are led to the linear Moore automaton (A, F ∗ (X), B). More generally, consider triples of the type A = (A, Γ, B), where A and B are Λmodules and Γ is any semigroup (of input signals), together with appropriate operations and ∗, Λ-linear in their first argument, satisfying the conditions a (γ1 γ2 ) = (a γ1 ) γ2
and
a ∗ (γ1 γ2 ) = (a γ1 ) ∗ γ2
2. Automata and their decomposition
169
valid for all a ∈ A, γi ∈ Γ. Note that giving the maps and ∗ is equivalent to giving linear representations ν1 : Γ → EndΛ A
and
ν2 : Γ → Hom+ Λ (A, B),
respectively. If A = (A , Γ , B , , ∗ ) is another linear automaton, then giving a homomorphism μ : A → A is the same as giving a triple of maps μ, μ = (μ1 , μ2 , μ3 ), with μ1 : A → A and μ3 : B → B both Λ-linear and μ2 : Γ → Γ a homomorphism of semigroups subject to the conditions (a γ)μ1 = aμ1 γ μ2
and
(a ∗ γ)μ3 = aμ1 ∗ γ μ2
valid for all a ∈ A and γ ∈ Γ. For a linear Moore automaton A = (A, Γ, B; , ∗) there are given a representation ν1 : Γ → EndΛ A and a Λ-linear map ε : A → B. It is easy to verify that these data completely define A. Here the map ε can be chosen arbitrarily, i.e. independently of ν1 . This is also the appropriate place to note that not every linear semigroup automaton allows adjoining an external unity to Γ to make it a monoid. This is however possible if there exists an element ψ ∈ HomΛ (A, B) such that one has a ∗ γ = (a γ)ψ for all a ∈ A and γ ∈ Γ. Let Λ[Γ] be the semigroup algorithm for a semigroup Γ over the ring Λ. It is often useful to extend a linear semigroup automaton (A, Γ, B; , ∗) to the automaton (A, Λ[Γ], B; , ∗) by defining for all a ∈ A and u = u1 γ1 + · · · + un γn in Λ[Γ] a u = a (u1 γ1 + · · · + un γn ) = u1 (a γ1 ) + · · · + un (a γn ) and a ∗ u = a ∗ (u1 γ1 + · · · + un γn ) = u1 (a ∗ γ1 ) + · · · + un (a ∗ γn ). It is easy to see that the relations a ∗ (u + v) = a ∗ u + a ∗ v and a ∗ (uv) = (a u) ∗ v hold for all u, v ∈ Λ[Γ]. An important example of a linear automaton is provided by the linear regular automaton A with A = B = Λ[Γ] and the maps νi given by the rule u v = u ∗ v = u · v; here u · v means multiplication of u and v in Λ[Γ]. Using semigroup algorithm extension for a linear semigroup automaton it is easy to introduce the notion of a linear cyclic automaton. Namely, call a linear automaton (A, Γ, B; , ∗) cyclic if there exists a ∈ A such that A = a Λ[Γ] and B = a ∗ Λ[Γ]. An example of a cyclic automaton is supplied by the linear regular automaton (Λ[Γ]/U, Γ, Λ[Γ]/V), where U Λ[Γ] is a right ideal and V is any Λ-submodule in Λ[Γ] such that U ⊆ V. A straightforward verification shows that any cyclic automaton is of this type, i.e. it can be obtained as an epimorphic image of a linear regular automaton. Further, call a linear automaton (A, Γ, B) reduced if the zero element in A is the only element a ∈ A such that a ∗ γ = 0 for all γ ∈ Γ. Indeed, denoting generally speaking D(A) = {a ∈ A | a ∗ γ = 0
for all
γ ∈ Γ},
it is easy to verify that D(A) is Γ-invariant and coincides for a linear Moore automaton (A, Γ, B) with the set {a ∈ A | (a γ) ∗ ε = 0 for all γ ∈ Γ}. Note that (A/D(A), Γ, B) is a reduced automaton. Concluding this section, let us consider other linearities for automata and their interconnections.
170
C HAPTER II. AUTOMATA THEORY
The case when Λ is a field has been by far the most interesting for applications. Note also that stochastic automata [10] and stationary linear dynamical systems, as understood in [9], are just linear automata. Affine automata, which are just linear systems in the sense of Kalman, are also linear automata, however given invariantly. Take a commutative ring Λ with identity and a set X (the input alphabet), and let A and B be Λ-modules of states and outputs, respectively. Further, suppose that a Λ-module C together with an encoding map τ : X → C is given, along with four linear maps: α1 ∈ EndΛ A; α2 ∈ HomΛ (C, A); β1 ∈ HomΛ (A, B); β2 ∈ HomΛ (C, B). These maps give rise to the following two operations: a x = aα1 + (xτ )α2 and a ∗ x = aβ1 + (xτ )β2 for a ∈ A, x ∈ X. Note that this time the operations and ∗ are not linear in the usual sense. Yet, we have here an automaton called an affine automaton; this is the main object of study in Kalman’s theory of linear systems in [9]. Any study of linear systems aims to tie the linear structure to the dynamics. Give’on and Zalcstein [6] use a new compatibility relation between the linear operations of the system and concatenation in the monoid of transformations induced on the state space by the dynamical action of inputs. Let us indicate some details of their approach. Writing X ∗ = ∪n≥0 X n , as usual, note that on X ∗ there is a natural multiplication X k × X l → X k+l : (u, v) → uv
(k, l ≥ 0)
for u ∈ X k and v ∈ X l . As a generalization, one has the notion of Λ-monoid which defined as follows (compare the notion of a graded algorithm!). D EFINITION 2.15. A Λ-monoid is a sequence M = (Mn | n ≥ 0) of Λ-modules together with a double sequence of maps (τk,l | k, l ≥ 0), τk,l : Mk × Ml → Mk+l ,
τk,l (u, v) = uv,
such that: (i) all maps τk,l are surjective and Λ-linear; 0v = v = v ˆ 0 for all v ∈ Ml ; (ii) M0 = (ˆ0), where ˆ0 is the zero element with ˆ (iii) (uv)w = u(vw) for all u ∈ Mk , v ∈ Ml and w ∈ Mm (k, l, m ≥ 0). Writing (ii) and (iii) in terms of the maps τk,l , one finds τ0,l (ˆ0, v) = v = τl,0 (v, ˆ 0) for all v ∈ Ml and l ≥ 0 and τk+l,m (τk,l (u, v), w) = τk,l+m (u, τl,m (v, w))
for all u ∈ Mk , v ∈ Ml and w ∈ Mm .
2. Automata and their decomposition
171
We have a homomorphism ψ : M → M of Λ-monoids if there is given a sequence ψ = (ψn | n ≥ 0) of Λ-linear maps ψn : Mn → Mn such that the diagram of Λ-modules Mk × Ml
τk,l
/ Mk+l ψk+l
(ψk ,ψl )
Mk × Ml
τk,l
/ Mk+l
is commutative for all k, l ≥ 0. Thus we obtain the category M(Λ) of Λ-monoids. In the present context the appropriate replacement for the classical Λ-linear action (A, X ∗ ; ) is the following notion. Consider the triple (A, W ; λ) where A is a Λ-module of states, W a Λ-monoid (of inputs) and λ = (λn | n ≥ 0) a sequence of Λ-linear maps λn : A × Wn → A : (a, w) → a w, these data satisfying the following conditions: (i) a ˆ0 = a for all a ∈ A and W0 = (ˆ 0); (ii) a (w1 w2 ) = (a w1 ) w2 for all a ∈ A and w1 , w2 ∈ W . We say that (A, W ; λ) is a Λ-linear transition system. Here W plays the role of the monoid of input words for usual automata. Consider quintuples of the type L = (A, W, B; λ; δ) where A, W and λ are as above, while B is a Λ-module (of outputs) and δ : A → B a Λ-linear output map. Such triples are called discrete time, time-invariant Λ-linear dynamical systems in [9]. Let us change this set-up as follows. We keep all objects as above but for δ we take a sequence of Λ-linear maps δ = (δn | n ≥ 0), δn : A × Wn → B : (a, w) → a ∗ w, satisfying the condition a ∗ (w1 w2 ) = (a w1 ) ∗ w2 for all a ∈ A, w1 , w2 ∈ W . We call the quintuple L thus obtained a general linear % system. An input-output map for such system is a map f : W → B such that f %Wn : ˆ = f (w) for all w ∈ W where W0 = (ˆ 0). Wn → B is Λ-linear for all n ≥ 0 and f (0w) However, we will not follow the traditional path for linear systems (Nerode and Myhill equivalences, canonical realizations for f , etc.) here. Instead, we note that, e.g., in the case of a field Λ = K and finite dimensional K-vector spaces A and B, linear systems defined above are nothing else than finite automata. Yet, representing them as linear system allows one to reduce considerably their size in comparison with finite automata in general. In algorithmic situations it is a not unimportant feature. In Section 3 it will be shown that the above notions together with several other important concepts in computer science can be extended to a more general setting. This implies – and this is, probably, the most interesting thing here – a more powerful mathematical framework for the questions dealt with in this chapter.
172
C HAPTER II. AUTOMATA THEORY
2.9. Triangular products and decomposition of linear automata In this section attention is paid to the triangular product construction. This construction seems to be indispensable in the decomposition of linear automata. It is shown here how cascades of linear automata, which are widely used when decomposing them, can be reduced to triangular products of the corresponding components (Theorem 2.17). Take any two linear semigroup automata A = (A , Σ , B ; , ∗ )
and
A = (A , Σ , B ; , ∗ )
assuming that, at least, the first automaton is a Moore automaton. In particular, we have two representations (A , Σ ; ) and (A , Σ ; ) of the corresponding semigroups Σ and Σ . So we can form their triangular product (see [8]) def
ˆ = (A , Σ ) (A , Σ ). (A, Γ) ˆ def = Φ × Σ × Σ , where Φ is the We recall that this means that A = A ⊕ A and Γ + ˆ additive (semi)group of the Λ-module HomΛ (A , A ); Γ is thus a set of triples, ˆ = {(φ, σ , σ ) | ϕ ∈ Φ, σ(i) ∈ Σ(i) , i = or }. Γ Furthermore, one has natural actions ˙ Φ Σ × Φ →
and Φ × Σ → ˙ Φ
given by the rules (a )σ
·ϕ def
= (a σ )ϕ
and
def
(a )ϕ·σ = (a )ϕ σ ,
respectively, where the elements a ∈ A , ϕ ∈ Φ and σ (i) ∈ Σ(i) (i = ˆ is given by the formula arbitrary. The multiplication on Γ
or ) are
(ϕ, σ , σ ) · (ψ, τ , τ ) = (σ · ψ + ϕ · τ , σ τ , σ τ ). The associativity is easily verified by remarking that this multiplication rule can be interpreted as matrix multiplication: τ σ τ ϕ ψ σ · ψ + ϕ · τ σ · = . 0 σ τ 0 σ 0 τ ˆ is indeed a semigroup also denoted about Φ Σ. So, Γ ˆ→ A given by the rule Furthermore, there exists an action A × Γ def (a , a ) (ϕ, σ , σ ) = (a )ϕ + a σ , a σ . ˆ ). Therefore, we obtain the semi-automaton (A, Γ; ∗ ˆ → Finally, let B = B ⊕ B . Then we can define the output function A × Γ A as follows. For any two elements a = (a , a ), with a(i) ∈ A(i) (i = or ), and ˆ respectively, put γ = (ϕ, σ , σ ), in A and in Γ a ∗ γ = a ∗ σ + (a )ϕ ∗ ε, a ∗ σ . In what follows, we drop the marks and in , and ∗ , ∗ , as the correct meaning of the corresponding operations can be understood unambiguously from the context.
2. Automata and their decomposition
173
P ROPOSITION 2.16. For any a = (a , a ) in A and all elements γi = (ϕi , σi , σi ) ˆ one has (i = 1, 2) in Γ a ∗ (γ1 γ2 ) = (a γ1 ) ∗ γ2 P ROOF. Indeed, from γ1 γ2 = (ϕ1 , σ1 , σ1 ) · (ϕ2 , σ2 , σ2 ) = (σ1 · ϕ2 + ϕ1 · σ2 , σ1 · σ2 , σ1 · σ2 ), we obtain, on the one hand, a ∗ (γ1 γ2 ) = a ∗ (σ1 σ2 ) + (a )σ1 ·ϕ2 +ϕ1 ·σ2 ∗ ε, a ∗ (σ1 · σ2 ) and, on the other hand, (a γ1 ) ∗ γ2 = (aϕ1 + a σ1 ), a σ1 ∗ γ2 = = (aϕ1 + a σ1 ) ∗ σ2 + (a σ1 )ϕ2 ∗ ε, (a σ1 ) ∗ σ2 = = (a )ϕ1 ∗ (σ2 ε) + (a σ1 ) ∗ σ2 + (a σ1 )ϕ2 ∗ ε, a ∗ (σ1 · σ2 ) = = a ∗ (σ1 σ2 ) + (a )ϕ1 σ2 + (a σ1 )ϕ2 ) ∗ ε, a ∗ (σ1 · σ2 ) = = a ∗ (σ1 σ2 ) + (a )σ1 ·ϕ2 +ϕ1 ·σ2 ) ∗ ε, a ∗ (σ1 · σ2 ) . ˆ B; , ∗) As a result, there appears a new linear (semigroup) automaton A = (A, Γ, called the triangular product of the given automata A and A and denoted A A In what follows we shall prove the main decomposition theorem for linear automata. An application to Image Compression will be given in Sec. 2.10. Take Λ = K (a field). Let there be given a linear semigroup Moore automaton A = (A, Γ, B; , ∗) where it is assumed that the action A = (A, Γ, B) is faithful. Suppose that we have a Γ-invariant subspace A ≤ A and let B = A ∗ ε where ε is the unit element in Γ. In order to obtain a decomposition for A let us argue as follows. F i r s t. Let us prove that there exist subspaces A ≤ A and B ≤ B complementary to A in A and to B in B, respectively, such that A ∗ ε ≤ B . def
Set B1 = A ∗ ε. Then B = A ∗ ε ⊆ A ∗ ε = B1 . Denote E = {a | a ∈ A, a ∗ ε = 0} def
and introduce the subspace A1 = A + E in A. Then we have A1 ∗ ε = (A + E) ∗ ε = A ∗ ε = B . Therefore, the correspondence ψ : a + A1 → a ∗ ε + B is a map: if u − v ∈ A1 for some u, v ∈ A then u ∗ ε − v ∗ ε = (u − v) ∗ ε ∈ A1 ∗ ε = B , i.e., we have u ∗ ε + B = v ∗ ε + B .
174
C HAPTER II. AUTOMATA THEORY
It is clear that this map ψ is a K-homomorphism: ψ k(u + A1 ) + l(v + A1 ) = ψ (ku + lv) + A1 = = (ku + lv) ∗ ε + B = = k(u ∗ ε + B ) + l(v ∗ ε + B ) = = kψ(u + A1 ) + lψ(v + A1 ) for k, l ∈ K and u, v ∈ A. Suppose that for some u, v ∈ A it holds u ∗ ε + B = v ∗ ε + B. Then (u − v) ∗ ε = a ∗ ε for some a ∈ A and, hence, u − v − a ∈ E. It follows that u − v ∈ A1 , i.e. that u + A1 = v + A1 , showing that ψ is an injection. The map ψ is surjective also, and so ∼ the rule a + A1 → a + ε + B gives an isomorphism of K-spaces ψ : A/A1 → B1 /B . Next, take any K-subspace B2 in B1 which is complementary to B and choose an arbitrary basis {bα | α ∈ I} in B2 . As B2 ≤ B1 = A ∗ ε, we can choose elements def
aα ∈ A such that aα ∗ ε = bα . Take the K-subspace A2 = aα | α ∈ Ik (K-hull) and εˆ
τ
define the following two maps: B2 → A2 , by the rule bτα = aα (α ∈ I), and A2 → B2 , by εˆ(aα ) = bα (α ∈ I). We get bταεˆ = (bτα )εˆ = aεαˆ = bα showing that τ εˆ is identical on the basis chosen in B2 and, consequently, on the entire space B2 , i.e., τ εˆ = idB2 . It follows that the map εˆ is a K-isomorphism. Note also that (A1 + A2 ) ∗ ε = A1 ∗ ε + A2 ∗ ε = B + B2 = B1 = A ∗ ε and that (A1 ∩ A2 ) ∗ ε ⊆ (A1 ∗ ε) ∩ (A2 ∗ ε) = B ∩ B2 = B1 = {0}. If there would exist a nonzero element y in A1 ∩ A2 , then we could find nonzero scalars kα ∈ K such that y= ka aα α∈I
for some finite subset I ⊆ I. However, then it follows that y ∗ ε ∈ A1 ∗ ε = A ∗ ε = B along with y∗ε= kα εˆ(aα ) = kα bα ∈ B2 . α∈I
α∈I
But we know that B ∩ B2 = {0}. Therefore α∈I kα bα = 0 and so kα = 0 for all α, contradicting the choice of y. Thus y = 0 and so also A1 ∩ A2 = {0}. Of course, A1 + A2 ⊆ A and we have also (A1 + A2 ) ∗ ε = A ∗ ε. Consequently, for any a ∈ A there exist elements ai ∈ Ai (i = 1, 2) such that a ∗ ε = a1 ∗ ε + a2 ∗ ε. Therefore, a − a1 − a2 = e ∈ E and we get a = (a1 + e) + a2 ∈ A1 + A2 showing that A ⊆ A1 + A2 . As a result it follows that A = A1 ⊕ A2 .
2. Automata and their decomposition
175
From A1 = A + E it follows that there exists a subspace A3 ≤ E so that A1 = A ⊕ A3 and, hence, also
A = A1 ⊕ A2 = A ⊕ (A3 ⊕ A2 ). Denote by A the subspace A3 ⊕ A2 ⊆ A. As A3 ≤ E we get A ∗ ε = (A2 + A3 ) ∗ ε = A2 ∗ ε = B2 . Again, as B ⊕ B2 = B1 ≤ B we can extend the basis {bα | α ∈ I} for B2 to get a a subspace B complementary to B in B, i.e. B = B ⊕ B together with A ∗ ε = B2 ⊆ B . S e c o n d. Consider again the linear Moore semigroup automaton A = (A, Γ, B; , ∗) but suppose now that the action (A, Γ; ) is faithful. (Note that we have not used this assumption during the first part of our reasoning!) Then Γ can be treated as a subsemigroup of EndK A. For any γ ∈ Γ denote by γ μ and γ ν the endomorphisms of A and A/A , respectively, induced by γ. Let us take Γμ = Σ and Γν = Σ . As A = A ⊕ A we have the natural epimorphism α : A A/A and the projection πA : A A . The map α induces an isomorphism A → A/A for the inverse of which we introduce the notation α−1 ; in particular, for any a ∈ A we have the formula −1
(aα )α
= aπA .
The representation (A/A , Σν ; # ) and the map α produce the representation (A , Σ ) = (A , Σ ; ). Namely, for a ∈ A and γ ∈ Γ, γ ν = σ ∈ Σ define def
a σ = (a γ)πA . It is easy to see that is, indeed, an action on A . Note that α−1 −1 −1 (a γ)πA = (a γ)α = (aα γ ν )α = (aα # σ )α . It follows also that aα # γ ν = (a σ )α . Let us prove that the rule for gives an action. Indeed: a (σ1 σ2 ) = (a γ1 γ2 )πA = ((a γ1 ) γ2 )πA = −1 α α α−1 = (a γ1 )α γ2ν = = (a γ1 ) γ2 α−1 α α−1 γ1ν ) # γ2ν = (a σ1 ) # γ2ν = = (aα # α−1 α π = (a σ1 ) σ2 A = = (a σ1 ) σ2 = (a σ1 ) σ2 .
176
C HAPTER II. AUTOMATA THEORY
ˆ ) can be represented as the triangular T h i r d. Let us show that the action (A, Γ; product of the representations (A , Σ ; ) and (A , Σ ; ). This will be done using the matrix form of this construction. Using again A = A ⊕ A , consider the immersions i = or ,
˜ (i) ≤ EndK A, Σ(i) → Σ given by the rules:
˜ = σ → σ and ˜ = σ → σ
ε 0 0 σ
σ 0
0 ε
for σ ∈ Σ for σ ∈ Σ
Then for any a ∈ A, a = a + a with a ∈ A , a ∈ A , we have
aσ˜ = (a + a )σ˜ = a σ + a and
aσ˜ = (a + a )σ˜ = a + a σ . Notice that it follows from
(a )σ˜ = a σ = (a γ)πA that a γ − a σ ∈ A . Therefore, we may consider the map ϕ given by the rule def
(a )ϕ = a γ − a σ. It is easy to verify that ϕ ∈ Homk (A .A ). The semigroup Homk (A , A ) can be considered also as a subsemigroup Φ in EndK A. To achieve this let us interpret any ϕ ∈ HomK (A , A ) as the endomorphism ϕ˜ ∈ EndK A, ϕ˜ =
ε ϕ . 0 ε
Then aϕ˜ = (a + a )ϕ˜ = (a + (a )ϕ ) + a . ˜ def ˜ · Φ ˜ · Σ ˜ in EndK A. The map Take now the subsemigroup Γ = Σ ˜ → (ϕ, σ , σ ) ω:σ ˜ ϕ˜ σ ˜ to the semigroup Γ ˜ of triples, induces an isomorphism of Γ ˜ = Φ Σ = {(ϕ, σ , σ ) | ϕ ∈ Φ, σ (i) ∈ Σ(i) , i = or }. Γ Indeed, using the relation ˜ = σ ˜ ϕ˜ σ
ε 0
0 σ
ε · 0
σ ϕ · ε 0
0 ε
=
σ 0
ϕ σ
2. Automata and their decomposition
we get
177
ω ϕ ψ τ σ · = 0 σ 0 τ ω σ · ψ + ϕ · τ σ τ = = 0 στ
[(˜ σ ϕ˜ σ ˜ ) · (˜ τ ψ˜ τ˜ )]ω =
= (σ · ψ + ϕ · τ, σ τ , σ τ ) = = (ϕ, σ , σ ) · (ψ, τ , τ ) = ˜ )ω · (˜ τ ψ˜ τ˜ )ω . = (˜ σ ϕ˜ σ This calculation shows that ω is a homomorphism of semigroups. It is easy to verify also that ω is a bijection. This result together with the constructions above inside EndK A show that ˆ = (A , Σ ) (A , Σ ). (A, Γ) F o u r t h. More is true! The automaton A = (A , Σ , B ; , ∗ ) appears through the natural homomorphism μ : Γ Σ . Note that the output operation ∗ is induced here by the map εˆ : A → B, ε(a) = a ∗ ε, with ε the unit element in Γ. Analogously, the natural epimorphisms α : A → A/A , ν : Γ → Σ and β : B → B/B define the automaton , ). A˜ = (A/A , Σ , B/B ; # Again, the output operation is induced by the map εˆ : A → B, according to the rule: def
aα ε = (a ∗ ε)β for any a ∈ A. Using the isomorphism of the representation (A/A , Σν ; ) to (A , Σ ; ), established in the second step of our argument we obtain the automaton A = (A , Σ , B ; , ∗ ). Indeed, let us define
def
a ∗ σ = (a ∗ γ)πB . This give what is needed: a ∗ (σ1 , σ2 ) = (a ∗ (γ1 γ2 ))πB =
β −1 = = ((a γ1 ) ∗ γ2 ))πB = ((a γ1 ) ∗ γ2 ))β
= ((a γ1 )α γ2ν )β
−1
= ((a σ1 )α γ2ν )β
= ((aα # γ1ν ) γ2ν )β −1
−1
=
= (a σ1 ) ∗ σ2 .
Here we used the notation γ1ν = σ1 and γ2ν = σ2 . Next, let us prove that the triple of maps (α, idΣ , β) gives rise to an automorphism of automata χ : (A , Σ , B )→(A/A , Σ , B/B ).
178
C HAPTER II. AUTOMATA THEORY
We have already shown that all three components of χ are bijections. Therefore, it remains to prove that (a σ )α = aα # σ and that (a ∗ σ )β = aα σ To prove the first of these equalities, recall that −1
σ )α . (a γ)πA = (aα # It follows that −1
(a σ )α = ((a γ)πA )α = ((a σ )α )α = = (a σ )idA/A = aα σ . To prove the second equality note that β −1 = a σ = (a ∗ γ)πB = (aγ)β ) β −1 β −1 β = (a γ)α εν = = (a γ) ∗ ε β −1 −1 γ ν ) εν = (aα γ ν )β . = (aα # Hence
β −1 (a σ )β = (a γ)πB = ((aα # γ ν )β )β = = (aα # γ ν )idB/B = aα σ .
˜ with Γ ˜=Γ ˜ μ · Φ· ˜ Γ ˜ν . Let us prove that there exits an immersion of semigroups Γ → Γ To this end, using faithfulness of the action (A, Γ; ) associated to the automaton A, consider Γ as a subsemigroup in End A. For any element γ ∈ Γ denote by γ˜ the image of this element in End A and consider the map δ = ωΓ : γ˜ = γ˜ μ ϕ˜ ˜γ ν . Using what was said above about the map ω and the faithfulness of (A, Γ) it follows that δ is a monomorphism. Consequently, we get the monomorphism of automata ˜ B), δˆ : (A, Γ, B)→(A, Γ, def
where δˆ = (idA , δ, idB ). Further, note that exactly in the same way the isomorphism ˜→Γ ˜ considered above induces the isomorphism of automata ω:Γ ˜ B); ω ˆ : (A, Γ, B)→(A, Γ, def
we take here ω ˆ = (idA , ω, idB ). To sum up, we have now established the following theorem.
2. Automata and their decomposition
179
T HEOREM 2.17. For any faithful linear semigroup Moore automaton (A, Γ, B; , ∗) with A possessing a Γ-invariant subspace A ≤ A there exists a monomorphism of automata A → A A with the automata A and A given as indicated above.
2.10. Decomposition of linear automata in image compression ˇ In a series of papers, K. Culik II has shown how to implement a great variety of linear operators in Image Compression using weighted finite automata (WFA), see for example [2]. Let us see how this approach can be “embedded" into the framework of decomposition of linear automata. First, let us briefly touch some points of connection between Image Compression def
and WFA. Take for alphabet the set T = {0, 1, 2, 3}, let Σ = T ∗ be the monoid of words on T , and consider functions f : T ∗ → R. Such functions can be interpreted as multiresolution functions on T . Namely, let the unit square be divided into 2 × 2 pieces with addresses as shown in Figure 7. 1 0
3 2
Fig. 7
Let us continue this division indefinitely. A word in Σn (the subsets of words of length n in Σ) gives then an address to a pixel in the 2n × 2n -subdivision. For instance, the pixel in the 23 × 23 -subdivision corresponds to the word w = 103 as shown in Figure 8.
Fig. 8
Quite often one considers the case when all values f (w) lie in the unit interval [0, 1] ⊆ R. This may be interpreted so that f (w) gives the intensity of the pixel with
180
C HAPTER II. AUTOMATA THEORY
address w, w ∈ Σ. Also, let us consider the extreme cases f (w) = 0 and f (w) = 1 as the pixel with address w being white or black, respectively. Then a multiresolution function f : T ∗ → [0, 1] defines a sequence of grey-tone images with increasing resolution. The restriction of f to Σn defines an image in the 2n × 2n -resolution. Note that there exist no principal difficulties in considering the colored case: one has to take, instead of the previous function f , three such functions r, g and b : T ∗ → [0, 1] to represent the intensities (red, green and blue). Alternatively, we could as well have used an alphabet T with |T | = 2m to produce functions [0, 1]m → [0, 1]. Let us say that a multiresolution function f : T ∗ → R is average-preserving if the condition f (wa) = 4f (w) a∈T
holds for all w ∈ T ∗ . Observe that this condition makes images at various resolutions compatible. Note also that every function f : T ∗ → R defines a collection of functions fv : T ∗ → R given by the formula def
fv (w) = f (vw) for all w ∈ T ∗ . This may be interpreted so that these functions fv give the image in the subsquare with the address v. The following theorem holds true. ˇ [2]). An average-preserving multiresolution function f : T HEOREM 2.18 ((K. Culik T ∗ → R can be given by a weighted finite automaton – in a sense made explicit below – if and only if the R-vector space generated by the set {fv | v ∈ T ∗ } is finite-dimensional. This dimension equals the minimal number of states of the WFA realizing f . Following [2], let us state the definition of WFA. A weighted finite automaton is a quintuple A = (S, T, {Wa | a ∈ T }; I, F ) with • S – the set of states; we put |S| = n; • T – a finite alphabet; • Wa : S × S → R – weights of transition; a triple (p, a, q) ∈ S × T × S is called a transition if Wa (p, q) = 0; • I : S → R – the initial distribution (a row vector in R1×n ); • F : S → R – the final distribution (a column vector in Rn×1 ). Every weighted finite automaton A = (S, T, {Wa | a ∈ T }; IA , FA ) gives a multiresolution image by the rule def
fA (a1 a2 . . . ak ) = IA Wa1 Wa2 . . . Wak FA
(ai ∈ T ).
In this formula we used matrix multiplication. A WFA is called average-preserving if we have Wa FA = 4FA . a∈T
It turns out that if a weighted finite automaton A is average-preserving, then the corresponding multiresolution function fA is also average-preserving ([2], Lemma 2). Also, there exist good algorithms that enable to encode a given image (picture) by some weighted
2. Automata and their decomposition
181
finite automaton A, and thereafter to compute from such an automaton the function fA representing finite resolution approximations of the given image; [3–5]. Let us see now how WFA and the corresponding multiresolution functions can be treated using linear automata. To this end, note that we can extend the correspondence a → Wa ∈ Mn (R) (a ∈ T ) to the whole monoid T ∗ defining the map Δ : T ∗ → Mn (R) recursively by the rule Δ(wa) = Δ(a) · Wa
for w ∈ Σ, a ∈ T .
So we get for a word w = a1 . . . ak that Δ(w) = Wa1 · · · · · Wak . We have the relation Δ(vw) = Δ(v)Δ(w) and, hence, we get a matrix representation, Δ : T ∗ → Mn (R). Next, fix (some) total order on the set S and, thinking of S as an ordered basis (s1 , s2 , . . . , sn ), consider the R-space A of all finite (formal) sums n & def A = A(R, S) = | αi ∈ R, si ∈ S , i=1
denoting by i and f the vectors that correspond to the initial and the finite distribution, respectively. Define the action of Σ on A as follows. For any word w = a1 . . . ak (ai ∈ T ) and any element z in A, z = α1 s1 + · · · + αn sn (αi ∈ R), define z w = z¯Δ(w) = z¯(Wa1 · · · Wak ). Here we write z¯ = (α1 , . . . , αn ) considering it as a vector in R1×n . For any two words v and w in Σ, we find z (vw) = z¯Δ(vw) = z¯ Δ(v)Δ(w) = z¯Δ(v) Δ(w) = (z v) w, which show that we have a (linear) action of Σ on A. Taking z = i we obtain the vectors i w = ¯iΔ(w) = IA (Wa1 · · · Wak ), which are called multiresolution vectors for A. Furthermore, define def
z ∗ w = z¯Wa1 · · · Wak f . It follows that z ∗ w = z¯Wa1 · · · Wak f = (z w) ∗ 1. In particular, for z = i we get i ∗ w = IA Wa1 · · · Wak f = fA (w). i.e. i ∗ w is the value of the multiresolution function at w, w ∈ Σ∗ . Furthermore, one has z ∗ (w1 w2 ) = (z (w1 w2 )) ∗ 1 = [(z w1 ) w2 ] ∗ 1 = (z w1 ) ∗ w2
182
C HAPTER II. AUTOMATA THEORY
for all w1 , w2 ∈ Σ∗ . Thus, to sum up, we see that WFA and multiresolution functions may be considered in the language of linear automata. References [1] [2]
H. Beker and F. Piper. Cipher systems: the protection of communication. Northwood, London, 1982. K Culik II and J. Karhumäki. Finite automata computing real functions. SIAM J. Comput 23(4), 1994, 789–814. [3] K Culik II and J. Kari. Computational fractal geometry with WFA. Acts Informatica 34 (2), 1994, 151– 166. [4] K Culik II and J. Kari. Finite state transformation of images. Computers and Graphics 20, 1996, 125–135. [5] K Culik II and J. Kari. Finite state methods for image manipulation. In: Proc. of ICALP’95, ˝ Lect. Notes in Comp. Sci., Vol. 944, 1995, 51-U62. [6] Y. Give’on and Y. Zalcstein. Algebraic structures in linear system theory. J. Comput. Syst. Sci. 4 (6), 1970, 539-556. [7] D. Gollman, Kaskadenschaltungen taktgesteuerter Schiebenregister als Pseudozufallsgeneratoren. Dissertation an der Universität Linz, Nr. 59. Verband der Wissenschaftlicher Gesellschaften Österreichs, 1986. [8] U. Kaljulaid, Triangular products and stability of representations. Candidate dissertation, 1979. Russian, typescript; see [K79a], reprinted in this book as Sect. 4. [9] R. E. Kalman, P. L. Falb, and M. A. Arbib. Topics in mathematical system theory. McGraw-Hill, New York, 1969. [10] M. O. Rabin and A. Paz. Probabilistic automata. Information and Control 6, 1963, 230–245.
183
3.
[K97] On two algebraic constructions for automata Coauthor J. Penjam
A BSTRACT. Categories naturally appear when we attempt to find means useful both for describing attributed rewriting systems and for expressing features of process algebra models. Following an idea in Mumford’s old paper [19] on Picard moduli, one can consider semiautomata as sheaves and automata with monotonous (as well as with all) homomorphisms as Grothendieck topologies. This idea naturally leads us to systems of the type (A −→ Set; 2). Wreath products of these systems are introduced and investigated in this paper. Some earlier results – both on algebraic and attributed automata, on rewriting and transition systems – are reconsidered from this viewpoint. Keywords: automata, category theory, Grothendieck topologies, fiber products and wreath products of automata
3.1. Introduction and Preliminary Motivation Using of categories in theoretical computer science is an old enterprize. So, G. Hotz [10] applied categories for clarifying the syntax of a language generated by a given set of productions. D. Knuth [15] viewed productions as functions (i.e. morphisms between objects) in his investigations of semantics of CF-languages. Recent applications of categories to areas of programming languages and to models of parallel computation are surveyed in [1] and [25]. This line of reasoning reflects many important logical concepts in a way independent of their syntactic presentation. Categories naturally appear also when we attempt to develop executable specifications of programming languages (e.g. automata), distributed systems (e.g. process algebras), or imperative functional programming [9]. As a model for automata we use a quintuple A = (A, Σ, Y ; ◦, ∗), where the set of states A, the semigroup Σ and the output set Y are given so that (A, Σ; ◦) is a semigroup ∗ action (i.e., it holds a ◦ (uv) = (a ◦ u) ◦ v) together with the action A × Σ −→ Y satisfying a ∗ (uv) = (a ◦ u) ∗ v for all a ∈ A and u, v in Σ. In this case A is called a semigroup automaton. Also, we consider automata A with A and Y ordered, together with (A, Σ; ◦) being an ordered action and with ∗ being an increasing map; call them o-automata. As a general model for automata a system A = (A, A; 2) is used. Here, A is a small category, A is a functor from A to Set (or to Set∗ ) and 2 is a (partial) feedback operation # aA × Mor(A)) −→ Mor(A), (46) 2:( a∈Obj(A)
satisfying the condition (47)
x 2(f · g) = f A (x)2g
184
C HAPTER II. AUTOMATA THEORY
for all f ∈ HomA (a, a ), g ∈ HomA (a , a ) and x ∈ aA with a, a and a in Obj(A). The sets and categories considered in Sect. 3.3 of this paper are supposed to be small. Considering for each open set U in a analytic space X the set F (U ) of all analytic functions defined on U we get a contravariant functor from the category of open sets to Set; with some consistency conditions satisfied, a sheaf of analytic functions on X appears. In the 1950-ies it was shown that sheaves on X (“toposes”) are the main objects to be studied in geometry [7, 23]. A category A of finite sets and bijections together A with a covariant functor A −→ A is called a species and they are intensively used in combinatorics [20]. In Sec. 3.3, for given automata A = (A, A; 2) and B = (B, B; ) a new automaton AwrB = (AwrB, AwrB; ) is defined and called the wreath product of the given two. Any transition system can be considered as a labelled category (A, A) and so their wreath products can be investigated. It is shown that these wreath products include as special cases products and sums of transition systems important in [25]. It also appears that semi-Thue systems can be represented by wreath products of the above type. Using of colored categories, i.e. categories A with Mor(A) colored, one can consider questions concerning the languages L(A) accepted by such a general automaton A. It seems to be important to investigate interrelations between the languages L(A), L(B) and L(AwrB).
3.2. Fiber Products of Automata and Grothendieck Pretopologies In this section, let us discuss the idea that automata can be viewed as devices for giving nearness on the set X ∗ of all words written in some alphabet X. An origin of this idea is the following context in mathematics. Giving a topology on a space T means that some collection T of open subsets of T is fixed. A.Grothendieck [7] proposed to supply T with some additional structure allowing not to refer to T . Recall that a Grothendieck pretopology is defined as a category G with distinguished family of covers for its objects. A cover of an object U is meant as a set of morphisms {Ui −→ U }. Objects of G are called open sets of the topology G. The following conditions must hold: (1) For any objects U, V and S in Obj(G), there exists the fiber product U ×S V ∈ Obj(G). The fiber product U ×S V , also called pullback or universal cone of U and V over S, is defined by the rule: for any diagram V ~~~g
U@ @@ f
S u v there must exist an object W together with morphisms W −→ U and W → V u
such that for any other object W together with some morphisms W −→ U and v
w
W −→ V there exists the unique morphism W −→ W making the diagram
w W BUo_UU_U _v _ _u_ _ii_i W BB UUUU iiii zz u ! tiiiiUUU* }z v U@ V @@ ~~ ~ g f S
commutative.
3. On two algebraic constructions for automata
185
p
(2) All isomorphisms U −→ U are postulated to be among the covers. Also, if fi,j
fi
{Ui −→ U }
and all {Ui,j −→ Ui } are covers, then fi,j ·fi is also a cover; intuitively it means that a refinement of a {Ui,j −−−−→ U } cover is also a cover. fi pi (3) For any cover {Ui −→ U } and a morphism V → U the set {V ×U Ui −→ V } is also a cover; here pi is the projection of V ×U Ui to V . Any classical topology on a set T is a Grothendieck topology as well. Rather unexpected is the fact that G may not have, in general, the final object T . These ideas are naturally related to automata and languages. First, recall that a presheaf for a Grothendieck topology G is a contravariant functor F from G to Set. To be a sheaf this functor must well-behave on the covers of G. More precisely, a sheaf of sets for a Grothendieck topology G is a contravariant functor F : G −→ Set such that pi for any cover {Ui −→ U } of G the following diagram of sets and their maps
F (U )
(F (pi ))
/ F (Ui )
(F (pri )) (F (prj ))
i
// F (Ui ×U Uj ) i,j
is exact. Here, every component map F (pi ) : F (U ) −→ F (Ui ) is induced by the map pi Ui −→ U and every component map F (pri ) : F (Ui ) −→ F (Ui ×U Uj ) is induced by the pri map Ui ×U Uj −→ Ui (projection onto the first factor). F (prj ) are defined analogously, changing the roles of i and j. As usual, a diagram of sets and their maps A
f
/B
g1 g2
//
C
is called exact if f is injective and it holds f (A) = {b ∈ B | g1 (b) = g2 (b)}. Note that it was given G in effect as a Grothendieck pretopology. However, for our purposes here, the imprecision in it in sense that two different pretopologies may give exactly the same sheaves is unimportant; so we do not use the notion of a sieve, etc. for G. These notions give a new interpretation of some facts important for automata theory. We shall prove the following. T HEOREM 3.1. The category of all semigroup o-automata with their (isotone) homomorphisms as its morphisms defines a Grothendieck pretopology. Proof. For o-automata A1 = (A1 , Σ1 , Y1 ; ◦ , ∗ ) and A2 = (A2 , Σ2 , Y2 ; ◦ , ∗ ), by a homomorphism m : A1 −→ A2 is meant a triple m = (f, h, g) of (isotone) mappings f : A1 −→ A2 , g : Y1 −→ Y2 and a homomorphism of semigroups h : Σ1 −→ Σ2 , compatible with ◦- and ∗-actions. This means that
(a ◦ u)f = af ◦ uh and (a ∗ u)g = af ∗ uh
186
C HAPTER II. AUTOMATA THEORY
for all a ∈ A1 and u ∈ Σ1 . The identity endomorphism idA = (idA , idΣ , idY ) and the composition of isotone homomorphisms are isotone. Therefore a category A(o) appears with o-automata as its objects and isotone homomorphisms as its morphisms. To prove that fiber products exist in A(o) , let us suppose that o-automata A1 and A2 are given together with their homomorphisms mi = (fi , hi , gi ), i = 1, 2, into some o-automaton A = (A, Σ, Y : ◦, ∗). Define the subsets A˜ = A1 ×A A2 = {(a1 , a2 ) | ai ∈ Ai , af11 = af22 ; i = 1, 2}, ˜ = Σ1 ×Σ Σ2 = {(σ1 , σ2 ) | σi ∈ Σi , σ h1 = σ h2 ; i = 1, 2}, Σ 1 2 Y˜ = Y1 ×Y Y2 = {(y1 , y2 ) | yi ∈ Yi , y1g1 = y2g2 ; i = 1, 2}, ˜ is a semigroup. If both Σ1 and Σ2 are with A˜ and Y˜ ordered componentwise4. Here Σ ˜ groups then it is true for Σ also. Now, defining (a1 , a2 ) • (σ1 , σ2 ) = (a1 ◦ σ1 , a2 ◦ σ2 ) and
(a1 , a2 ) (σ1 , σ2 ) = (a1 ∗ σ1 , a2 ∗ σ2 ) ˜ and (a1 , a2 ) ∈ A˜ that for any ai ∈ Ai and σi ∈ Σi , i = 1, 2, we obtain for (σ1 , σ2 ) ∈ Σ (a1 ◦ σ1 )f1 = af11 ◦ σ1h1 = af22 ◦ σ2h2 = (a2 ◦ σ2 )f2 and
(a1 ∗ σ1 )g1 = af11 ∗ σ1h1 = af22 ∗ σ2h2 = (a2 ∗ σ2 )g2 .
Therefore (a1 ◦ σ1 , a2 ◦ σ2 ) ∈ A1 ×A A2 and (a1 ∗ σ1 , a2 ∗ σ2 ) ∈ Y1 ×Y Y2 . Also, one can observe that (a1 , a2 ) • ((σ1 , σ2 ) · (τ1 , τ2 )) = = (a1 , a2 ) • (σ1 τ1 , σ2 τ2 ) = = (a1 ◦ σ1 τ1 , a2 ◦ σ2 τ2 ) = = ((a1 ◦ σ1 ) ◦ τ1 , (a2 ◦ σ2 ) ◦ τ2 ) = = (a1 ◦ σ1 , a2 ◦ σ2 ) • (τ1 , τ2 ) = = ((a1 , a2 ) • (σ1 , σ2 )) • (τ1 , τ2 ), and (a1 , a2 ) ((σ1 , σ2 ) · (τ1 , τ2 )) = = (a1 , a2 ) (σ1 τ1 , σ2 τ2 ) = = (a1 ∗ σ1 τ1 , a2 ∗ σ2 τ2 ) = = ((a1 ◦ σ1 ) ∗ τ1 , (a2 ◦ σ2 ) ∗ τ2 ) = = (a1 ◦ σ1 , a2 ◦ σ2 ) (τ1 , τ2 ) = = ((a1 , a2 ) • (σ1 , σ2 )) (τ1 , τ2 ). Having (a1 , a2 ) ≤ (b1 , b2 ) in A˜ means ai ≤ bi in Ai (i = 1, 2). Therefore, a1 ◦ σ1 ≤ b1 ◦ σ1 and a2 ◦ σ2 ≤ b2 ◦ σ2 4In the middle of these equations and in the case of free monoids, where Σ = Σ = S ∗ and Σ = T ∗ 1 2 ˜ Equality sets the diagonal ΔE(h1 ,h2 ) in E 2 (h1 , h2 ) for the equality set E(h1 , h2 ) ⊆ S + is contained in Σ. are useful both for classical languages and for those obtained by splicing systems, see [13, 16].
3. On two algebraic constructions for automata
187
hold and so also (a1 , a2 ) • (σ1 , σ2 ) = a1 ◦ σ1 , a2 ◦ σ2 ) ≤ (b1 ◦ σ1 , b2 ◦ σ2 ) = (b1 , b2 ) • (σ1 , σ2 ). ˜ Σ; ˜ •) is an o-action. Analogously, it is verified that is isotone. As It follows that (A, ˜ = (A, ˜ Σ, ˜ Y˜ ; •, ) appears, which has the universal cone a result, a new o-automaton A property. I.e., given any automaton B = (B, Δ, Z; $, ) and any morphisms li : B −→ Ai (i = 1, 2) such that the diagram | | ~| A1 B BB m1 B l1
BBl BB2 A2 | | | ~| m2 A
˜ m is commutative, there exists uniquely a homomorphism m ˜ : B −→ A, ˜ = (f˜, ˜ h, g˜), that makes the diagram ˜
m ˜ (m) ˜ ˜ To_?TT_T _ _ _ _ _ _ii_i B A pr T ?? TTT2 l1iiii ~~ ? TTii ~~~ l2 pr1 tiiiiTTT* A2 A1 A AAA } ~}}}m2 m1 A
commutative. Existence of m. ˜ Given any morphisms li : B −→ Ai (i = 1, 2) with l1 = (f , h , g ) and l2 = (f , h , g ) ˜ g˜), by the rules ˜ where m one can define m ˜ : B −→ A, ˜ = (f˜, h,
b −→ (bf , bf ) , δ −→ (δ h , δ h ) , z −→ (z g , z g ). ˜ : Δ −→ Σ, ˜ given by Observe that the components f˜ and g˜ of m ˜ are isotone. The map h
the rule δ −→ (δ h , δ h ) is a homomorphism of semigroups. From the commutativity of the last diagram it follows that
(bf )f1 = (bf )f2 , (δ h )h1 = (δ h )h2 , (z g )g1 = (z g )g2 . ˜ Furthermore, Thus, m ˜ is a map from B into the automaton A. ˜
˜
˜
˜
˜
(b $ δ)f = bf • δ h , and (b δ)g˜ = bf δ h together with what was said above imply that m ˜ is a homomorphism of o-automata. ˜ that ˜˜ Suppose there exists an another morphism m ˜ Uniqueness of m. ˜ : B −→ A makes the last diagram commutative. This means, in particular, that ˜
˜ bm = (a1 , a2 ) ∈ A1 ×A A2 , ˜ m δ ˜ = (σ1 , σ2 ) ∈ Σ1 ×Σ Σ2 ,
188
C HAPTER II. AUTOMATA THEORY
and
˜
˜ = (y1 , y2 ) ∈ Y1 ×Y Y2 . zm
˜˜ ˜˜ ˜ ˜ m ˜˜ = (f, ˜ ˜ Suppose m h, g˜). On one hand, having m ˜ : B −→ A, ˜ = m, ˜ means that there exists a triple (b, δ, z) ∈ B × Δ × Z such that ˜
˜ ˜ = (b, δ, z)m . (b, δ, z)m
This is equivalent to the fact that ˜ ˜
˜
˜ ˜
˜
˜
(bf = bf ) ∨ (δ h = δ h ) ∨ (z g˜ = z g˜ ) holds. ˜ On the other hand, the commutativity of the last diagram for m ˜ implies that ˜ ˜ l1 = pr1 ◦ m ˜ and l2 = pr2 ◦ m. ˜ The first equality gives
˜
˜
˜ ˜ a1 = bf = bl1 = bpr1 ◦m = pr1 (bm ) = pr1 (a1 , a2 ) = a1 ,
˜ ˜
˜
˜ = pr1 (σ h ) = pr1 (σ1 , σ2 ) = σ1 σ1 = σ h = σ l1 = σ pr1 ◦m
and
˜
˜
˜
˜ ˜ = pr1 (z m ) = pr1 (z g˜ ) = pr1 (y1 , y2 ) = y1 . y1 = z g = z l1 = z pr1 ◦m Analogously, the second equality gives
a2 = a2 , σ2 = σ2 and y2 = y2 . These calculations show that
((a1 = a1 ) ∧ (a2 = a2 )) ∧ ((σ1 = σ1 ) ∧ (σ2 = σ2 )) ∧ ((y1 = y1 ) ∧ (y2 = y2 )), implying ˜
˜ ˜
˜ ˜
˜
˜
(bf = bf ) ∧ (δ h = δ h ) ∧ (z g˜ = z g˜ ) and, therefore, also ˜
˜ ˜ (b, δ, z)m = (b, δ, z)m . This contradicts the choice of (b, δ, z) and so the uniqueness of m ˜ follows. ˜ the fiber product of the given o-automaton A1 and A2 over Call this o-automaton A the o-automaton A and denote A1 × AA2 . To finish the proof, it remains to define covers. For a given o-automaton A = α (A, Σ, Y ; ◦, ∗) call a set of morphisms {Aα −−m −− → A} a cover for A if it holds
A = ∪ fα (Aα ) , Σ = ∪ hα (Σα ), Y = ∪ gα (Yα ). α
α
α
m
mα
Taking a cover {Aα −−−−→ A} together with some covers {Aαβ −−−αβ −→ Aα } it is easy m ·mα is also a cover. Indeed, let m to understand that {Aαβ −−αβ α = (fα , hα , gα ) −−−→ A} and mαβ = (fαβ , hαβ , gαβ ) be the corresponding triples with their first and third components isotone mappings and with hα and hαβ being homomorphisms of semigroups. Then mαβ · mα is the triple of mappings (fαβ · fα , hαβ · hα , gαβ · gα ) with its components having the same properties. Note that the triple mαβ ·mα is covering: e.g., we have
3. On two algebraic constructions for automata
189
A = ∪ fα (Aα ) = ∪ fα (∪ fαβ (Aα,β )) = α
α
β
∪ ∪ fα (fαβ (Aαβ )) = ∪ (fαβ · fα )(Aαβ ). α β
α,β
To see that the condition (48) in the definition of the Grothendieck topology holds it remains to realize that for the projections qα : B ×α Aα −→ B, ∪ qα (B × A Aα ) = ∪ qα (B ×A Aα , Γ ×Σ Σα , Z ×Y Yα ) = α
α
∪{(b, γ, z) | f (b) ∈ fα (Aα ), h (γ) ∈ hα (Σα ), g (z) ∈ gα (Yα )} = α
{(b, γ, z) | f (b) ∈ A, h (γ) ∈ Σ, g (z) ∈ Y } = {(b, γ, z) | b ∈ B, γ ∈ Γ, z ∈ Z} = B.
This proof can be modified to give T HEOREM 3.2. The category of all semigroup ordered actions of the type (A, Σ; ◦) with their isotone homomorphisms as its morphisms defines a Grothendieck pretopology. Further specialization is possible: when fixing the semigroup Σ the category of ordered Σ-sets with (isotone) homomorphisms as its morphisms gives again a Grothendieck pretopology. The case (of trivial order) – with all homomorphisms of Σ-sets as the morphisms – is considered in [21]. To strengthen the mathematical backbone and to prepare the ground for the next section, let us see how contravariant functors A : A −→ Set appear of the present context. Additional motivation comes from concurrency models in [25] and from associative memory investigations, see [3]. For a group Γ denote it Γ∗ when considered as a regular Γ-set. Take all Γ-sets to be the open sets and all homomorphisms of Γ-sets to be the morphisms of the covering topology T = T(Γ) on {ε}. Here, ε is the identity element of Γ. Each element σ ∈ Γ defines an automorphism σ : Γ∗ −→ Γ∗ by the rule γ −→ γσ. These automorphisms act on the set S = F (Γ∗ ) for any sheaf F on T(Γ). It appears that S is a Γ-set that uniquely defines F . Indeed, let F be any sheaf on T. As S is a transitive Γ-set then there exists an epimorphism f : Γ∗ −→ S giving a cover for S. The sequence F (S)
(F (f ))
(F (pr1 ))
/ F (Γ∗ )
(F (pr2 ))
// F (Γ∗ × Γ∗ ) S
is exact. It means that (F (S); F (f )) is the kernel for the pair (F (pr1 ), F (pr2 )) and F (f ) is a monomorphism of F (S) into F (Γ∗ ). Denote by S1 the image F (f )(F (S)) in S. To describe the subset S1 in S means to find the cokernel for the diagram Γ ∗ × S Γ∗
pr1 pr2
//
Γ∗
f
/ S
This cokernel is given by Γ∗ /π, where π is the least Γ-equivalence on Γ∗ × Γ∗ that contains the set of all pairs {(γ1 , γ2 ) | γ1f = γ2f }. It follows
190
C HAPTER II. AUTOMATA THEORY
f
f
εf ◦ γ1 = (εγ1 ) = γ1f = γ2f = (εγ2 ) = εf ◦ γ2 .
Therefore εf ◦ γ1 γ2−1 = εf , which means that γ1 γ2−1 ∈ Stab(εf ) ≤ Γ. Considering the Γ-equivalence π naturally corresponds to the taking of the normal closure N for Stab(εf ). As (Γ/N )∗ is the cokernel for the pair (pr1 , pr2 ) we conclude that for any (γ1 , γ2 ) ∈ Γ∗ ×S Γ∗ it hold γ1−1 γ1 γ2−1 γ1 = γ2−1 γ1 ∈ N. Denote s = γ1f ; then f
s ◦ γ2−1 γ1 = γ2f ◦ γ2−1 γ1 = (γ2 γ2−1 γ1 ) = γ1f = s. Note that any element in N can be written as a finite product of elements of the type γ2−1 γ1 with γ1f = γ2f . Therefore – resuming the above arguments – the elements s ∈ S1 are just the fixed points for the normal closure of the stability subgroup Stab(εf ) in the group Γ for εf ∈ S. Any Γ-set S carries on itself a partition with the components being transitive Γ-sets Sα , these last being Γ-orbits on S. Consequently, there exists a cover {iα : Sα → S} with iα natural immersions. The definition of F gives the existence of the following short exact sequence of Γ-sets and their homomorphisms
F (S)
(F (iα ))
/ F (Sα ) α
//
(F (prα )) (F (prβ ))
A verification shows that we have an isomorphism F (S) ∼ =
F (Sα ×S Sβ )
α,β
F (Sα ) here. Conversely, a Γ-set S together with the above isomorphisms F (S) ∼ = F (Sα ) gives a = S1 and F (S) ∼ α
α
sheaf F on T(Γ). Resuming the above analysis, it follows that the following elementary, yet important result (see also [6], pp. 124 – 127). T HEOREM 3.3. Let Γ be a group. To give a Γ-set is equivalent to giving a sheaf in the topology T(Γ). Note also that analogous result holds for an ordered Γ-set for any right orderable group Γ; [11]. F We observe that an automaton appears here as a system (T −−− −→ Set) [12]. This observation can serve a source for new techniques both for automata theory and models for parallel computation. Decomposition methods in mind, the wreath product construcA tion for systems of the type (A −−− −→ Set) is considered in the next section.
3.3. Wreath Products of General Automata From their very beginning the decomposition methods for automata use the wreath product construction; at least, This is so for semigroup actions; see [5]. Later several generalizations have appeared; see [14, 21]. In this section the wreath product construction for categories is lifted to that for general automata. New links appear so that motivation for this approach increases both from mathematical and computer science point of view.
3. On two algebraic constructions for automata
191
Let A be a small category and denote O = Obj(A) for brevity. A collection of sets M = {Ma | a ∈ O} is called a right A-set if for any element x ∈ Mb and any morphism f ∈ MorA (a, b) an element x ◦ f in Ma is preassigned so that (1) for any a, b, c ∈ O, x ∈ Mc , f (as above), g ∈ MorA (b, c), x ◦ (f · g) = (x ◦ g) ◦ f , and (2) x ◦ idc = x hold. In other words, a right A-set is a collection of sets M together with some right action ◦ M × Mor(A) −→ M given. A family of mappings Φ = {Φa | Φa : Ma −→ Na , a ∈ O} is called a homomorphism of right A-sets if the condition (x ◦ f )Φa = xΦb ◦ f holds for all a, b ∈ O, x ∈ Mb and f ∈ MorA (a, b). We have used here the diagrammatic notation f · g for the composition of morphisms in A. Note that the case | Obj(A) | = 1 is classical – just the semigroup actions appear. It is easy to see that to give a right A-set is equivalent to giving of a contravariant functor A −→ Set. Left A-sets are defined analogously – giving such an A-set is equivalent to giving of a covariant functor A −→ Set, see [8]. Considering this, the following construction is rather natural. A C o v a r i a n t c a s e. Given general automata A = ((A −→ Set); 2) and B B = ((B −→ Set); ) with A and B covariant functors, let us define a (new) automaton V, (48)
AwrB
V = AwrB = ((AwrB −−−−→ Set); )
in the following way. First, we define the category V = AwrB with its set of objects given by Obj(V) = {(α, b) | b ∈ Obj(B), α : bB −→ Obj(A)}. Here a map α indicates for any element m ∈ bB some object α(m) in Obj(A). For any objects (α, b) and (α , b ) in V, a mor(Φ,f ) phism (α, b) −− −−→ (α , b ) is the pair (Φ, f ) with f in M orB (b, b ) and Φ a collection of morphisms in Mor(A), such that ∀m ∈ bB ,
Φ(m)
α(m) −−−−→ α (f B (m)).
(Ψ, g) Having also a morphism (α , b ) −− −−→ (α , b ), it is assumed that the composition is defined by the rule
(49)
(Φ, f ) · (Ψ, g) = (Φ ∗ fΨ, f · g),
where the collection Φ ∗ fΨ of morphisms in A is defined by the rule ∀m ∈ bB , (Φ ∗ fΨ)(m) = Φ(m) · Ψ(f B (m)). To show that this construction yields a category the following must be verified: (i) for any object (α, b) ∈ Obj(V) there exists the identity morphism e(α,b) ; u (ii) the identity morphisms e(α,b) are such that for any other morphism (α, b) −→ (α , b ) it is true e(α,b) · u = u = u · e(α ,b ) ; and (iii) the composition of morphisms in Mor(V) is associative, i.e. ((Φ, f ) · (Ψ, g)) · (X, h) = (Φ, f ) · ((Ψ, g) · (X, h))
192
C HAPTER II. AUTOMATA THEORY
holds for any three morphisms where it makes sense. The condition (i) is satisfied by taking e(α,b) = (I, idb ). Indeed, as functors preserve identity morphisms, it follows that the functor B takes the identity idb ∈ Mor(b, b) into the identity map bB → bB , i.e. it is true that (idb )B = idbB ; here idbB is the identity in MorSet (bB , bB ). Therefore, (idb )B (m) = idbB (m) = m for every element m in the set bB . It follows that I gives the needed collection of morphisms I(m) = idα(m) : α(m) −→ α(m) = α((idb )B (m)). To prove (ii) take any morphism in V, (Φ,f )
(α, b) −−−−→ (α , b ). Then it holds (I, idb ) ∗ (Φ, f ) = (I ∗
Φ), idb · f ).
idb
The identity idb · f = f is obvious, as it holds in the category B. So, to prove (ii) it remains to verify that I ∗ idbΦ = Φ is true. Indeed, for any m ∈ bB it holds (I ∗
idb
Φ)(m) = =
I(m) · Φ((idb )B (m)) = I(m) · Φ(idbB (m)) idα(m) · Φ(m) = Φ(m),
so giving the desired equality for the first components of (I, idb ) ∗ (Φ, f ) and (Φ, f ). Analogously it is verified that (Φ, f ) ∗ (I , idb ) = (Φ, f ). As a result, it is proved that (ii) holds for W. Let us prove (iii) now. On the one hand we have [(Φ, f ) ∗ (Ψ, g)] ∗ (X, h) = =
[Φ ∗ fΨ, f · g] ∗ (X, h) = ((Φ ∗ fΨ) ∗
f ·g
X, (f · g) · h).
On the other hand, (Φ, f ) ∗ [(Ψ, g) ∗ (X, h)] = =
(Φ, f ) ∗ (Ψ ∗ gX, g · h) = (Φ ∗ f(Ψ ∗ gX), f · (g · h)).
It follows from the definition of the category B, that (f · g) · h = f · (g · h). To see that the following formula is true (50)
(Φ ∗ fΨ) ∗
f ·g
X = Φ ∗ f(Ψ ∗ gX),
3. On two algebraic constructions for automata
193
take any element m ∈ bB and consider the corresponding morphisms in A on the left and on the right side. We get [(Φ ∗ fΨ) ∗
f ·g
X](m) =
[(Φ ∗ fΨ)(m)] ·
f ·g
X(m) =
=
[Φ(m) · Ψ(f (m))] · X((f · g)B (m)) =
=
Φ(m) · [Ψ(f B (m)) · X(g B (f B (m)))] =
=
Φ(m) · [Ψ ∗ gX](f B (m)) =
=
Φ(m) · f[Ψ ∗ gX](m) =
=
(Φ ∗ f[Ψ ∗ gX](m)
B
for any m ∈ bB . It implies that (50) holds and,consequently, (iii) also. Thus we have proved that V is a category. Let us now define the functor AwrB : AwrB −→ Set now, using for it the functors A and B. The rule (51)
(α, b)AwrB = {(l, m) | m ∈ bB , l ∈ [α(m)]A }
gives AwrB as a mapping Obj(V) −→ Obj(Set). For any morphism (Φ, f ) ∈ MorV ((α, b), (α , b )) define (Φ, f )AwrB as the map (ΦA , f B ) in MorSet ((α, b)AwrB , (α , b )AwrB ) . Let us prove that AwrB preserves the identity morphisms in V. Note that (I, idb )AwrB = (I A , (idb )B ) = (I A , idbB ), where for any m ∈ bB we have I A (m) = I(m)A by definition. Therefore, I(m)A
[α(m)]A −−−−→ [α((idb )B (m))]A = [α(idbB (m))]A = [α(m)]A , i.e. I(m)A ∈ MorSet ([α(m)]A , [α(m)]A ). As I(m) = idα(m) in Mor(A) and the functor A preserves identity morphisms in A then I(m)A = id[α(m)]A . It remains to prove that AwrB is compatible with the composition in V, i.e. that (52)
[(Φ, f ) · (Ψ, g)]AwrB = (Φ, f )AwrB · (Ψ, g)AwrB
holds. Here, on the right side of (52) the symbol · stands for the composition of morphisms (i.e. mappings) in Set. The left side of (52) can be written as [(Φ, f ) · (Ψ, g)]AwrB
=
(Φ ∗ fΨ, f · g)AwrB =
=
((Φ ∗ fΨ)A , (f · g)B ) =
=
(ΦA ∗ (fΨ)A , f B · g B ) =
=
(ΦA , f B ) · (ΨA , g B ) =
=
(Φ, f )AwrB · (Ψ, g)AwrB
Here, the composition of the pairs (ΦA , f B ) and (ΨA , g B ) with the collections ΦA and (fΨ)A as their first components goes shifted as indicated above. It follows that AwrB is
194
C HAPTER II. AUTOMATA THEORY
a covariant functor and so V −(AwrB) −−−−→ Set is a labelled category. Note also that having A B C A −→ Set, B −→ Set and C −→ Set, it follows that (AwrB)wrC
(AwrB)wrC −−−−−−−−→ Set and Awr(BwrC)
Awr(BwrC) −−−−−−−−→ Set are isomorphic, see [24]. Having any morphism (Φ, f ) ∈ MorV ((α, b), (α , b )) and any element (l, m) ∈ [(α, b)]AwrB , let us define (l, m) (Φ, f ) = (l2Φ(m), m f ).
(53) To prove that
AwrB
V = (V −−−−→ Set; ) is a general automaton, it remains to verify that (54)
(l, m) ((Φ, f ) · (Ψ, g)) = (((Φ, f )AwrB )(l, m)) (Ψ, g)
for all elements (l, m) in [(α, b)]AwrB . On the left side of (54), (l, m) ((Φ, f ) · (Ψ, g)) =
(l, m) ((Φ ∗ fΨ), f · g) =
=
(l2(Φ ∗ f Ψ)(m), m (f · g)) =
=
(l2(Φ(m) · Ψ(f B (m))), m (f · g)) =
=
(Φ(m)A (l)2Ψ(f B (m)), f B (m) g).
On the right side of (54), ((Φ, f )AwrB )(l, m) (Ψ, g) =
((ΦA , f B )(l, m)) (Ψ, g) =
=
(Φ(m)A (l), f B (m)) (Ψ, g) =
=
(Φ(m)A (l)2Ψ(f B (m), f B (m)) g).
The results on both sides are the same and so (54) follows. As a result, it is proved now that a new automaton V appears. Let us call V the wreath product of A and B. C o n t r a v a r i a n t c a s e. Given two general automata A and B with A and B being contravariant functors this time, it is possible to modify the above ’covariant construction’ to be applicable here. This can achieved by considering A : Aop −→ Set (with Af = Af op ) and B : Bop −→ Set and taking AwrB
(Aop wrBop )op −−−−→ Set ∼ A together with thereafter. However, to avoid isomorphisms of the type (Aop )op = A∼ A it is useful to give a direct construction for AwrB in contravariant case as well. = The sets of objects for V = AwrB with the functors A and B contravariant are the (Φ,f ) same as for covariant case. Yet, a morphism (α, b) −− −−→ (α , b ) is defined now as a pair (Φ, f ) with f ∈ MorB (b, b ) and Φ being some (fixed) collection of morphisms (in A) Φ(m)
∀m ∈ bB , α (m) −−−−→ α(f B (m)).
3. On two algebraic constructions for automata
195
These morphisms are composed by the rule: having (Φ,f )
(Ψ,g)
(α, b) −−−−→ (α , b ) −−−−→ (α , b ) we define (49’) where it is defined
(Φ, f ) · (Ψ, g) = (Ψ ∗ gΦ, f · g) , ∀m ∈ bB , (Ψ ∗ gΦ)(m) = Ψ(m) · Φ(g B (m)).
The situation is illustrated by the following figure. B
B
b
B
(f g) (m)
f
f
Obj A B B α (f (g (m)))
B
B
b’
b
g B (m)
b’
Φ (m) B
α ’ (g (m))
Ψ (m)
B
g
g
b’’ b’’
g
B
B
Obj B
α ’’ (m)
B
m Domains of attributes Fig. 9: Composition of morphisms (Φ, f ) and (Ψ, g).
The definition for the functor AwrB : AwrB −→ Set is just the same as in covariant case, i.e. (51’) ≡ (51). It is also the same that we define (Φ, f )AwrB = (ΦA , f B ). A verification shows that (52’) [(Φ, f ) · (Ψ, g)]AwrB = (Ψ, g)AwrB · (Φ, f )AwrB , i.e. AwrB is a contravariant functor as well. In the contravariant case the feedback operations 2 and (for A and B, correspondingly) are defined by the conditions x 2(h · k) = k A (x)2h and y (f · g) = g B (y) f for all elements x ∈ [codom k]A , y ∈ [codom g]B and for all diagrams f
g
a −→ a −→ a in A and b −→ b −→ b in B. h
k
Therefore, defining (53’) ˜ A , (˜l, m) ˜ (Φ, f ) = (˜l2Φ(m), ˜ m ˜ f) , ∀m ˜ ∈ bB , ˜l ∈ [α (m)] it holds ∀m ∈ bB , l ∈ [α (m)]A , (l, m) ((Φ, f ) · (Ψ, g)) = ((Ψ, g)AwrB (l, m)) (Φ, f ).
196
C HAPTER II. AUTOMATA THEORY
Resuming the above modifications (49’) – (53’) we get the wreath product construction for the contravariant case. Note that in the contravariant case there exists an another possibility to introduce wreath product construction, which will be treated elsewhere.
3.4. Examples and Further Motivation E XAMPLE 3.1. Consider the special case of an A-set M and a B-set N with Obj(A) = {a} and Obj(B) = {b}; i.e., |Obj(A)| = 1 = |Obj(B)|. It follows that there exists the single object (α, b) for AwrB as well, with α : N −→ {a} being the constant function. The set Mor(AwrB) consists of the pairs (Φ, σ) with σ ∈ Σ2 = End (b) and Φ(y) ∈ Σ1 = End (a) for every y ∈ N ; their composition is given by the rule (49), therefore (55)
(Φ1 , σ1 ) · (Φ2 , σ2 ) = (Φ1 · σ1Φ2 , σ1 σ2 ).
Further, having the functors A and B given by aA = M and bB = N , together with some mappings M −→ M and N −→ N replacing the endomorphisms for a and b correspondingly, it follows from (51) that (α, b)AwrB = M × N . The formula (Φ, σ)AwrB = (ΦA , σ B ) gives the action of Mor(AwrB) on M × N by the rule (56)
(x, y) ◦ (Φ, σ) = (Φ(y)A (x), σ B (y))
for all x ∈ M and y ∈ N . As a result we get the wreath product of the two monoid actions (M, Σ1 ) and (N, Σ2 ) as given by the following D EFINITION 3.4. Let (M, Σ1 ) and (N, Σ2 ) be any two monoid actions. Take the set ΣN of all functions from N to Σ1 together with their pointwise multiplication. Thereafter, 1 take the set Γ = ΣN 1 × Σ2 with the multiplication on it given by (55). So the monoid Γ appears together with its action on M × N given by (56). This resulting action is called the wreath product of the given two actions and is denoted by (M, Σ1 )wr(N, Σ2 ). Going one step further, let us think of the feedback operations 2 and given by A B (46) and (47) for the systems A1 = (A −→ Set; 2), and A2 = (B −→ Set; ) as (i) “output operations” ∗ , changing for it the codomain in (46) to some “output set” Yi for ∗(i) (i = or ). Then (53) and (54) imply that the wreath product A1 wrA2 in the sense of Section 3.3 for one-object categories A and B (with the functors A and B as above) gives us precisely the wreath product of the monoid automata A1 = (M, Σ1 , Y1 ; ◦ , ∗ ) and A2 = (N, Σ2 , Y2 ; ◦ , ∗ ) as defined by the following. Definition. Let A1 = (M, Σ1 , Y1 ; ◦ , ∗ ) and A2 = (N, Σ2 , Y2 ; ◦ , ∗ ) be any two semigroup automata. Take the wreath product (M × N, Γ; •) of the actions (M, Σ1 ; ◦ ) and (N, Σ2 ; ◦ ), with Γ being the semigroup ΣN 1 λΣ2 . Define the “output action” of Γ on M × N by the rule (x, y) ∗ (Φ, σ) = (x ∗ Φ(y), y ∗ σ). The condition (x, y) ∗ ((Φ1 , σ1 ) · (Φ2 , σ2 )) = ((x, y) • (Φ1 , σ1 )) ∗ (Φ2 σ2 ) is satisfied. So, the semigroup automaton A = (M × N, Γ, Y1 × Y2 ; •, ∗) appears which is called the wreath product of the semigroup automata A1 and A2 . It is easy to understand that for any monoid automata A1 and A2 their wreath products as given by the latter definition, can be realized as wreath products of suitable sysA tems A = (A −→ Set; 2), where it is |Obj(A)| = 1.
3. On two algebraic constructions for automata
197
E XAMPLE 3.2. According to [22], a rewriting system (called also a semi-Thue system) is a pair R = (X; P), where X is an alphabet and P is a finite set of ordered pairs of words in X + . The elements (q, p) ∈ P are referred to as rewriting rules (productions), their components together with an axiom s ∈ X + we call primitives. If some words u and v in X + are such that u = w1 qw2 and v = w1 pw2 then it is said that u yields v directly. A derivation in R is a finite sequence of words u0 = u, u1 , . . . , un = v such that ui yields ui+1 directly, 0 ≤ i ≤ n − 1; in such a case the derivation is denoted by u −→ v and it is said that u yields v in R. A rewriting system R generates several categories. First, the category P of primitives for R: its objects are all primitives, Mor (P) consists of all primitive derivations, i.e. derivations of the type p −→ p with p and p primitives. When an axiom s ∈ X + is fixed then s is considered a primitive also. In this case, taking for every primitive p the set D∗ (p) of all primitive derivations from s to p, we can define a functor D∗ : P −→ Set as well. In this way the labelled category D∗ (P −−−−→ Set) appears. Second, the linguistic category L for R. The objects for L are all words in X ∗ , its morphisms are all the derivations in R. Third, we get the syntactic category B for R by taking Obj(B) = Obj(L) and assuming that MorB (u, v) = ∅ if there exists in L a derivation s −→ u. To have a covariant functor D : B −→ Set one can define for every u ∈ Obj(L) the set uD as the f
set of all derivations s −→ u. Also, for every derivation u −→ v, it is possible to define the map f D : uD −→ v D by the rule: h −→ h · f , for any h ∈ uD . So, the labelled D category (B −→ Set) appears. Fourth, the semantic pair (B, B) for R appears; see in [2]. For any letter x ∈ X take a nonempty set xB so that these sets are pairwise disjoint and it holds (wx)B = wB ×xB for every letter x ∈ X and every word w in X + . Taking some maps f B : pB −→ q B in all cases where q yields p directly, with p and q primitives, let us extend this action of B f
to all productions w1 qw2 −→ w1 pw2 by taking f B : w1B ×pB ×w2B −→ w1B ×q B ×w2B , where f B is considered to be the identity map on the corresponding contexts. Functions from uB to sB for any derivation s −→ u are called interpretations, their images in sB are called values of u. As a result, a contravariant “interpreting functor” B : B −→ Set appears. ∗ wrD Consider the full subcategory (PwrB −D −−−−→ Set) generated by the objects (α, u) D in PwrB, where α : u −→ Obj(P) is any constant function such that α(uD ) = p with p primitive and p a subword of u. Any such function α can be taken as choosing some primitive subword α(u) in u. This means that there exists a context (w1 , w2 ) ∈ X ∗ × X ∗ such that u = w1 α(u)w2 . Therefore, to take such an object (α, u) for PwrB means that we take a word u ∈ X ∗ together with choosing some primitive α(u) in it. Consequently, (Φ,f ) having a morphism (α, u) −− −−→ (α , v) in this subcategory for PwrB means that f
there exists a derivation u −→ v in R which begins with u0 = u = w1 α(u)w2 and ends up with un = v = w ˜1 α (v)w ˜2 ; here α(u) and α (v) are primitives and wi , w ˜i are some f
words in X ∗ . Saying it in another way, for any derivation u −→ v there is chosen a
198
C HAPTER II. AUTOMATA THEORY
Φ(u) companion primitive derivation α(u) −− −−→ α (v) also. This argument shows that any semi-Thue system R together with its derivations can be naturally considered as a full subcategory in PwrB. Treating of semi-Thue systems in this way allows to simplify, at least to some extent, quite a complicated notion of the transition adopted in [2]. Though the idea to use categories as a proper framework for derivations is not original with us, it seems to have gone unnoticed the feature of a rewriting system to allow to be treated as a wreath product of suitable categories.
E XAMPLE 3.3. The conception of the transition system is a fundamental model of computation in Computer Science [25]. Such a system J is a structure of the type (S, i, L; T), where S is a set of states together with the initial state i fixed, L is a set of labels and T ⊆ S × L × S is the transition relation. Let us denote the transition l (s, l, s ) ∈ T by s −→ s . Note that no two distinct transitions with the same pre- and post-states have the same labels. Every transition system J generates a category T = (S, i, L∗ ; T∗ ), quite analogously to the way how a finite automaton generates the corresponding free semigroup automaton. Namely, define (s, v, s ) ∈ T∗ if v = l1 . . . ln−1 ∈ L∗ is such that there exists a path s = s1 , s2 , . . . , sn = s so that (si , li , si+1 ) ∈ T for all i = 1, . . . , n − 1; v call s −→ s an extended transition from s to s . A category T appears: there is given a natural (associative) composition of extended transitions together with existing the label ∗ 1 ∈ L∗ (usually denoted ∗) so that there exist “idle” transitions ids : s −→ s. The transition system where exist paths from the initial state to all other states is called a reachable transition system. For a reachable transition system J the following covariant functor T : T −→ Set can be defined. Namely, the rule T (s) = {w | w ∈ L∗ , ∃(i −→ s) ∈ T∗ } w
gives the functor on the objects of T. Further, let us define the functor T on Mor (T) as w w follows. For any given extended transition s −→ s , the mapping T (s −→ s ) transforms v any transition (i −→ s) ∈ T (s) into the transition (i −−vw −−→ s ) ∈ T (s ). A verification T
shows that in this way we get a covariant functor T : T −→ Set. We call (T −→ Set) a transition category. Let Jj = (Sj , ij , L∗j ; Tj ) (j = 1, 2), be any two reachable transition Tj systems and let (Tj −−− −→ Set) be the corresponding transition categories. Taking wrT2 their wreath product (T1 wrT2 −T−1− −−→ Set) and thereafter the full subcategory F in it generated by the objects (α, b) such that α : bT2 −→ Obj(T1 ) = S1 is a constant function, we get just the product J1 × J2 of the transition systems as defined in [25]. It is instructive to reconsider from the point of view of wreath products the example of the product of the transition systems •O S1 •O S2 ( • ai ; L1 = {a} ) and ( • bi ; L2 = {b} ) 1 2 as given in [25]; we drop the details. Note also that in the special case, where there are given J1 = (S, i, L ; T ) and J2 = (S, i, L; T) together with some inclusion map λ : L → L , the full subcategory F supplies a natural interpretation for the restriction morphism (λ∗ , 1S ) : J1 |L → J2 .
3. On two algebraic constructions for automata
199
3.5. Discussion This is not surprising that the categories developed for automata appear also suitable for modelling of transition systems. Transition systems generalize automata model of computations by enabling usage of values of state variables in previous states passed. Such a feedback information together with input/output primitives attached to transitions makes it useful operational model of distributed and reactive systems. Conventional transition systems use common memory to store state variables. However, real memories are more complicated and often seem to be matched together from some “pieces”. Moreover, according to an observation by N.de Bruijn [3] there exists a kind of buffer keeping track of the most recent information, including information very recently retrieved from the memory. This buffer can be compared to what is called the active window in a text editing system. According to De Bruijn’s idea, there is no fixed window through which information flows, but rather a moving window that floats over information. Distributed memory is used by an another generalization of state transition systems called attributed automata [17] Namely, an attributed automaton, as defined in [17], is a transition network A = (S, L), where S is a set of states together with its subsets Si (initial states) and Sf (final states) indicated; for every state s ∈ S a variable (an attribute as with its domain Ts of values) is attached; L ⊆ S × S is the set of transitions for A together with a map fl : Ts −→ Ts for every transition (s, l, s ) ∈ S × L × S and “allowing” predicates P l : Ts −→ bool are given. To illustrate it, we can represent an attributed au(s−→s ) tomaton as a transition graph with nodes corresponding to states and arcs to transitions. The states are labelled by associated attributes, arcs by enabling predicates and transformation functions as it is done in Fig. 10.
[ P1( x
f(x) )] y=
y
[P
2(
x) ]
z=
h(y) (y )] z= [ P3
x x=
g(
x=o(t)
x)
m(
z)
[P 4( z
)]
[ P5 ( t )]
t
P6 ( t=s(z) [
z )]
z
Fig. 10: Transition graph for A
In the framework of the attributed automata model we can consider a traditional finite automaton as a special case. This kind of automata have the same attribute domain (the finite input and output alphabets) for all states with appropriate specific operations [17]. Moreover, a finite automaton can be presented as an attributed automaton with one
200
C HAPTER II. AUTOMATA THEORY
state [18]. On the other hand, an attributed automaton with finite domains of attributes can be considered as a finite automaton [12]. The notion of a transition category when modified appropriately (bool instead of Set) and with 2-operation added seems to cover some main features of attributed automata. Systems of this type are close also to the context in [4] concerning questions of modelling of distributed systems. The idea of “moving window” seems to have some link with choosing the functions for α : bB −→ Obj(A) for abstract scheme AwrB (AwrB −−−−→ Set; 2). Information technology requires models of software and hardware systems allowing to specify behavior of the systems, to synthesize, analyze and verify complex systems. Compositional techniques is expected to be the most natural to use here, including that for attributed automata. Systems of type (A −→ Set; 2) and their wreath products seem to provide a good theoretical framework for further study of compositions of attributed models of computations. References [1] [2] [3] [4] [5] [6] [7] [8] [9]
[10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23]
A. Asperti and G. Longo. Categories, Types and Structures: An introduction to Category Theory for the working computer scientist. M.I.T. Press, Cambridge, MA, 1991. D. Benson. Syntax and semantics: a categorical view. Information and Control 17, 1970, 145–160. N. G. de Bruijn. A model for information processing in human memory and consciousness. In: Nieuw archief voor wiskunde, ser. 4, Vol. 12, 1994, 35 – 48. K. M. Chandy and L. Lamport. Distributed snapshots: determining global states of distributed systems. ACM Transactions on Computer Systems 3 (1), 1985, 63–75. S. Eilenberg. Automata, Languages and Machines, vol B. Academic Press, N.-Y., 1976. S. I. Gelfand and Yu. I. Manin. Methods of Homological Algebra, vol. I. Nauka, Moscow, 1988. A. Grothendieck. The cohomology theory of abstract algebraic varieties. In: Proc. Internat. Congress Math. Oxford University Press, Edinburgh, 1958, 103–118. M. Hasse and L. Michler. Theorie der Kategorien. VEB Deutsher Verlag der Wissenschaften, Berlin, 1960. J. Hill and K. Clarke. An introduction to category theory, category theory monads, and their relationship to functional programming. Technical Report QMW-DCS-681. Department of Computer Science, Queen Mary & Westfield College, 1994. G. Hotz. Eindeutigkeit und Mehrdeutigkeit Formaler Sprachen. Elektronische Informationsverarbeitung und Kybernetik 2 (4), 1966, 235–247. U. Kaljulaid. Right ordered groups and their representations, 1996, 1–42, manuscript. U. Kaljulaid, M. Meriste, and J. Penjam. Algebraic theory of tape-controlled attributed automata. Technical Report CS59/93. Institute of Cybernetics, 1993. (see [K93a]). L. Kari. DNA computers, Tomorrow’s reality. In: Bulletin EATCS, Vol. 59, 1996, 256–266. G. M. Kelly. On clubs and doctrines. In: Lecture Notes in Mathematics, Vol. 420. Springer-Verlag, 1974, ˝ 181U-256. D. Knuth. Semantics of context-free languages. Math. Systems Theory 2, 1968, 127–146. M. Lipponen and A. Salomaa. Simple words in equality sets. In: Bulletin EATCS, Vol. 60, 1996, 123–143. M. Meriste and J. Penjam. Attributed Models of Computing. Proc. Estonian Acad. Sci. Engin. 2 (1), 1995, 139–157. M. Meriste and V. Vene. Attributed Automata and Language Recognizers. In: Proc. of the Fourth Symposium on Programming Languages and Software Tools, Vol. 420, 1995, 114 – 121. D. Mumford. Picard groups and moduli problems. In: Arithmetical Algebraic Geometry (Proc. Conf. Purdue Univ., 1963). Harper & Row, N. Y., 1965, 33–81. O. Nava and G.-C. Rota. Plethysm, categories, and combinatorics. Advances in Math. 58, 1985, 61–88. B. I. Plotkin. Universal algebra, algebraic logic and data bases. Nauka, Moscow, 1991. G. Rozenberg and A. Salomaa. Cornerstones of Undecidability. Prentice-Hall, New York, London, 1994. J.-P. Serre. Faisceaux algébriques cohérents. Ann. of Math 61, 1955, 197–278.
3. On two algebraic constructions for automata
201
[24] C. Wells. A Krohn-Rhodes theorem for categories. J. of Algebra 64, 1980, 37–45. [25] G. Winskel and M. Nielsen. Models for concurrency. In: Handbook of logic in computer science (vol. 4): semantic modelling. Oxford University Press, 1995, 1–148.
This page intentionally left blank
203
4.
[K98c] Revisiting wreath products, with applications to representations and invariants Comments by the Editors
A
Wreath product constructions (wrpc) for systems of the type (A −→ U; ) or A (A −→ U) are considered. Here, basically, A is a (small) category, U = Set, A a (covariant) functor from A to U and is a partial feedback operation defined on the morphisms and some local elements of A. The grupoid and monoid cases are also treated; for the last case U = Map(X) – the (strict) monoidal category of mappings X m → X n (n, m ∈ N0 ). Wreath products of Grothendick (pre)topologies A and wrpc for presheaves and sheaves on them are considered as well. Some general theorems are proved, preliminary motivated by the case of Aleksandrov and Björner topologies. As an application a new look on Petri nets appear. In this way a unified picture appears for numerous results of classical wrcp, including those for acts, o-acts, actions on o-sets, for (linear) representations of (semi)groups, algebras, also for distributive Ω-semigroups and for (attributed) algebraic automata. Some results close to this approach and concerning invariants of varieties of kalgebras (char k = 0) are presented. Using results of the author on the arithmetic of varieties of (associative) algebras and some properties of Grothendick rings, both new proofs and extensions of some earlier results by E. Formanek and R. Stanley on noncommutative invariants and Hilbert functions of graded algebras are obtained. As the “arithmetical component” here is characteristic-free, there are generated some future hopes for a q-extension of this approach when using the Lusztig Conjecture instead of Grothendick rings. [1], [2], [4], [5], [6]
Comments. This short note is about the last publication appearing in Uno Kaljulaid’s life time; it was written for the Kurosh Algebraic Conference in 1998. We do not know if he then knew that the was soon to die, but in a way it constitutes a kind of mathematical will, expressing his ideas in the realm of generalized automata. Some of it refers to the contents of [3], printed here for the first time. But the author also indicates applications to Petri nets. Some material about this can be found in his Heritage (see Preface and the chapter Bibliography of this Volume). Perhaps we will publish this separately on a future occasion. The Editors
References [1] E. Formanek. Invariants and the ring of generic matrices. J. Algebra 89, 1984, 178–222.
204
C HAPTER II. AUTOMATA THEORY
[2] S. I. Gel’fand and Yu. I. Manin. Methods of homological algebra. Vol. 1, Introduction to the theory of cohomology, and derived categories. Nauka, Moscow, 1988. English Translation: Springer Monographs in Mathematics, Second edition, Springer-Verlag, Berlin, 2003. [3] U Kaljulaid and J. Penjam. On two algebraic constructions for automata. Technical Report CS92/97. Institute of Cybernetics, 1997. (see [K97a]). [4] V. Lychagin. Braided differential operators and quantization in ABC-categories. In: Competes Rendus Acad. Sci., Ser. 1, Vol. 318, Paris, 1994, 857–862. [5] Nava O. and G. C. Rota. Plethysm, categories, and combinatorics. Adv. Math. 58, 1985, 61–88. [6] I. R. Shafarevich. Basic notions of algebra. VINITI, Moscow, 1986. English translaton: Springer, Berlin et al, 1997.
CHAPTER III Majorization
This page intentionally left blank
207
1.
Generalized majorization Fragment. Coauthor J. Peetre
Preamble (by J. Peetre). Sections 1.1–1.2, 1.4 constitute a fragment of a larger, unfinished joint paper. The loose Section 1.3 was added by me later (1994). It replaces a less complete attempt by U. Kaljulaid himself. Section 1.5 was written in connection with an application by us to the Crafoord Foundation in 1994. It gives perhaps a vague indication of what U. Kaljulaid originally had in mind. Many topics which he had meant to treat remain thus untouched. As a compensation we include here two previously unpublished reports by me. Introduction. In 1923 I. Schur published a paper [33] where he developed a method for finding inequalities for characteristic values and diagonal elements of Hermitian matrices. These investigations were continued by A. Ostrowsky [21] in 1952. Let us briefly recall the basic observation that was the starting point for Schur. Let A = (aij ) be an Hermitian matrix with characteristic roots λi . Then there exist a unitary matrix T = (tij ) such that diag(λ1 , . . . , λn ) = T AT −1 . Schur observed that this implies that ⎛ ⎞ ⎛ ⎞⎛ ⎞ λ1 |t11 |2 · · · |t1n |2 a11 ⎜ · ⎟ ⎜ · ⎜ ⎟ · · · · ⎟ ⎜ ⎟ ⎜ ⎟⎜ · ⎟ ⎜ · ⎟=⎜ · ⎜ ⎟ · · · · ⎟⎜ · ⎟ ⎜ ⎟ ⎜ ⎟. ⎝ · ⎠ ⎝ · · · · · ⎠⎝ · ⎠ λn |tn1 |2 · · · |tnn |2 ann Indeed, using T −1 = T¯ t = T ∗ we have ⎛ ⎞ ⎝ tkj ajl ⎠ t∗lk = ajl tkj t¯kl = ajl tkj t¯kj δjl = ajj |tkj |2 . λk = l
j
l
j
j
l
j
Here all elements of the matrix B = (|tij | ) are non-negative and, furthermore, all line sums and all row sums equal unity. Schur called such matrices “averages”. Nowadays they are called doubly stochastic or bistochastic matrices1. Ωn . Many papers have been published on Ωn since these remote days. Yet, it appears that until the 80’s the main interest in Ωn was due to the influence of the following three “centers of attraction”: • the Hardy-Littlewood-Pólya Theorem (1929; in brief: HLP); • the Birkhoff-von Neumann Theorem (1946; BN); • Van der Waerden Conjecture (1926; VdWC). Denote by Sn the symmetric group on the set n = {1, . . . , n} and, for σ ∈ Sn and a ¯ = (a1 , . . . , an ) ∈ Rn+ , set a ¯σ = (aσ(1) , . . . , aσ(n) ) ∈ Rn+ . Further, for a set of vectors (i) n {b |i ∈ I} in R+ denote its convex envelope by K{¯b(i) |i ∈ I}. 2
1Editor’s Note. On even doubly stochastic matrices we refer to the work of Annela Kelly (née Rämmer), a student of Uno Kaljulaid’s, [27, 28].
208
C HAPTER III. MAJORIZATION
HLP can now be formulated as follows. Given any two vectors a ¯, ¯b ∈ Rn+ , the following conditions are equivalent: (1) There exist a matrix B ∈ Ωn such that a ¯ = ¯bB; (2) a ¯ ∈ K(bσ |σ ∈ Sn ); (3) a ¯ ≺ ¯b where in (3) the sign ≺ means that we have the inequalities k
a[i] ≤
i=1
k
b[i]
i=1
for all k ∈ {1, 2, . . . , n − 1}, whereas for k = n the corresponding equality holds: n i=1
a[i] =
n
b[i] ;
i=1
a[1] ≥ a[2] ≥ · · · ≥ a[n] and b[1] ≥ b[2] ≥ · · · ≥ b[n] being the decreasing rearrangements of the vectors a ¯ and ¯b respectively. BN states that every doubly stochastic matrix in Ωn is a convex combination of permutation matrices. This result may be interpreted in terms of a probabilistic model for fuzzy bijections X → Y : we let each element bij of B give the transition probability for xi → yj ; the fact that the i-th row sum is 1 means then that the element xi is by necessity mapped into some element in Y and that the j-th row sum is 1 that the element yj can be obtained as the image of some element in X. In this language, BN means that every fuzzy bijection X → Y is a convex combination of ordinary bijections from X to Y . The importance of BN for Discrete Mathematics appears to be due to the fact that BN is the matrix analogue of Ph. Hall’s famous Matching Theorem (proved in the late 30’s; see [10]). BN has been rediscovered, commented on, or given new proofs by several authors; see, e.g., [24] for an infinite generalization of BN. Let us briefly recall one of the proofs of BN that goes by induction on the number of positive elements of X ∈ Ωn and uses as an important ingredient the Frobenius-Koenig criterion for the existence of a positive diagonal in a non-negative matrix2. According to this criterion, every diagonal matrix (aσ(1) , . . . , aσ(n) ), σ ∈ Sn , of a non-negative matrix A = (aij ) contains a zero if and only if A contains an s × t submatrix of zeros with s + t ≥ n + 1. The proof of BN goes now as follows. Let X = (xij ) ∈ Ωn be given. The proof is carried by induction on the cardinality of the support [X] of X. If #[X] = n, then X is a permutation matrix and BN is trivially true for X. If #[X] < n, then we use the Frobenius-Koenig criterion to produce a positive diagonal in X. Let this positive diagonal be denoted P = (pij ) and set ε = min(i,j)∈supp P xij . Consider the matrix Y = X − εP . Then it is clear that the line sums 1 Y ∈ Ωn . of Y equal 1 − ε and, further, that # supp Y < # supp X. Hence Y = 1−ε Therefore the induction hypothesis implies that we can write Y = σ∈Sn λσ Pσ with 2Editor’s Note, with the assistance of Laszo Filep. For this theorem, see Encyclopedia Applied Mathematics, Vol. 6. Frobenius established it for determinants in 1910. Denes König (Koenig) gave a simplified proof in 1915 and generalized it to matrices in 1931. “This seems to have lead to some hostility between the two men. After Nazi occupation of Hungary, König worked to help persecuted mathematicians. This lead to his death [by suicide?] a few days after the Hungarian National Socialist Party [the Arrow Cross] took over the country.” (Ian Anderson, in MacTutor.)
1. Generalized majorization
λσ = 1, λσ ≥ 0. It follows that X = εP + Y = εP +
209
(1 − ε)λσ Pσ
σ
with ε + σ (1 − ε)λσ = 1, ε > 0, (1 − ε)λσ ≥ 0 as requested. VdWC was formulated byB. L. van der Waerden in 1926. It is the problem to minimize the permanent among all doubly stochastic n × n matrices. It was suggested that minimum was attained precisely for the matrix Jn = n1 J where J stands for the n × n matrix all of whose entries are 1. In terms of formulae: it is true that (x ∈ Ωn and X = Jn ) =⇒ (per A > per Jn ). The desire to prove this was the stimulus to approximately 500 papers written on this topic until two papers, settling the question definitely, appeared simultaneously in 1981 – these were the papers by D. I. Falikman and G. P. Egorychev; see [14, 22]. The aim of the present paper is to develop several group-theoretical variations of these main themes.
1.1. Ωn (G) and its multiplicative structure. Let G be a finite group acting faithfully on some non-empty finite set X of cardinality n ∈ N. Thus, identifying X with n = {1, 2, . . . , n}, we may view G as a subgroup of the symmetric group Sn , G & Sn . Let Ωn (G) be the convex hull of all G-permutation matrices. Its elements are called G-doubly stochastic matrices. P ROPOSITION 1.1. The product of any two G-doubly stochastic matrices is a Gdoubly stochastic matrix. The multiplicative semigroup Ωn (G)(·) is a monoid whose invertible elements of finite order are precisely the G-permutation matrices. P ROOF. Let A and B be any two G-doubly stochastic matrices and write λσ Pσ and B = λτ Pτ A= τ ∈G
σ∈G
where λσ and λτ are non-negative numbers such that λσ = 1 and λτ = 1. τ ∈G
σ∈G
Then we have λσ Pσ )( λτ Pτ ) = λσ λτ Pσ Pτ = ( λσ λτ Pρ ) = λρ Pρ AB = ( σ∈G
τ ∈G
with
σ,τ ∈G
λρ =
ρ∈G στ =ρ
λσ λτ .
στ =ρ
It is clear that λρ ≥ 0. Moreover, we find ( λσ λτ ) = λσ λτ = λσ · λτ = 1. λρ = ρ
στ =ρ
σ,τ ∈G
σ∈G
τ ∈G
This proves that AB ∈ Ωn (G). That I ∈ Ωn (G) is obvious. Therefore, Ωn (G) is a monoid.
ρ
210
C HAPTER III. MAJORIZATION
Now, let A be an invertible G-doubly stochastic matrix. Then we may assume that Am = I for some integer m ≥ 2. Arguing by contradiction, suppose that A is not a permutation matrix. Then for some matrix element akl in A holds 0 < akl < 1. Set maxi∈n {ail } = apl . Then it is clear that 0 < apl < 1, as apl together with akl and, possibly, other elements in the lthe column sums to unity. Notice that, in view of what has already been the matrix Am−1 is doubly stochastic. Therefore, considering proved, ∗ the elements s∈n ais asl in the l-th column of the doubly stochastic matrices Am−1 and A, it follows that for any one of them holds a∗is asl ≤ apl a∗is = apl < 1. s∈n
s∈n m
This contradicts the assumption that A = I. Finally, let us prove that A is a G-permutation matrix. Take a reduced representation for A, i.e. A = σ∈H λσ Pσ with H ⊂ G and all λσ > 0. By the above argument, we know that λ σ∈H σ Pσ = P for some permutation matrix P . Thus we have P (i, j) = λ P (i, j). σ∈H σ σ If P (i, j) = 0 then 0 = σ∈H λσ Pσ (i, j) together with the fact that all λσ > 0 while Pσ (i, j) ∈ {0, 1} implies that Pσ (i, j) = 0 for all σ ∈ H. if Pτ (i, j) = 0 for If, however, P (i, j) = 1 then Pσ (i, j) = 1 for all σ ∈ H. Indeed, some τ ∈ H, it would follow from 1 = σ∈H λσ Pσ (i, j) that σ∈H λσ = σ∈H λσ with H = H\τ . But as all λσ > 0 this is impossible. Thus we have shown that for all σ ∈ H holds 0, if P (i, j) = 0, Pσ (i, j) = 1, if P (i, j) = 1. As Pσ and P are permutation matrices, this gives Pσ = P for all σ ∈ H. It follows that A = P is a G-permutation matrix. ( ' Fix a convex subset S ⊆ Mn (R). Linear transformations g : Mn (R) → Mn (R) such that g(S) = S are called symmetries for S. The set of all symmetries for S is a group. Its subgroups are called symmetric groups for S. Denote the action of g on matrices Q ∈ S by Q ◦ g. Take now S = Ωn (G) and consider a finite group L of symmetries for it. If Q is a matrix such that Q ◦ = Q for all ∈ L, then Q is called an L-fixed point. P ROPOSITION 1.2. Let L be a finite group of linear transformations of the space of matrices Mn (R) such that Ωn (G) is L-invariant. Then the subset of L-fixed points coincides with the convex hull of the set of all matrices of the form Qσ =
1 Pσ ◦ . |L| ∈L
P ROOF. Denote bt L∗ the set of all L-fixed points in Ωn (G). Given A, B ∈ L∗ consider the matrix C = λ + (1 − λ)B, λ ∈ [0, 1]; it is clear that C ∈ L∗ , as, for any ∈ L, C ◦ = λ(A ◦ ) + (1 − λ)(B ◦ ) = λA + (1 − λ)B = C.
1. Generalized majorization
211
For any matrix Qσ =
1 Pσ ◦ |L|
with σ ∈ G fixed
∈L
and ∈ L we find Qσ ◦ =
1 1 Pσ ◦ = Pσ ◦ = Qσ ; |L| |L| ∈L
∈L
here we used the fact that if runs through all of L the same is true for = also. This shows that Qσ ∈ L∗ . Now, any element D ∈ L∗ ⊆ Ωn (G) can be written as D= δπ Pπ with δπ ≥ 0, δπ = 1. π
π∈G
Hence,
1 1 D= δπ (Pπ ◦ ) = D= |L| |L| ∈L ∈L π∈G 1 = δπ (Pπ ◦ ) = δπ Q π . |L| π∈G
π∈G
∈L
This together with what was proved above finishes the proof.
( '
R EMARK 1.3. Using the obvious relations (A + B)t = At + B t and Pσt = Pσ−1 we see that for the transpose of any matrix X ∈ Ωn (G) it holds t t X = λσ Pσ = λσ Pσt λσ Pσ−1 , σ∈G −1
σ∈G
σ∈G
which, as σ ∈ G if σ ∈ G, implies that X ∈ Ωn (G). Therefore, Ωn (G) is invariant under transposition of matrices. Taking L to be the 2-element group t|t2 = idΩn (G) , it follows that the set of all symmetric doubly stochastic G-matrices coincides with the convex hull of the set of matrices { 21 (Pσ +Pσ−1 )}. Furthermore, specializing G to be the group Sn (the symmetric group of n) we obtain the result: the set of all symmetric n × n doubly stochastic matrices is identical with the convex hull of the set of all matrices of the form 12 (P + P t ), where P is an n × n permutation matrix. This is a result by M. Katz [12]; see also [5] t
Following [8], a semigroup S with involution a → a∗ is called a special involution semigroup if and only if every finite nonempty subset T ⊆ S has the property that there exists an element a ∈ T such that if for some b, c ∈ T we have aa∗ = bc∗ then b = c. P ROPOSITION 1.4. The multiplicative subgroup of all doubly stochastic G-matrices (for any given finite group G, G ≥ Sn ) is a special involution semigroup. P ROOF . Take any X ∈ Ωn (G) and write it in the form X = π∈G λπ Pπ . Then we get X t = π∈G λπ Pπ−1 . It follows that X → X t is an involution on the multiplicative subgroup Ωn (G)(·).
212
C HAPTER III. MAJORIZATION
Take any finite subset T ⊆ Ωn (G) and choose A ∈ T such that tr AAt = max XX t. X∈T
Then, assuming that AAt = BC t for some B, C ∈ T , we get AAt = (AAt )t = CB t . Hence, (B − C)(B − C)t = BB t + CC t − 2AAt . In view of our choice of A, this gives 0 ≤ tr(B − C)(B − C)t = tr BB t + tr CC t − 2 tr AAt ≤ 0. As obviously tr XX t = 0 if and only if X = 0 it follows that B = C, as required.
( '
R EMARK 1.5. In [8, p. 96], it is noted that every periodic (and so also every finite) special involution semigroup is inverse. In our case it follows easily from Proposition 1.1 that every periodic submonoid in Ωn (G)(·) is indeed a group.
1.2. Inequalities for Ωn (G) For a given matrix in Mn (R) it is very easy to decide whether it belongs to Ωn or not – just use the definition of a doubly stochastic matrix, as a positive matrix all of whose rows or columns add up to unity. Several combinatorial problems are essentially e.g. the Travelling optimization problems on some subsets of Ωn . One such problem is Salesman Problem: to minimize the function f : Mn (R), f (X) = i,j∈n cij xij , for X = (xij ) ∈ Ωn and S j ∈S / xij ≥ 1 for S ⊆ n. Therefore, it is desirable to have a clear picture of the interplay between the multiplicative structure and the linear structure of Ωn . In general, however such information is not available and the problem is by no means an easy one. So, to find “few and natural” inequalities for describing the convex hull of the set {Pσ |σ ∈ Sn \M }, even in the special case when M consists of the identity element only, is not trivial; the answer was given by Cruse [5] by a resourceful argument. Notice also that the travelling salesman polytope is nothing else than the convex hull of the {Pσ | σ ∈ Zn }, where Zn is the set of all (full) cycles of length σ in Sn The symmetric travelling salesman polytope has a similar presentation. The list of problems can be easily enlarged, e.g. we could enclose the question of finding a basis with minimal weight for a (simple) matroid over a finite field etc.; see [2], [13]. Denote by An the subgroup of Sn of all even permutations, i.e. the alternative group of Sn . Convex combinations of permutation matrices Pσ , σ ∈ An – i.e. of even permutation matrices – are called even doubly stochastic matrices –; the set of such matrices is denoted by Ω(An ). A. J. Hoffman proposed in 1955 the problem of describing Ω(An ) inside Ωn . With the aim to answer this question, L. Mirsky [19] established the following result. T HEOREM 1.6 (L. Mirsky, 1961). Let D = (dik ) be an even n× n doubly stochastic matrix. Then the inequalities (57)
n k=1
hold for all j ∈ n and π ∈ An .
dk,π(k) − 3dj,π(j) ≤ n − 3.
1. Generalized majorization
213
Unfortunately, these conditions are not sufficient for D to belong to Ω(An ). This was first noticed by J. von Below [2] who gave the example of the matrix DB4 which satisfies (57) but is not in Ω(An ): ⎛ ⎞ 1 3 2 0 ⎜3 2 0 1 ⎟ ⎟ DB4 = 12 P(12) + 13 P(134) + 16 P(243) = 16 ⎜ ⎝0 1 3 2 ⎠ . 2 0 1 3 Such counterexample exist for any n ≥ 5: DBn = 12 (P(12) + P(134...n) )
with (for n = 5) ⎞ ⎛ ⎞ ⎛ 0 0 1 0 0 0 1 0 0 0 ⎜0 1 0 0 0⎟ ⎟ ⎜1 0 0 0 0⎟ ⎜ ⎟ and P(1345) = ⎜0 0 0 1 0⎟ . P(12) = ⎜ ⎟ ⎜ ⎝0 1 0 0 0⎠ ⎝0 0 0 0 1⎠ 0 0 0 1 0 1 0 0 0 0 Four other necessary conditions in order that a doubly stochastic matrix be even are described by R. Brualdi and B. Liu [7]. Let G ⊂ Sn . Denote by i(σ) the number of fixed points of σ ∈ G induced in the natural action (n, G) by (n, Sn ). We call the set Spec G = {i(σ)|σ ∈ G, σ = ⊂ {0, 1, . . . , n}, the spectrum of the subgroup G ⊂ Sn .
1.3. On the diagonals of G-doubly stochastic matrices This section is motivated by the previous discussion. We are interested in describing the diagonals of G-doubly stochastic matrices. A first result in this direction is. T HEOREM 1.7. Let G be a subgroup of Sn with normalizer N (that is, π ∈ G, g ∈ N implies gπg −1 ∈ G). Assume that G is transitive in the following sense: (*) If A, B are any two subsets of n = {1, 2, . . . , n} with |A| = |B| = i, where i is an integer belonging to spec G, then there exists an element g ∈ N such that gA = B. Then the diagonals of G-doubly stochastic matrices form a convex subset of Rn+ which is Sn -invariant. P ROOF. As Ωn (G) = conv{Pg }g∈G , it is clear that diag Ωn (G) = diag(conv{Pg }g∈G ) = conv(diag{Pg }g∈G ). Therefore it suffices to show that the set diag{Pg }g∈G is Sn -invariant. An equivalent statement is: If i is any index in spec G and if u is a 0-1-vector of length i, |u| = i, then u ∈ diag{Pg }g∈G . To see this we observe first that if a is any fixed point of π ∈ G (or ∈ Sn , for that matter) and if g ∈ Sn , then ga is a fixed point of π = gπg −1 . (This is proved by the series of equalities: π ga = gπg −1 ga = gπa = ga.) Denoting by fix(π) the fixed point set of π, we can state this as fix(gπg −1 ) = g fix(π).
214
C HAPTER III. MAJORIZATION
Equivalently: Pg diag Pπ = diag(Pπ Pg Pπ−1 ) = diag Pgπg−1 . (Note that fix(g) = supp diag Pg .) Let now u be an arbitrary 0-1-vector with |u| = i, i ∈ spec G. As i ∈ spec G, there exists then also a vector w with w = diag Pπ for some π ∈ G and |w| = i. But (∗), together with the observations in the preceding paragraph, shows that u = Pg w for some g ∈ N . It follows now that u = Pg w = Pg diag Pπ = diag Pgπg−1 = diag Pg with g = gπg −1 ∈ G. Therefore u ∈ diag{Pg }g∈G .
( '
So our problem is reduced to a purely geometric question: describing the structure of the convex hull of an Sn -invariant set M of 0-1-vectors. We will answer this question only in a very special situation. L EMMA 1.8. Let n be a positive integer and let f be an integer satisfying 0 ≤ f ≤ n. Let Mf be the set of all vectors in Rn all of whose components are either 0 or 1, such that this set includes the vector (1, 1, 1, 1, ...., 1) and also all vectors which have at most n − f components equal to 1. (Thus Mf also contains the vector (0, 0, ..., 0).) Then the convex hull of Mf consists of all vectors (x1 , x2 , ...., xn ) which satisfy the conditions (58)
0 ≤ xj ≤ 1
and
n − f + f xj ≥
n
xk
for j = 1, 2, ..., n.
k=1
P ROOF. (After Michael Cwikel3) Let F be the set of all vectors in Rn which satisfy all the conditions (58). It is clear that F is a convex set containing every vector in Mf . So we have conv v(Mf ) ⊂ F , where conv v(E) denotes the convex hull of any set E ⊂ Rn . It remains to show that F ⊂ conv v(Mf ). In fact it suffices to show instead merely that F∗ ⊂ conv v(Mf ), where F∗ is the subset of F consisting of those vectors x = (x1 , x2 , ..., xn ) which satisfy 1 ≥ x1 ≥ x2 ≥ ... ≥ xn ≥ 0. This reduction of the problem follows immediately from the fact that the set F and also the set conv v(Mf ) are both invariant under permutations of the components of vectors. It will be convenient to use the notation D for the larger set of all vectors x = (x1 , x2 , ..., xn ) ∈ Rn which satisfy x1 ≥ x2 ≥ ... ≥ xn ≥ 0. R EMARK 1.9. Let x be any vector of the form x = kj=1 αj wj where wj ∈ Mf k or merely wj ∈ conv v(Mf ), and αj ≥ 0 for j = 1, 2, ..., k and j=1 αk ≤ 1. Then it is clear that x ∈ conv v (conv v(Mf )) = conv v(Mf ), since we can write x in the form k k x = j=0 αj wj where w0 is the zero vector and α0 = 1 − j=1 αj . ( ' Let us define the vectors v0 , v1 , ..., vn in Rn by letting v0 be the zero vector and, for j = 1, 2, ..., n, letting vj be the vector whose first j components equal 1 and all of whose remaining components (if there are any) equal 0. 3Note by J. Peetre. This result is due to Uno Kaljulaid, but his proof was not quite complete. So in 1997 I gave another proof. Unfortunately, I have been unable to reconstruct it. Therefore we offer here yet a third proof, constructed on my request by friend Michael Cwikel. I am immensely grateful to him for this.
1. Generalized majorization
215
Suppose that x = (x1 , x2 , ..., xn ) is an arbitrary element of F∗ . Then we can write x in the form n (59) x= θj vj , j=1
n
where each θj ∈ [0, 1] and j=1 θj ≤ 1 and vj are the special vectors just defined. In fact θj = xj − xj+1 for j = 1, 2, ..., n, where we define xn+1 = 0. By summing all components of all vectors in the sum for x in (59) we see that n
(60)
j=1
xj =
n
θj j.
j=1
n For x ∈ F∗ the n conditions n − f + f xj ≥ k=1 xk for j = 1, 2, ..., n, which appear in (58), are equivalent to the single condition n − f + f xn ≥
(61)
n
xk .
k=1
We will sometimes use the notation )x) = nk=1 xk . Suppose first that f = 0. Then vj ∈ Mf for every j = 0, 1, 2, ..., n. Taking the first n component of the vector equation (59) shows that 1 ≥ x1 ≥ j=1 θj . This inequality, combined with Remark 1.9 and (59), immediately gives us that x ∈ conv v(Mf ). Let us next consider the case where f = 1. By definition M1 = M0 , so we still have vj ∈ Mf for every j = 0, 1, 2..., n. Exactly the same reasoning as for f = 0 shows that x ∈ conv v(Mf ). (Note that so far the only part of condition (58) that we have had to use is 0 ≤ xj ≤ 1 for j = 1, 2, ..., n.
(62)
In fact, although we do not need it for this proof, it can be immediately checked directly that condition (58) for f = 0 is equivalent to condition (58) for f = 1 . Now suppose that f = n. Then condition (58) implies that the average value of the n components x1 , x2 , ..., xn is less than or equal to their minimum value. So all the components xj must be equal. Thus x = θn vn where θn = xn ∈ [0, 1] is the common value of all the components. So, again by Remark 1.9, we have x ∈ conv v(Mf ). It remains to consider the case where 1 < f < n. In this case we have vj ∈ Mf for j = 0, 1, 2, ..., n − f and also for j = n. For each j in the range n − f < j < n we claim that n−f (63) vj ∈ conv v(Mf ). j This is fairly obvious intuitively, but let us check it anyway. Consider the subspace V of Rn consisting of all vectors y of the form (y1 , y2 , ..., yj , 0, 0, ...0) i.e. all components after the j-th component are 0. Now consider the cyclic permutation map T acting on V defined by T (y1 , y2 , ..., yj , 0, 0, ...0) = (yj , y1 , y2 , ..., yj−1 , 0, 0, ...0). Since vn−f and all its permutations are in Mf , the vector w=
1 vn−f + T vn−f + T 2 vn−f + T 3 vn−f + ... + T j−1 vn−f j
216
C HAPTER III. MAJORIZATION
must be in conv v(Mf ) and must be of the form (a, a, a, ..., a, 0, ..., 0), i.e., the first j elements all equal the average value a = n−f of the first j elements of vn−f , and the j remaining elements are 0. This proves (63). It is convenient to divide the case where 1 < f < n into several subcases. Subcase (i): Suppose that the arbitrary element x ∈ F∗ chosen above satisfies xj = 0 for all j > n − f . Then we also have θj = 0 for all j > n − f in the representation (59). Since vj ∈ Mf whenever θj = 0, we obtain that x ∈ conv v(Mf ) by exactly the same reasoning as was used in the case f = 0. Subcase (ii): Suppose that x is such that in its representation (59) we have θj = 0 for all integers j in the range 1 ≤ j ≤ n − f and also θn = 0. This last condition is equivalent to xn = 0. So, by (58) and (60), we have n−1
(64)
θj j =
n−f j vj
θj j ≤ n − f.
j=1
j=n−f +1
Note that now see that
n
= wj ∈ Mf for each j in the range n − f + 1 ≤ j ≤ n − 1. We n−1
x=
θj vj =
j=n−f +1
n−1
αj wj
j=n−f +1
By (64) we have n−1 j=n−f +1 αj ≤ 1. So, once again, Remark 1.9 n−1 applies to show that x = j=n−f +1 αj wj is an element of conv v(Mf ). Subcase (iii): Suppose that x ∈ F∗ is of the form x = n−f j=1 θj vj + θk vk for some n−f particular k in the range n − f + 1 ≤ k ≤ n − 1. Now let y = j=1 θj vj and let z = vk . n−f n−f So y and z are both in D. Let y = n−f y y and z = z z = k vk . These are both elements of F∗ and furthermore, by subcases (i) and (ii) they are also in conv v(Mf ). n−f We can now write x = αy + βz where, necessarily α n−f y = 1 and β k = θk . Then where αj =
j n−f θj .
y x θk k α + β = n−f + n−f = n−f ≤ 1. So Remark 4.1 shows that x ∈ conv v(Mf ). Subcase (iv): Suppose that x ∈ F∗ satisfies xn = 0, or equivalently θn = 0. So, as in (59) we have
x=
n−1 j=1
⎣
k=n−f +1 n−1
=
θj vj +
j=1
⎡
n−1
=
n−f
θj vj =
n−1
θk vk =
k=n−f +1
θk
n−f
n−1 q=n−f +1 θq j=1
⎤
θj vj + θk vk ⎦ =
yk
k=n−f +1
where yk =
θk n−1 q=n−f +1
n−f θq
j=1
θj vj + θk vk . For each k in the range n − f + 1 ≤
n−f yk is exactly an element of F∗ of the k ≤ n − 1 we can see that the vector yk := y k form treated in subcase (iii) and so it is in conv v(Mf ). Furthermore we clearly have
1. Generalized majorization
n−1 k=n−f +1 n−1 k=n−f +1
)yk ) = )x) ≤ n − f . So x =
n−1 k=n−f +1
217
αk yk where αk =
yk n−f
and so
αk ≤ 1. This shows that x ∈ conv v(Mf ). Subcase (v): Finally we have to treat the last remaining subcase, where xn > 0. We shall write x in the form x = y+z where z = xn vn and so y = x−xn vn = (x1 −xn , x2 − 1 y. We xn , . . . , xn−1 − xn , 0). Let w = (w1 , w2 , . . . , wn−1 , 0) be the vector w = 1−x n xj −xn x1 −xn claim that w ∈ F∗ . To check this first note that 0 ≤ wj = 1−xn ≤ 1−xn ≤ 1 for each j. Then we have the following sequence of inequalities, where we shall use the fact that x ∈ F∗ in the second line. ⎛ ⎞ n−1 n n−1 n−1 1 1 ⎝ wj = wj = (xj − xn ) = xj − (n − 1)xn ⎠ 1 − x 1 − x n n j=1 j=1 j=1 j=1 ⎛ ⎞ n 1 ⎝ 1 xj − nxn ⎠ ≤ (f xn + n − f − nxn ) = 1 − xn j=1 1 − xn (n − f )(1 − xn ) = n − f. 1 − xn Since wn = 0 this is exactly what we need to show that w ∈ F∗ . Also, again using the fact that wn = 0, we see, from the previous case, that w ∈ conv v(Mf ). Finally we express x as a convex combination x = (1 − xn )w + xn vn . Since both w and vn are in ( ' conv v(Mf ), so is x. =
Combining Theorem 1.7 with Lemma 1.8 we obtain at once the following result. T HEOREM 1.10. Assume that G ⊂ Sn satisfies condition (∗) in Theorem 1.7 and, moreover, that spec G = {0, 1, . . . , n − f, n}. Then a vector x = (x1 , . . . , xn ) belongs to the convex hull of all diagonals of G-doubly stochastic matrices if and only if condition (1) in Lemma 1.8 is fulfilled.
1.4. G-variations on HPL Fix n ∈ N and consider any subgroup G & Sn . Take some vectors a = (a1 , . . . , an ) and b = (b1 , . . . , bn ) with real non-negative components. D EFINITION 1.11. The vector a is said to be a G-average of b, if there exists a matrix X ∈ Ωn (G) such that a = bX. D EFINITION 1.12. The polynomial 1 aσ(1) a X1 . . . Xnσ(n) [a]G := |G| σ∈G
is called the symmetric G-mean of a. The following two examples are well-known 1 [(1, 0, . . . , 0)]Sn = (x1 + . . . xn ) n and
√ 1 1 [( , . . . , )]Sn = n x1 x2 . . . xn . n n
218
C HAPTER III. MAJORIZATION
The following fact is true. T HEOREM 1.13. Let a = (ai ) and b = (bi ) be some vectors in Rn with non-negative components. The condition [a]G ≤ [b]G holds for all real xi ≥ 0 if and only if a ia a G-average of b P ROOF. S UFFICIENCY. We can modify the scheme employed in [10]. We use the notation n y = (ln x1 , . . . , ln xn ); (c, z) = ci (zi ) i=1
for any vectors c = (ci ) and z = (zi ). Then we find [a]G = |G|−1 ·
n
a
xi σ(i) =
σ∈G i=1
= |G|−1
n exp( aσ(i) ln xi ) = i=1
σ∈G −1
= |G|
·
exp((aσ(1) , ·, aσ(n) ), y ) =
σ∈G
= |G|−1 ·
exp(aPσ , y).
σ∈G
As a is a G-average of b there exists a matrix X ∈ Ωn (G) such that a = bX. Let X= λπ Pπ , λπ ≥ 0, λπ = 1. π
π∈G
It follows that (aPσ , y ) =
λπ Pπ Pσ , y).
π∈G
Using the convexity of exponent we get λπ (bPπσ , y ) ≤ λπ exp(bPπσ , y). exp(aPσ , y) = exp( π∈G
Therefore, we obtain |G| · [a]G =
π∈G
exp(aPσ , y ) ≤
σ∈G
λπ exp(bPπσ , y) =
g∈G π∈G
=
λπ
λπ
π∈G
=(
σ ∈ G exp(bPπσ , y) =
π
π∈G
=
π∈G
exp(bPγ , y) =
γ∈G
λπ ) · |G| · [b]G = |G| · [b]G .
1. Generalized majorization
219
The needed inequality [a]G ≤ [b]G follows.
( '
To prove the necessity part of the theorem use the following result by R. Rado [26] T HEOREM 1.14 (Rado, 1952). For given vectors a = (ai ) and b = (bi ) in Rn with all their components non-negative and for any subgroup G & Sn it holds [a]G ≥ [b]G if and only if a belongs to the convex hull of the set {bσ | σ ∈ G}. Here the following notation is used: for b = (b1 , . . . , bn ) one writes bσ = (bσ(1) , . . . , bσ(n) ) ∈ Rn . Denote further by KG (b) the convex hull of the set {bπ | π ∈ G}. It remains to prove that a ∈ KG (b) is the same as saying that a is a G-average b. Indeed, if a = bX for some matrix X ∈ Ωn (G), then, representing X as tπ Pπ . tπ ≥ 0, tπ , X= π
π∈G
we get a = b(
tπ Pπ ) =
π∈G
tπ (bPπ =
π∈G
tπbπ ∈ KG (b).
π∈G
In the other direction, if a ∈ KG (b), then for some λσ ≥ 0, σ ≥ 0, σ lamσ = 1, we have λσbσ a = σ∈G
and therefore λσbσ = λσ (bPσ ) = b( λσ Pσ ) = bX, a = σ∈G
σ∈G
X=
Pσ ,
σ
σ∈G
with X belonging to Ωn (G). This means that a is a G-average b. P ROOF OF T HEOREM 1.14 ([26]). Let there hold [a]G ≤ [b]G ] and, arguing by contradiction, suppose that a ∈ / CG (b). Then it follows (see [26] that ui (ai − bτ (i) ) ≥ δ. ∃ui R (i ∈ n), δ > 0, i
Take any number M > 1 and set xu = M uu . Then we have n bτ (i) b xi τ (i) = M ui ≤ |G| · [b]G = τ ∈G i
≤
M
τ ∈G ui ai −δ
≤
τ ∈G n n (M ui )ai ] = ≤ |G| · M −δ · [ (M ui )ai + i
= |G| · M
−δ
· |G|[a]G .
τ =ε i
220
C HAPTER III. MAJORIZATION
Taking here ln M > we get
ln |G| δ
|G| · M −δ · |G|[a]G < |G|[a]G .
Hence, it follows [b]G < [a]G , which contradicts [a]G ≤ [b]G . Suppose now that a ∈ CG (b). Then there exist real numbers tπ ≥ 0, π tπ = 1 π such that a = π tπ b . This implies that aj = tπbπ(j) . π∈G
Then [a]G =
n 1 a{ σ(i) xi = |G| i=1 π∈G
=
n 1 xi |G| i=1
π tπ bσ(i)
=
σ∈G
n 1 tπ b xi π σ(i) = |G| σ∈G i=1 n tπ n 1 bπσ(i) xi ≤ = |G| σ∈G π∈G i=1 n bπσ(i) 1 = ≤ tπ xi |G| i=1 σ∈G π∈G = tπ · [b]G = [b]G .
=
π∈G
The first inequality in this calculation follows from the generalized version of the arithmetic geometric inequality [9]: for αi ≥ 0, i αi = 1 and xj non-negative it holds αn 1 α2 xα 1 x2 · xn ≤ α1 x1 + α2 x2 + · · · + α1 xn .
As a result we obtain as needed.
[a]G ≤ [b]G , ( '
1. Generalized majorization
221
1.5. Appendix. Research plan of the project “Groups and inequalities with applications to combinatorics and optimization” Introduction. History, motivation, examples. A. History: I. Schur (1923); Hardy-Littlewood-Pólya (1929); G. Birkhoff–J. von Neumann (1946); A. Ostrowsky and R. Rado (50’s); L. Mirsky (60’s); G Egorychev et al (1980); R. Brualdi (1970-1990). B. Motivation: (a) Majorization: Marshall-Olkin book – examples using the relation ≺ in combinatorics; T. Ando’s lectures on majorization – preprint (1990, ‘old’ version) of the lectures notes & the new version (T. Ando, Lin. Alg. Appl. 1994). (b) Generalized majorization. Peetre’s pre-print (1985). (c) Discrete optimization on Sn and its subsets. Marshall-Olkin’s examples of combinatorial and discrete optimization through majorization theory methods. The results of H. Ryser et al revisited. Minsk seminar (70’s and 80’s). Vershik - Barvinok (1990). (d) Polytope algebra. Lattice-theoretic generalizations. Permutohedron and superconductivity. McMullen’s polytope algebra I,II. Valuations – Geissinger, Rota, Lovász, etc. K. Fan and S. Sternberg’s results on Bruhat order and superconductivity. (e) H. Ryser’s problem and M. Hall’s problem on (0, 1)-matrices. Infinite extensions. Problems. H. Ryser’s survey; L. Skornyakov on ∞ versions; lattice theoretic versions; L. Lovász et al. Part I. Ωn (G) group theoretic variations on ‘bistochastic’ themes. A. Multiplicative structure of Ωn (G) (a) Elements of finite order in the monoid Ωn (G)(·) (b) Subgroups of Ωn (G)(·) via D. Farkas’ paper (c) Unit in the group rings ZS3 , ZD4 , . . . via Hughes-Pearson; . . . (d) On the algebra structure of the monoid Ωn (G). Eastwood & Munn (B = Sn ) B. Giving Ωn (G) by a few inequalities (a) The spectrum of a group G, G & Sn . A theorem on the inequalities for X ∈ Ωn (G). Corollaries: Results of L. Mirsky and A. Cruse. New counterexamples to Mirsky’s conjecture. (b) A criterion for the diagonals C. On a problem on finite simple groups: the (second) main theorem (T HEOREM 2) on the classification of groups (using FGC). D. Infinite extensions of results on the diagonals of bistochastic G-matrices, G & S∞ (substitutions displacing finitely may symbols only). E. G-majorizations (a) G-version of the HLP-theorem. TG -transformations (b) a ¯ & ¯b for any G & Sn ; the Dirichlet polytope (c) . . . (d) . . .
222
C HAPTER III. MAJORIZATION
Part II. G-permanents. A. A new solution of the van der Waerden inequality via L. Gårding’s inequality for hyperbolic polynomials. – Peetre’s preprint [22]. B. . . . (a) T HEOREM 3. G-extension of P. Hall’s marriage theorem. (b) T HEOREM 4. G-extension of the Frobenius-König theorem. (c) Egorychev-like proof of the (extended) G-version of the van der Waerden problem – via Peetre’s preprint (d) T HEOREM 5. Algebraic structure of the McMullen polytope algebra for Ωn (G). A (new?) Molien series for Ωn (G). Part III. Applications. A. Kaplansky-Riordan theory revisited: G-extension via Moebius inversion associated to the restriction matrix. B. Bruhat order on G & Sn and (possible permutahedron results for Ωn (G). Applications to superconductivity (K. Fan and S. Sternberg). On a problem of H. Ryser on (0, 1)-matrices. C. The M. Hall theorem on permanents of (0, 1)-matrices. G-extension. [1],[4],[3], [6],[11],[15],[16],[17], [20],[23],[25],[29],[30], [31],[32],[18],[34] References [1]
[2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]
A. I. Barvinok and A. M. Vershik. Methods of representation theory in combinatorial optimization problems. Izv. Akad. Nauk SSSR, ser. Tekhn. Kibernet. 6, 1988, 64–71. English translation: Soviet J. Comput. Systems Sci. 27 (5), 1989, 1–7. J. von Below. On a theorem of L. Mirsky on even doubly stochastic matrices. Discrete Math. 55 (3), 1985, 311–312. N. Biggs. Finite groups of automorphisms. London Mathematical Society Lecture Notes Series, 6. Cambridge Univ. Press, 1971. T. Bonnesen and W. Fenchel. Theorie der konvexen Körper. In: Erg. Math. u. ihrer Grenzgebiete, 3, No. 1. Springer, Berlin, 1934. A. Cruse. A note on symmetric doubly stochastic matrices. Discrete Math. 13, 1976, 109–119. A. Cruse. On removing a vertex from the assignment polytope. Linear Algebra Appl. 26, 1979, 45–57. R. Brualdi and B. Liu. The polytope of even doubly stochastic matrices. J. Combin. Theory Ser. A 57 (2), 1991, 243–253. D. Eastwood and W.D. Munn. On semigroups with involution. Bull. Aust. Math. Soc. 48, 1993, 93–100. G. H. Hardy, G. Littlewood, and G. Pólya. Inequalities. Cambridge Univ. Press, Cambridge, 1934. L. Harper and G.-C. Rota. Matching theory, an introduction. Advances in Probability Theory 1, 1971, 169–215. A. Horn. Doubly stochastic matrices and the diagonal of a rotation matrix. Amer. J. Math. 76, 1954, 620–630. M. Katz. On the extreme points of a certain convex polytope. J. Comb. Theory 8, 1970, 417–423. A. W. J. Kolen and J. K. Lenstra. Combinatorics in operator research. In: Handbook of combinatorics, Chap. 35. Elsevier, Amsterdam, 1995. J. H. van Lint. The Van der Waerden Conjecture: two proofs in a year. Math. Intell. 4, 1982, 72–77. M. Marcus and H. Minc. A survey of matrix theory and matrix inequalities. Allyn and Bacon, Inc., Boston, 1964. A. W. Marshall and I. Olkin. Inequalities, theory of majorization and its applications. Aacdemic Press, New York, 1979.
1. Generalized majorization
223
[17] P. McMullen and G. C. Shephard. Convex polytopes and the upper bound conjecture. London Mathematical Society Lecture Notes Series, 3. Cambridge Univ. Press, 1971. [18] H. Minc. Non negative matrices. London Mathematical Society Lecture Notes Series, 3. John Wiley, New York, 1988. [19] L. Mirsky. Even doubly stochastic matrices. Math. Ann. 144, 1961, 418–421. [20] L. Mirsky. Results and problems in the theory of doubly stochastic matrices. Z. Wahrscheinlichkeitstheorie 1, 1963, 319–334. [21] A. Ostrowski. Sur quelques applications des fonctions convexes et concaves au sens de I. Schur. J. Math. Pures Appl., IX. Ser. 31, 1952, 253–292. [22] J. Peetre. Van der Waerden’s conjecture and hyperbolicity. Technical Report LTH 1981:9. Lunds Universitet, Lund, 1981. Reprinted in this Volume. [23] J. Peetre. On generalized majorization. Technical Report LTH 1985:2. Lunds Universitet, Lund, 1985. Reprinted in this Volume. [24] L. I. Polotski˘ı, M. V. Saphir, and L. A. Skornyakov. Convex combinations of infinite permutation matrices. Acta Sci. Math. 51, 1987, 185–189. [25] D. G. Poole. The stochastic group. Amer. Math. Monthly 102, 1995, 798–801. [26] R. Rado. An inequality. J. London Math. Soc. 27, 1952, 1–6. [27] A. Rämmer. On minimizing matrices. In: Proc. of the First Est. Conf. on Graphs and Appl. (Tartu– Kääriku). Tartu Univ. Press, Tartu, 1991, 121–134. [28] A. Rämmer. On even doubly stochastic matrices with minimal even permanent. Acta Comm. Univ. Tartuensis 878, 1990, 103–114. [29] J. V. Ryff. On the representation of doubly stochastic operators. Pac. J. Math. 13, 1963, 1379–1386. [30] J. V. Ryff. Orbits of L1 -functions under doubly stochastic transformations. Tr. Am. Math. Soc. 117, 1965, 92–100. [31] J. V. Ryff. Majorized functions and measures. Nederl. Akad. Wetensch. Proc. Ser., A. 71 = Indag. Math. 30, 1968, 431–437. [32] J. V. Ryff. Extreme points of some convex subsets of L1 (0, 1). Proc. Am. Math. Soc. 18, 1967, 1026– 1034. [33] I. Schur. Über eine Klasse von Mittelbildungen mit Anwendungen auf die Determinantentheorie. Sitzungsber. Berl. Math. Gesell. 22, 1923, 9–20. [34] G. Ziegler. Lectures on polytopes, Graduate Texts in Mathematics, 152. Springer-Verlag, New York, 1995.
This page intentionally left blank
225
2.
Van der Waerden’s conjecture and hyperbolicity by J. Peetre 4
Introduction. The Van der Waerden’s conjecture says that, if A = (aik ) is an n × n n! def doubly stochastic matrix, then per A ≥ n with equality if and only if A = J = ( n1 ) n (see definitions infra). It has been settled independently by two Soviet mathematicians Egorychev [4] and Falikman [5], that is, the latter proves only the inequality without discussing the case of equality. An analysis of Egorychev’s proof by van der Lint [13] has also appeared5. It is interesting that both Egorychev and Falikman at least implicitly invoke hyperbolic quadratic forms (Lorentz forms). In this note we wish to further clarify the role of hyperbolicity in this context, the main point being the simple observation that the permanent as a function of the rows (or columns) of the matrix is the complete polarization of a certain hyperbolic (in the sense of Gårding) polynomial, viz. the polynomial P (x) = n!x1 · · · xn . Since Falikman’s proof at least has not yet appeared in translation we reproduce its main features below (Section 2.2). We also indicate a few minor simplifications of Egorychev’s proof based on Falikman’s lower bound (Section 2.3). Therefore we have, in fact, here a proof which is “almost self-contained”, that is, modulo the only remaining purely combinatorial element, the celebrated FrobeniusKönig theorem (see [12, Chapter 3]) and the circumstance that we have not bothered to reproduce some reasoning which we otherwise would have taken over verbatim from [5] or [13]. Notation. If A = (aik ) is an n × n matrix then its permanent is defined as per A = a1σ(1) · · · anσ(n) σ
(summation over all permutations σ of {1, . . . , n}). If we consider it as a function of the rows x1 , . . . , xn of A we write per(x1 , . . . , xn ). Notice that 1 if σ is a permutation per(eσ(1) , . . . , eσ(n) ) = 0 if not, 6 . which relation essentially characterizes the permanent function A matrix A = (aik ) is called doubly stochastic if k aik = 1 = i aik , aik ≥ 0. 2 The set of all doubly stochastic matrices will be denoted Ω (it is a convex subset of Rn , dim Ω = (n − 1)2 ). The “interior of Ω will be denoted by Ω∗ and its “boundary” by ∂Ω (= Ω\Ω∗ ). Every permutation matrix is in ∂Ω. Also J = ( n1 ) is in Ω∗ . Attention: Sometimes xi means the i-th component of the vector x = (x1 , . . . , xn ) but sometimes it is the i-th member of the family of vectors {x1 , . . . , xm }.
4
Report LTH 1981:9, Lund, 1981. Reprint.
5Egorychev’s paper [4] has not been available to us; we know of its contents only through van der Lint
[13]. 6The standard reference for permanent theory is Minc’s book [12].
226
C HAPTER III. MAJORIZATION
2.1. Hyperbolicity Hyperbolic polynomials were introduced by Gårding [7] in 1950 in the context of Cauchy’s problem for linear partial differential equations. Their main algebraic characteristics in purely algebraic terms are summarized in his beautiful paper [6] (see also Hörmander’s book [8, Chapter 5] and Beckenbach-Bellman [2, §§ 36–39]). Let P (x1 , . . . , xm ) be a real symmetric m-linear form in Rn . If all the arguments are equal we write P (x) = P (x, . . . , x). Thus P is a homogeneous polynomial of degree m which uniquely determines P (x1 , . . . , xm ). One says that P (x1 , . . . xm ) is the complete polarization of P (x). Let a be any non-zero vector in Rn . D EFINITION 2.1. P is hyperbolic with respect to a (or a is hyperbolic with respect to x) in one variable P ) if P (a) > 0 and if, further, for any x in Rn , the polynomial P (sa + s has m distinct roots. That is, one has the factorization P (sa+x) = c j (a+λj (x, P )), where c > 0 and λ1 (x, P ) < λ2 (x, P ) < · · · < λm (x, P ).7 If P is hyperbolic with respect to a, let us introduce the set def
C(a, P ) = {x|∀ j λj (x, P ) > 0}. The main properties of hyperbolic polynomials can be summarized in the following theorem. T HEOREM 2.2. C(a, P ) is an open convex cone in Rn , in fact, as a set equal to the connected component of {P = 0} that contains the vector a. The vector b is hyperbolic with respect to P for any b ∈ C(a, P ); then, in particular, C(b, P ) = C(a, P ). For the proof we refer to Gårding’s paper [6]. Here we shall only need the following. C OROLLARY 2.3. If b1 , . . . , bk are any k vectors in C(a, P ) (0 < k < n) then the “partial” polarization def
Q(x) = P (x, x, . . . , x, b1 , . . . , bk ) n−k times
is hyperbolic with respect to any vector in C(a, P ). P ROOF. By induction it suffices to consider the case k = 1, b1 = a. That is, we shall prove that def Q(x) = P (x, x, . . . , x, a) n−1 times
is hyperbolic throughout C(a, P ). We have the formula d P (sa + x) = mP (sa + x, . . . , a) = mQ(sa + x). ds It follows from Rolle’s theorem that for any x all the roots of Q(sa + x) are real and, in fact, separated by the roots of P (sa + x): (65)
λ1 (x, P ) < λ1 (x, Q) < λ2 (x, P ) < λ2 (x, Q) < . . .
7Editors’ Note. It is advantageous to interpret the relation t = sa + x geometrically as a straight line in the t, x plane with direction vector a. Then we are dealing with the intersection of this line with the variety {P (t) = 0}. For hyperbolicity of non-homogeneous polynomials, see [7, 8].
2. Van der Waerden’s conjecture and hyperbolicity
227
Thus Q is hyperbolic with respect to a. Also (65) shows that C(a, Q) = C(a, P ), so that, moreover, by Theorem 2.2 Q is hyperbolic with respect to any element of C(a, P ). ' ( Let us consider some examples of hyperbolic polynomials. E XAMPLE 2.1. m = 2, P = x21 − x22 − · · · − x2n (Lorentz form). This is the canonical example, because every other hyperbolic quadratic form can be written in this way after a linear change of variables. As is well-known this example is of fundamental importance for special relativity. Hyperbolic vectors are now called time-like, those on the conical surface x21 − x22 − · · ·−x2n = 0 light-like, all other vectors (= 0) being termed space-like. ( ' E XAMPLE 2.2. m = n, P = n!x1 x2 . . . xn . Every positive vector is hyperbolic. As already mentioned in the Introduction, the complete polarization is now the permanent function per A = per(x, x2 , . . . , xn ) = P (x, x2 , . . . , xn ) if A is a matrix with rows x1 , x2 , . . . , xn . ( ' R EMARK 2.4. In relation to the permanent used in Example 2.2 let us forward the following interesting observation. Curiously enough the interpretation of the permanent as a multilinear form does not seem to be explicitly mentioned in [12]. On p. 103 there is reproduced Muir’s formula n n per A ι1 . . . ιn = aik ιk , i=1 k=1
where the ιk are generators of a commutative associative algebra such that ι2k = 0. Of course, a similar thing can be done with any polynomial (hyperbolic or not). If P (x1 , . . . , xm ) = aα1 ...αm x1α1 . . . xmαm then n n P (x1 , . . . , xm )ι1 . . . ιm = xiα ια i=1 α=1
with ια1 . . . ιαm = aα1 ...αm . Cf. Dirac’s introduction of the Dirac matrices etc.
( '
ν(ν + 1) and identify Rn with the set of all symmetric 2 ν × ν matrices. Define P (x) = det(xik ). Then P is hyperbolic with respect to any positive definite matrix. ( ' E XAMPLE 2.3. Take n =
E XAMPLE 2.3 BIS . Analogous example with Hermitian matrices.
( '
An elementary fact about hyperbolic quadratic forms is the “reverse Cauchy inequality”: (66)
P (x, y) ≥
-
P (x)
-
P (y),
valid for time-like vectors with equality if and only if x = y. In [6] Gårding generalized (65) by proving hyperbolic polynomials P an inequality of the type (67)
1
1
P (x1 , x2 , . . . , xm ) ≥ (P (x1 )) m · · · (P (xm )) m
(Gårding’s inequality)
The special case of (67) corresponding to Example 2.3 is due to Aleksandov [1] (apparently rediscovered by Chern [3]). It is just the Aleksandov’s inequality that presumably is used in Egorychev’s paper [4] and which van Lint [13] manages to replace
228
C HAPTER III. MAJORIZATION
by the more elementary inequality (65). In the case of the permanent (example 2.2), however, (66) gives a trivial result, viz. the inequality 1 per A ≥ n!(Π(A)) n with Π(A) = aik . Now let us record for reference that the corollary to Theorem 2.2 gives the following result. L EMMA 2.5. If A is a positive n×n matrix, then fixing any n−2 rows the permanent as a function of the remaining two rows is a hyperbolic quadratic form. In fact, the same conclusion remains in force if we only know that these n − 2 rows are positive. One can also easily give a direct proof, as is done in [5] and [13]. So one can rightly ask if it is really worth while to make this detour via Gårding’s rather sophisticated theory. Our point is that we hope that, in putting the Van der Waerden’s conjecture into this wider frame ultimately perhaps something more will be revealed about its true nature (cf. Section 2.4). Finally, likewise for reference, we state the following simple fact characterizing, in fact, hyperbolic quadratic forms. L EMMA 2.6. Let P (x, y) be a hyperbolic quadratic form. If x and y are any two vectors such that P (x) > 0, P (x, y) = 0, then also P (y) < 0 unless y = 0. P ROOF. This follows most conveniently just upon applying (65). But conversely (65) can be obtained from the lemma. The direct proof goes as follows: Pick a basis such that x = (1, 0, . . . , 0) and P is “in normal form”, P (x) = x21 − x22 − · · · − x2n . Then ( ' P (x, y) = 0 gives y1 = 0 so that P (y) = −y22 − · · · − yn2 < 0 provided y = 0.
2.2. Analysis of Falikman’s proof The main difficulty in the Van der Waerden’s conjecture has, throughout the years, been the treatment of the “boundary points” (A ∈ ∂Ω). For instance, in the fundamental paper of Marcus and Newman [10] (see Minc [12], notably Chapter 5, Section 1) it is shown that if A is an “interior” minimizing matrix (A ∈ Ω∗ ) then by necessity A = J. In the same paper it is also shown that if A is any interior” minimizing matrix, then per Aik = per A provided aik > 0, where Aik denotes the (n − 1) × (n − 1) matrix gotten by deleting the i-th row and k-th column. The proof is quite simple and is based on an application of Lagrange multipliers (cf. Section 2.3, infra). Falikman’s proof [5] parallels at the outset at least, although the author himself does not refer to it, this proof by Marcus and Newman. The basic new idea is the introduction of, as is customary in optimization theory, a penalty function, viz. ε def f (A) = fε (A) = per A + , (A ∈ Ω∗ ) Π(A) where ε is a parameter (> 0), and, as in the end of Section 2.1, Π(A) = aik . As Π(A) → 0 when A approaches the boundary ∂Ω = Ω\Ω∗ it is manifest that f takes on a “minimum” at an interior point. Let thus A ∈ Ω∗ be a matrix such that the minimum is
2. Van der Waerden’s conjecture and hyperbolicity
229
assumed. Then using Lagrange multipliers, or by a direct computation, which everybody familiar with the rudiment of calculus can do for himself,8 one finds c = λi + μk , (68) pik − aik where λi and μk are the Lagrange multiplier, and where we have put pik = per Aik , ε . c= Π(A) C LAIM 2.7. All the λi and all the μk are equal. P ROOF. If we multiply both members of (68) by aik and sum over k we get aik μk (69) λi = b − with b = p − nc, p = per A = (70)
k
aik pik . Similarly, we find aik λi . μk = b − k
i
If we substitute theexpression for μk as givenby (70) into formula (69) we get a relation of the form λi = j bij λj , where bij ≥ 0, j bij = 1 = i bij . It is easy to see that ( ' λ1 = · · · = λn = λ. Similarly, we find μ1 = · · · = μn = μ. R EMARK 2.8. The argument (omitted!) leading to the above conclusion is but a special case of the Perron(-Frobenius) theorem on positive matrices. What is really going on becomes somewhat more transparent if we use matrix notation. Then (69) and (70) can be written as λ = b1 − Aμ and μ = b1 − A∗ λ respectively (remember that A1 = 1, since A is doubly stochastic), that is, λ = Bλ with B = AA∗ positive. This gives again λ = const · 1 (Perron’s theorem). Note that from (69) and (70) now follows b = λ + μ. We have thus proved (see (68)) that if A ∈ Ω∗ is a critical point for fε , ε arbitrary then pik = b +
(71)
c . aik
The final step in the proof can now be condensed in the following lemma. L EMMA 2.9. Assume that A = (aik ) ∈ Ω∗ with def
pik = per Aik = φ(aik )
(72)
where the function φ is strictly decreasing or constant. Then by necessity A = J = ( n1 ). This lemma is thus, in particular, applicable if (68) holds with c ≥ 0 (corresponding to ε ≥ 0). 8The “tangent space” of Ω (the “infinitesimal doubly stochastic matrices”) is generated by all matrices
1 −1 , all other entries being zero. This gives fik − fi − fjk + −1 1 ), whence readily fik = λi + μk .
containing a submatrix of the type fj = 0 (with fik =
∂f ∂aik
230
C HAPTER III. MAJORIZATION
P ROOF. It suffices to prove that any two rows, say, x = x1 and y = x2 are equal: x = y. We consider the quadratic form P (x, y) = per(x, y, x3 , . . . xn ), which we know is hyperbolic (Lemma 2.5). Then by (72) def
xi = P (x, ei ) = φ(yi ); def
y i = P (y, ei ) = φ(xi ), where e1 , . . . , en is the standard basis in Rn . Using a fancy language, the xi (y i ) are the i def contravariant coordinates of x (y) and P (x, y) = x yi = xi y i . Set z = x − y. Then similarly def z i = P (z, ei ) = φ(yi ) − φ(xi ). Assuming first that φ is strictly decreasing, we draw from this the important conclusion that z i ≥ 0 =⇒ zi > 0. Thus, in particular, 0. Furthermore, since x and y are rows of a doubly stochastic P (z) ≥ yi = 1, whence zi = 0. It follows that we have matrix we have xi = cannot z i ≥ 0 for all i. Therefore we can find a positive vector c such that ci z i = 0 or P (c, z) = 0. Also P (c, c) = 0. But this plainly contradicts the hyperbolicity (see Lemma 2.6). The case π constant is even simpler. Now z i = 0 for all i, which contradicts already the fact that P is a non-degenerate quadratic form. ( ' R EMARK 2.10. To make the above argument work it obviously just suffices to know that the remaining rows (i.e. x2 , . . . xn ) are positive but not by necessity x1 and x2 . So now we know that if A ∈ Ω∗ is any minimizing element for fε (ε ≥ 0) then A = J. In particular, thus ε ε per A + ≥ per J + . Π(A) Π(A) Passing to the limit (ε → 0) we get per A ≥ per J for A ∈ Ω∗ and by continuity also for A ∈ Ω. We have established T HEOREM 2.11. per A ≥ per J for any A ∈ Ω.
2.3. Comments on Egorychev’s proof Using Falikman’s proof and result (Theorem 2.11) we can somewhat simplify Egorychev’s proof to the effect that A = J is the only minimizing element in Ω. In particular, we can eliminate all the partial result on which it depends (London’s theorem [9] etc.; cf. [13]). We are thus out for the proof of T HEOREM 2.12. If A ∈ Ω and per A = per J then A = J. P ROOF. We do this in several steps.The idea is to prove directly that for a minimizing matrix A ∈ Ω we must have pik = p (according to (71), with ε = 0). First we verify that this is indeed sufficient. Step 1. If we have a minimizing matrix A ∈ Ω with pik = p it is easy to reproduce another one, A , say, with the same property, having one row, x = x1 , say in common
2. Van der Waerden’s conjecture and hyperbolicity
231
with A and all other rows positive. (This is achieved by successively forming mean values of rows; for details see [13].) We see that x = ( n1 ). Since this row was an arbitrary row we infer that A = J = ( n1 ). Step 2. It suffices to prove that pik ≥ p. For assume that A ∈ Ω is a minimizing matrix with this property. Then if x and y are any two rows of A inequality (66) gives p2 = P (x, y)2 ≥ P (x)P (y) = xi xi xi p · yi p = p2 . yi y i ≥ (Here we use the notation of the Lemma 2.9.) Thus we are in the case of equality for that inequality (but since we do not know yet that P is (strictly) hyperbolic we cannot, at this stage, conclude that x = y), and it is plain that indeed pik = p must hold. Step 3. Next we conclude that a minimizing matrix, at any rate, must be fully indecomposable (cf. [12]). Indeed, assume that, contrariwise, A is decomposable, which, A being doubly stochastic, just means that the matrix after suitable permutations of the rows and columns can be put in the form A ⊕ A , where A is an r × r matrix, and A an (n − r) × (n − r) matrix, with 0 < r < n. Then Theorem 2.11 gives r! (n − r)! n! = per A = per A · per A ≥ r · n n r (n − r)n−r or n n−r n n r 1− ≥ 1. r r r But this contradicts the elementary inequality n r x (1 − x)n−r < 1 if 0 < r < n, 0 < x < 1. r Step 4. Now we are (in principle) in a position to carry out the details of the proof of the theorem of Marcus and Newman [10] already referred to at the beginning of Section 2.2: pik = p if aik > 0. We add additional constraints of the form aik = 0, one for each zero matrix element, and proceed exactly as in Section 2.2. Since we know that A is fully indecomposable the conclusion of Perron’s theorem is still applicable. Step 5. There remains only one more step – London’s theorem [9] (see [12], Chapter 5, p. 85-86) to the effect that without the restriction pik ≥ p for a minimizing matrix on has A ∈ Ω. The proof is based on the inequality n (73) piσ(s) ≥ p, s=1
valid for any permutation σ. (Proof by a straightforward variational argument. Consider the “deformation” (1 − θ)A + θP , 0 < θ < 1, of A, where P is the permutation matrix corresponding to σ. Remember that Ω is a convex set!) Having (73) at our disposal it suffices to remark that, A being fully indecomposable (see Step 3), for any i and k we can find a permutation σ such that σ(i) = k and aσ(s) > 0 if s = i, proving aik > 0. This again is essentially the Frobenius-König theorem ([12, Chapter 3, see notably Theorem 2.2, p. 31 and Theorem 35, p. 38.]) ( '
232
C HAPTER III. MAJORIZATION
2.4. Open questions 4a. Having uncovered the role of hyperbolicicity in the Van der Waerden’s conjecture there arises the question whether the conjecture perhaps is a special case of something more general. Here is, tentatively, a “non-commutative” version of the Van der Waerden’s “conjecture”: to minimize the complete polarization Per A (“hyperpermanent”) of the hyperbolic polynomial P (x1 , . . . , xn ) = n! det x1 . . . det xn , where each entry xi is a symmetric, say, ν × ν matrix (see Example 2.3), under the side conditions k aik = 1 = i aik , aik being the “matrix elements” of A, each of them thus in turn a positive definite (aik ≥ 0) matrix. Analogous problem with Hermitian matrices (Section 2.1, Example 2.3BIS ). 4b. In view of all the trouble one has had with the (non-existent!) minimizing boundary matrices one is tempted to ask if there is perhaps a more quantitative result than just the mere statement that the are no minimizing points on the boundary. In other words, what can be said about inf per A? A∈∂Ω
4c. The more general conjecture of Marcus and Minc [11] (cf. [12], p. 91) to the effect that nJ − A per A ≥ per , A ∈ ∂Ω, n−1 is still unsettled [in 1981]. The latter is also meaningful in the context of Subsection 4b, ultra (that is, for “hyperpermanents”). References [1]
[2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
A. D. Aleksandrov. Zur Theorie der gemischten Volumina von konvexen Körpern IV: Die gemischten Diskriminanten und die gemischten Volumina. Mat. Sbornik 3 (2), 1938, 227–249. Russian with German summary. E. F. Beckenbach and R. Bellman. Inequalities. Ergebnisse der Mathematik und ihrer Grenzgebiete, 30. Springer-Verlag, Berlin, Göttingen, Heidelberg, 1961. S.-S. Chern. Integral formulas for hypersurfaces in Euclidean space and their appplication to uniqueness theorems. Indiana Univ. Math. J. 8, 1959, 947–966. G. P. Egorychev. The solution of van der Waerden’s problem for permanents. Advances in Math. 42, 1981, 299–305. D. I. Falikman. Proof of van der Waerden’s hypothesis on the permanent of doubly stochastic matrices. Mat. Zametki 19, 1981, 931–938, 957. L. Gårding. An inequality for hyperbolic polynomials. J. Math. Mech. 8 (6), 1959, 957–966. L. Gårding. Linear hyperbolic partial differential equations with constant coefficients. Acta Math. 85, 1951, 1–62. L. Hörmander. Linear partial differential operators. (Grundlehren 116.) Springer-Verlag, Berlin, Göttingen, Heidelberg, 1963. D. London. Some notes on the van der Waerden conjecture. Linear Algebra and Appl. 4, 1971, 155-160. M. Marcus and M. Newman. On the minimum of the permanent of a doubly stochastic matrix. Duke Math. J. 26, 1959, 61–72. M. Marcus and H. Minc. On a conjecture of B.L. van der Waerden. Proc. Cambridge Philos. Soc. 63, 1967, 305–309. H. Minc. Permanents. In: Encyclopedia of Mathematicss and its applications, 6. Addison-Wesley, London etc., 1978. J. H. van Lint. Notes on Egorychev’s proof of the van der Waerden’s conjecture. Linear Algebra and Appl. 39, 1981, 1–8.
233
3.
On generalized majorization by J. Peetre 9 For my friends
While visiting Haifa recently (Jan. 85) I discussed with Michael Cwikel the question of extending the theory of majorization, which is connected with the special pair (L1 , L∞ ), to the case of other pairs. The problem is mentioned already in our joint paper [5] and even earlier in [12]. Schur, Ostrowsky . . . Consider first the finite dimensional case, that is, the pair p (1n , ∞ n ) (that is, L spaces based on a finite segment (1, n)). If x = (x1 , . . . , xn ) and y = (y1 , . . . , yn ) are positive vectors, which we for simplicity take to be decreasing too, we write x ≺ y if x1 ≤ y1 ; x1 + x2 ≤ y1 + y2 ; ................... x1 + x2 + · · · + xn ≤ y1 + y2 + · · · + yn . This is majorization, a term, in this context, apparently first used by Hardy and Littlewood. Given a function f (x) = f (x1 , . . . , xn ), which is always assumed to be symmetric in its arguments the problem is to decide when x ≺ y implies f (x) ≤ f (y) (Schurconvexity). T HEOREM 3.1 (Schur). Assuming that f is smooth a necessary and sufficient condition for f to be Schur convex is that ∂f ∂f (xi − xj ) · ≥ 0. − ∂xi ∂xj Schur was interested in this because of applications to Hermitian matrices of the type of Hadamard’s inequality. For more applications and a comprehensive treatment see especially [11]. See further [1–3] (I owe these references to Jonathan Arazy). A brief synopsis of the theory can likewise be found in [4, pp. 30-33]. Some interesting material is also contained in [7, 8] (especially Chapter 14), S HORT K - FUNCTIONAL PROOF OF S CHUR ’ S THEOREM . It is clear that f (x) depends only on the K-functional of the vector x = (x1 , . . . , xn ). In this case the Kfunctional is piecewise linear and the values at the n “knots” are precisely K1 = x1 , K2 = x1 + x2 ,. . . , Kn = x1 + x2 + · · · + xn . So we may write f = F (K) with K = (K1 , . . . , Kn ). Differentiating we get n ∂F ∂Kj ∂f = · . ∂xi ∂Kj ∂xi j=1 9
Report LTH 1985:2, Lund, 1985. Reprint.
234
C HAPTER III. MAJORIZATION
But
∂Kj = ∂xi
1 if i ≥ j; 0 otherwise.
Therefore we find
∂f ∂F ∂f − = . ∂xi ∂xi+1 ∂Kj This clearly is the embryo of Schur’s condition.
( '
We won’t elaborate more on the details. Instead we shall look on some more general cases, the point being that the argument just produced is quite general (Schur’s theorem is not too deep!). Every time we have sufficiently exact information about the K-functional the same proof can be carried over. The case (L2 , L2 (λ)). In this case it is convenient to use K2 in place of K. (In view of [10] this causes no essential change.) .∞ 1 2 (74) K2 (t, a) = |a(λ)|2 dλ. 1 0 1+ (λt)2 Let f = F (K22 ). Then formally (variational or Volterra derivatives) .∞ δK22 (t, a) δF δf = · dt. δa(λ) δK22 (t, a) δa(λ) 0
But by (74) [for a positive] δK22 (t, a) = δa(λ) Thus substituting δf = δa(λ)
.∞ 0
2a(λ) . 1 1+ (λt)2
δF · δK22 (t, a)
2a(λ) dt. 1 1+ (λt)2
δF ≥ 0. Thus the integral is of the form δK22 . dμ(λ) . 1 1+ (λt)2 with a positive measure μ, thus represents a Loewner (or Pick) function. Now if F is monotone in K2 then
T HEOREM 3.2. f is K-monotone if and only if δf 1 · a(λ) δa(λ) is a Loewner function in λ. / E XAMPLE 3.1. Let f be quadratic, f = (a(λ))2 (w(λ))2 dλ, w a (positive) weight. Our condition for K-monotonicity then becomes the classical one of (w(λ))2 being a Loewner function.
3. On generalized majorization
235
R EMARK 3.3. The above points also to that there might be a sort of “generalized Loewner theory”. As is well-known (for Loewner theory, see e.g. [6], cf. [13]) Loewner was concerned with the issue of “monotone operator functions”. For which (scalar) functions ϕ, is it true that A ≥ B in operator sense (A and B being s.a. operators in a Hibert space H) implies ϕ(A) ≥ ϕ(B)? Given a function Φ = Φ(x, A) of two variables (x ∈ H, A a s.a. operator in H) we may instead consider the more general inequality Φ(x, A) ≥ Φ(x, B). Thus Φ(x, A) = (ϕ(A)x, x) will correspond to the classical case. The case (Lp , Lp (λ), 1 ≤ p ≤ ∞. Nothing essential happens if we pass to the case of general p. The condition formally becomes that (a(λ))p−1 · Φ(x, A) should admit an analogous integral representation with the (convolution) kernel
1 1 + t2
1 1 + = 1. (Compare again [13].) p q (1 + The limiting case p = 1 is noteworthy. Then we have the kernel min(1, t) and by δf has to be Sparr’s lemma [14] (see once more [13]) this is the same as to say that δa(λ) a concave function of λ (observation by M. Cwikel).
replaced by
1
1
t−q ) q
, where
The case (Lp , Lq ), p = q. The case of different exponents is slightly more rewarding10. We further find it convenient to use the L-functional now. Recall that .∞ p q L(t, a; L , L ) = L(t, a(λ); R, R) dλ. 0
If f = F (L) we have δf = δa(λ)
.∞
δL(t, a) δF · dt. δL(t, a) δa(λ)
0
Therefore we will end up with a condition involving the kernel δL(t, a(λ); R, R) . δa(λ) R EMARK 3.4. The “scalar” L-functional L(t, a) = L(t, a(λ; R, R) has been invesδL(t, a) . tigated in [9] but not much seems to be known about the derivative δa Conclusion. The drawback of all this is that we have no applications at all, whereas in the primitive Schur case concrete applications (see especially [11]). Challenge to the Reader: find some! References [1]
P. Alberti and A. Uhlmann. Dissipative motion in state spaces. Teubner-Texte zur Mathematik, 33. Teubner, Leipzig, 1981. 10Because of the Stein-Weiss trick [15] the weight can always be removed.
236
[2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]
C HAPTER III. MAJORIZATION
P. Alberti and A. Uhlmann. Stochasticity and partial order: doubly stochastic maps and unitary mixing. Mathematical monographs, 18. Deutscher Verlag der Wissenschaften, Berlin, 1981. T. Ando, Computationally secure information flow. Hokkaido University, Sapporo, 1982. E. F. Beckenbach and R. Bellman. Inequalities. Ergebnisse der Mathematik, 30. Springer Verlag, Berlin, Göttingen, Heidelberg, 1961. M. Cwikel and J. Peetre. Abstract K and J spaces. Abstract K and J spaces 60, 1981, 1–50. W. Donoghue. Monotone matrix functions and analytic continuation. Die Grundlehren der mathematischen Wissenschaften, 207. Springer Verlag, New York, Heidelberg, 1974. R. Farell. Multivariate calculation. Springer Series in statistics. Springer Verlag, New York, Heidelberg, Tokyo, 1985. I. Gohberg and M. Krein. Introduction to the theory of linear non-selfadjoint operators. Nauka, Moscow, 1965. English translation: Am. Math. Soc., Providence, 1988. M. Gustavsson and J. Peetre. Properties of the L function. Audia Math. 74, 1982, 106–121. T. Holmstedt and J. Peetre. On certain functionals arising in the theory of interpolation. Func. Anal. 4, 1969, 88–94. A. Marshall and I. Olkin. Inequalities: Theory of Majorization and Its Applications. Academic Press, New York, 1979. J. Peetre. On the connection between the theory of interpolation spaces and approximation theory. In: Proc. Conf. on Constructive Theory of Functions. Akadémiai Kiado, Budapest, 1969, 351–363. J. Peetre. On Apslund’s averaging method – the interpolation (function) way. In: Proc. Int. Conf. on Constructive Theory of Functions. Bulgarian Acad. Sci., Sofia, 1984, 664–671. G. Sparr. Interpolation of weighted Lp spaces. Studia Math. 62, 1978, 229–271. E. Stein and G. Weiss. Interpolation of operators with change of measure. Trans. Am. Math. Soc. 87, 1958, 159–172.
CHAPTER IV Combinatorics
This page intentionally left blank
239
1.
[K88a] On Stirling and Lah numbers
Given a finite set S, |S| = n, let us consider the set of all possible functions f : S → X, |X| = x. Each such function f gives a certain equivalence relation Ker f , the kernel of f . Conversely, each equivalence relation π serves as the kernel of a function f : S → X, and the number of functions with a given kernel π equals to decreasing subfactorial (x)n(π) , where n(π) is the number of blocks of the partition on S corresponding to the equivalency π.1 Let Π(n) be the lattice of all equivalencies on S. We have the equation (x)n(π) = x(n) . (75) n∈Π(n)
As (75) is true for infinitely many natural numbers x, it is the equality of two polynomials in Q[x]. Based on (75) and using methods of linear algebra, G.-C. Rota derived ([7], in def
1964) a series of properties of the numbers Bn = |Π(n)|, which, in particular, showed that Bn is the n-th Bell number [2]; for details about this see [4]. In the Proceedings of the All Union Seminar on Combinatorial Analysis (Moscow University, Jan. 1980), the author suggested a similar approach also to Stirling and Lah numbers; cf. further [1]. In part, this was realized (for the derivation of the basic properties of the Stirling numbers of the second kind) in [2], and, more completely, in [4], where one considers from this point of view (but, this time, involving a suitable order relation on the blocks of a partition of S) also Stirling numbers of the first kind. So far the author knows the papers [6] and [3], showing that the line of thought indicated deserves much attention. Here we give a new combinatorial foundation for some identities for Stirling and Lah numbers 2 illustrating the synthesis of the ideas of Pólya and Rota just mentioned. def The polynomials pu (x) = (x)u , u = 0, 1, 2, . . . form a basis of the vector space of polynomials Q[x], and so the formula Lk (pu (x)) = δu,k , k = 0, 1, 2, . . . defines uniquely a sequence of linear functionals Lk : Q[x] → Q, k = 0, 1, 2, . . . . Next, we obtain from (75) for the numbers def
S(n, k) = |{p ∈ Π(n)|n(π) = k}|
(76) the “strange”definition (77)
S(n, k) =
1 = Lk (xn ).
p∈Π(n)|n(π)=k}
On the basis of (77) all fundamental relations for the Stirling numbers of the second kind S(n, k) were derived in [4]. There it was also shown that the same approach works for a 1Translator’s Note. Quite generally, (x) = x(x − 1) . . . (x − (n − 1)) for any integer n. n 2Translator’s Note. These numbers were, apparently, introduced by Lah in [5], noted in [2].
240
C HAPTER IV. COMBINATORICS
combinatorial foundation of some more complicated identities for the numbers S(n, k), as, for instance: n i+j S(n, i + j) = S(k, i)S(n − k, j). i k k≥0
Πn
Let be the set of all partitions of a set n. It is assumed that there is given a cycle structure for those blocks. On the one hand, we may look at a function f : n → X, |X| = x, as on a distribution of |n| = n ordered objects into x distinct and unordered def
baskets; the number of those distributions equals to x(n) = x(x + 1) . . . (x + n − 1). On the other hand, the function f : n → X we may also look as on a composition f
f
n → n → X, where f is bijective and f is an arbitrary function from n to X. Let us consider π : Ker f together with the structure that arises from the cyclical construction of the bijection f : n → n. Then we arrive to the relation (78) x(n) = xn(π) . x∈Πn def
Introducing c(n, k) = |{π ∈ Πn x(n) }| allows us to write (78) in the form the numbers (n) k x = k c(n, k)x . The previous is a polynomial relation, and so remains in force if we make the change x → −x, which gives (x)n = s(n, k)xk k def
with s(n, k) = (−1)n+k c(n, k). Applying this to the functionals Lk : Q[x] → Q,
Lk (xu ) = δk,u , u = 0, 1, 2, . . .
gives the relations Lk ((x)n ) = s(n, k). This “strange” definition of the numbers s(n, k) can serve as a foundation of the derivation of the numbers s(n, k), in particular, of the recurrence relation s(n + 1, k) = s(n, k − 1) − ns(n, k). Together with s(n, 0) = 0, s(1, 1) = 1, this shows that we are here dealing with Stirling numbers of the first kind. We remark that in the same way the definition c(n, k) = Lk (x(n) ) can serve as the basis of a derivation of the properties of the numbers c(n, k); cf. also [4]. Using the linear functionals Lk one can give the recurrence relation indicated for the numbers s(n, k) the form Lk ((x − n) · (x)n ) = Lk−1 ((x)n ) − nLk ((x)n ). We see that an analogous relation holds for an arbitrary polynomial p(x) ∈ Q[x]: Lk ((x − n)· (x)) = Lk−1 (p(x)) − nLk (p(x)). It suffices to check the last statement on the basis sequence {xu |u = 0, 1, 2, . . . } of the space Q[x], which is immediate to do and yields a positive outcome. Let Πn be the set of all partition of n, on the blocks of which it assumed that there is given a structure of a chain. On each function f : n → X, |X| = x, we may look at as a map for which preimages f −1 (y), y ∈ X, there is given a structure of a chain. The number of such functions equals x(n) . On the other hand, a function f : n → X
1. On Stirling and Lah numbers
241
may be viewed as a pair (π, f ) consisting of an element of π ∈ Πn and an injection f : n/π → X; the number of such pairs is π∈Πn (x)n(π) . We obtain the relation (x)n(π) . (79) x(n) = π∈Π n
Applying to this relation the linear functionals Lk : Q[x] → Q, where Lk ((x)u ) = δu,k and u = 0, 1, 2, . . . , gives def (80) Lk ((x)n ) = 1 = L(n, k). {π∈Π n |n(π)=x}
Usually, the numbers L(n, k) arise as the coefficients of the expansion of the eigenpolynomials n dx n −x n−1 def (e x )= L(n, k)(−x)n n (x) = xex dx k
of the Laguerre operator .∞ L : p(x) → −
e−t
d p(x + t)t dt; dx
0
cf. [2, p. 111]. The “strange” definition (80) of these numbers allows us to derive all their properties, in particular, the recursive relation (81)
L(n + 1, k) = L(n, k − 1) + (n + k)L(n, k),
which together with L(0, 0) = 1 and L(n, 0) = 0 for n > 0 shows that the L(n, k) are the Lah numbers, for which holds n! n − 1 L(n, k) = , cf. [2]. k! k − 1 For example, let us indicate the deduction of (81). With the aid of (79) we may write (80) in the form (82)
Lk ((x + n)x(n) ) = Lk−1 (x(n) ) + (n + k)Lk (n, k)(x(n) )
It turns out that (79) is valid for any p(x) ∈ Q[x]: (83)
Lk ((x + n)p(x)) = Lk−1 (p(x)) + (n + k)Lk (n, k)(p(x))
It is sufficient to show (83) for the basis sequence {(x)u , u = 0, 1, 2, . . . } in the space Q[x]: Lk ((x + n)(x)u ) = Lk−1 ((x)u ) + (n + k)Lk (n, k)((x)u )) which, with the aid of the representation x + n = (x − u) + (n + u) leads, to the (not immediate) verification (for u = k; u = k − 1; or u = k, k − 1) of the relation (84)
δk,u+1 + (n + u)δu,k = δk−1,u + (n + k)δk,u .
It turns out that (84) is true which shows that, likewise, (82) is true, and along with it (81).
242
C HAPTER IV. COMBINATORICS
Perhaps it might be of some interest to carry over this approach to the case when one considers on n partitions on which blocks one assume that there is given a completely arbitrary structure. References [1] U. Kaljulaid. A remark on Stirling numbers. Sb. “Komb. Analiz” 6, 1983, 98. (see [K83b]). [2] M. Aigner. Combinatorial theory. Grudlagen der Mathematik, 234. Springer Verlag, Berlin, Heidelberg, New York, 1979. [3] S.-N. A. Joni, G.-C. Rota, and B. Sagan. From sets to functions: three elementary examples. Discrete Math. 37, 1981, 193–202. [4] U. Kaljulaid. Elements of discrete mathematics. Tartu University Press, Tartu, 1983. (see [K83c]). [5] I. Lah. Ein neue Art von Zahlen, ihre Eigenschaften und Anwendungen in der Mathemstischen Statistik. Mitteilungsblatt Math. Stat. 7, 1955, 203–212. [6] G. Pólya. Partitions of a finite set into structures subsets. Math. Proc. Camb. Phil. Soc. 77, 1975, 453–458. [7] G.-C. Rota. The number of partitions of a set. Am. Math. Monthly 71, 1964, 498–504.
Remark. The references [5, 7] were added by translator.
243
2.
Letter (or draft of letter) c. 1991 from Uno Kaljulaid to Torbjörn Tambour
Preamble (Note by Uno Kaljulaid to J. Peetre). This material and such a letter was sent to Professor Tambour in order to initiate anew our cooperation, which was interrupted in 1991 by reasons known to you (and he in the beginning of his trip returned to Sweden).3 Dear Professor Tambour, You asked me some details. Though chaotic, here they are! I would like to add to the remarks on p. 299 that, of course when finding Ω(P, Fm ) it seems to be important also [to invoke] the width w(P ) of P and the fact that order preserving maps P → Fm map chains “convexly” into chains of Fm . So there does not seem to remain so many possibilities when also taking into account a Dilworth partition on P (into chains with a minimal number of such blocks). Sincerely, Uno Kaljulaid
3 Note by J. Peetre Gert Almkvist and Torbjörn Tambour were supposed to visit Tartu in the summer of 1991. However, in Moscow Tambour was attacked by a robber, so he decided to cancel his trip, and returned home.
This page intentionally left blank
245
3.
On Fibonacci numbers of graphs Unpublished manuscript c. 1991, edited by J. Peetre
My curiosity was arisen to this several years ago while reading Prodinger and Tichy [5]; at first it seemed to me to be a recreational hobby. Let me describe the set-up now. Given a (simple) graph G = G(V ; E) with V , the set of vertices, and E, the set of edges, we define the Fibonacci number of the graph f (G) as the number of subsets S ⊆ V such that (a, b) ∈ E for all pairs {a, b} ⊆ S; let us call these subsets S acceptable. E.g., an easy induction shows that the (usual) Fibonacci number Fn+2 is the Fibonacci number of the n chain Rn (see Figure 1) and that the Lucas number Ln is Fibonacci • 1
• 2
• 3
...
• n
Fig. 1: The n-chain Rn
number of the elementary n-cycle Cn (see Figure 2). •3 •2 •1 •n •n−1 Fig. 2: The n-cycle Cn
Furthermore, Prodinger-Tichy [5] prove some elementary lemmas and an (easy) theorem for an n-tree Tn : Fn+1 ≤ f (Tn ) ≤ 2n−1 + 1, and they pose some questions (not difficult to solve): e.g., (1) the Fibonacci number for the graph in Figure 3 is 3n . √ (2) the Fibonacci number for the graph Rn in Figure 4 is f (Rn ) = 3+23 3 (1 + √ √ n 3−2√3 3) + 3 (1 − 3)n . (3) the Fibonacci number for the for the graph Qn in Figure 5 is f (Qn ) = 12 (1 + √ √ n+1 2) + (1 − 2)n . (4) the Fibonacci number for a 2n-cycle with √ opposite vertices √ joined, as depicted in Figure 6, is f (Zn ) = (−1)n+1 + (1 + 2)n + (1 − 2)n .
246
C HAPTER IV. COMBINATORICS
n+1 •
n+2 •
n+3 •
n+4 •
• 1
• 2
• 3
• 4
... • ... •
2n • • n
Fig. 3: The forest of “dipoles”
n+1 •
n+2 •
n+3 •
n+4 •
• 1
• 2
• 3
• 4
... • ... •
2n • • n
Fig. 4: The The graph Rn
n+1 •
n+2 •
n+3 •
n+4 •
• 1
• 2
• 3
• 4
... • ... •
Fig. 5: The The graph Qn
2 1 ?? ?? ?? ?? 0 ??? ? ?? 2n − 1 Fig. 6: The The graph Zn
2n • • n
3. On Fibonacci numbers of graphs
247
After several years I saw a note by A. Alameddine [1] (1983) on the Fibonacci number s of outerplanar graphs – these are planar graphs whose vertices can be thought as belonging to a single face4. Maximal among outerplanar graphs are those outerplanar graphs which do not allow addition of edges without disturbing outer planarity. According to the main result of this paper the Fibonacci number f (Pn ) of a maximal outerplanar graph Gn with n vertices satisfies the inequality f (Pn ) ≤ Fn+1 , and this result is the best possible. In the proof of this result in [1] there is a mistake: the author asserts that f (Pn−3 ∩ {v}) = f (Pn−3 ); yet, for n = 7 we have f (P4 ) = F6 = 8, but f (P4 ∩ {v}) = 16. Nevertheless, the assertion is true, as there exists a way to overcome the author’s difficulty. I have some additional remarks here. 1. Using a technique of A. Proskurowski, I can prove the following two theorems: T HEOREM 3.1. For a given maximal outerplanar graph G with n vertices, let us denote by G+ the maximal outerplanar graph obtained by adding a new vertex, and denote by G− the outerplanar graph with n − 2 vertices which we get upon dropping from G some two of its vertices. Then it is true that f (G) = f (G+ ) − f (G− ). T HEOREM 3.2. The Fibonacci number of the maximal outerplanar graphs Mn , given for n odd in Figure 7, and n even in Figure 8, are minimal among the Fibonacci numbers of maximal outerplanar graphs with n vertices.
n – odd 1 3 ? ? •?? ······ • • ??? ?? • ??? ?? ?? ?? ?? ?? ?? ? ? ? • • • ······ 2 4 n−1 Fig. 7: The maximal outerplanar graph Mn for odd n
This solves two questions posed by Alameddine in [1]. In addition, my reasoning to achieve the above seems to be such that there exists a quite realistic hope to do all the above for any planar graph: it seems that the needed lemmas exist already, and are contained in Chapter 11 of F. Harary’s book [4]. To this seems to be one the possible lines for extending the results on maximal outerplanar graphs. And, very probably, this extension will be useful for ‘chip-industry’. 4 Editor’s note. Equivalently, a graph is called outerplanar if it has an embedding in the plane such that the vertices lie on a fixed circle and the edges lie inside the disk of the circle and don’t intersect.
248
C HAPTER IV. COMBINATORICS
n−1 1 3 •?? •?? •?? • · · · · · · ?? ? ? ?? ?? ??? ?? ?? ?? ? ? ? ? • • • ······ n – even 2 4 Fig. 8: The maximal outerplanar graph Mn for even n
2. The equation f (Mn ) = f (Mn−1 ) + f (Mn−3 ) has the characteristic equation x3 −x2 −1 = 0. Setting x = y+ 31 we get y 3 − 13 y− 29 27 = 0 with the roots ⎧ y = u + v; 1 ⎪ ⎪ ⎪ ⎨ u − v√ u+v +i 3; y2 = − 2 2 ⎪ ⎪ √ ⎪ ⎩ y = − u + v − i u − v 3, 3 2 2 where 4 4 5 5 31 31 3 29 3 29 u= + and v = − . 54 108 54 108 So we obtain the general solution f (Mn ) = axn1 + bxn2 + cxn3 , where the approximate values of xi (i = 1, 2, 3) are ⎧ ⎪ ⎨ x1 = 1.465572; x2 = −0.232786 + i · 0.792551; ⎪ ⎩ x = −0.232786 − i · 0.792551; 3 As |f (Mn )| ≤ |a||x1 |n + |b||x2 |n + |c||x3 |n and we have |x2 | = |x3 | < 1, then for n → ∞ we have |x2 | = |x3 | → 0, and so for large values of n we obtain f (Mn−1 ) ≤ |x1 | = 1.465572. f (Mn ) Experimenting a little with various n shows that the ratio 1.465572 well enough even for small n: n 3 4 6 7 8 9
f (Mn ) 4 6 13 19 28 41
f (Mn+1 ) f (Mn )
1.5 1.444 1.4461538 1.4475684 1.4464287
f (Mn−1 ) f (Mn )
tends to this value
3. On Fibonacci numbers of graphs
249
E DITOR ’ S R EMARK . As Rn =
a + b( xx21 )n+1 + c( xx31 )n+1 axn+1 + bxn+1 f (Mn+1 ) + cxn+1 2 3 = 1 n = x , 1 f (Mn ) ax1 + bxn2 + cxn3 a + b( xx2 )n + c( xx3 2 )n 1
1
we see that the sought ratio Rn , indeed, tend to x1 as n → ∞. Likewise, it is easy to see that we have |x2 | n+1 |Rn − x1 | ≤ K( ) x1 for suitable constant K and all n. ( ' Note also that here the technique used by Pólya for solving recurrences appearing in connection with the enumeration of trees can be applied (when suitably extended) – and this seems to be an interesting reasoning here. 3. To sum up, all the above, probably, deserves to be written down, and to be critically analyzed once more together – [this should be] interesting at least for people concerning with graphs and chips-technology. I finish with some chaotic thoughts on these matters. First note that for mathematics the most interesting things seems to begin to appear, yet, when we pose questions analogous to the graph-theoretic ones above for a finite poset P . For such a P it is natural to define the Fibonacci number P as the number of all antichains in P . Let ζP be the zeta-function for the order relation in P ; that is, ζP (x, y) = 1 if and only if x ≥ y in P . So f (P ) equals the number of all k × k zero-submatrices in the matrix )ζP (x, y)); here x, y ∈ P (listing P ) and k takes all values in {1, 2, . . . , |P |}. The role of antichains when investigating the structure of poset is, of course, well-known; e.g., the maximal size of antichains in P as the width of P , Dilwoorth’s theorem, . . . . I have several observations here. To be more concrete I shall describe two of them here. 4. When considering order preserving maps ϕ : P → P it seems natural to consider the kernel of ϕ, π = Ker ϕ. And then to define x ¯ ≤ y¯ on P¯ / = P Ker ϕ if and only if π π there exist x , x ∼ x, and y , y ∼ y such that x ≤ y holds in P . This is a consistent definition as in the case if x ¯ < y¯ then for any pair (x , y ) with different components π π x , x ∼x, and y , y ∼y such that x ≤ y if these components are comparable then we must have x < y . Note also that for a (finite) poset P , taking π ∈ Π(P ) such that that all π-classes are connected (as subsets of P ), we can define x ¯ ≤ y¯ in P/π by the rule: x ¯ ≤ y¯ if and only if there exist x ∼ x and y ∼ y such that x ≤ y in P . Call such an π an acceptable equivalence. It follows from R. Stanley’s results that all acceptable equivalences form an Eulerian sublattice in Π(P ). Returning to the main point, observe that for any order preserving map ϕ : P → P there exists a natural ◦-epimorphism ψ : P → P¯ , π(x) = x¯, x ∈ P , and so the usual “◦-diagram” appears: Ψ – epi / / P/ Ker ϕ = P¯ P HH HH n n HH n H ϕ HH $ vn n ε – iso Im ϕ ≤ P
Now, the finding of the number Ω(P, P ) of all order preserving maps P → P reduces to the enumeration of acceptable equivalences on P and of ◦-automorphisms of P/π for acceptable π. This seems to have some point of contact with the Sands
250
C HAPTER IV. COMBINATORICS
conjecture, that I shall describe below. More generally, for any order preserving map ϕ : P → Q it holds ζP (x, y) = 1 =⇒ ζP (ϕ(x), ϕ(y)) = 1 for x ≥ y in P . 5. As above, denote by Ω(P, m) the number of order preserving maps Ω → m. Stanley has observed that Ω(P, m) = Z(Z(P ), m), with Z(Q, n) denoting the number of multichains y1 ≤ y2 ≤ · · · ≤ yn in Q, and this n-expression is called the zeta-polynomial of the poset Q. Ω(P, m) is called the order-polynomial of P and can be thought of as an m-polynomial of degree |P |. Stanley [6, Theorem 4.5.14], gives the following intriguing formula Ω(P, m) = ( λ1+d(π) )(1 − λ)−p+1 , m≥0
π∈L(P )
where L(P ) denotes the Jordan-Hölder set for P . My question is now: What will happen to this theory of Stanley if we take Fm (a fence (zigzag): {1, 2, . . . .m} with the only inequations 1 > 2, 2 < 3, 3 > 4, . . . , m − 1 < m) instead of the cochain m? Other posets P , instead of m, may be of interest also. Yet, Fm is interesting in relation to the paper Currie-Visentin [2]. In this respect, at least the Ω(P, Fm ) should deserve an attention. In [2] the generating function of Ω(Fm , Fm ) is introduced. According a conjecture of B. Sands the number Ω(P, P ) is minimal for P = Fm . Let us further mention the paper Duffus-Rödl-Sands-Woodrow [3], although we have not seen it so far. Here we can make the conjecture that Ω(P, P ) ≥ Ω(P, Fm ) ≥ Ω(Fm , Fm ), for any poset P , |P | = m. It seems that the outerplanar graphs Mn here somehow correspond to the “bichromatic” Jordan-Hölder sets for Fm , and so the role of f (Mn ) was, presumably, not just an accident? This expectation is supported by the observation that for any poset P its order polynomial depends only on its graph of comparability G(P ), with (x, y) ∈ E(G) if and only if x < y or y < x. Also, Ω(m, Fm ), Ω(Fm , m), Ω(Fm , Fm ) and | Aut(Fm )| are interesting, and so are the Eulerian (sec,tan)-numbers ... References [1] A. F. Alameddine. Centers of maximal planar graphs with two vertices of degree three. J. Combin. Inform. System Sci. 8, 1983, 90–96. [2] J. D. Currie and T. I. Visentin. The number of order-preserving maps of fences and crowns. Order 8, 1991, 133–142. [3] D. Duffus, V. Rödl, B. Sands, and R. Woodrow. Enumeration of order preserving maps. Order 9, 1992, 15–29. [4] F. Harary. Graph theory. Addison-Wesley, Reading, MA, 1969. Russian translation: Mir, 1973. [5] H. Prodinger and R. Tichy. Fibonacci numbers of graphs. Fibonacci Quarterly, 9, 1982, 16–21. [6] R. Stanley. Enumerative combinatorics I. The Wadsworth & Brooks/Cole Mathematics Series. Wadsworth & Brooks/Cole Advanced Books & Software, Monterey, CA, 1986.
CHAPTER V History of Mathematics
This page intentionally left blank
253
1.
Th. Molien, an innovator of algebra Unpublished manuscript, c. 1985, translation from Estonian by J. Peetre
The life of Fedor Eduardovich Molin (1861-1941) was somewhat unusual. He was born in Riga and in 1883 received the scientific degree of a candidate in astronomy from the University of Dorpat/Tartu. In 1883-1885 he worked in Leipzig in the seminar of Felix Klein, on whose advice Molin began to research linear transformations of elliptic functions. When he returned to his alma mater, Molin was appointed a docent and during the following six years made contributions that earned him his place in the history of algebra. In 1892 he published his paper “On systems of higher complex numbers”. In modern language, in that paper by analogy with the notion of a simple group Molin defined simple algebras over the field of complex numbers, showed that they are algebras of matrices, and finally, discovered that the study of an arbitrary algebra over the field of complex numbers reduces to the case when the quotient by the radical is a direct sum of matrix algebras. In the short articles that followed that memoir, Molin applied these results to representation theory of finite groups. His research had much in common with works by Frobenius, Killing and Lie Lie, and immediately brought him international acclaim and a gold medal from the Paris Academy of Sciences. Georg Frobenius in one of his letters to Molin said, in particular, that Molin “with one stroke completely solved the most important questions in this field”. Unfortunately, neither Moscow nor St-Petersburg universities had any influential people capable of giving Molin’s papers their due, and after receiving for them his doctorate degree, he had to take a full professor position (called “ordinary professor”) in mathematics in the newly opened Tomsk Technological Institute. There, the daily needs of organizing teaching, a library, and other tasks vital for an institution of higher education that was new and distant from the capital remove him for a long time from the stream of international mathematical life. In 1917 Molin was appointed a professor of mathematics in the department of physics and mathematics in the newly opened Tomsk University. He became completely absorbed into organizing this department, and published form time to time articles of a general mathematical nature. For a long time, Tomsk had been the cultural capital of Siberia, and the current flourishing of Siberian mathematics is partially due to F.E. Molin. Excerpt from A.I. Ma’lcev, To history of algebra in the USSR for the first 25 years, Algebra i Logika 10, 1971, 102–118 (Russian). English Translation: Algebra Logic , pp. 68–81.
254
C HAPTER V. HISTORY OF MATHEMATICS
According to [1] the picture of the early history of the theory of group representations is perverted: the approach of Frobenius and Burnside is usually considered as fundamental, although in reality the innovative work was done by the little known T. Molien. However, the wider mathematical and historical context of Molien’s results and their connection with problems of contemporary mathematics has been little studied [2]. 125 years have elapsed since the birth (September 10, 1861) of Theodor Molien. He was born in a family of Swedish origin, which from northern Estonia had settled in Riga. His father [Eduard Molien] had graduated from Tartu University, was a teacher at a private gymnasium in Riga. T. Molien had received his basic education at the Government Gymnasium at Riga. There were laid the foundations for his ability for studies and his character, his intellectual interests and habits. At Tartu University Molien began to prepare himself for a career as an astronomer. His aptness and diligence was noted, his scientific tastes and ability developed. He graduated at the university (1883) and was sent to Leipzig (1883–1885), which gave him modern and deep knowledge. There, in the seminar of F. Klein, his scientific interests definitely turned to interior problems of mathematics. The future Docent at Tartu University (1885–1900) did not give up his connections with Leipzig (from 1886 on, the seminar was directed by Sophus Lie). Therefore, in the study of systems of hypercomplex numbers, he got stimulation and help from the activities which arose in Lie’s seminar from the results of Weierstrass and Dedekind on this theme, and in particular from Poincaré’s remark that the expression of the multiplication of hypercomplex numbers gives a Lie group. As a result [3] the results of Molien obtained in 1887–92 began the structure theory of algebras [5]. The facts known to Molien that group algebras, in special cases already known to A. Cayley, constitute a bridge between representation theory and the theory of algebras, led him to fundamental notions and facts in the theory of group representations [4]. In this way the “hypercomplex aspect” of the theory was born. Let G be a finite subgroup in the group of all regular linear maps of the subspace of the linear forms in the algebra of polynomials R = C[x1 , . . . , xm ]. The description of the homogeneous components RnG , the so-called subalgebra of invariants RG = {f ∈ R | ∀G ∈ G, f G = f } constitutes the central problem of the theory of invariants. This problem is equivalent to the determination of the formal series MG (t) = G n n≥0 (dim Rn )t , the Molien series of G. The answer, an the explicit form of the rational function MG (t), is provided by Molien’s formula: MG (t) =
1 1 . |G| det(I − tG) G∈G
The so-called Polyá theory in combinatorics studies the numbers d(τ ) of G-schemes of a given type, that is, the series LG (t) = τ d(τ )tτ . The formula for LG (t), which is the central part of Polyá theory, can be found in a similar way [6]. This example [6] does not limit the connections of Molien’s results with problems of contemporary mathematics. In particular, a somewhat more general variant of Molien’s formula just considered admits
1. Th. Molien, an innovator of algebra
255
applications to the noncommutative theory of invariants. Apparently, this new interest in the innovative papers [3, 4] of Molien is yet another confirmation of their lasting value. References [1] [2] [3] [4]
W. Gustafson. Review on S. Sehgal’s topics in group rings. Bull. Amer. Math. Soc. 1, 1979, 654–657. N. F. Kanunov. Fedor Eduardovich Molien. Nauka, Moscow, 1983. T. Molien. Über Systeme höherer komplexen Zahlen. Math. Ann. 41, 1893, 83–156. T. Molien. Über die Invarianten der linearen Substitutionsgruppen. Sitzungsber. der Königl. Preuss. Akad. d. Wiss. 52, 1897, 1152–1156. [5] R. S. Pierce. Associative algebras. Graduate Texts in Mathematics, 88. Springer-Verlag, New York, Berlin, 1982. Russian translation: Mir, Moscow, 1986. [6] R. Stanley. Invariants of finite groups and their applications to combinatorics. Bull. Amer. Math. Soc. 1, 1979, 475–511.
This page intentionally left blank
257
2.
[K87e] On the results of Molien about invariants of finite groups and their renaissance in contemporary mathematics Translation by J. Peetre
1. Molien’s papers [11] and [12] occupy an honorable place in the history of mathematics (cf. [1]). Nevertheless, it is written comparatively little on the wider historicalmathematical context of his classical results and their connection with the problems of contemporary mathematics [8, 15]. One of the reasons for this situation is a long-lasting complicated treatment of the interaction of different the results and the notions of the theory of representations of finite groups and their group algebras, which arose after the publication of the books [2] and [22]. These outstanding books indicated a landmark in the algebraic literature and become the basic text-books for several generations of mathematicians, still nowadays their impact is great. However, questions of the formation history of these concepts and results are not illuminated sufficiently clearly cf. e.g. [13]). Apparently, this distortion of history was caused by the underestimation the contribution of S. Lie and other mathematicians taking part in the Leipzig seminars (in the second half of the 1880’s) and thinking the same way, as well as by neglecting the works Molien and É. Cartan that were written in the old-fashioned language of the Pierce idempotents (1871). The latter happened after the brilliant presentation and generalization of the Molien-Cartan theory by Wedderburn (1907) and especially after the appearance of [14]. This is clearly seen in the history of group algebras, to which the second part of our paper is devoted. Only E. Noether was considered the founder of the theory of group algebras. Nowadays, more and more people refer to the role of A. Cayley in the genesis of the notion of group algebra, and as regarding to the theory of group algebras – of T. Molien, cf. [5, 8, 16]. In particular W. Gustafson writes in [5]: “Most people familiar with the early history of the theory of representations of finite groups in C, think immediately of Frobenius and Burnside, who used approaches that seem unsuitable and even bizarre in the light of modern treatments. Admittedly Frobenius’ group determinant and Burnside’s Lietheoretic approach both yielded the basic properties of complex characters. However, they said much less about the representations themselves. For this reason they have little application to the important problems of finding properties of representations over other rings[: representations over fields of finite characteristic and over rings of algebraic integers have very important applications in group theory, algebraic number theory and topology. Hence, a more flexible approach was needed.] In fact, the groundwork has been done by the little known Estonian mathematician Theodor Molien.” Let us add to this the words of H. Weyl: “The matter is closely connected with hypercomplex number systems or algebras. After Hamilton’s foundation of quaternion calculus (1843), and a long period of more or less formal research in which R. Pierce is played the major role, Molien (1892) was really the first who reached several general and profound results in
258
C HAPTER V. HISTORY OF MATHEMATICS
this direction” (cf. [23, p. 29] of the original 1939 edition1). This, as well as the recent reconstruction by J. Dieudonné, shows that Molien, undoubtedly, may be viewed as the first discoverer of the “hypercomplex aspect” of the theory of representation of groups, a discovery which is frequently ascribed to Burnside and E. Noether cf. [13]; as to Noether’s paper [14], the definition of the group ring given there and the treatment of the whole “hypercomplex aspect” has taken a modern form. Moreover, the paper has made a major impact on the style of algebraic thinking. Later, however, it was often neglected that Noether herself knew the Molien’s work well and had a high opinion of it (cf. [14]). Recently there was discovered an unexpected connection between Molien’s old result and contemporary problems of mathematics. Such a long interruption is partially explained by the fact that the actual adaption of invariant theory and the ideas of Klein’s program, to which one should precede when studying the corresponding groups and representations, was proceeding slowly. Therefore only in the 1950’s and 60’s it has lead to a new and essential posing of problems and applications. Below, following [21], we shall tell about the remarkable connection of the theory of invariants of finite groups with the combinatorial theory of counting, established using the formula (12) of Molien’s paper [12]. Knowledge and ability in combinatorics has become an essential component in undergraduate courses of applied mathematics. In many lecture courses on combinatorics, a central place is occupied by the so-called counting theory of Pólya, or, as it is nowadays adopted, the Redfield-Pólya theory. It turns out that the central result of this theory can be easily derived by an analogy of Molien’s formula. 2. Let us regard the algebra of polynomials R = C[x1 , . . . , xn ] as a vector space over the field of complex numbers C and let us represent it as the direct sum R = R0 ⊕ R1 ⊕ R2 . . . Rn ⊕ . . . , where Rn is the subspace of all homogeneous polynomial (forms) of degree n. The subspace R1 of linear forms will be denoted V and will be considered as a vector space with the fixed basis x1 , . . . , xn ; the column (x1 , . . . , xn )t will be written x. Let G be a finite subgroup of the group GL(V ) of all regular linear maps of V . As a basis is fixed in V , every element G ∈ G may be viewed as a matrix, and its action of a polynomial f ∈ R can be given by the formula f G (x) = f (Gx). In F we can distinguish the so-called subalgebra of invariants RG = {f ∈ R | ∀G ∈ G, f G = f }. An essential characteristic of the algebra RG is provided by the formal series (dimC RnG )tn , MG (t) = n≥0
which is called the Molien series of the group G. According to the theorem of Hilbert the Molien series is always a rational function. The classical result of Molien, about which we spoke at the end of Subsection 1, gives an explicit formula for the determination of this rational function, namely: 1 1 . MG (t) = |G| det(1 − tG) G∈G 1Translator’s note. Kaljulaid in [K87e] refers to p. 48 of the Russian translation (Moscow, 1947).
2. On the results of Molien about invariants of finite groups
For example, let R = C[x1 , x2 , x3 ] and G = G, H, where ⎞ ⎞ ⎛ ⎛ −1 0 0 1 0 0 G = ⎝ 0 −1 0 ⎠ and H = ⎝0 1 0⎠ 0 0 −1 0 0 i
259
,
i2 = −1.
Then G is an Abelian 8-group such that RG = C[x21 , x22 , x23 ](1 ⊕ x1 x2 ). As
⎛ 1 0 G = ⎝0 1 0 0 ⎛ −1 0 ⎝ 0 −1 0 0
⎞ ⎛ 0 1 0 ⎠ , ⎝0 1 0 ⎞ ⎛ −1 0 0⎠ , ⎝ 0 1 0
⎞ ⎛ 0 1 0 0⎠ , ⎝0 1 i 0 0 ⎞ ⎛ 0 0 −1 −1 0⎠ , ⎝ 0 0 i 0
0 1 0
⎞ ⎛ ⎞ 0 1 0 0 0 ⎠ , ⎝0 1 0 ⎠ , −1 0 0 −i ⎞ ⎛ ⎞ 0 0 −1 0 0 −1 0 ⎠ , ⎝ 0 −1 0 ⎠ , 0 −1 0 0 −i
the Molien series of G is given by 1 1 1 1 1 + + + + MG (t)) = 8 (1 − t)3 (1 − t)2 (1 − it) (1 − t)2 (1 + t) (1 − t)2 (1 + it) 1 1 1 1 + + + = + 2 2 3 2 (1 + t) (1 − t) (1 + t) (1 − it) (1 + t) (1 + t) (1 + it) 1 = . (1 − t2 )3 The Molien series carries the important information about the algebra RG , the study of which is also the central problem in invariant theory. The theory of invariants arose in England in the mid of 19th century in the form of generalization of the theory of determinants as an algebraic instrument for description of connections and configurations in projective geometry. At the beginning the foreground was the actual numerical computation of invariants of the group of all homogeneous linear transforms. This “combinatorial” development line of the theory was initiated by Cayley (1846). From determinants he proceeded to more general invariants (i.e. to algebraic expressions in the coordinates, which are changed in a definite way under non-degenerate transformations) and in 185459 he obtained a complete system of them for cubic and biquadratic forms. This was followed by important work by Sylvester, Clebsch, Cremona, Beltrami, Capelli, and others; the first of these authors has invented most of the terminology in invariant theory. As result, the so-called symbolic method was developed, and it is of current interest in modern combinatorics (cf. [21]). The number theory has also given the impulse to the development of invariant theory: the arithmetic theory (Gauss) of binary quadratic forms was forcing to study invariants of the group G of integer unimodular matrices. This line of reasoning found its sequel in the works of Eisenstein, Jacobi and Hermite. The following “abstract” development of invariant theory has moved the direct computation of invariants to a background and the main attention has been turned on general notions and relations. The final result of the key problems of the classical theory (the existence of finitely many generators of the algebra of invariants and of a finite bases for the syzygies) were obtained by Hilbert (1890-92). After these achievements the interest for problems about invariants has abruptly dropped. But in the 1930’s they again attracted the
260
C HAPTER V. HISTORY OF MATHEMATICS
interest due to developments in physics. At the same time the general formulation of the problem of invariants has been originated in the way as it was set forth at the beginning of this section and which provides the basis for reduction of the problem of invariants to a special case of the general problems of representations theory. Above we mentioned the underestimate role of S. Lie’s approach in the development of representation theory of groups. We may add to this that the theory of invariants has been productively unified with Lie’s infinitesimal methods by E. Study. Today his results have become the source of ideas that support the development of concrete differential equations for the invariants. This fact as long as the increased interest to this relation from the side of contemporary Discrete Mathematics shows that this approach did not exhaust all the possibilities.
3. The most interesting applications of the ideas and results in Molien’s paper [12] took place in the past decade [the 1970’s]. Let us now familiarize ourselves with a generalization of Molien’s formula, that lead to surprisingly wide range of applications in contemporary combinatorics. To this end let us consider the decomposition V = V1 ⊕+ · · ·⊕Vn , where Vi is the homogeneous subspace in V spanned by the basis element xi . Let G ≤ GL(V ) be a finite subgroup such that for each element G ∈ G there exists a permutation πG ∈ Sm with the property that G(Vi ) = VπG , i = 1, . . . , m. In this situation we say that G is a monomial group and it consists of monomial matrices having on each of its rows precisely one element different from zero. If, for some G ∈ G, C = {i1 , . . . it } is a cycle in the permutation πG , then {i1 , . . . it } ⊆ m and π(ik ) = ik+1 for 1 ≤ k ≤ t − 1, while π(it ) = i1 . In view of the monomiality of G there exist numbers α1 , . . . , αt ∈ C such that G(xik ) = αk xik+1 for 1 ≤ k ≤ t − 1 for 1 ≤ k ≤ t − 1 and G(xit ) = αt xi1 ; denote by γG (C) the product α1 α2 . . . αt . The type of a monomial xn1 1 . . . xnmm is the sequence τ = (τ1 , τ2 , . . . ), where τi is the number of indices nj equal to i, i.e. τi = |{nj | nj = 1}|. Let Rτ be the subspace of the C-algebra R spanned by all monomials of type τ . As the type does not change under the action of a monomial matrix G ∈ G, we have G(Rτ ) = Rτ . If we set RτG = RG ∩ Rτ , " then the representation RG = RτG grades RG as a C-space; let us, however, note that RτG · RσG is not always contained in some RμG . Molien’s formula suggests a path for finding the generating function of (infinitely many) variables t = (t1 , t2 , . . . )
LG (t) =
(dimC RτG )tτ =
τ
1 |C| |C| (1 + γG (c)t1 + γG (c)2 t2 + . . . ), |G| G∈G C
where tτ = tτ1 tτ2 . . . and C runs through all cycles of the permutations πG . For example, for the monomial group ⎛ 1 0 G = ⎝0 1 0 0
⎞ ⎛ −1 0 0⎠ , ⎝ 0 1 0
⎞ ⎛ 0 0 0 0 −1 0 ⎠ , ⎝0 1 0 −1 1 0
⎞ ⎛ 1 0 0 0⎠ , ⎝ 0 −1 0 −1 0
⎞ −1 0⎠ 0
2. On the results of Molien about invariants of finite groups
261
we obtain 1 [(1 + t1 + t2 + t3 + . . . )3 + (1 − t1 + t2 − t3 + . . . )3 + 4 (1 + t1 + t2 + t3 + . . . )(1 + t21 + t22 + t23 + . . . ) +
LG (t) =
(1 − t1 + t2 − t3 + . . . )(1 + t21 + t22 + t23 + . . . )] = =1+
∞
t2k +
k=1
∞ k=1
t2k +
∞
t2k t2 .
k,=1
4. The results on invariants of finite groups, to which the interest again arose in the 1950’s, admit various important applications in contemporary mathematics. It is especially noteworthy that the general theorem of Pólya that plays such an eminent role in combinatorics, is a special case of the generalization of Molien’s formula described in Subsection 3. Apparently, firstly this was noticed by Stanley [21]. Let us now describe briefly the ideas that led to the so-called Pólya theory. It has its origin in Cayley’s paper (1875) on counting of carbo-hydrides. However, the method proposed turned out to be impractical and so chemists did not pay much attention to it. Nevertheless, in the following 30 years many have showed interest to that technique, but on the mathematical level there was still no progress; for a survey of these attempts see [7]. The remarkable paper [18] was written by Redfield (1927). This paper remained unknown for a long time, although it contained many ideas and results that were later (1934–37) rediscovered by G. Pólya. Partially the neglecting of [18] was caused by its discouraging terminology and a hard penetrable presentation. Pólya’s work was likewise preceded by the paper [10], where the author promotes the idea of the usefulness of the terminology and technique of group representations for the counting of isomers. Let us note an interesting fact that one of cornerstones of the theory is called everywhere the “theorem or lemma of Burnside”, although according to [13] it was known to Cauchy and Frobenius long before the appearance of the book [2]. The work of Pólya and in particular his final paper [17] became a landmark in counting theory because of its influence on the subsequent development. The Redfield-Pólya theory results and its generalizations compose nowadays an important chapter of modern combinatorics. The above mentioned connection between this theory and Molien’s formula can be briefly described as follows. Let us consider the case wfere the monomial group G consists of permutation matrices. Then each element G ∈ G induces a substitution on the set F of all functions f : m → N satisfying the condition (Gf )(i) = f (πG (i)). If we now identify the f (1) f (2) f (m) function f using the monomial xf = x1 x2 . . . xm , then the action of G on the C-algebra R satisfies the relation G(xf ) = X G
−1
f
, where G−1 f (i) = f (πG−1 (i)).
The action G on F gives a partitioning of the set F , its classes (so-called G-schemes) are the orbits of this action, i.e. we write f ∼ g, if for some G ∈ G holds g = Gf . If f ∼ g then the multisets {f (1), . . . , f (m)} and {g(1), . . . , g(m)} and therefore xf and xg have the same type. One can speak of the type of G-schemes. The main problem of the counting theory of Pólya is to determine the number of G-schemes of a given type τ . Consequently, denoting the sought number by d(τ ), the counting theory problem can
262
C HAPTER V. HISTORY OF MATHEMATICS
be thought of as a problem of finding the generating series LG (t) = τ d(τ )tτ . The answer is given by theory of Pólya, which is obtained by specialization of the formula for LG (t) given above by considering the special case d(τ )tτ = LG (t) = τ
1 |C| |C| |C| (1 + t1 + t2 + t3 + . . . ), |G| G∈G C where C runs through all cycles of the permutations πG . This example does not exhaust the connections of Molien’s formula with modern mathematics. In algebra, during the recent years there has been a great interest for the non-commutative analogue of the situation considered in Subsection2. This amounts to studying the subalgebra of invariants RG of a finite group G in the algebra R = C[x1 , . . . , xn ] of polynomials of non-commutative variables xi . The corresponding generalization of Molien’s formula and its various applications are discussed in [4]. The authors of this paper developed the analogue of Molien’s formula for a non-commutative compact topological group2 G and used it for solving subtle (discrete) algebraic problems. Besides of the above-mentioned generalization of Molien’s formula finds the use in the theory of multi-partitions, in coding theory and other divisions, yielding a clearing up of the problems considered, and together with a single approach and simplification of the corresponding proofs and possibilities of generalizations. However, a more detailed analysis of these problems requires, the attraction of new notions and results and so this surpasses the bounds of the present publication. The interested Reader may acquaint him- or herself with the papers [4, 19, 20] that exhibit the importance of the paper [12] and an unprecedented value of Molien’s results. Our account is sufficient to see the unfoundedness of the pretty narrow appreciation of the scientific activity of T. Molien in the country at the turn of the century, which forced him to leave Tartu, the town where he wrote his classical papers [11] and [12] in the theory of algebras and groups. [9],[3],[4],[6] References [1] [2] [3] [4] [5] [6] [7] [8]
Nicolas Bourbaki. Éléments d’histoire des mathématiques. Masson, Paris, 1984. Russian translation: Gos. Izdat. Inostr. Lit., Moscow, 1963. W. Burnside. Theory of groups of finite order. University Press, Camnbridge, 1897. A. Cayley. On the analytical forms called trees, with application to the theory of chemical combinations. Rep. Brit. Assoc. Adv. Sci. 45, 1875, 257–305. W. Dicks and E. Formanek. Poincaré series and a problem of S. Montgomery. Linear and Multilinear Alg. 12, 1982, 21–30. W. Gustafson. Review on S. Sehgal’s topics in group rings. Bull. Amer. Math. Soc. 1, 1979, 654–657. T. Hawkins. Cayley’s counting problem and the representation of Lie algebras. In: Proc. of the Int. Congress of Math., August 3–11, 1986. Amer. Math. Soc., Providence, RI, 1987, 1642–1656. H. Henze and C. Blair. The number of isomeric hydrocarbons of the methan series. J. Amer. Chem. Soc. 53, 1931, 3077–3085. N. F. Kanunov. Fedor Eduardovich Molien. Nauka, Moscow, 1983. 2In such groups, the formula is called the Molien-Weyl formula.
2. On the results of Molien about invariants of finite groups
[9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22]
[23]
263
N. F. Kanunov. F E Molin’s work "On invariants of groups of linear substitutions". Historical Mathematical research 30, 1986, 306–338. A. Lunn and J. Senior. Isomerisms and configuration. J. Phys. Chem. 33, 1929, 1027–1079. T. Molien. Über Systeme höherer komplexen Zahlen. Math. Ann. 41, 1893, 83–156. T. Molien. Über die Invarianten der linearen Substitutionsgruppen. Sitzungsber. der Königl. Preuss. Akad. d. Wiss. 52, 1897, 1152–1156. P. M. Neumann. A lemma that is not Burnside’s. Math. Scientist 4, 1979, 133–141. E. Noether. Hyperkomplexe Grössen und Darstellungstheorie. Math. Zeit. 30, 1929, 641–692. K. Parshall. Joseph Wedderburn and the structure theory of algebras. Arch. Hist. Exact Sci. 32, 1989, 223–349. R. S. Pierce. Associative algebras. Graduate Texts in Mathematics, 88. Springer-Verlag, New York, Berlin, 1982. Russian translation: Mir, Moscow, 1986. G. Pólya. Kombinatorische Anzahlbestimmungen für Gruppen, Graphen und chemische ur Verbindungen. Acta Math. 68, 1937, 145–254. J. Redfield. The theory of group-reduced distributions. Amer. J. Math. 49, 1927, 433–455. N. Sloane. Error-correcting codes and invariant theory. Amer. Math. Monthly 84, 1977, 82–107. L. Solomon. Partition identities and invariants of finite groups. J. Comb. Theory, Ser. A 23 (2), 1977, 148–175. R. Stanley. Invariants of finite groups and their applications to combinatorics. Bull. Amer. Math. Soc. 1, 1979, 475–511. B. L. van der Waerden. Moderne algebra, I; II. Die Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen mit besonderer Berücksichtigung der Anwendungsgebiete. Springer, Berlin, 1930; 1931. H. Weyl. The classical groups. Their invariants and representations. Princeton University Press, Princeton, N.J., 1939. Russian translation: Gos. Izdat. Inostr. Lit., Moscow, 1947.
This page intentionally left blank
265
3.
Theodor Molien, about his life and mathematical work as seen a century later. (A biographical sketch and a glimpse of his work) Xerox3 copy of handwritten original [c. 1991], edited by J. Peetre, corrections by A. Zubkov Contents of the chapter 1. A biographical sketch and a glimpse of his thesis . . . . . . . . . . . 265 2. Molien’s 1897 papers on group rings and invariants . . . . . . . . .269 3. Molien type formulae in Combinatorics . . . . . . . . . . . . . . . . . . . 275 4. Noncommutative versions of Molien type formulae . . . . . . . . . 277 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
Science and art are the two sides of our life. Usually science is considering all that is determined by some laws, and so its development can be predicted. But what is history? If we know the laws of processes we can predict them. But often there appears a point in developing some idea where we have two or infinitely many choices – chaos appears. If we look behind – there were laws and predicability. But looking forward we see chaos. The historian cannot write the history of what has not happened. The history of things that have not happened but might have happened – that is art. To go back to some point and try again in a new direction and with new connections in mind. Considered in such a way, the history of mathematical ideas, can, I believe, be a useful thing for a mathematician. Something like this has happened with some of the ideas of Molien.
3.1. A biographical sketch and a glimpse of his thesis Theodor Molien was born [in Riga] on September 10, 1861. His great-grandfather was a Swede, who had settled near Reval/Tallinn in the 18th century, and was a teacher at the local school there.4 Molien’s grandfather (Andrei [Andrew]) was a watchmaker who had settled in Riga. His father, Eduard Molien, had got his education at Riga Gymnasium and afterwards at Dorpat/Tartu University, where he got a diploma as a teacher of classical languages in 1843. Then he worked as a private teacher in Riga. Theodor Molien himself5 was a student at Riga Gymnasium in 1872–79; after his father’s death he wanted very much to support and please his mother, so he was very careful in his studies. All his family (himself, two sisters and their mother) moved to Tartu in 1880. There he became a student of mathematics, with the aim to prepare himself as an astronomer – he was primarily influenced by the famous observatory and intense 3 Editors’ Note. The symbol [GAP!] is used where a portion of the text has, regretfully, been lost in the process of Xeroxing. 4 Editor’s Note. According to Kanunov [12, p.7], the great-grandfather, Johan Molien, moved to Livonia from Göteborg in Sweden in 1751. He came to live a small town near Reval/Tallinn. Kanunov says “mestechko”. 5 Editor’s Note. His full name reads Theodor Georg Andreas.
266
C HAPTER V. HISTORY OF MATHEMATICS
scientific work in it from the days of W. Struve . He was a listener of P. Helmling (mathematics), F. Minding (mechanics), A. Oettinger (physics) and P. Schwarz (astronomy). He was very much engaged by lectures and seminars by a young Swedish astronomer and mathematician Lindstedt. Anders Lindstedt was born in 1853, and after having got a doctor’s degree in astronomy from Lund, he served as an astronomer at the Tartu Observatory since 1879. After the retirement of Minding, Lindstedt served as a Professor of applied mathematics at Tartu University (1883-86). He published works on celestial mechanics and integral calculus in the Memoirs of Petersburg Academy. His lectures were new, original, and influential for students, and (what is important in our context, among them were algebra and algebraic geometry courses). What was absolutely new for Tartu was his seminar for students with the object to support their scientific work. It was in this seminar where Molien got much encouragement and advice. As a result Molien wrote and published two papers in astronomy and got his diploma [8,24]. Anders Lindstedt was the first person to recognize Molien’s talent, and he insisted on Molien’s remaining at the University to prepare himself for the doctor’s degree. He also insisted that Molien be given a stipend for continuation of his (then beginning) studies in pure mathematics in Germany, namely for participation in the famous seminar of Felix Klein in Leipzig. Under the influence of this seminar (from 1886 on it was directed by S. Lie) Molien’s astronomical interests were finally changed into pure mathematics. Molien remained in Leipzig for two years for writing there (under Klein) his master’s thesis on elliptic functions, which he presented in October 1885 in Tartu [26].6 For the next 15 years Molien was a docent in Dorpat (soon afterwards renamed Yurjev’7) University, teaching on a huge variety of fields. Among them were new courses for Tartu, e.g. on quaternions and other hypercomplex numbers, lectures on Gauss’s theory of division of the circle etc. All this time Molien was keeping contact with the Leipzig seminar. And so it happened that he was among the very few who knew of W. Killing’s work on simple Lie algebras8 and, with E. Study and F. Engel, he considered Killing’s theory as a paradigm for his own investigations of hypercomplex numbers. His thesis advisor was Friedrich Schur, who had worked in Leipzig with S. Lie during the time when Killing was working on the structure of semisimple Lie algebras. Using this paradigm, Molien succeeded in solving some problems (the corresponding paper in Mathematische Annalen appeared in 1892 [28]). In September of same year he presented these results as a doctoral dissertation at Tartu. Molien’s results on hypercomplex numbers were quickly esteemed by the experts: they were included in S. Lie’s monograph, and two years later Molien got also the Ch. Hermite Gold Medal from the Paris Academy of Sciences. Let us stop our story for a moment, and give a glimpse at some mathematical details. 6
Editor’s Note. After his return to Sweden, Anders Lindstedt was a Professor of Mathematics and Theoretical Mechanic at the Royal Institute of Technology (KTH), Stockholm, in 1886-1909 and also the Rector of this school 1902–1909. He had also several other assignments as a civil servant. He died in 1939. 7 Translator’s Note. After the Christian name of the Kiev king Yaroslav the Wise who in 1030 during a short campaign founded here a small town Yurjev, as indicated in a Russian chronicle. It was recaptured by the Estonians about 1060. 8 Quite recently (Mathematical Intelligencer 11, no. 3, 1989) Killing’s papers in the “Mathematische Annalen” were characterized by John Coleman as the greatest mathematical papers of all times – only the Elementa of Euclid, and Newton’s Principia he considers [to] have been more influential. Really, Wilhelm Killing had discovered the entire theory of simple Lie groups, i.e. what is now called Coxeter groups, Weyl groups, Dynkin diagrams . . . Slowly then, beginning [GAP!].
3. Th. Molien’s life and mathematical work
267
The successful experience in Number Theory of Gauss had, at least, two consequences. First, the theory of algebraic numbers was created (E. Kummer, L. Kronecker, R. Dedekind – to name only very few!). Second, there followed quaternions and biquaternions by W. Hamilton in 1837, and matrices by A. Cayley in 1855. Then (1884) J. Sylvester noticed the possibility )aij ) = aij Eij with Eij · Ek = δij Ei . i,j
There remains a little step to “n-ary numbers” and Dedekind’s extraction of the “hypercomplex aspect” of all these new tools. So there was opened a way to a general theory of finite-dimensional associative algebras. Among the first general results: Karl Weierstrass proved the 3-dimensional numbers do not exist, i.e. that the non-existence of 3dimensional R-algebras without zero divisors, and Frobenius’ theorem was proved. For people connected with Lie’s seminar in Leipzig a turning point in the story was provided by the following remark by H. Poincaré (1884): multiplication of n-ary numbers, ( xi ei )( yi ei ) = z i ei , is given by equations zi = ϕi (x1 , . . . , xn ; y1 , . . . , yn ) that determine a Lie group. This observation was made by Scheffers, Study, etc., and their understanding related to W. Killing’s penetrating results and notions were taken by Molien as a paradigm for his investigation of associative C-algebras. A finite dimensional algebra is said to be simple if it has no non-trivial two-sided ideals, and semisimple if its only nilpotent ideal (the radical) is zero; nilpotency of an ideal means the existence of m ∈ N such that any product with ≥ m factors is 0. According to Molien: every semisimple C-algebra is isomorphic to the direct sum of simple C-algebras. Moreover, for every such simple component Si there exists ni ∈ N such that Si Mni (C). Specializing these results to that case where the basis {e1 , . . . , en } is a group led Molien to many results on group representations. As pointed out by Hawkins and Gustafson, Molien was the first to discover the “hypercomplex aspect” of this theory. Some details in this story deserve special attention – they are to be provided later. Five years later one of several graduates of the prestigious École Normale Supérieure, encouraged by Picard, Darboux, Poincaré . . . , and having already made rigorous sense out of W. Killing’s (1888-1890) papers on semisimple Lie algebras (dissertation 1894), entered into the story. Élie Cartan was the man who clarified the notions of radical and of simple and semisimple algebras and further proved the uniqueness of the decomposition; his corresponding report appeared in 1897. He considered also the Rcase. To finish our story: in 1907, Joseph Wedderburn generalized the theory to any field k (instead C or R). In this general situation, the simple algebras are, as before, full matrix algebras. Although, now not over k itself, but over a suitable division k-algebra. In the case k = R there are three division k-algebras only: R, C and H – this fact is known as Frobenius’ theorem. As Wedderburn returned to the Peirce approach via idempotents and this approach culminated in E. Noether’s paper, the style of which became standard in algebra for a long time, Molien’s name fell into oblivion for at least 50 years. This has happened despite the fact that E. Noether herself highly respected both Molien’s and Cartan’s contributions. Perhaps, one of the reasons was also that [GAP!].
268
C HAPTER V. HISTORY OF MATHEMATICS
There is no possibility to go into further details. Perhaps Karen Parshall’s report [18] on Joseph Wedderburn deserves your attention. One third of it is devoted to the history of the Molien-Cartan results. And, of course, Thomas Hawkins’s brilliant papers [11, 12] on the Hesse principle, Cayley’s counting principle and others, in this “Lie field”. After these comments let us continue our account about Molien. During the years following 1892 he simplified some proofs in his Mathematische Annalen paper, and published further three papers (two of them in a local journal) on finite substitution groups, using his theory of algebras. Quite quickly, Frobenius underlined the importance of Molien’s work on group representations. Nevertheless, Molien remained a docent at Yurjev University until the very beginning of the 20th century 1900. As an example of motives raised against him when applying for a professorship [ e.g. at Kharkov University] the Commitee (Lyapunov, Struve, Steklov, Koval’skiˇı) declared: “. . . we have not been able to gather an independent opinion about the degree of originality of Molien’s work, as it lies far away from the mainstream of mathematical thought, and so the Commitee knows these matters only superficially. This new-born theory of algebras seems to be a complicated and artificial construction motivated by the pure desire to generalize usual numbers, and therefore it cannot be justified properly . . . ” So it happened that Molien was forced to accept an offer from the Tomsk Technological Institute (in Siberia). Probably this was to some extent due to the fact that there was a friend of Molien, a certain P. Kadik 9 who had studied together with Molien in Tartu and who, after having obtained a master’s degree at Tartu University in 1885, had settled in Tomsk. He taught at the Gymnasium and had corresponded with Molien all these years. In Tomsk Molien worked until his death in 1941. He set up the standards for many mathematical courses, wrote a series of lecture notes (differential calculus; differential equations; geometry): in the period of 1902-1909 he published notes from 12 such courses. He was the first Professor of Mathematics in Siberia. Although he was highly esteemed by both students and colleagues, he was forced to retire in 1911. During the next three years nobody knew that there existed a circular about giving him the “Emeritus” – it was well hidden somewhere in the Russian Ministry of Education. And so he was not allowed by the officials to teach at the Institute. So Molien gathered a mathematical seminar outside the Institute, where most of the Tomsk mathematicians participated. He gave also some survey lectures on algebra and arithmetics for teachers in Ufa, and lectured to higher women courses in Tomsk. In 1917 the mathematical faculty at Tomsk University was opened10, and his colleagues from the Institute days called him to return as the Professor at this new University. Since then, during more than 20 years, almost all mathematics students participated in Molien’s seminars, which most often were devoted to elliptic functions and to the theory of surfaces. He had many postgraduate students (on Lie algebras, on minimal surfaces, on function theory), and he is also viewed as the founder of the Tomsk school of differential geometry. He did not stop his efforts to continue working in algebra despite his very intensive pedagogical work in other fields. For instance, in 1930 he published a note where he gave an example of a transcendental equation having an algebraic number as one of its 9Editor’s Note. Maybe Peteris Kadikis (1857-1923), Latvian, studied mathematics in Tartu and was a
private docent there. 10Note by Aleksandr Zubkov. Tomsk University itself was founded in 1879. In 2004 they celebrated their 125-th anniversary.
3. Th. Molien’s life and mathematical work
269
roots but not all conjugates of this “algebraic” root are roots of the equation. In 1935 he attempted to do some systematic work in the theory of algebras. In his last years he was very interested in hypergeometric series – he has an almost finished manuscript giving a systematic survey of the theory. There are also almost finished papers about Galois groups: there he wants to finds linear groups with [GAP!] a given Galois group is contained [GAP!]. Furthermore, there are almost finished methodological notes on Lobachevsky’s views in Geometry, Cremona transformations . . . There are lecture notes, e.g. notes on the theory of elliptic functions (from the time of the Klein-Lie seminars), notes on the history of mathematics. There are reprints of Hurwitz, Dehn, Klein, Kneser, Kronecker, Minkowsky, Study, Frobenius, Schur, Engel and others. Letters from Hurwitz, Klein, A. Kneser, Frobenius, I. Schur, Struve. All these and other things of the Molien Archive were left by his daughter Eliza to her blind student V. D. Fatneva, a Latinist. What will be the further fate of this heritage? In Siberia Molien has not been forgotten. In 1986, his bas-relief was put on the house in Nikitin Street where he had lived. Also his portraits hang at Tomsk University. The well-known Russian algebraist A. Mal’cev has designated Molien as the first professor and the patriarch of Siberian mathematics in the pre-war period. Recently, Professor Leonid Bokut, the leader of a well-known ring theory-school in Siberia (Efim I. Zelmanov and A.R. Kemer were [among] his postgraduate students) visited Tartu, and he declared, in his talk, that Th. Molien should be considered as the first real classic in the field of Algebra in the Russian Empire of that time. Sources for further details: a booklet in Russian with comments on Molien’s dissertation by Kanunov (a graduate from Tomsk University) [14] and, similarly a booklet with Russian translations of his main (1892 and 1897) papers [16].
3.2. Molien’s 1897 papers on group rings and invariants To get a 3-dimensional R-space with basis G = {x, y, z} we take all formal Rcombinations αx x + αy y + αz z. Similarly if |G| > 3 and, instead of R, there is any field K, we get the space V (G, K) = {α | α = g∈G αg g}. If G is a group, then its multiplication can be extended (distributively) to V (G, K). In this way we obtain the group ring KG. The elements of KG, i.e. the formal series sums g∈G αg g can be interpreted def
as mappings α : G → K with finite support, α(g) = αg ; their multiplication in KG is called convolution. Quite often E. Noether is considered to be the only creator of group algebras. Nevertheless, A. Cayley (in 1854) dealt with the ring C[S3 ]. A look at Molien’s paper (1897) on invariants of substitution groups shows that some first fundamental results in this field are due to him. The genesis of the the notion of “group algebra” can be illustrated by the diagram in Figure 111. More is true: in 1895–97 Molien discovered (independently of G. Frobenius) the basic facts in group representation theory. He was formally motivated by 11Editor’s note. Part of this chart is missing in the Xerox version. We are, however, convinced that it must be Euler that points to Hamilton; in 1770 he gave a parametrization of rotations in R3 , which can be interpreted in terms of quaternions. This has made some authors to view Euler as a forerunner of Hamilton: if a is a quaternion of unit length, one associates to it an orthogonal transformation given by x → a−1 xa (see e.g. [3, p. 3-4]). Another predecessor of Hamilton was C. F. Gauss. In posthumous work (cf. [10, especially p. 358]), he parameterized rotations with the aid of 4-tuplesx = (x0 , x1 , x2 , x3 ) ∈ R4 . If there are two rotations corresponding to x and y respectively, and z corresponds to their composition, he wrote down
270
C HAPTER V. HISTORY OF MATHEMATICS
Group algebras 6 gOOO nnn OOO nnn OOO n n n OO' n n n n nv nn Group Representation Hypercomplex numbers Theory / (Dedekind, Peirce, Noether) (Frobenius, Molien, < aDD hPPP x I.Schur, Noether) DD PPP xx O DD PPP xx PPP DD xx x P DD x PPP x DD xx DD xx Lie Theory Vector spaces DD xx x DD (Poincaré, Killing) (Hamilton, Grassmann) x S S hRRR x D j x D j RRR S S D j xx RRR j j S S DDD xx RRR j x D R S xx j j Substitution groups Matrix algebras Algebraic numbers (Cayley, Sylvester) (Cauchy, Galois, Jordan) (Kummer, Kronecker, Dedekind) O O O (Cayley, Molien, Noether)
Z[i] (Gauss)
Algebraic equations, Galois Theory
H iRRR RRR mm6 (Hamilton) m m RRR RRR mmm RRR mmm RRR mmm m m RR mm Parametrization of rotations
(Lagrange, Gauss, Abel, Galois)
(Euler, Gauss)
Fig. 1: Genesis of the the notion “group algebra”
the problem of determining the representation of minimal degree for a group. This problem had been suggested by F. Klein’s attempt to generalize Galois theory. The main step in Molien’s approach can be well illustrated by having a look at the problem of studying group determinants – the formal motivation for G. Frobenius. Let G be a finite group, |G| = n, and let {xg |g ∈ G} be n independent variables (over C). Frobenius’ theory of representations of finite groups in its historical context def was concerned with factorization of the group determinant Dg = det )xh−1 g ), with h, u ∈ G, viewed as a polynomial in C[xg |g ∈ G]. Take the field K = C(xg |g ∈ G) of rational functions over C. The group algebra KG can be viewed as a K-space V with all elements of G as its basis. Right multiplication by an element XG = g∈G xg on KG gives an endomorphism of V with matrix )xh−1 g ): xg g = xg (hg) = xh−1 u u, h ∈ G −→ h · Xg = h · g∈G def
g∈G
u∈G
where u = hg → g = h−1 u. We see that h −→ h·Xg = u∈G gxh−1 u u, again is a Klinear combination of all basis elements u ∈ G, so the matrix of this K-endomorphism is )xh−1 u ). As a result, the group determinant DG is interpreted as the determinant of the endomorphism of V , given by right multiplication by XG on KG. As char K = 0 (N.B. K = C(xg |g ∈ G)), the group ring KG is known to be semisimple. Therefore it is isomorphic as a K-algebra to the direct product of a full matrix algebra (over K = the components of z in terms of the ones of x and y, which again corresponds to the multiplication of the corresponding quaternions.
3. Th. Molien’s life and mathematical work
271
C(xg )): (85)
ψ
ψ = (ψ1 , . . . , vs ) : KG −→ Mn1 (K) × · · · × Mns (K)
Every element Y ∈ Mn (C) corresponds to an endomorphism of row spaces Cn → Cn , (z1 , . . . , zn ) → (z1 , . . . , zn )Y . It follows that the endomorphism Mn (C) → Mn (C), given by right multiplication by Y on Mn (C), has the determinant (det Y )n . Indeed, as a right Mn (C)-module, Mn (C) is isomorphic to Cn ⊕ · · · ⊕ Cn , and so right multiplin
cation by Y on Mn (C) can be viewed as an endomorphism Yˆ of Cn ⊕ · · · ⊕ Cn with the matrix ⎛ ⎞ Y 0 ··· 0 ⎜ ⎟ ⎜ 0 Y ··· 0 ⎟ ⎜ . .. ⎟ .. .. ⎜ . ⎟ . ⎝ . . ⎠ . 0 0 ··· Y with the determinant (det Y )n . Using this result together with formula (85) we get the following formula: ns DG = det ψ1 (XG ))n1 · · · · · det ψm (XG ) . It appears that this is the complete factorization of DG in C(xg ) and so we have obtained a solution to Frobenius’ question [9]. All that has been said above is true for any k instead of C. K. Johnson (1988) raised the question (in a combinatorial context) whether the group determinant determines the group G. It was proved recently by E. Formanek and D. Sibley [8] that is indeed true in the nonmodular case, i.e. if char k |G|. More precisely, they established the following. T HEOREM 3.1. If G and H are finite groups, char k |G|, and ϕ : G → H is a bijection (of them as sets!) such that ϕ(D ˆ H ) = DH for fˆ(xg ) = xϕ(g) , then G ∼ = H as groups. Next, we are going to give some details about Molien’s formula, another remarkable result in his 1897 paper [32]. Take the polynomial ring R = C[x1 , . . . , xn ] and, viewing it as an R-space, present ∞
it in the form R = ⊕ Ri , where Ri is the subspace of all homogeneous polynomials i=0
(forms) of degree i, i = 1, 2, . . . . The subspace V = R1 of linear forms has x1 , . . . , xn as its basis, thus is n-dimensional. Fix any finite subgroup G ≤ GL(V ) in the group GL(V ) of all C-linear automorphisms of V . An action of G on R is induced by the formula def f A (x) = f (xA), x = (x1 , . . . , xn ), A = )aij ). This yields the subalgebra of G-invariants in R, RG (x) = {f ∈ R | ∀A ∈ G, f A = f }, with the homogeneous components RiG = RG ∩ Ri . Substantial information about the subalgebra RG is given by the formal series def MG = (dimC RiG )ti , def
i≥0
272
C HAPTER V. HISTORY OF MATHEMATICS
called its Hilbert-Poincaré series of RG , sometimes also its Molien series. Indeed, this series is a rational function in t, and Molien proved (1897) the following theorem. T HEOREM 3.2. Let R = C[x1 , . . . , xn ] and G ≤ Mn (C) as above, and let G = {A1 , . . . , Ag } be all its elements. Then the generating function for the numbers dimC RiG of linearly C-independent i-forms is given by g 1 1 . (86) MG (t) = g α=1 det(I − tAα ) E XAMPLE 3.1. For G = C2 =
1 0
0 −1 0 , 1 0 −1
we have RG = C[x1 , x2 ]G = C[x21 , x22 ] ⊕ x1 x2 C[x21 , x22 ] and 1 1 + t2 1 1 MG (t) = = + . 2 (1 − t)2 (1 + t)2 (1 − t2 )2
⎞ −1 0 0 E XAMPLE 3.2. R = C[x1 , x2 , x3 ] and G = G, H with G = ⎝ 0 −1 0 ⎠ 0 0 −1 ⎞ ⎛ 1 1 0 and H = ⎝0 1 0⎠, i2 = −1, we have that |G| = 8 and that G is Abelian, 0 1 i ⎧⎛ ⎞ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ 1 0 0 1 0 1 1 1 0 ⎨ 1 0 0 G = ⎝0 1 0⎠ , ⎝0 1 0⎠ , ⎝0 1 0 ⎠ , ⎝0 1 0 ⎠ , ⎩ 0 0 −i 0 0 i 0 0 −1 0 0 1 ⎞⎫ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎛ −1 0 0 ⎬ −1 0 0 −1 0 0 −1 0 0 ⎝ 0 −1 0⎠ , ⎝ 0 −1 0⎠ , ⎝ 0 −1 0 ⎠ , ⎝ 0 −1 0 ⎠ . ⎭ 0 0 −i 0 0 −1 0 0 i 0 0 1 ⎛
and RG = C[x21 , x22 , x43 ](1 ⊕ x1 x2 ). According to (86) we get (R. Stanley [??]) 1 1 1 + MG (t) = + 3 2 8 (1 − t) (1 − t) (1 − it) 1 1 + + + (1 − t)2 (1 + t) (1 − t)2 (1 + it) 1 1 + + + 2 2 (1 − t) (1 − t) (1 + t) (1 − it) 1 1 + = + (1 + t)3 (1 + t)3 (1 + it) 1 = . (1 − t2 )3 R EMARK 3.3. Two nonisomorphic groups can have the same Molien series: e.g. the dihedral group D4 and the Abelian group C2 × C4 have both the series 1 = MC2 ×C4 (t). MD4 (t) = (1 − t)2 (1 − t4 )
3. Th. Molien’s life and mathematical work
273
( ' For any polynomial f (x), its mean g 1 ˜ f (xAα ) f(x) = g α=1
is also G-invariant. It is clear that, generally, any symmetric expression in the polynomials f (xA1 ), . . . , f (xAg ) is again a G-invariant. There exists a finite polynomial basis for RG , i.e. a set of all G-invariants f1 , . . . , f , > n such that any G-invariant f can be written as a polynomial in f1 , . . . , f . Then there are polynomial equations, of course, relating f1 , . . . , f , called syzygies12. E.g., f1 = x21 , f2 = x1 x2 , f3 = x22 form a polynomial basis for C[x1 , x2 ]C2 with the syzygy f1 f3 − f22 = 0. The existence and a method for finding a polynomial basis is given by the following. T HEOREM 3.4 (E. Noether [17]). The ring of invariants R = C[x1 , . . . , xn ]G for G ≤ Mn (C) has a normal polynomial (or integrity) basis, with not more than n+g n invariant in it, and their degree not exceeding g, g = |G|. Such a polynomial basis may be obtained by averaging over G of all monomials xa1 1 · . . . · xann with i ai = g, i.e. all monomials of degree g. Among polynomial bases the most important are the so-called “good polynomial bases”. It is not hard to prove that there always exist n algebraically independent Ginvariants. A good polynomial basis for RG consists of homogeneous G-invariants ( ≥ n) where: (1) f1 , . . . , fn are algebraically independent, and , furthermore, (2) we have C[f1 , . . . , fn ], if = n; or, G R = C[f1 , . . . , fn ] ⊕ fn+1 fn+1 C[f1 , . . . , fn ] ⊕ · · · ⊕ f C[f1 , . . . , fn ], if > n. In other words, any G-invariant can be written as a polynomial in ([GAP!] l > n) as such a polynomial (in [GAP!]). This means that f1 , . . . , fn are “free invariants” in the sense that they can be used as often as needed, while fn+1 , . . . , f are “transient” and can be used at most once. It is interesting to point to the following theorem proved by M. Hochster and J. Eagon [13] (1971), and independently by E. Dade [4] (1964). T HEOREM 3.5. Any finite group G has a good polynomial basis of invariants. For this good polynomial basis the syzygies are given by a simple rule: • if = n, then there are no syzygies; • if > n, then there are ( − n)2 syzygies, which express the products fi fj (i ≥ n, j ≥ n) in terms of f1 , . . . , f . 12Editor’s note. The word “zyzygy” was, in this mathematical context, apparently, first used by David Hibert. Etymology:, from Latin zyzygia, Greek συζυγια, yoked together, in turn from συν, together, and ζυγoν, yoke, the last word appearing as a loan in many languages, not only Indo-European ones, such as English, German, Estonian, Finnish, Russian.
274
C HAPTER V. HISTORY OF MATHEMATICS
Let the degrees of a good polynomial basis be known for RG : def deg f1 ,. . . , n = deg f . Then the Molien series of RG is given by ⎧ 1 ⎪ ⎪ , if = n; ⎪ ⎨ ni=1 (1 − tni ) (87) MG (t) = ⎪ 1 + j=n+1 tnj ⎪ ⎪ n , if > n. ⎩ ni i=1 (1 − t )
n1
def
=
(These formulae can be verified by expanding the right hand sides in powers of t and then comparing with RG = C[f1 , . . . , fn ] ⊕ fn+1 C[f1 , . . . , fn ] ⊕ · · · ⊕ f C[f1 , . . . , fn ], if > n.) 1 0 −1 0 E XAMPLE 3.3. Let G = C2 = , be our group. i.e. we take 0 1 0 −1 the cyclic group of order 2. Its homogeneous invariants are f1 = x21 , f2 = x1 x2 and f1 = x22 . One sees that this is a good polynomial basis with n1 = n2 = n3 = 2. So we have RG = RC2 = C[x21 , x22 ] ⊕ x1 x2 C[x21 , x22 ]. This means that any C2 -invariant can be written uniquely as a polynomial in x21 and x22 plus (perhaps!) x1 x2 times another such polynomial. Here = 3 > 2 = n, so by (87) MC2 (t) =
1 (1 + t2 ) = . (1 − t2 )(1 − t2 ) (1 − t2 )2
There is the single syzygy x21 x22 = (x21 x22 )2 . R EMARK 3.6. At the same times the polynomials in the above example, taken in a different order: x21 , x21 x22 , x22 do not give a good polynomial basis! It suffices to notice that RG + x42 ∈ C[x21 , x1 ] ⊕ x22 C[x21 , x1 x2 ]. ( ' R EMARK 3.7. As a consequence of the above Hochster-Eagon-Dade theorem, for any finite G, its Molien series can be put in the form (87), as there exists a good polynomial basis whose degrees match the powers of t in (87). ( ' R EMARK 3.8. However, the converse to (2) is, in ⎞ general, not true. Indeed, if we ⎛ ⎞ ⎛ 9 −1 0 0 1 0 0 : take the group G = ⎝ 0 −1 0 ⎠ , ⎝0 1 0⎠ , then it has the Molien series 0 0 i 0 0 −1 (88)
MG (t) =
1 , (1 − t2 )3
which by multiplication of both denominator and numerator by 1+t2 , can also be written as (89)
MG (t) =
1 + t2 (1 − t2 )2 (1 − t4 )
3. Th. Molien’s life and mathematical work
275
As seen above, there exits a good basis corresponding to MG (t) in form (88), which gives us C[x , x , x ]G = C[x2 , x2 , x4 ] ⊕ x x C[x2 , x2 , x4 ]. 1
2
3
1
2
3
1 2
1
2
3
But not corresponding to the form (89). It is a question of N. J. Sloane (1977): to which forms of MG (t) does there correspond good polynomial bases, and to which not? There are old results by Shephard-Todd (1954), but, in general, it seems to be open. ( '
3.3. Molien type formulae in Combinatorics.
13
During the past decades this old theme has been combined with new ones, so in order to gain greater coherence in understanding combinatorial and algebraic problems, I shall give briefly three such results. 3.3.1. Let V = V1 ⊕ · · · ⊕ Vn , with all dim Vi = 1 and xi as basis vectors of Vi . Let G ≤ GL(V ) be such a finite subgroup that for every G ∈ G there exists a πG ∈ Sn with Vi G = VπG (i) , i = 1, 2, . . . , n. In this case G is called a monomial group; it consists of monomial matrices, i.e. of matrices such that every line contains exactly one non-zero element of C. For any cycle {C = (i1 , . . . , it )} of πG we have {i1 , . . . , it } ∈ n, π(ik ) = ik+1 (if 1 ≥ k ≥ t − 1) and π(it ) = i1 . Monomiality of G means that ∃ α1 , . . . , αt ∈ C, G xG ik = αk xik+1 (if 1 ≥ k ≥ t − 1) and xit = αt xi1 . Put γG (C) = α1 · . . . · αt . For a1 an any monomial x1 · . . . · xn let τ be its type, the sequence such that τ = (τ1 , τ2 , . . . ) def
with τi = #{ak | ak = i}. Next, take the subspace Rτ of all monomials of type τ . Then for any G ∈ G it is clear that RτG = Rτ . Therefore setting RτG ∩ Rτ , we see that RG = ⊕τ RτG is a graduation of this C-space RG , and that the following Molien-type formula is true: 1 |C| |C| 2 PG (t) = (dim RτG )tτ = (1 + γG (C)t1 + γG (C)t2 + . . . ), |G| τ G∈G C with C here covering all cycles of the substitution πG , and t = (t1 , t2 , . . . ) being in determinants; |C| denotes the length of the cycle C, and [GAP!]. See Stanley [???]. 3.3.2. In the special case of the monomial matrices being permutation matrices, every element G ∈ G induces a substitution on the set F of functions f : n → N by f G (i) = f (1) f (2) f (πG (i)). To every such function there corresponds the monomial xf = x1 x2 · f (n) . . . · xn , and so G acts on the C-algebra R = C[x1 , . . . , xn ] by the formula ∀ i ∈ G f G n, (x ) = xf , where f G (i) = f (πG (i)). The orbits under this action are called G-schemes; they are given by the following equivalency on F: f ∼ g ⇐⇒ ∃ G, g = f G . This means that the multisets {f (1), . . . , f (n)} and {g(1), . . . , g(n)} coincide; in particular, the monoidals xf and xg have the same type. The main problem of the RedfieldPólya theory can be described as the question to find the number d(τ ) of G of a given 13See Tambour [23] for other finite lattices and their automorphism groups (homogeneous – [GAP!] degrees).
276
C HAPTER V. HISTORY OF MATHEMATICS
type τ . The answer is given by the following formula: 1 |C| |C| 1 + t1 + t2 + . . . , d(τ )tτ = PG (t) = |G| τ G∈G C where C runs over all cycles of πG . 3.3.3. Torbjörn Tambour also recently published a paper on this topic [23] (1989). His problem is the following. Let G be a finite group acting on a finite set S. It induces an action on Pk (S), the set of k-subsets of S: {s1 , . . . , sk }G = {πG (s1 ), . . . , πG (sk )}. Denote by pk the number of G-orbits of this action. Tambour aims at finding pk tk . [GAP!] generalize [GAP!] to other lattices. P ROBLEM . Finite vector spaces or equivalences of some others. A function-theoretic interpretation of this question is possible. First, we interpret any k-subset n as the image Im f of a suitable injection f : k → n. The action of G on n induces an action of G on Pk (n) : f → f G with the rule, ∀ i ∈ k, f G (i) = πg (f (i)). It follows that if Im f = (Im f )πG , from which it again follows that Im f is a (disjoint) union of cycles of πG . The converse “if Im f is a union of cycles of πG ” is obvious. So it follows that k (90) 1+ iG (1 + t|C| ); kt = k≥1
C<πG
here iG G denotes the number of G-fixed points of (Pk (n), G); all possibilities of putting together the various cycles of the substitutions must be taken account of. Forming the 1 sum |G G∈G in both sides of (90) and using the Cauchy-Frobenius lemma in the left | hand side, we get 1 k iG (1 + kt )= |G| k≥1 G∈G 1 = (1 + t|C| ) = |G| k≥1 C<πG ⎛ ⎞ 1 1 ⎝ ⎠ tk = = 1+ iG k |G| |G| k≥1 G∈G G∈G k =1+ pk t . ( ' (In the last step we used the Cauchy-Frobenius Lemma.) To see the similarity of this result with Molien’s formula Tambour interprets this formula in the following way. Take V to be the C-space with basis S = n, and let VC be
3. Th. Molien’s life and mathematical work
its subspace with the cyclic basis C < πG . ⎛ 0 ⎜0 ⎜ [G]C = ⎜ ⎜· ⎝0 1
277
With this choice, πG/VC has the matrix ⎞ 1 0 ...0 0 1 . . . 0⎟ ⎟ · · ... ⎟ ⎟, 0 0 . . . 1⎠ 0 0 ...0
and so we get per(I|C| )+t|C| ) = 1+t|C|). Now, by the permanent analogue of Laplace’s formula, (1 + t|C| ) = per(I|C| + t[G]C ) = per(In + t[G]n ), C<πG
[GAP]
C<πG
14
3.4. Noncommutative versions of Molien type formulae 3.4.1. Representations of Sn Let k be a field of characteristic 0. Then we have k[Sn ] = Mn1 (k) × · · · × Mnκ (k), a direct product of full matrix rings. Here κ is the number of distinct irreducible representations of Sn , and n1 , . . . , nκ are the dimensions of the corresponding simple modules. The number of simple factors equals the number of partitions of n: λ = (λ1 , . . . , λκ ),
λ1 ≥ · · · ≥ λκ > 0,
λ1 + · · · + λκ = n.
Here n = |λ| is the weight (or size) of λ, and κ is the length of λ. To each partition λ there corresponds a Young diagram D(λ) with κ rows as λi boxes in the ith row: λ = (3, 2, 2, 1)
There exists an algorithm which associates representations with partitions of n. Let us also add that k[Sn ], n ≥ 2, has exactly two 1-dimensional representations: • the trivial representation: λ = (n)
D ( λ) =
14Editor’s note. Section 3 stops abruptly on p. 19 of the manuscript, the formula per(I + t[G] ) n n being barely visible. The text continues then only on p. 33 with Section 4. Thus as much as some 10 pages, regretfully, may be missing in the present book.
278
C HAPTER V. HISTORY OF MATHEMATICS
• sign representation: λ = (1, ..., 1)
D ( λ) =
The set of diagrams is partially ordered by D1 ≥ D2 if D2 can be obtained from D1 by adding boxes, see e.g. Figure 2.
... ... ... Fig. 2: The lattice of Young diagrams
For any diagram diagram D(λ) let M (D) be the corresponding simple Sn -module. Then we have the following two fundamental facts: (1) Any two-sided ideal in k[Sn ] is a direct product of some matrix algebras in the direct decomposition of k[Sn ], and each such matrix algebra corresponds to a diagram (2) Let u < n; if M is an Su -module, then its restriction M |Sn is an Su -module. Let u > n; if M is an Su -module, then its induction is given by M Su ∼ = k[Su ]⊗k[Sn ]M . The following is true: T HEOREM 3.9 (Branching Theorem). Suppose that M (D) is an irreducible Sn module. Let A1 , . . . , As be all diagrams of weight n − 1, which precede D. And let B1 , . . . , Bt be all diagrams of size n + 1, which follow D. Then ∼ M (D)|S = M (A1 ) ⊕ · · · ⊕ M (As ), n−1
and
M (D)Sn−1 ∼ = M (B1 ) ⊕ · · · ⊕ M (Bt ).
3. Th. Molien’s life and mathematical work
279
3.4.2. Representations of GL(n, K) Let V be a vector space over κ of dimension n; it is a standard GL(n, k)-module. The group GL(n, κ) acts diagonally also on V ⊗κ : (v1 ⊗ · · · ⊗ vκ )G = v1G ⊗ · · · ⊗ vκG . The symmetry group Sκ acts on V ⊗κ by permuting positions: (v1 ⊗ · · · ⊗ vκ )πG = vπ−1 (G)(1) ⊗ · · · ⊗ vπ−1 (G)(κ) . It is possible to describe the structure of V ⊗κ as a GL(n, k)-module. Let M (D1 ), . . . , M (Dt ) be the full set of irreducible Sκ -modules corresponding to def
def
Young diagrams with κ boxes, and set mi = M (Di ), ni = the multiplicity of M (Di ) in V ⊗κ . Then V ⊗κ = U1 ⊕ · · · ⊕ Ut ∼ = n1 M (D1 ) ⊕ · · · ⊕ nt M (Dt ), where Ui is the sum of all irreducible Sκ -modules of V ⊗κ isomorphic to M (Di ). It is a manifest that each Ui is GL(n, k)-invariant and that V ⊗κ = U1 ⊕ · · · ⊕ Ut ∼ = m1 N (D1 ) ⊕ · · · ⊕ mt N (Dt ), with N (D1 ), . . . , N (Dt ) being non isomorphic irreducible Sκ -modules (or zero!), while each Ui is the sum of all irreducible submodules in V ⊗κ isomorphic to N (Di ), and ni = dimk N (Di ). Let us add that the numbers ni and mi here can be computed directly from the Young diagrams by some ingenious algorithms. It appears that mi = 0 if and only if the Young diagram Di has ≤ n = dim V rows. One has the following fundamental theorem. T HEOREM 3.10. The irreducible GL(n, k)-submodules of V ⊗κ are in 1-1 correspond with the Young diagrams with κ boxes and n = dimk V rows. As a GL(n, k)module, one has V ⊗κ ∼ = m1 N (D1 ) ⊕ · · · ⊕ ms N (Ds ), where D1 , . . . Ds are all the Young diagrams with κ boxes and ≤ n rows; the N (D1 ), . . . , N (Ds ) are non-isomorphic irreducible GL(n, k)-modules, and the multiplicities mi = dimension of the irreducible Sk -module M (Di ). For distinct κ, the irreducible GL(n, k)-modules which occur are non-isomorphic; this follows from the fact that V ⊗k is a vector space of dimension κn, so that the action of GL(n, k) on V ⊗κ gives rise to a homomorphism GL(n, k) → GL(κn, k), (aij ) → (fpq (aij )), the fpq being homogeneous polynomials of degree κn, and all finite dimensional GL(n, k)modules arise from this construction. To describe briefly the representation theory of GL(n, k), we need one more notion – the Grothendieck ring S = S(GL(n, k)) of GL(n, k)-modules: let by [M ] be the element of the ring of equivalency classes GL(n, k)-modules represented by the module M , and take addition and multiplication of these elements to be [M ] + [N ] = [M ⊕ N ];
[M ] · [N ] = [M ⊗k N ],
with GL(n, k) acting here diagonally on M ⊗k N .
280
C HAPTER V. HISTORY OF MATHEMATICS
3.4.3. Brief summary of representation theory on GL(n, k) (1) Every finite dimensional GL(n, k)-module is a direct sum of irreducible GL(n, k)-module in a unique way described above; (2) There exists an isomorphism (called the character map) χ : S(GL(n, k)) → Z[x1 , . . . , xn ]Sn between the group ring of finite dimensional GL(n, k)-modules and symmetric functions in n commuting variables x1 , . . . , xn ; (3) If M is a finite dimensional GL(n, k)-module, χ[M ] its character and G ∈ GL(n, k) has eigenvalues α1 , . . . , αn , then the trace of G, as a linear operator on M , is χ[M ](α1 , . . . , αn ); (4) There is a 1-1 correspondence between irreducible GL(n, k)-modules and partitions λ = (λ1 , . . . , λn ) of length ≤ n. If Mλ is the irreducible module corresponding to λ, its character χ[Mλ ] is denoted Sλ and called the Schur function associated with λ; (5) [M :k] = χ[M ](1, . . . , 1), i.e. the dimension of the module M is equal to the value of χ[M ] in (1, . . . , 1) of the corresponding Schur function. AGREEMENT. If m < n and λ = (λ1 , . . . , λm ) is a partition of length m, then there exist two distinct irreducible modules Mλ (m) and Mλ (n) for GL(m, k) and GL(n, k) respectively with distinct Schur functions Sλ (m)(x1 , . . . , xm ) and Sλ (n)(x1 , . . . , xn ), and they are related by Sλ (m)(x1 , . . . , xm ) = Sλ (n)(x1 , . . . , xm , 0, . . . , 0). ( ' 3.4.4. Relatively free algebras and their character series Let k be a field of characteristic zero, and V a vector space over k with basis x1 , . . . , xn . Let R = k[x1 , . . . , xn ] = k[V ] = k ⊕ V ⊕ V ⊗2 ⊕ V ⊗3 ⊕ · · · = ⊕ Ri , i
i⊗
where V denotes the ith symmetric power of the V , i.e. the subspace in R spanned by the monomials in x1 , . . . , xn of degree i. So, it is the case of a free commutative algebra over R = k[V ], or, in other words, the case of the polynomial ring in x1 , . . . , xn . There exists also a non-commutative analogue of this algebra, namely the free associative algebra of rank n, R = kx1 , . . . , xn = kV = k ⊕ V ⊕ (V ⊗ V ) ⊕ S ⊗3 ⊕ · · · = ⊕ Ri . i
It is obvious that k[V ] = kV /C, where C is the commutator ideal in kV . As C is homogeneous (i.e. Ci = C ∩ V ⊗i ), the grading in k[V ] is induced by the one in kV . In both cases GL(n, k) = GL(V ) leaves invariant the homogeneous components Ri in the induced action of GL(V ) on R, so giving the group of homogeneous automorphisms of R. And for any (finite) subgroup G ≤ GL(V ) we can study the fixed ring RG , the subalgebra of G-invariants. There are three important classical results:
3. Th. Molien’s life and mathematical work
281
(1) Molien (1897): for char k = 0 one had the Poincaré series 1 1 ; H(k[V ]G ) = |G| det(I − Gt) G∈G (2) E. Noether (1916): k[V ]G is finitely generated as a k-algebra; (3) Shephard-Todd (1954), Chevalley (1955): For char k = 0 the fixed algebra k[V ]G is a free commutative algebra i.e. (it is itself a polynomial algebra) if and only if G is generated by pseudo-reflections. Here an element G ∈ G is called a pseudo-reflection if it has an eigenvalue 1 with multiplicity n − 1 = dim V − 1. Next, we want to describe some (quite recent) noncommutative extensions of these (classical) theorems. For this we need some more notions. For any graded k-algebra R = k ⊕ R1 ⊕ R2 ⊕ . . . with all its homogeneous components finite dimensional over k, the Hilbert (or Poincaré) series of R is the formal series (dimk Ri )ti . H(R) = 1 + i≥1
Let R = kX = kx1 , . . . , xn , . . . be a free associative algebra of countably infinite rank, while kV remains a free associative algebra of finite rank. We call an ideal T in kX (or in kV ) a T -ideal if T is closed under k-endomorphisms. And kX/T is called a relatively free algebra, and kV /T a relatively free algebra of rank n. In this last case we have ⊕ R = kV /T = kx1 , . . . , xn /T = Ri with Ri = V ⊗i /(T ∩ V ⊗i ). i≥0
The character series of R is defined by χ(R) = 1 +
χ[Ri ]ti ,
i≥1
where χ is the character map in the “Main Representation Theorem”. So, χ(R) is a formal series in t with coefficients in Z[x1 , . . . , xn ]Sn , i.e. χ(R) ∈ Z[x1 , . . . , xn ]Sn [[t]]; and it can be written also in terms of Schur functions: χ(R) = a(λ)Sλ t|λ| , λ
where a(λ) ∈ Z≥0 and Sλ is a homogeneous polynomial in x1 , . . . , xn of degree |λ|. Now, a brief summary is on the main properties of T -ideals.
282
C HAPTER V. HISTORY OF MATHEMATICS
3.4.5. Additional observations on T -ideals R EMARK 3.11. If R is commutative, it satisfies [x1 , x2 ] = x1 x2 − x2 x1 ; here T = C, the commutator ideal of kX, and C is generated as a T -ideal by [x1 , x2 ]. R EMARK 3.12. If R is a finite dimensional k-algebra of dimension n, then R satisfies the standard identity of degree n + 1, Sn+1 (x1 , . . . , xn+1 ) = sign(n)xσ(1) . . . xσ(n+1) . σ∈Sn+1
R EMARK 3.13. The ring of n×n matrices over k, Mn (k) satisfies S2n (x1 , . . . , x2n ), Amitsur-Levitzki Theorem. R EMARK 3.14. The ring of upper triangular matrices over k satisfies (x1 x2 −x2 x1 )n and its T -ideal of identities is C n , where C is the commutator ideal of kX. R EMARK 3.15. The exterior (or Grassmann) algebra E = kv1 , v2 , . . . /J over an infinite dimensional vector space with basis v1 , v2 , . . . , where J is the ideal generated by vi2 and vi vj + vj vi . Then E satisfies [x1 , x2 , x3 ] = (x1 x2 − x2 x1 )x3 − x3 (x1 x2 − x2 x1 ). and this polynomial generates E as a T -ideal. R EMARK 3.16. E ⊗ E satisfies [[x1 , x2 ]2 , x3 ] and [x1 , x2 , [x3 , x4 ], x5 ], and these two polynomials generate E ⊗ E as a T -ideal (Popov). R EMARK 3.17. Let T kX be a T-ideal, T (n) = kx1 , . . . , xn ∩ T . Then, if m < N , we have a(λ)Sλ (x1 , . . . , xn ) = ϕ(x1 , . . . , xn ; t) χ(kx1 , . . . , xn )/T (n) = λ
and χ(kx1 , . . . , xn )/T (m) =
a(λ)Sλ (x1 , . . . , xm ) = ϕ(x1 , . . . , xm , 0, . . . , 0; t).
λ
Note that the coefficient a(λ) is the same in both equations. Here it is also important that this coefficient a(λ) is independent of the number ofvariables involved. Thus, χ(kX)/T may be regarded as well-defined and equal to λ aΛ Sλ t|λ| . This step can be formalized if we introduce the “ring of symmetric functions of infinitely many variables” and then we write the character series as λ aΛ t|λ| ., a formal series with Sλ as symmetric functions in variables x1 , . . . , xn , . . . , the number of which we may ignore. Let us also notice that if kV /T has the character series ϕ(x1 , . . . , xn ; t), then from (Remark 3.15) in the “Main Representation Theorem” it follows that it Poincaré series is ϕ(1, . . . , 1; t).
3. Th. Molien’s life and mathematical work
283
E XAMPLE 3.4. Let R = kV = k ⊕ V ⊕ V ⊗2 ⊕ . . . . Now χ(V ) = x1 + · · ·+ xn .15 Hence, we get χ(kV ) = 1 + (x1 + · · · + xn )t + (x1 + · · · + xn )2 t2 + · · · =
1 . 1 − (x1 + · · · + xn )t
E XAMPLE 3.5. We have k[V ] = kV /C, where C is the commutator ideal of kV , so kV = k ⊕ V ⊕ S 2 (V ) ⊕ · · · ⊕ S i (V ) ⊕ . . . where S i (V ) is the i-th symmetric power of V . Hence χ[S i (V )] is the i-th symmetric 1 1 ... : function in x1 , . . . , xn , that is, the coefficient of ti in 1 − x1 t 1 − xn t 1 1 fed ... = S(i) (x1 , . . . , xn ). 1 − x1 t 1 − xn t So we obtain χ(kV ) =
i≥0
S(i) (x1 , . . . , xn )ti =
S(i) ti ;
i≥0
here (i) denotes the partition with one part equal to i (a horizontal strip). Here E. Formanek adds the character series for kV /M2 (cf. Formanek-Halpin-Li [6]), for kV /T (E), E infinite dimensional exterior algebra (Krakowski-Regev), and for kV /T (E ⊗ E) (A. Popov). He writes: “To the best of my knowledge the above examples are the only ones for which the character series is completely known.” And further: “. . . Unfortunately, it appears that the only way to determine the character series of kV /M(k) completely is by understanding its rational structure.” This was done for 2 × 2 matrices (k = 2) by Formanek-Kalin-Li-Procesi-Drensky, but it is a much harder task for larger k, even for k = 3. for general k the problem must be difficult to solve since it is clearly related to the problem of classifying sets of k × k matrices under conjugation, which generally is considered to be unsolvable. 3.4.6. Up to my knowledge, very little is indeed known beyond this, but nevertheless there are some positive results. Several years ago I proved the following theorem: every variety of k-algebras can be uniquely decomposed into a product of indecomposable varieties, and this was done in some sense constructively. In other terms, it means that every T-ideal T of kX can be uniquely decomposed T = T1 . . . Tκ (with some κ), all Tj being indecomposable T-ideals. This is intimately connected with the BergmanLewin result on FI-rings. Here it is important that, using this theorem, one can prove the following result. T HEOREM 3.18. Let T be any T -ideal in kX. Then T can be decomposed uniquely as a product T = T1 . . . Tκ with each factor Ti an indecomposable T-ideal, 15V is the sum of one-dimensional subspaces, say, V = V ⊕ · · · ⊕ V . This implies that [V ] = n 1 [V1 ⊕ . . . Vn ] = [V1 ] ⊕ · · · ⊕ [Vn ], which again implies that χ(V ) = x1 + · · · + xn
284
C HAPTER V. HISTORY OF MATHEMATICS
and the following formula holds true: χ(kX/T ) = σ1 χ(kX/T1 ) , . . . , χ(kX/Tκ ) + + (S(1) t − 1) · σ2 χ(kX/T1 ) , . . . , χ(kX/Tκ) + + (S(1) t − 1)2 · σ3 (χ(kX/T1 )) + · · · + (S(1) t − 1)κ−1 · σκ (. . . ), where σi (. . . ) denotes the i-th elementary symmetric expression in the formal series χ(kX/T1 ), . . . , χ(kX/Tκ ). The proof will be given below (p. 284-286). This theorem is interesting because it reduces the general problem (for any T -ideal) to the case of irreducible T -ideals. [These factors can be quite effectively found by the D-construction technique; in this way I solved an old problem by Yu. Mal’cev regarding the triangular matrix algebra, and achieved much the same as had been done, independently, by Plamen Siderov [20] using (complicated) direct calculations in kX.] And so, knowing the character series for all irreducible varieties, it is possible to find the character series of any relatively free algebra. Let us add here (we shall explain it in details later) that, taking in these character series all xi = 1, we can find, in principle, the Poincaré series of any relatively free algebra! From these results it is possible to find χ(kX/M(r)) for r ≥ 3 in some cases when there exists a suitable block-structure for the matrices present there; for some matrix algebras it is possible to bring all elements simultaneously into block-triangular shape. When can all operators in an algebra (over C) be brought simultaneously into blocktriangular shape? Cf. the Suprunenko-Tyshkevich theorem16. P ROOF
OF THE
T HEOREM 3.18. We are going to use the following
L EMMA 3.19 (E. Formanek [7, p. 10]). Let T and U be T-ideals in kV . Then T U is a T-ideal, and χ(kV /T U ) = χ(kV /T U ) + + χ(kV /U ) + (S(1) t − 1)χ(kV /T ) · χ(kV /U ). Now our Theorem follows from my main theorem for k-algebras (Dissertation of mine [K79a], see Section 4 in Chapter I or [K76]). Indeed, from this follows that for any T-ideal T there is a (unique) decomposition T = T1 . . . Tκ (with some κ), into indecomposable T-ideals T1 , T2 , . . . , Tκ . The last assertion of the Theorem now follows with induction on κ. We illustrate the induction step by the special case T = T1 T2 T3 = T U W ; the reduction κ → κ − 1 is of the same sort as the reduction 3 −→ 2 below. As T U W =
16Editor’s note Probably it is in the paper [22]
3. Th. Molien’s life and mathematical work
285
(T U )W , we have χ(kV /T U W ) = [E. F ormanek lemma] χ(kV /T U ) + χ(kV /W ) + + (S(1) t − 1)χ(kV /T U ) · χ(kV /W ) = = (χ(kV /T ) + χ(kV /U )) + + (S(1) t − 1)χ(kV /T )χ(kV /T U ) + + χ(kV /W ) + (S(1) t − 1) · (exactly the same thing) · · χ(kV /W ) = = (χ(kV /T ) + χ(kV /U ) + (χ(kV /W ) + + (S(1) t − 1) · (χ(kV /T ) · χ(kV /U )) + + χ(kV /T ) · χ(kV /W ) + χ(kV /U ) · χ(kV /W )) + + (S(1) t − 1)2 · χ(kV /T )χ(kV /U )χ(kV /W ) = = σ1 (χ(kV /T ), χ(kV /U ), χ(kV /W )) + + (S(1) t − 1)σ2 (∗, ∗, ∗) + (S(1) t − 1)2 · σ3 (∗, ∗, ∗). [In the last term the arguments of σ2 and σ3 , indicated by the three stars ∗, are the same as the argument of σ1 .] General case: the induction κ − 1 −→ κ. We write T = T1 . . . Tκ = (T1 . . . Tκ− )Tκ = U · W.
Using the Formanek lemma as the main tool, we find: χ(kV /T ) = χ(kV /U ) + χ(kV /W ) + (S(1) t − 1)χ(kV /U ) · χ(kV /W ) = = χ(kV /T1 . . . Tκ−1 ) + χ(kV /Tκ ) + + (S(1) t − 1)χ(kV /T1 . . . Tκ−1 ) · χ(kV /Tκ ) = = [by the induction hypothesis] κ−1
(κ−1)
σi
(χ(kV /T1 ), . . . , χ(kV /Tκ−1 )·
i=1
· (S(1) t − 1)i−1 + χ(kV /Tκ ) = (κ−1)
= σi
(χ(kV /T1 . . . Tκ−1 )) + χ(kV /Tκ ) + σ1κ (χ(kV /T1 ),...,χ(kV /Tκ ))
(κ−1)
+
i=2
(κ−1)
σi
(S1 t − 1)i−1 +
κ−1 i=1
χ(kV /Tκ )(S(1) t − 1)i−1 =
286
C HAPTER V. HISTORY OF MATHEMATICS
(κ)
χ(kV /T ) = σ1
(κ−1) + σ1 χ(kV /Tκ ) u + σ3κ + σ2κ χ(kV /Tκ ) u2 + (κ−1)
κ−1 κ−1 + σκ−2 χ(kV /Tκ )]uκ−2 + σκ−1 χ(kV /Tκ ) uk−1 = + · · · + [σκ−1 κ σκ−1
= =
(κ) σ1 κ
+
(κ) σ2 u
+ ··· +
κ σκ
(κ) σκ−1 uκ−2
+ σκ(κ) uκ−1 =
(κ)
σi ui−1 .
i=1
From this calculation the induction step follows.
( '
So one needs to know the character series for the irreducible T-ideals only. In this way we can overcome (using my thesis [K79a]) the difficulties described by Formanek in the case kV /M(r), r ≥ 3. It ought also be stressed that this Theorem is true for any field k (of any characteristic, not only for k with char k = 0, as in Formanek’s paper!) if and only if Theorem 2 in Formanek’s paper is true for any field – as our results about the triangular product constructions are proved for any field. This is important, because in this way one can hope to be able to reprove some of the results of Dicks-Formanek [5] and AlmkvistFossum [2] on the calculation of the Poincaré series for relatively free algebras in the positive characteristic case (char k = p > 0). 3.4.7. For any finite group G, let Rep(k, G) be the set isomorphism classes of finitely generated kG-modules; let [M ] be the isomorphism class of such a module M . It is an (additive) monoid under [M ] ⊕ [M ] = [M ⊕ M ]. As G is finite, the Krull-Schmidt Theorem implies that it is a free monoid freely generated by the classes of indecomposable finitely generated kG-modules. Take [M ] · [M ] = [M ⊗k M ] with diagonal action of G on M ⊗ M . Let us consider the k-space of additive maps Rep → k, e.g. def ψ : Rep → k, ψ(M ) = [M G : k]. The ones among them which preserve multiplication in the monoid Rep also are called k-characters of Rep. Each G ∈ G defines a characteristic χG : Rep → k by M → trace(G : M → M.). If we now take G ≤ GL(V ) and M any GL(V )-module, [V :k] = n, and G ∈ GL(V ) has the eigenvalues α1 , . . . , αn as a linear operator on V , we know by our def Main Representation Theorem that trace G = χ[M ](α1 , . . . , αn ). Let M G = {m ∈ G M |m = m for all G ∈ G}, i.e. the fixed points of M . It follows from the inner product formula for characters that 1 (91) [M kG :k] = χG [M ]. |G| G∈G Here χG [M ] is the trace of G as a linear operator on M . Now we shall see that all this material about the character series has intimate connections with Molien’s formula and its analogues. Taking kV /T = k ⊕ R1 ⊕ R2 ⊕ . . . , we have (kV /T )G = k ⊕ R1G ⊕ R2G ⊕ . . . ,
3. Th. Molien’s life and mathematical work
and so H((kV /T )G ) =
287
[RiG ]ti =
i≥0
1 χG [Ri ]ti = |G| i≤0 G∈G 1 ( = χG [Ri ]ti ) = |G| i≥0 G∈G 1 = ψG [Ri ]ti ) = ( |G| G∈G i≥0 = [in view of (91)]
χG (kV /T )
1 = χG (kV /T ). |G| G∈G Let us add that if χG (kV /T ) = ϕ(x1 , . . . , xn ; t) ∈ Z[(x1 , . . . , xn ][[t]], then χG (kV /T ) = ϕ(α1 , . . . , αn ; t), where α1 , . . . , αn are the eigenvalues of G as a linear operator on V . Thus, we have reached the Molien Theorem for Relatively Free Algebras: Let T be a T -ideal in kV and G ≤ GL(V ) a finite subgroup. Then the Hilbert series for (kV )/T is given by 1 H((kV /T )G ) = ( χG (kV /T ).' |G| G∈G E XAMPLE 3.6. As we saw above χG (kV ) = so χG (kV ) = We now get
1 , 1 − (x1 + · · · + xn )t
1 1 = . 1 − (α1 + · · · + αn )t 1 − (trace G)t
1 1 . H((kV /T )G ) = |G| 1 − (trace G)t G∈G
E XAMPLE 3.7. In the commutative we saw that 1 1 χ(kV /T )G = ... , 1 − x1 t 1 − xn t so
1 1 1 ... = . 1 − α1 t 1 − αn t det(1 − Gt) We get the classical formula of Molien: 1 1 . H((kV /T )G ) = |G| det(1 − Gt) G∈G χG (kV /T )G =
288
C HAPTER V. HISTORY OF MATHEMATICS
It is perhaps also worth while to finish with all three classical theorems (due to Molien, Shephard-Todd and Chevalley) for a free associative algebra kV of finite rank: 1 1 (1) H((kV /T )G ) = ; G∈ G |G| 1 − (tr G)t (2) (Formanek): (kV )G is finitely generated if and only if G is scalar, i.e. it consists of scalar matrices only, and then kV /T )G = kV ⊗|G| /T )G ; (3) (Kharchenko): (kV )G is a free associative algebra. [1],[2],[4],[5], [6],[7],[8],[9], [11],[12],[13],[14], [15],[16],[17], [18],[19],[20],[21] [24],[25],[26], [27],[28],[29],[30], [31],[32],[33], [34],[35],[36] References [1] [2]
[3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21]
G. Almkvist, W. Dicks, and G. Formanek. Hilbert series of fixed free algebras and noncommutative classical invariant theory. J. Algebra 93, 1985, 189–214. G. Almkvist and R. Fossum. Decomposition of exterior and symmetric powers of indecomposable Z/pZmodules in characteristic p and relations to invariants. In: Séminaire d’Algèbre Paul Dubreil, Lect. Notes Math., 641. Springer–Verlag, Berlin, Heidelberg, New York, 1978, 1–111. W. Blaschke. Kinematik und Quaternionen. Mathematische Monographien herausgegeben von Wilhelm Blaschke, 4. VEB Deutscher Verlag der Wissenschaften, Berlin, 1960. E. C. Dade. Answer to a question of R. Brauer. J. Algebra 1, 1964, 1–4. W. Dicks and G. Formanek. Poincaré series and a problem of S. Montgomery. Linear and Multilinear Algebra 12 (1), 1982/83, 21–30. G. Formanek, P. Halpin, and W.-C. W. Li. Poincaré series of the ring of 2 × 2 generic matrices. J. Algebra 69 (1), 1981, 105–112. G. Formanek. Noncommutative invariant theory. In: Group actions on rings (Brunswick, Maine, 1984), Contemp. Math., 43. Am. Math. Soc., Providence, RI, 1985, 87–119. G. Formanek and D. Sibley. The group determinant determines the group. Proc. Am. Math. Soc. 112, 1991, 649–656. F. G. Frobenius. Über die Darstellung der endlichen Gruppen durch lineare Substitutionen. In: Sitzungsber. Preuss. Akad. Wiss. Berlin, Berlin, 1897, 357–361. C. F. Gauss. Mutation des Raumes. In: Carl Friedrich Gauss Werke, Band 8. König. Gesell. Wissen., Göttingen, 1900, 357–361. T. Hawkins. Hesse’s principle of transfer and the representation of Lie algebras. Arch. Hist. Exact Sci. 39 (1), 1988, 41–73. T. Hawkins. Cayley’s counting problem and the representation of Lie algebras. In: Proc. of the Int. Congress of Math., August 3–11, 1986. Amer. Math. Soc., Providence, RI, 1987, 1642–1656. M. Hochster and J. Eagon. Cohen-Macaulay rings, invariant theory, and the generic perfection of determinantal loci. Am. J. Math. 93, 1971, 1020–1058. N. F. Kanunov. Fedor Eduardovich Molien. Nauka, Moscow, 1983. D. Krakowski and A. Regev. The polynomial identities of the Grassmann algebra. Trans. Am. Math. Soc. 181, 1973, 429–438. T. Molien. Number systems. Nauka, Novosibirsk, 1985. E. Noether. Der Endlichkeitssatz der Invarianten endlicher Gruppen. Math. Ann. 77, 1916, 89–92. K. Parshall. Joseph H. M. Wedderburn and the structure theory of algebras. Arch. Hist. Exact Sci. 32 (3–4), 1985, 223–349. K. Parshall. In pursuit of the finite division algebra theorem and beyond: Joseph H. M. Wedderburn, Leonard E. Dickson, and Oswald Veblen. Arch. Internat. Hist. Sci. 33 (111), 1983, 274–299. P. N. Siderov. A basis for identities of an algebra of triangular matrices over an arbitrary field. PLISKA Stud. Math. Bulgar 2, 1981, 143–152. B. Sturmfels. Algorithms in invariant theory. Texts and Monographs in Symbolic Computation. SpringerVerlag, Wien, New York, 1993.
3. Th. Molien’s life and mathematical work
289
[22] P. N. Siderov. The similarity and substitutive equivalence of (0,1)-matrices. Dokl. Nats. Akad. Nauk Belarusi 22, 1978, 485–487, 571. [23] T. Tambour. A theorem of Molien type in combinatorics. European J. Combin. 10, 1989, 197–199.
Supplement. List of scientific papers of Th. Molien.
17
[24] T. Molien. Bahn des Kometen 1880, III. Astronomische Nachrichten 2519, 1883, 353–362. [25] T. Molien. Zusats zur Bahnbestimmung des Kometen 1880 III. Astronomische Nachrichten 2519, 1883, 353–362. [26] T. Molien. Über gewisse, in der Theorie der elliptischen Functionen auftretenden Einheitswurzeln. Berichte der k. Sächsischen Gesellschaft der Wissenschaften, 1885. [27] T. Molien. Über lineare Transformation der elliptischen Functionen. Master’s thesis, Dorpat, 1885. [28] T. Molien. Über Systeme höherer komplexen Zahlen. Math. Ann. 41, 1893, 83–156. [29] T. Molien. Berichtigung zum Aufsatze “Ueber Systeme höherer complexen Zahlen”. Math. Ann. 42, 1893, 308–312. [30] T. Molien. Eine Bemerkung zur Theorie der homogenen Substitutionensgruppen. Sitzungsberichte der Naturforschenden Gesellschaft der Universität Jurjew 18, 1897, 259–274. [31] T. Molien. Über die Anzahl der Variablen einer irreduzibeln Substitutionensgruppen. Sitzungsberichte der Naturforschenden Gesellschaft der Universität Jurjew 18, 1897, 277–288. [32] T. Molien. Über die Invarianten der linearen Substitutionsgruppen. Sitzungsber. der Königl. Preuss. Akad. d. Wiss. 52, 1897, 1152–1156. [33] T. Molien. Über gewisse transzendente Gleichungen. Math. Ann. 103, 1930, 35–37. [34] T. Molien. Lösung der Aufgabe 148. Jahresber. Dtsch. Math.-Ver. 44, 1934, 35–37. [35] T. Molien. Zahlensysteme mit einer Haupteinheit (Systems of higher complex numbers with a principal unit). Uch. Zap. Tomsk Ped. Inst. Mat. Mekh. Kubyshev-Univ., Tomsk 1 (1), 1935, published 1937, 217– 224. [36] T. Molien. On a certain transformation of the hypergeometric series. Uch. Zap. Tomsk Ped. Inst. Mat. Mekh. Kubyshev-Univ., Tomsk 1, 1935, published 1937, 119–121.
17
Editor’s Note. Several text-books by Molien are also mentioned in Kanunov’s book [14], ultra.
This page intentionally left blank
291
4.
Notes on five 19th century Tartu mathematicians (Backlund, Kneser, Lindstedt, Molien, Weihrauch) Edited and translated by J. Peetre with the assistance of V. Ufanrovsky
Preamble (by J. Peetre). This notes represent material found by Uno Kaljulaid in archives in Tartu. Kaljulaid sent me the manuscript probably a few years before his death, without any explanation. On the top of one of the pages about Molien there was, however, a note to me in Estonian, of which I can decern the following words: “Jaak . . . Vaadata ja tõlkida.” (To look and to translate.) The language in the original has been Russian (occasionally German, or other languages), by Kaljulaid translated into Estonian. Here all is rendered in English. Sometimes the Russian/German has been preserved because of emphasis. I express the hope that this compilation will be of interest for historians of mathematics. Recall that the language of administration at Dorpat/Tartu University was Russian, whereas, until 1893, all teaching took place in German. The text is divided into five sections corresponding to each of the five men treated. All of them had a connection to mathematics, although Backlund was primarily an astronomer. Two of them are native Swedes, another, Molien, having at least Swedish ancestry. At the end each of these sections there is a separate list of references, compiled by me. The footnotes are, likewise, by me. The text is usually rendered verbatim. The Reader will probably have no difficulty in discerning what is taken directly from the archives from occasional interpretations by Kaljulaid. The pages of the various “folders” in the archives has been numbered 1, 2, . . . , empty (or skipped?) pages being sometimes denoted, by Kaljulaid, by a question mark (?). Some headings inside the chapters are printed in bold letters. Passages which I have been unable to decipher are indicated as [?????].
292
C HAPTER V. HISTORY OF MATHEMATICS
Johan Oskar Backlund (1846-1916)
The astronomer Backlund was a native Swede, he studied in Uppsala and became a docent there in 1875. He was an assistant astronomer at the Observatory of the Royal Swedish Academy of Sciences (KVA18) in Stockholm in 1875-1876, and an astronomer at Dorpat University 1876-1879. From there he moved to Russia and was an adjoint astronomer at the Pulkovo Observatory19 in 1879-1887. Backlund became a member of the Russian Academy of Sciences in 1887. In 1895 he was appointed director of the Observatory. Backlund’s scientific work was mainly devoted to the study of Encke’s comet20 One of Backlund’s two sons Helge Backlund (1878-58) became a geologist and was first a professor in Åbo (1918-1924), and then in Uppsala (1924-1943).21
18Kungliga Vetenskaps Akademien 19Near St. Petersburg, founded in 1839 by the German-born astronomer Wilhelm Struve (1793-1864),
studied in Dorpat, and taught there 1813-1839. Struve was the first to determine, in 1837, the parallax of fixed stars. The Russian name of the Observatory is a corruption of Finnish Purkola, an estate given by Czar Peter to his wife Catharine, a Lithuanian peasant woman. Two more Swedes to work in St. Petersburg were the mathematician Anders Johan Lexell, an assistant of Leonard Euler, and Georg Lindhagen, who married Struve’s daughter. 20This is the comet with the smallest period of rotation about the Sun. It was identified by Johann Franz Encke (1791-1865) in 1819. Encke also found the first signs of a non-gravitational force acting on it [2]. 21Interesting information about the Backlund family can be found in the book [1]. Backlund had also a daughter Elsa, who became a painter. A well-known picture “På skidor (On skies)" has Helge as model. After the Revolution she returned to Sweden together with her mother the Secret Counciloress Ulrika.
4. Notes on five 19th century Tartu mathematicians
293
Folder 1207: March 25, 1876 – March 7, 1879. Page 4. Extract from the proposal of the Physico-Mathematical Faculty on March 25, 1976. The Faculty proposes to the Council to nominate Johan Oskar Backlund for the vacant position of astronomy observer at the University of Dorpat as the single candidate. J. O. Backlund was born on April 8, 1846, entered Uppsala University as a student in 1866, became a Candidate of Philosophy in 1872, and defended on December 5, 1874 [a thesis] pro grado philosophico “Beräkning af relativa störningar för Planeten (112) Iphigenia”, and was, since January 1875, appointed docent at Uppsala University, and obtained (in March?) the degree of doctor. Besides that he has written a yet non-published article on the influence of Encke’s comet to shortening of the orbit of Earth, however, KVA has been given him for that the Ferner prize. . . . By the way, these works show that Dr. Backlund is an outstanding theoretical astronomer, and, on other hand, he was recommended by the director of the Astronomical Observatory of Stockholm, Professor Dr. Hugo Gyldén22, who for two years already has been Backlund’s adviser. Page 6. Excerpt from the Journal des Conseils der Kaiserlichen Universität Dorpat, March 26, 1876, No. 162. Having listened to the presentation of Dr. J. O. Backlund, the single candidate to the position of astronomy observer, the election was performed, thereby 34 voted for him and their were no votes against. The Council decides to turn to Mr. Curator to affirm Dr. Backlund to this position, and to refund him his travel expenses for the trip to Tartu. Page 9. On April 13, 1876 there is the confirmation (which had arrived in Dorpat on April 27, 1876) signed (in Riga?) by the Curator Saburov together with the decision to reimburse the travel expenses of Backlund to the amount 437 rubles 75 kopeks; Backlund expressed the wish to obtain this money in Dorpat. [The Curator had earlier asked where Backlund wanted this money, in Stockholm or in Dorpat. So the rules were.] Page 21. Extract from the minutes of the Council of the University on April 13, 1877. [Item on the agenda:] Does the Physico-Mathematical Faculty support Dr. Backlund’s application for a summer vacation so that he could participate in the Congress of Astronomers in Stockholm in August? 22Hugo Gyldén (1841-1896), Finnish-Swedish astronomer, employed at Pulkovo, St. Petersburg 18631871, was then as an Astronomer at KVA, worked mainly on the perturbation theory of planets, had a great influence to the development of this subject in Sweden [2]. See also [3, Chapter 9.]
294
C HAPTER V. HISTORY OF MATHEMATICS
He was given a summer vacation + 28 days. Signed by the Minister of National Education, Count Dmitriˇı Tolstoˇı Page 34. On Dec. 2, 1877 Curator Saburov’s decided to pay to Dr. Backlund the sum of 150 ruble for lectured delivered on “Algebraische Analysis” (3 hours weekly). Page 40. On May 24, 1878 [there is] a new order by “Curator Saburov (?)” about payment of 150 rubles for the running semester (that is the spring semester of 1878) for lectures (3 hours a week) “Ausgewählte Theile der Elementar-Matematik” (from the funds of the vacant docentship). Page 42. The Department of National Education has informed Curator Saburov on Jan. 10, 1879 that the Director of the Nikolaˇı Observatory of Pulokvo submitted to the Ministery of Education a proposal for appointing Dr. Backlund to the vacant position of Adjunct Observer. It is asked whether Dorpat has any objections to this? Page 47. Extract from the Journal of the Council of Dorpat University. On Feb. 12 1879 it is replied that there are no objections, and it is decided to forward to the Curatorship, that there are no hindrances in the transfer of Dr. Backlund. Page 48. The Council of the Imperial Dorpat University confirms that Dr. Oskar A. Backlund23, foreigner, not a Russian citizen, 32 years of age, after finishing the science course at Uppsala University with the degree of Candidate, further in 1875 with the degree of Doctor of Philosophy, was a docent at Uppsala University 1875-76, and earlier employed as an astronomer at Stockholm, was, on April 13, 1876, appointed by a decision of the Curator of the Dorpat District of Education, in accordance with the results of voting in the Council, to astronomer at Dorpat University, from which position (being in Class VII) he has been transferred to an adjoint-astronomer at the Nikolaˇı Main Astronomical Observatory from Feb. 24, 1879 on. The salary in Dorpat was 10000 rubles a year. He fulfilled his obligations with satisfaction. Backlund is married to Ulrika (née Widebeck) and has the sons: Hjalmar (born on Apr. 5, 1877) and Helge (born on Aug. 24, 1878). Wife and sons are of EvangelicLutheran confession.
Folder 1208: July 29, 1876 - February 28, 1880. On August 20, 1876, a letter from the Custom Office (City of Saint Petersburg, Custom of the Port) that 40 rubles, 80 kopeks has to be payed back to Backlund, which he had payed for a pianoforte, brought into the country by him from abroad. Generally speaking, the information about Backlund in Tartu, what substance goes, is rather scant. Two sons were born to him here . . . 24 One needs supplementary explanation what he did here, and why he left. 23Editor’s note. The name of Dr Backlund appears on Russian documents as Oskar Andreevich Baklund 24One of the sons was the aforementioned future geologist Helge Backlund, born in 1878.
4. Notes on five 19th century Tartu mathematicians
295
References [1] B. Jangfeldt. Svenska vägar till S:t Peterburg: kapitel ur historien om svenskarna vid Nevans stränder (Swedish routes to St. Peterburg. Chapters from the story of the Swedes on the shores of the Neva.) Wahlström & Wifstrand, Stockholm, 2000. [2] Nationalencyklopedin (Swedish National Encyclopedia). [3] L. Gårding. Matematik och Matematiker. Matematiken i Sverige före 1950. Lund University Press, Lund, 1994. English translation: Mathematics and mathematicians. Mathematics in Sweden before 1950. In: History of Mathematics, 13. American Mathematical Society, Providence, RI; London Mathematical Society, London, 1998.
This page intentionally left blank
4. Notes on five 19th century Tartu mathematicians
297
Adolf Kneser (186225-1930)
Kneser was born in Grüssow, Mecklenburg, Germany. He studied mathematics in Berlin under Kronecker and Weierstrass. In 1884 he became a private docent in Marburg, but in 1886 he moved to Breslau (Wrosław), as a professor and successor of O. Staude, who had left for Dorpat. He was thus made a professor at the early age of 27. However, when Staude accepted an invitation to Rostock in 1889, Kneser became his successor for the second time. In 1893 the previously German-language university became a Russian university, named Yurjev University, and the language of teaching became Russian. At least one of the professors, Oettingen26, quit. In this context Kneser drafted a letter of protest on the behalf of the German professors. Finally, Kneser himself returned to Germany, in 1900. It should however emphasized that he had a good relation to several Russian mathematicians, among them Steklov, in particular was very close to him. Eventually got an invitation back to Breslau, where he then stayed for the rest of his life. Adolf Kneser gave rise to a mathematical dynasty; both his son Helmuth27 and his grandson Martin became famous mathematicians. 25Note that Kneser born in the same year as David Hilbert. 26Arthur Joachim von Oettingen (1836-1920), Baltic-Geman physicist and meteorologist, taught at Tartu
1863-1893, in Leipzig 1984; upon the “Russification” of the University he refused to teach in Russian and so had to emigrate and taught one year, 1894, in Leipzig. In 1899 he went to Transval, and return from there via “Eastern Africa” (see [5, Part 5]); by this is probably meant German Eastern Africa (Deutsch Ostafrika), that is, present day, Tanzania. Oettingen started regular meteorological observation in 1865, and founded the first meteorological station in Estonian territory in 1876. 27A second son Lorents Friedrich died as a soldier in World War.
298
C HAPTER V. HISTORY OF MATHEMATICS
Folder 402/3/805: November 22, 1888 – April 11, 1900. Certificate and place of birth: Great Duchy of Mecklenburg-Schwerin, born on March 19, 1862 in Grüssow, Mecklenburg, father Adolph Hermann Kneser, worked as a Lutheran pastor, mother Friederike Wilhelmine Filippe Augusta (née Kolman), the full name of their son was Julius Christian Carl Adolph. • On November 22, 1888, the Council of the Physico-Mathematical Faculty proclaimed that the Private Docent Adolph Kneser from Breslau is the only candidate to the vacant extra-ordinary professorship in applied mathematics. His basic training is obtained from the Gymnasium of Rostock, from which he graduated in 1879 with a certificate of maturity. Then he studied at the Universities of Rostock, Heidelberg, and Berlin; in the latter he obtained the degree of Doctor of Philosophy on March 8, 1884. In the same year he began to read mathematics at Marburg University as a private docent. In 1886 he moved to Breslau, where he began to work upon the leave of Professor Staude, and he is still (i.e., in 1888) a private docent there. Until now he has mainly given lectures on: analytical mechanics; function theory; the theory of algebraic equations; on the numerical solution of equations; number theory; determinants; Fourier series; algebraic analysis; calculus of variation and integral calculus. He has a number of papers (on mathematical physics; algebraic equations; Kronecker’s principle that number theory only then gets its true arithmetical nature when everything is founded in a purely arithmetical way; etc.). From these abstract questions Kneser has lately passed on to geometrical problems of the distribution of real twisting of plane curved lines in space not investigated sufficiently before28. With this paper he has shown himself as a skilled applied mathematician. Furthermore, he has a paper that fulfils a very essential gap in Halphen’s text-book of elliptic functions29; here he has also proved himself as a powerful analyst. He has also read analytical mechanics, that for Tartu was a novelty. The Faculty has obtained from outstanding scientists rather good characterizations of Kneser. Weierstrass recommends to consider Kneser as the first candidate, Kronecker characterizes him as a good and many-sided laborer, who finds independently his themes and delivers lectures, which lack formalism, and are dictated by an independent thinking. He adds that already in the gymnasium Kneser displayed an unusual ability in presenting mathematics, and that later, being already a lecturer in Marburg or in Breslau, he continued exhibiting this ability. Professor [H.] Weber 30 (in Marburg) says that Kneser is free from the one-sidedness of only one school, he possesses a broad knowledge of in many domains of mathematics, and he has enjoyable clarity of his presentation. The students here appreciated him and listened to his lectures; also in Breslau, where he was active in a much broader spectrum, he made the same success. Professor Hurwitz (Königsberg) says that hiring him would be a great victory for the Faculty. 28It may be [1]. 29This paper was probably not published; the only paper by Kneser dealing with elliptic functions seems
to be [2], but Halphen is not mentioned there at all. 30Heinrich Weber (1815-1897), well-known German mathematical, mainly distinguished for his work in Algebra and Number Theory. He was a professor in Marburg 1885-1892. In the last year he obtained a call to Göttingen and, finally, in 1895, one to Strassburg (Strasbourg) [4].
4. Notes on five 19th century Tartu mathematicians
299
Professor Schröter31 (Breslau) adds that as a comrade is agreeable, because he is free from waywardness and superiority; Professor Mayer adds that his over all impression of him is very high. He is diffident, and in the beginning he does not show his feelings but he is always reliable and obliging, he has a trustworthy and efficient character. • On Feb. 1, 1889, Kneser gave his oath of office (“Sittliches Gelübde”) to Tartu University. • On Mar. 1, 1890, he asks for permission to go abroad; in 1891 he signs such applications as Professor of Mathematics. • On Dec. 18, 1892, it is told in the minutes of the Council that 400 rubles has been payed to Kneser for the first half of this year one for his lectures in newer Algebra and Geometry (from money available for the vacant position in the Chair of Pure Mathematics). • On Feb. 11, 1893, there is an application from Kneser to participate, in September, in an already ongoing international exhibition of mathematical models in München, and likewise in another meeting taking place at the same time. The use of models in both pure and applied mathematics gains an even greater importance; it will also be necessary to renew the collection in the Mathematical Cabinet. (This gets the support of the Dean!) • On Nov. 11, 1893, there is an interesting note: The Board of the Imperial University of Yurjev pronounces to the question if there might be any objections to the marriage of the Ordinary Professor and Councillor of State Adolph Kneser with the daughter of the deceased landowner Lorentz Booth (Laura Booth), that no such objections existed. • On Nov. 7, 1893, there is an entry by the Council that Kneser has obtained a vacation because of domestic reasons during the winter break. • On Nov. 28, 1895, there is a letter from the Rector giving permission for Kneser to go abroad during the winter break. • On March 18, 1896, Kneser asks for permission to go to Germany because of personal reasons. (He was then a professor of applied mathematics.) • On Feb. 28, 1897, there is in the Council of the Imperial Yurjev University an interesting communication by Kneser: He has accepted the editorship of the Part “Calculus of Variations” in the Enzyklopedädie, which is composed by united scientists abroad, and academies. For this it is necessary to visit for a long time libraries abroad, because the corresponding biographic information is not available here. He intends also to participate in the Meeting of Natural Scientists (Gesellschaft Deutscher Naturforscher und Ärtzte) in Braunschweig in September. • April 22, 1898, The application has been approved by the Councillor of the Yurjev University, from the Riga District of Education, but at the end there is an interesting note: “Independently of this, taking into account that a professor’s being away from the university at his lecturing time may lead to insufficient acquisition of the study course 31Heinrich Eduard Schroeter (Schröter) (1829-1892), German mathematician, devoted himself mainly to elliptic functions and synthetic geometry.
300
C HAPTER V. HISTORY OF MATHEMATICS
by students; His Grace asked me to notice the Council not only serve as a middleman when considering the business trip of teachers, but follow the needs of the faculties being responsible for studies.” • On May 30, 1898, Kneser asks for 400 rubles for lectures in function theory (4 hours weekly). • On March 13, 1899, Kneser submits an application for travelling abroad in the period June 10 - August 20 with the goal to participate in the assembly of researchers in the history of the Calculus of Variations. (This was approved on May 29, 1899.) • On Feb. 25, 1900, the Faculty asks Kneser go abroad from July 1 to September 5 in connection with the lectures on the historical development and the present state of mathematics (only a few distinct subfields of mathematics were selected).These were arranged by the German mathematicians following the example of the “British Association for Advancement of Sciences”. Kneser is supposed to give a talk on the Calculus of Variations. The last two summer vacations he has worked in foreign libraries. Now he intends to present his lecture also to the Mathematical Society (Deutsche MathematikerVereinigung), at its joint meeting with the German Society of Natural Scientists in Aachen, in the second half of 1900. Missing lectures [at home university ?] can be compensated for later. • On Oct. 11, 1900, he asks for leave for 28 days because of urgent family reasons (that cannot wait because of imperative circumstances). •On Oct. 13, 1900, Kneser turns to the Rector with an appeal to be relieved from his duties starting from October 25, 1900. He asks also for financial support during one year for a 10 year service (his service class was V); this was approved on Feb. 7, 1901. There is a corresponding proposal of the Rector (1429 rubles and 60 kopeks).
Service Record of A. Kneser. The Councillor of State, Doctor of Philsophy Adolph Kneser was an ordinary professor of applied mathematics, 38 years of age, of Evangelical-Lutheran Faith. (A foreigner who has not given the oath to become a Russian citizen.) On Aug. 20 1889 appointed to an extra-ordinary professor at Dorpat, later Yurjev, extra-ordinary professor of mathematics and theoretical mechanics at Dorpat, now Yurjev University, in the Chair of applied Mathematics. In 1890 he was elected to an ordinary professor, which was on January 22, 1891. Has been on vacation: 1889 in the summer abroad; 1890 in the summer abroad; 1891 – 1892 and 1893 in the winter abroad; 1894 in the summer abroad; 1895 in the summer; 1896 March 6–31.
4. Notes on five 19th century Tartu mathematicians
301
Wife: Laura [née Booth]; sons Lorents Friedrich September 17/29 1896, Helmuth32 4/16 1898 living with the parents. Both wife and children are of Evangelical-Lutheran Faith. A. Kneser was in the service at Tartu University as a professor of applied mathematics in the period January 23, 1889 – October 25, 1900. (This testimony was given on the request of A. Kneser himself on August 17, 1929; he writes that he needs this information for “settling his problem of widow’s pension”, see also the paper by P. Müürsepp33 [3].)
References Two selected mathematical papers by A. Kneser [1] A. Kneser. Bemerkung über die Frenet-Serret’schen Formeln und die analytische Unterscheidung recht und links gewundener Raumkurven. J. Reine Angew. Math. 113, 1894, 89–101. [2] A. Kneser. Elementarer Beweis für die Darstellung der elliptischen Functionen als Quotienten beständig convergenter Potenzreihen. J. Reine Angew. Math. 82, 1888, 309–330.
Biographies [3] P. Müürsepp. Professor of mathematics Adolf Kneser (1862-1930) and the Tartu University. In: Items from History of Science in the Estonian SSR Acadademy of Science of the Estonian SSR, Tartu, 1971, 56–71. [4] S. Gottwald, H.-J. Ilgauds, and K.-H. Schlote (eds.) Lexikon bedeutender Mathematiker (Dictionary of eminent mathematicians). Bibliographisches Institut, Leipzig, 1990.
Auxiliary reference [5] J. C. Poggendorff’s biographisch-litterarisches Handwörterbuch, Vierter Band Band (Die Jahre 1888 . . . ). Barth, Leipzig, 1904.
32Later, like his father, a famous mathematician. 33Peeter Müürsepp (1918-1999), Estonian mechanist and historian of science, has written books about the
self-taught Estonian astro-optician Bernhard Schmidt (1879-1935), inventor of the famous Schmidt telescope, and about the mathematician Carl Friedrich Gauss.
This page intentionally left blank
4. Notes on five 19th century Tartu mathematicians
Anders Lindstedt (1854-1939)
Anders Lindstedt was born in Sunborn, Sweden, in 1854. He studied and worked at Lund University 1872-1879, except for one year 18741875, when he served as an astronomer at Hamburg Observatory. In 1879-1886 he was at Dorpat University, first as an astronomer and then, 1883-1886, as professor of applied mathematics. In this period he devoted himself, scientifically, mostly to the problem of the secular perturbations of Mercury, an enigma settled by Albert Einstein on the basis of the latter’s theory of general relativity in 1916. After his return to Sweden in 1886, Lindstedt was a professor of mathematics and general theoretical mechanics at the the Royal Institute of Technology and Stockholm (KTH) in the years 1886-1909, being its rector 1902-1909. The last years he quit his professorship, after that period he worked only in government service. He was an assistant under secretary (departementsråd) 1909-1916 and President of the Insurance Council (Försäkringsrådet) 1917-1924. He was the driving force behind the law of a general pension (allmän folkpension) 1913, and has been given the epithet “Father of the Swedish Social Insurance” (Svenska socialförsäkringens fader). [13]
303
304
C HAPTER V. HISTORY OF MATHEMATICS
“A small folder”: April 4, 1879 – February 19, 1886. Page 4. On May 3, 1879, there was sent a letter to the Customs Office in Libau (Li¯ep¯ aja, Latvia), asking that goods belonging to Lindstedt, be delivered to the merchant Egbert Dassel (the cost of transportation was payed by Dorpat University). Page 7. Below a message that to the Customs Office in Reval two boxes have arrived in the name of “Observator Dr. And. Lindstedt”, containing [household] equipment, underwear, dresses. To be given to the merchant Geppener. Page 15. [Direction of District of Education]: According the application of the University management from December 13, I permit to give to Astronomer Lindstedt 150 rubles for the course taught this semester (3 hours weekly) on the theory of elliptic and Abelian functions. Page 35. To the Collegium for National Education, Department for international exchange of scientific literature. The University management sends back the parcel from January 9, as Mr. Lindstedt did not returned Dorpat in this semester and has been dismissed from the position of Professor of this University. (Secretary Tamberg.)
“A big folder”: 1879 – 1886. Page 19.There are he following notes: • 1878, Lund, dissertation “Beobachtungen des Mars während seiner Opposition 1877” [9]. • Feb. 24, 1879. The Council of the University recommends him to the position of astronomer. • March 24, 1879. [The Curator of Dorpat University] appoints Lindstedt to astronomer. Page 26. On March 27, 1879, the Council of the University receives a letter on this appointment from Curator Saburov about his decision from February 27, 1879. Page 31. The order that a transfer of money (85 half-imperials34) should be made to A. Lindstedt for travel expenses so that he could arrive. Page 36. Inauguration to the service on May 19, 1879 (signature of A. Lindstedt). Page 47. On November 10, 1879 a sum of 159 rubles is given to Lindstedt from the funds left over by the vacant docentship for “Lectures on analytic functions” (during the current semester, 3 hours per week). [Thus, in the fall semester of 1879 he already read a whole series of courses 3 hours weekly.] 34In tsarist Russia, a half-imperial was a coin of gold worth 5 rubles and 50 kopeks.
4. Notes on five 19th century Tartu mathematicians
305
Page 49. In the first semester of 1889 Lindstedt offers the course for his colleagues: On elliptic functions. (Proposed by the Dean, Oettingen). Page 52. On May 13, 1880 (from the District of Education) according to the decision of the Council of the University, 150 rubles has been allocated to Lindstedt for giving the course “Theory of elliptic functions” in the spring semester of 1880. Page 54. In the second semester of 1880 Lindstedt offers the course (?) “Morphische Funktionen”35. Page 55. Request of the Dean Oettingen to the Council “ Neue Geometrie und Algebra” in the first semester of 1881. Page 56. Report by Dr. A. Lindstedt to the “Conseil der Kaiserlichen Universität Dorpat”that in the second semester of 1879 and in the first and second semesters of 1880 he has given the following courses • “Vorlesungen über allgemeine Theorie der analytischen Funktionen” • “Über allgemeine Theorie der elliptischen und Abel’schen Funktionen” so wie auch das er für das jetzt eingehende Semester den Auftrage hat über neuere Geometrie und Algebra vorzutragen und die Übungen in dem mathematischen Seminar zu leiten. Ausserdem wage ich die Bitte hinzuzufügen ein s. g. testimonem für die Zeit zu erhalten, während welcher ich mich an der hiesigen Universität bis jetzt aufgehalten habe. (And for the future he has has the commission to lecture on newer “Geometry and Algebra”, and to conduct the exercises in the mathematical seminar. Moreover, he dares to express the request to obtain a so-called testimony on the time he have spent at this University so far ). Signed: Dorpat, 16. Januar 1881. Dr. And. Lindstedt, Astronomer. Page 59. New appeal of the Dean Oettingen to the Council of the University. In the second semester Dr. Lindstedt ought to read a course over “die Theorie der Algebraischer Curven” (the theory of algebraic curves) and therefore asks 300 rubles for conducting these lactures. The request is approved on May 20, 1891 (Curator of Dorpat University). Page 72. Request by Dr. Lindstedt to obtain 200 ruble for a trip to Stockholm during the summer vacation, where he intends to thoroughly learn perturbation theory from Academician Gyldén there, which he has briefly mentioned. (Lindstedt’s own) paper appears to be a development further of the theory about the planet Mercury. The theories available until now have not been capable to explain its orbit. It seems that, in particular, the century perturbations cause anomalies which cannot be explained with methods so far known. Lindstedt wrote: “However, until now this has not been clarified sufficiently, 35Perhaps monogenic functions?
306
C HAPTER V. HISTORY OF MATHEMATICS
and so I have begun to compute anew the perturbations of Mercury using the theory of Hansen36, and almost completed the computations of the perturbations of Venus and the Earth. Among the other planets only Jupiter may have a more significant influence and, regarding it, I have the hope to finish completely before the beginning of the summer, so that it will be possible to see how sufficient is the method that I applied. Otherwise the method suggested by Academician Gyldén should be viewed as the only one which can lead to a complete success. By these reasons and on the request of the the PhysicoMathematical Faculty to give lectures describing the main features of this method as an example of application of elliptic functions, I would like to ask the Council to support my appeal.“ On April 14, he obtains this permit. Page 79. On May 23, 1882, the decision is made by the Curator Saburov for the Council of the Imperial Dorpat University to pay to astronomer Lindstedt 300 rubles for his lectures course “Theory of analytic functions”, and for conducting the mathematical seminar (from special sources of the University). [On May 20, one can see the name of Weihrauch in the Council of the University and on this day there was there a voting in favor of Lindstedt’s trip in the summer vacation.] Page 90. On Dec. 10, 1882 once more 300 rubles is allocated for lectures and the conducting the seminar. Page 92. On May 5, 1883 there is again a demand of Dean Oettingen to pay 300 ruble (200 + 100) to Anders Lindstedt for lectures on “Newer Geometry and Algebra” and for directing the seminar. Page 98. Below there is the proposal for acquiring for Dr. A. Lindstedt the degree of professor, May 13, 1883. From the Dean’s proposal to present A. Lindstedt as the single candidate to the position of professor of applied mathematics, opening up on August 12, as an ordinary professor: Lindstedt was born in Sundborn on July 27, 1854 near Falun in Sweden, and got his first education . . . . On Sept. 16, 1872 he entered Lund University. He became a Candidate of Philosophy on May 23 1874 and was, from July on until June 1875, an astronomer at Hamburg Observatory. The stay in Hamburg was followed by the summer semester in Leipzig. Upon his return to Lund he became a Licentiate of Philosophy on May 28 1877, and a Doctor of Philosophy on June 6, 1877, among the promovendi had the position of first auster. On Aug. 1, 1877, he became private docent in Lund, and from there he got an invitation for a position of Astronomer in Dorpat on May 27, 1879. 36Peter Andreas Hansen (1795-1874), Danish born self-taught astronomer, worked at the Gotha Observatory. Developed a perturbation theory together with Palowsky. Karl Rudolph Palowsky (1817-1881), German astronomer, was the assistant of Hansen in 1850-56, moved later to Washington, died there.
4. Notes on five 19th century Tartu mathematicians
307
In the years 1873 and 1874 Lindstedt took part in a continuing search of small planets and comets. Being an astronomer in Hamburg, he continued observations for the astral zone +80◦ −81◦. Later on he carried out observations of comets by the refractor and determined the “meridian circle of the [?????] fundamental stars”, Astronom. Nachr. 2046-2048. In Leipzig, he worked on the [GAP] of stars together with the [GAP]. In Lund he had for his use the meridian circle, with the aid of which he could determine the plane of the fundamental stars. In the following years the meridian circle was used for a careful study of permanent errors of distribution. All this material was used in his Ph.D.-thesis. Lindstedt has published the papers [1–8]. As a private docent in Lund, Lindstedt read of practical astronomy, the last term also on differential equations. At Dorpat University, from the second semester of 1879 on, he have given on the request of the Faculty the lecture courses on the theory of analytic functions and newer geometry and algebra. In the second semester of 1880 Lindstedt introduced practical exercises, in order to induce students to independent work, which from the first semester of 1881 has been carried out each semester, on the request of the Faculty. In the Observatory he finished the observations of the star zone +70◦ − 75◦ , and, to a large extent, also finished the calculations on this basis. Although Lindstedt’s aforementioned activities are thus connected with Astronomy, his papers were rather devoted to higher Pure and Applied Mathematics. The topics of the lectures read by him in our University concern applied and higher problems in more recent mathematics. On the other hand, Lindstedt’s papers on differential equations form an inseparable part of mechanics, with which he has to occupy himself besides the topics hitherto read on. His newest paper on the form of the integral for the general case of the 3-body problem is of great importance. Although a Swede by birth, Lindstedt masters German fairly well. His speech is precise and concise, his way to see is clear and general. His attractive personality and clearness of his character is well known to all Council members. It remains to add that the faculty is happy to have the possibility to present such a candidate. On May 30, 1883 there is a presentation to the Council. In the election there were 31 voices for him, among them 2 written ones, nobody voted against him. To ask Mr. Curator to take pains to transfer Astronomer Lindstedt to the Chair of Applied Mathematics, in the capacity of an ordinary professor. This decision was confirmed to Mr. Minister of National Education on Oct. 3 1883. (Reprot of the Senat on Oct. 14 1883, Nr. 84.)
Page 123. Letter, dated on Jan. 26 1884, by Oettingen himself to the Council of the University:
308
C HAPTER V. HISTORY OF MATHEMATICS
He [Lindstedt] wants to go to a mission abroad in a scientific purpose during the summer vacation 1884. Purpose: “I intend to study the new methods of Professor Weierstrass in Berlin, because the investigations of this great mathematician in the theory of functions until now are only partly printed, and known only to the narrow circle of his closest disciples; but these investigations are the basis of one of the most important mathematical disciplines, they have by themselves of very great value and for me especially important in connection with my work of the mechanics of celestial bodies. Moreover, I plan to visit Leipzig, Heidelberg, and Königsberg in order to study the structure and the activities of their seminars, with the intention in mind to implement this knowledge in our own mathematical seminar.” Besides this he is interested in looking at the Mathematical Cabinet and try to decide which of the presently existing models of curves and surfaces are the most suitable in teaching, while our Mathematical Cabinet does not yet have such models, but the acquiring of which is necessary in the learning of modern spatial geometry. In the voting of this decision 25 were for it and 1 against. On Apr. 23, 1884 this request was fulfilled (by the Curator of the Dorpat Educational District). Pages 128–133. Another appeal submitted by A. Lindstedt on Apr. 22, 1885 for a business trip during the summer vacation. Satisfied on May. 17, 1885. Page 139. A highly interesting list: • 1879 II Semester: analytic functions – 3 hours. • 1880 I Semester: elliptic functions – 3 hours. • 1880 II Semester: elliptic and Abelian functions – 3 hours. • 1881 I Semester: newest geometry and algebra – 3 hours. • 1881 II Semester: theory of algebraic surfaces – 4 hours. • 1882 I Semester: analytic functions, Part I – 4 hours. • 1882 II Semester: analytic functions, Part II – 4 hours. • 1883 I Semester: newest geometry and algebra – 4 hours. Page 144. Appeal to give him spend the winter vacation abroad. (Secretary H. Treffner.) Page 145. Discussion of the wish of Ewa Lindstedt (née Petersson) who wants to make a trip abroad together with the children (sons: Samuel, born May 1 1880; Gustav, born May 11 1882; and Folke, born Sept. 14 1884 – the University has no objections. Daughter Hilda (June 12 1881). Of Evangelic-Lutheran faith. Honored by the Holy Stanislav 3 Rank Order in Oct. 1882. As well Anders as his wife Ewa remained Swedish citizens during his service in Dorpat. • On Nov. 19, 1885 (appeal by A. Lindstedt) for a trip abroad together with his daughter Hilda. Page 153. From a presentation A. Lindstedt to the Council on Nov. 27, 1885, in the form of the following letters:
4. Notes on five 19th century Tartu mathematicians
309
“richtet der Endesunterzeichnete das ergbenste Gezuch ihn vom 2. Dezember bis zum Schluss des Semesters einen zu einem Urlaub zu einer Reise in Finnland und Russland in Wohlfartsangelegenheiten bewilligen zu wollen”. Signed: Prof. A. Lindstedt. Page 156. richtet der Endesunterzeichnete das ergebenste Gezuch ihm für den Dauer von 3. Wochen, von Anfang des kommenden Semesters abgerechnet, an und für Wohlfartsangelegenheiten einen Urlaub zu wollen. signed: Falun in Schweden den 22. Dezember, 1883, Professor Dr. A. Lindstedt. • On Jan. 14 1886 there is a letter to the Council from the Dean Weihrauch. . . . habe ich die Ehre zu berichten, dass der Bewilligung des beigelegten, von Herr Professor Anders Lindstedt eingereichte Entlassungsgezuches seitens der Physico-mathematischen Fakultät kein Hinderniss im Wege steht. signed: Dean Weihrauch. A letter to the Council (by A.L.): . . . richtet der Endesunterzeichnete das ergebniste Gezuch ihm, wegen eines von ihm befogten Rufes nach Stockholm, den Abschied aus den Staatsdienst, vom 31. Januar ab, bei den hohen oberen erwirken zu wollen. signed: Falun in Schweden den 30. Dezember, 1883, Professor Dr. A. Lindstedt. Resignation of Lindstedt from his position as Ordinary Professor of Applied Mathematics from Jan. 31, 1886 on. Application approved by the Ministry of National Education on Feb. 4, 1886. [10],[11],[12],[14]
References Publications of Lindstedt covered by the Jahrbuch der Fortschritte der Matematik37 (1882-1891) [1] A. Lindstedt. Über ein Theorem des Herrn Tisserand aus der Störungstheorie. Acta Math. IX, 1887, 381– 384. JFM 19.1218.01. [2] A. Lindstedt. Sur la détermination des distances mutuelles dans le probléme des trois corps. Ann. de l’Éc. Norm. (3) I, 1884, 85–102. JFM 16.1105.01. [3] A. Lindstedt. Über die allgemeine Form der Integrale des Dreikörperproblems. Astr. Nachr. 2503, 1883. JFM 15.0980.01. [4] A. Lindstedt. Über die Bestimmung der gegenseitigen Entfernungen in dem Probleme der drei Körper. Astr. Nachr. 2557, 1883. JFM 15.0982.03. [5] A. Lindstedt. Sur la forme des expressions des distances mutuelles, dans le problème des trois corps. Comtes rendus XCVII, 1883, 1276–1278, 1353–1356. JFM 15.0982.04. 37Extracted from the Jahrbuch Data base. The ordering is inverse to the chronological.
310
C HAPTER V. HISTORY OF MATHEMATICS
[6] A. Lindstedt. Über die Integration einer gewissen Differentialgleichung. Astr. Nachr. 2482, 1883. JFM 15.0983.01. [7] A. Lindstedt. Beitrag zur Integration der Differentialgleichungen der Störungstheorie. Petersburg und Leipzig Voss’s Sortiment, 1883. JFM 15.0983.02. [8] A. Lindstedt. Zur Theorie der Fresnel’schen Integrale. Wiedemann Ann. (2) XVII, 1882, 720–725. JFM 14.0836.01.
Other publications of Lindstedt [9]
A. Lindstedt. Undersökning av meridiancirkeln påLunds observatorium jemte bestemning af densammas polhöjd. (Investigation of the meridian circle at Lund Observatory, together with a determination of its polar height.) In: Thesis, Lund, 1877. Also: Lunds Universitets Årskrift, 13 (1876-77). [10] A. Lindstedt. Beobachtungen des Mars während seiner Opposition 1877 angestellt auf der sternwarte zu Lund. In: Lunds Universitets Årskrift, 14, Lund, 1876.
Auxiliary references [11] [12] [13] [14]
Svenskt biografiskt Lexikon XIII, 1898, 612–617. Svensk Uppslagsbok (Swedish Encyclopedia). Nationalencyklopedin (Swedish National Encyclopedia). B. Lindblad. Anders Lindstedt, obituary. Populär Astronomisk Tidskrift 20 (3–4), 1939, 134–135.
4. Notes on five 19th century Tartu mathematicians
311
Theodor Molien (1861-1941)38
Molien was born in Riga in a family of Swedish decent. He studied in Dorpat, and in Leipzig, taking part in Lie’s famous seminar. He became a docent in Dorpat in 1885. Unable to get a professorship elsewhere he moved to Tomsk (in Siberia) in 1900, and lived there for the rest of the life, and died there. Although generally little known, Molien has to be viewed as a pioneer in contemporary algebra. For more details about this, see the previous paper Uno Kaljulaid, “Theodor Molien, about his life and mathematical work as seen a century later. (A biographical sketch and a glimpse of his work.)” in Section 3 of this Chapter, as well as the book by Kanunov quoted there. For a short but excellent biography of Molien, also summarizing well his scientific achievements, we refer to Bashmakova [1].
“A Big folder” no 2333: January 18, 1880 – January 17, 1901. Theodor Georg Andreas Molien, born on August 29, 1861 (Riga). Service record:
Councillor of State, Doctor of Pure Mathematics Fedor Eduardovich Molien, Docent of Mathematics, 39 years and by birth of EvangelicLutheran faith, Knight of the Order of the Holy Stanislav III class. Has the Government Medal of Alexander III. • Has a salary 1200 rubles a year. 38Editor’s Note. After this paper was completed (May 2004), we became aware of the thesis of L. B. Stiller [3]. There there is an interesting historical note about Theodor Molien (and Friedrich Ameling) [3, Sec. 6.1.3, p. 76–82]. In particular, Molien’s achievements as a theoretician of chess of note are discussed there, something which Kanunov [2] and other of his biographers seem to have overlooked.
312
C HAPTER V. HISTORY OF MATHEMATICS
• Having finished the complete course at the University of Dorpat/Yurjev, he was nominated for the degree of Candidate of Astronomy (August 5, 1883) and for the degree of Master of Pure Mathematics by the Council of the University (October 24, 1885; he defended his Master’s thesis which was also certified by the Council on October 29, 1885). • On November 29, 1885 he was appointed (on the basis of a decision of the Council) to a Docent of Mathematics at the University mentioned. On December 20, 1888 he was appointed to the rank of Court Councillor (November 19, 1885) and to the rank of Collegiate Councillor (November 19, 1888). • He has not participated in war, has not been punished. • Travel abroad. - In 1886 Molien was abroad during the summer break; - In 1887 Molien was abroad during the summer break and likewise during the winter break; - In 1889 Molien was abroad during the summer; - In 1898 Molien was abroad during the summer; - In 1899 Molien was abroad during the summer. • Married to: Elise, née Baranius • Children: - son Benedikt October 20, 1895; - daughter Elise March 28, 189439. • In 1892 he was sent to Moscow University during the first semester with the object to improve his Russian. • September 30, 1892. The Council of the University conferred him the degree of doctor of pure mathematics. • November 19, 1893. From the last year nominated for State Councillor. • January 1, 1899. For diligent service and special work he was given the Order of St. Stanislav of of the 3-d degree. • August 11, 1899. He is sent abroad on a scientific mission with the Robert grant during the second semester. • December 16, 1900. He is appointed to ordinary professor of mathematics at Tomsk University of Technology (from September 1, 1900 on). 39Petr Krylov and Aleksandr Nikolskiˇı in Tomsk have kindly communicated to me the following information: F. E. Molin had a son and a daughter. The son, Benedikt, was killed in a battle of the Civil War in 1919. Elise was an assistant professor of the Department of Classical Philology in Tomsk State University. She died in 1988 and had no children. The family archive was acquired by the Scientific Library of Tomsk State University in 1994. Unfortunately, the documents of the archive are not sorted yet. For this reason, the archive is unavailable for investigators.
4. Notes on five 19th century Tartu mathematicians
313
• He is a Russian citizen. Curriculum Vitae
I was born on August 29 1861 in the City of Riga, I got my first education at the Riga Gymnasium of Imperator Nikolaˇıi, and began to study astronomy at Dorpat University in 1880. In 1883 I obtained the degree of Candidate of Astronomy, finishing the course I studied mathematics during 3 semesters in Leipzig. In May 1885 I passed the exam for the degree of Master of Pure Mathematics and, in October of the same year, I defended my Master thesis. At the end of the year 1885 I became a Docent at Dorpat University. I defended my Doctoral thesis in September 1892. Having got a mission to Moscow in the beginning of 1892, I began to get closer acquainted with the Russian language, allowing me to listen to lectures at the university. Signed by Th. F. Molin. (He wrote himself in this way, in Cyrillic letters.) [1], [2],[3],[4],[5] List of scientific papers [1] T. Molien. Bahn des Kometen 1880, III. Astronomische Nachrichten 2519, 1883, 353–362. [2] T. Molien. Zusats zur Bahnbestimmung des Kometen 1880 III. Astronomische Nachrichten 2519, 1883, ???–???. [3] T. Molien. Über gewisse, in der Theorie der elliptischen Functionen auftretenden Einheitswurzeln. Berichte der k. Sächsischen Gesellschaft der Wissenschaften, 1885. [4] T. Molien. Über lineare Transformation der elliptischen Functionen. Master’s thesis, Dorpat, 1885. [5] T. Molien. Über Systeme höherer komplexen Zahlen. Math. Ann. 41, 1893, 83– 156.
• In November 1894. The Educational District of Riga forwarded from French Ambassador to the Rector a brochure and a medal on the occasion of the 70-th birthday of the geometer Hermite in order to hand over them to Molien. Molien have received these awards. • On January 20, 1901, the letter to the Rector arrived from Director of the Tomsk Technical Institute with the request to clarify to what extent and from what sources his salary was payed.
“A small folder”: 1885 – 1901. Page 3 (Dec. 5, 1885). In reply to the request, a curriculum vitae for Theodor Molien, Docent of Mathematics at the Mathematical Faculty of Imperial Dorpat University, Master of Mathematical Sciences was sent to the Curatorship of Dorpat.
314
C HAPTER V. HISTORY OF MATHEMATICS
Page 8. Service record constituted in 1887 (1888?).
A docent, 26 years of age, of Evangelic-Lutheran faith, has no distinguishing signs, has a salary of 900 rubles. . . . “Having finished the science curriculum at Dorpat University with the degree of Candidate of Astronomy in 1883, he obtained in 1885 at the same university the degree of Master of Sciences in mathematical sciences. From November 29, 1885 on (Nr. 5999) promoted to docent at the Mathematical Faculty of Dorpat University. He was on vacation abroad: 1886 during the summer break, 1887 at the time of the whole winter vacation and remained during the term. Unmarried.” Page 10. On May 5, 1890 there is an order “To the Board of the Imperial Public Library”: American Journal of Mathematics, Vol. IV, 1881. 40 He asks for the possibility to keep it for 2 weeks, the journal named is needed for scientific work. Signature: Docent Master Th. Molien. The answer arrives on May 5, 1890 (in the library there is no such delivery). This is followed by a letter. Page 12. The request from Th. Molien to the Highly Honored Board (der Kaiserlichen Universität Dorpat) to obtain from Imperial Academy of Sciences for not very long time the following books, which are needed in his research as a docent41: (1) Proceedings of the American Acad. of Arts and Sciences II ser., vol. II, 186773. (2) Proceedings of the American Acad. of Arts and Sciences, vol. X, XI, 1875. (3) American Journal of Mathematics, Vol. IV, 1881. • On January 18, 1901, letter of the Chancellor the Dorpat Educational District (from Jan. 15, Nr. 294, Riga) to the Council of the University confirming that in accordance of the Order no 83 dated December 16, 1900, F.Molin has been appointed to the position of ordinary professor at Tomsk university of Technology. Page 15. The Rector’s letter to the Director of the Tomsk Technical Institute from January 22, 1901, where he makes a complaints and asks to reimburse 49 rubles paid to Molien as salary for December 16, 1900 – Jan. 1, 1901, and send this sum as fast as possible to Yurjev University. The letter refers to § 5 of Chapter 1 of the Order issued by the Ministry of National Education (to pay a docent’s salary of 1200 rubles a year). Thereafter Page 16. The Rector’e letter to the Department of National Education (from January 15, 1901, Nr. 294, Riga) that begins with the same story, and then at the end comes a note: 40This Volume to contains a posthumous paper by Benjamin Peirce (1809-1880). It has the title “Linear
Associative Algebra. With Notes and Addenda by C.S. Peirce, son of the Author” (pp. 97-229). 41The last one is a paper by B. Peirce entitled “Linear associative algebras”. It may be conceived that the former two also contain also reports by the same author.
4. Notes on five 19th century Tartu mathematicians
315
“According to this, I inform the Department of National Education that if this sum would come in time to the University, it will be transferred to the Department, according to the official regulation No. 28984 from November 30, 1898.” Page 17. However, the 392 rubles was received. About this there is also given a receipt. Page 18. Now follows a letter from Tartu University (by the Rector?) to the Department of National Education that there is a letter from Molien, saying that he has obtained a salary of 392 rubles on the position of docent from September 1, 1900 to January 1, 1901.
References Publications of Molien) See the bibliography of the paper “Theodor Molien, about his life and mathematical work . . . ”, this Volume, Section 3. Other references [1] I. G. Bashmakova. Fedor Eduardovich Molin. In: Dictionary of scientific biography, Vol. IX. Charles Scribner’s Sons, New York, 1974, 457–458. [2] N. F. Kanunov. Fedor Eduardovich Molien. Nauka, Moscow, 1983. [3] L. B. Stiller. Exploiting symmetry on parallel architectures, Ph. D. thesis. John Hopkins University, Baltimore, Maryland, 1995.
This page intentionally left blank
4. Notes on five 19th century Tartu mathematicians
317
Karl Weihrauch (1841-1891)42
Karl Weihrauch (also Carl Wayrauch) was born in Mainz, Germany in 1841. He studied first at Heidelberg and the at Giessen, where he got his Ph.D. in in 1860, and went to Livonia43 in 1862, where he worked as Senior Teacher (Oberlehrer) of mathematics at various secondary schools: from 1862 on at the Private Gymnasium in Birkenruh near Wenden (Latvian: C¯esis) and then from 1862 on at the Crown Gymnasium (Kronsgymnasium) in Ahrensburg 44 In 1869 Weihrauch defended a master’s thesis at Dorpat/Tartu, on a problem dealing with partitions. The question concerns the number fn (A) of solution to the Diophantine equation a1 x1 + · · · + an xn = A. Subsequently he published several papers on this or related problems of algebra in Schlömilch’s Zeitschrift für Mathematik Later he served as a Professor of Physical Geography and Meteorology in Dorpat, all the time continuing to publish mathematical papers also.
42Editor’s Note. In German, the name means “myrrh”. 43Historical province comprising most of present day Estonia and Latvia. The Livonian were a Fenno-
Ugrian people who around 1200 AD lived on both side of the Gulf of Riga; the City of Riga was founded by the Germans on their territory in 1201; today this people, and their language (close to Estonian) is practically extinct. 44Town nowadays called Kuressaare (Kingissepp in Soviet times, named after the Estonian revolutionary Viktor Kingissepp (1888–1922), executed after the abortive communist coup in 1922) on the island Saaremaa (Ösel).
318
C HAPTER V. HISTORY OF MATHEMATICS
“A big folder” no 3183, Stock 402 no. 3/277. Page 9. Dr. Carl Weyhrauch (Weihrauch Johann-Karl-Friedrich Filippovich) was born in Mainz on November 23, 1841. Has received a certificate from Pastor Bauder of the Evangelic Congregation in Mainz on November 20, 1862 (that he has begun as a servant of this religious belief and been confirmed, being admitted to The Holy Secret [?????] in 1850.) Page 11. In 1875 a German passport had been issued to him. Page 23. Dr. Weihrauch was until 1871 (on February 12, 1871 there is a letter to the Council of the University about this) senior teacher of mathematics at the Ahrensburg Gymnasium. He was then promoted to Docent of Physics of the Earth at the University (from July 1, 1871 on). Page 36. On September 25, 1871 asks the Council of the University for an allowance of 100 rubles, to cover his expenses in connection with his moving from Ahrensburg to Dorpat. Page 48. There is a record with some previously approved applications by Docent Weihrauch to send him to a vacation abroad so that he could acquaint himself with some observatories. He obtains permission in order to improve his health, and to recover his strength (on the basis of a new application) for 29 days (he had pneumonia and was recovering from a throat infection), but he needs medical treatment and he ought to abstain from lecturing this term and seek a better climate. That this should be for a longer period in the South is demanded by his doctor. He went to Trieste in the summer of 1874, and he asks now for a prolongation of his sojourn there. He read on the following subjects: meteorology; physical geography; terrestrial magnetism; algebraic analysis; determinants; continued fractions; Diophantine [undetermined] equations; and practical work. During these three years he has done many meteorological observations, each day 6 times, and corresponding calculations, which have been presented in print, and in numerous reports. Concerning mathematics he has sent for printing to Zeitschrift für Mathematik an expansion of material contained in Chapter 1 of his master’s thesis, as well as 2 papers to the journal Determinantenlehre [24]. From this it can be drawn the conclusion that great powers are hidden in him. However, as Doc. Weihrauch has not published any monograph in meteorology or in the field of mathematics, the Faculty is not going to propose him for a position of ordinary professor but proposes him for a position of extra ordinary professor of meteorology and physical geography (this position was opened on January 1, 1875). On December 26, 1874 there was an election, and in the ballot 32 of the votes were in favor of him and nobody was against (this was ratified by the Minister of Education Count Tolstoˇı on March 29, 1875). Page 88. On November 6, 1875 Weihrauch asks for a possibility to spend the winter holidays abroad. This was satisfied on December 4, 1875.
4. Notes on five 19th century Tartu mathematicians
319
Presentation of the Physico-Mathematical Faculty on March 3, 1877 in the Council of the University
From his election to extra-ordinary professor of meteorology and physical geography at the end of 1874 on, Weihrauch did not only keep all instruments in good shape, but also repaired them and increased their number. On the top of the 6 usual daily observation he added 2 more hours. He has published in the Dorpater Meteorologische Beobachtungen in 1874 and 1875. In connection with these observations one can rely on the absolute exactness and in all data, provided by professor Weihrauch. This is connected with his methodical control of the computations. Also there are the 10 year averages, a comprehensive paper, which was sent to the printer this year. Of great interest is also the influence of the moon’s position on weather. This occupies him already for two years and involves a lot of calculating. During the past two years Weihrauch has published, in the area of pure mathematics the following: (1) Über unimodulare Determinanten. (2) Ueber die allgemeine unbestimte Gleichung mit vier Unbekanten. (3) Theorie der Restproductsumme 1. (These 3 papers are printed in Schlömilchs Zeitschrift für Mathematik und Physik.45.) The Faculty expressed its opinion over the unprecedented educational gifts of Professor Weihrauch. In this connection they propose him for a promotion to an ordinary professor. The election took place on March 8, 1877 (32 for him, 2 of them in letter form, no voices against). Page 113. On November 9, 1878 he asks for permission to be sent abroad for 6 months with a scientific purpose, beginning January 1, 1879 to acquaint himself with some central observatories with meteorological surveys. This was granted on December 13, 1878 in the Ministry of Education (The Curator Saburov46). On January 31, 1879 there arrives a letter from Weihrauch, already in Trieste. [He had the intention also to participate in the Congress of Meteorology. This took place in Rome in April 1879, and the Faculty allocated to him 400 rubles for this purpose. In the voting there were 30 for him, and 3 against.] In the summer of 1881 he asks for permission to spend his vacation abroad because of personal reasons. In the summer of 1884 the same procedure is repeated. Page 162. Professor Weihrauch and Professor von Kennel 47 ask during the summer vacation 400 rubles for investigating Lake Peipus48 from the point of view of zoology and physical geography (32 for him, 1 against). For many years naturalists in Germany, Switzerland and France have been engaged in the study small or big inland bodies of water. Therefore we have the following goal: an exact determination of the fauna; opening up, hitherto unknown lower supplies; the spread of horizontal as well as lower animals; their resettling. Connection with fish. Great praise and explanation why Lake Peipus so special. This was approved on April 17, 1889. 45This seems to be preliminary titles of the papers in view; see the References below. 46Andreˇı Saburov was appointed Curator of the University in 1875. In 1880 he became Minister of
National Education. [39, p. 121.] 47Julius Thomas von Kennel, was appointed professor of zoology at Dorpat University in 1886; he had previously been private docent at Würzburg University. [39, pp. 194, 270, 347, 357]. 48Big lake forming the frontier between Estonia and Russia.
320
C HAPTER V. HISTORY OF MATHEMATICS
• On June 26, 1889 one discusses the need to send Weihrauch abroad to rest and to improve his health [on the basis of a attest] given to him by his family doctor (former Professor [at Tartu] von Holst 49) that he has to go to Wiesbaden in order to cure his podagra, from which he has been suffering over a period of years. (Then he was the Dean of the Physico-Mathematical Faculty, an ordinary professor and a Councillor of State.) From July 7, 1889 Dr. Weihrauch was sent to vacation (Brunner50 was then the Pro-Dean) until 1889. • On September 6, 1889 Weihrauch asks again 2 months of vacation abroad, so that he could recover his “respiratory health”. Because of his illness he has not been able to use the possiblity for a vacation. Due to this he asks to be relieved from his duties, from the change of semesters on, as his illness has deteriorated and he has to leave earlier. [On July 19, 1890 it was decided to let him remain in service for another 5 years, taking into account that he has been employed as a teacher for 25 years.] Page 188. [continued] Service Record: • At the University of Dorpat Councillor of State (July 7, 1865 – July 7, 1870); • At the University of Dorpat Docent of Physical Geography (July 1, 1871 – January 1 1875); • At the University of Dorpat – Extra-Ordinary Professor in the chair of Theoretical Geography and Meteorology (January 1, 1875 – April 16, 1877); • At the University of Dorpat - Ordinary Professor (April 16, 1877 – June 1, 1890). After 25 years of service has retired with a pension on June 1, 1890. As a professor he received a salary of 2400 rubles; as dean 400 rubles; as pension 1429 rubles yearly. Karl Weihrauch died on December 7, 1891. Wife Matilde Weihrauch – children Robert (June 8, 1877), Eliza (July 24, 1878) and Carl (February 15, 1882). Full names of the children: Filipp-Alexander-Robert; KarolinaEliza-Johanna; Karl-Ernest.
“A small folder” no 3205, Stock 402 no. 3/278: 1871–1891. In 1889 Weihrauch was a professor of physical geography and mineralogy at Dorpat University. He died in 1891 and then the sum of 300 rubles was given to his wife Matilde Weihrauch (who had 3 small children) to cover various expenses connected with the burial of her husband. 49Probably Johannes von Holst (1823–1906), gynaecologist. On his initiative, the building of the delivery
ward in Tartu was rebuilt in 1860–1861. He had many famous students, [39, p. 248.] 50Georg Bernhard Brunner (1835-1892), professor of Agricultural Economy in Dorpat 1876-1890. [39, pp. 201, 202.]
4. Notes on five 19th century Tartu mathematicians
321
Appendix. On the early life of Karl Weihrauch. (Excerpt from [41, pp. 123–125]) Karl Weihrauch was born in Mainz on November 11, 1841 as the son of the school teacher Philipp Weihrauch. His mother’s name was Anna Elisabeth, née Schmidt [43]. K. Weihrauch got his basic training (elementary and secondary school (gymnasium)) in Mainz, and studied then mathematics and chemistry at Heidelberg University. In the years 1858-1860 he continued his studies at Giessen University, where he received the degree of Ph. D. on July 13, 1860. One year he was an assistant teacher in the Mainz Gymnasium. In 1861 he moved to Estonia as a private teacher, and a year later as a mathematics teacher to the Birkenruhe (Latvian: Berzaine) Private School near C¯esis (Wenden) [44]. From here he sent, on February 17, 1863 an application to Tartu University, where he asked for permission for passing an exam for obtaining the profession of senior teacher of mathematics. This exam took place on April 8, 1863, in front of a committee, to which belonged the Rector Professor F. Bidder51, and the professors of the PhysicoMathematical Faculty (P. Helmling52), H. Mädler53 and L. F. Kämtz54. To all questions of the exam (except to question 1) he gave the answer “very good”. In addition to the exam he had to give a trial lecture and present a written paper, which obtained the approval of the committee. It was also remarked there about the written paper “Versuch einer Behandling einiger Gegenstände aus der Wärmelehre” that “it shows that the author has the ability to treat scientific questions independently, and possesses a maturity for pedagogical work” [45]. On April 15, 1863 K. Weihrauch obtained an order from the Curator of the Tartu Educational District to move to Arensburg (under Soviet rule Kingisepp [now Kuressaare]) as a senior teacher of mathematics. [1],[2],[3],[4],[5], [6],[7],[8],[9],[10], [11],[12],[13],[14],[15], [16],[17],[18],[19],[20], [21],[22],[23],[24],[25], [27],[28],[29],[30],[31], [32],[33],[34],[35],[36], [37],[38],[39],[40],[41], [42],
References Publications of Weihrauch covered by the Jahrbuch der Fortschritte der Matematik55 (1881-1891) [1] [2] [3]
[4] [5]
K. Weihrauch. Über eine algebraische Determinante mit eigentümlichem Bildungsgesetz der Elemente. Schlömilch Z. XXXVI, 1891, 34–40. JFM 23.0148.03. K. Weihrauch. Über gewisse goniometrische Determinanten und damit zusammenhängende Systeme von linearen Gleichungen. Schlömilch Z. XXXVI, 1891, 71–77. JFM 23.0151.03. K. Weihrauch. Fortsetzung der neuen Untersuchungen über die Bessel’sche Formel und deren Verwendung in der Meteorologie. Schriften herausg. von der Naturforscher-Gesellschaft bei der Universität Dorpat. K. F. Koehler., Leipzig, 1890. JFM 22.1235.01. K. Weihrauch. Bildung von Taupunkt-Mitteln. Met. Zeitschr. VII, 1890, 429–432. JFM 22.1250.01. K. Weihrauch. Ableitung des mittleren Sättigungsdeficits. Met. Zeitschr. VI, 1889, 73–74. JFM 21.1246.01. 51Georg Friedrich Karl Heinrich Bidder (1810-1894) was a famous physiologist. [39, p. 236.] 52Peter Helmling (1817-1901), mathematician of German extraction, taught at Dorpat from 1852 on,
published papers on definite integrals and ordinary differential equations. 53Johann Heinrich Mädler (1794-1874), taught at Dorpat 1840-1865. 54Ludwig Friedrich Kämtz (1801-1867), physicist and meteorologist, educated in Halle, taught at Dorpat 1841-1865 55Extracted from the Jahrbuch Data base. The ordering is inverse to the chronological.
322
[6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20]
C HAPTER V. HISTORY OF MATHEMATICS
K. Weihrauch. Über gewisse Determinanten. Schlömilch Z. XXXIII, 1888, 126–128. JFM 20.0148.03. K. Weihrauch. Die elementaren Ableitungen des Satzes von der “ablenkenden Kraft der Erdrotation”. Met. Zeitschr. (2) V, 1888, 81–82. JFM 20.0942.01. K. Weihrauch. Neue Untersuchungen über die Bessel’sche Formel und deren Verwendung in der Meteorologie. Th. Hoppe und E. J. Karow; K. F. Koehler., Dorpat, Leipzig, 1888. JFM 20.1270.01. K. Weihrauch. Theorie der Restreihen zweiter Ordnung. Schlömilch Z. XXXII, 1887, 1–21. JFM 19.0179.02. K. Weihrauch. Über Pendelbewegung bei ablenkenden Kräften, nebst Anwendung auf das Foucault’sche Pendel. Exner Rep. XXII, 1886, 480–491. JFM 18.0865.01. K. Weihrauch. Einfluss des Widerstandes auf die Pendelbewegung bei ablenkenden Kräften, mit Anwendung auf das Foucault’sche Pendel. Exner Rep. XXII, 1886, 643–675. JFM 18.0865.02. K. Weihrauch. Über die dynamischen Centra des Rotationsellipsoids mit Anwendung auf die Erde. In: Bull. de l. Soc. Imp. d. Nat. de Moscou, Moscow, 1886, 643–675. JFM 18.0928.02. K. Weihrauch. Über die Zunahme der Schwere beim Eindringen in das Erdinnere. Exner Rep. XXII, 1886, 396–401. JFM 18.1086.01. K. Weihrauch. Über die Berechnung meteorologischer Jahresmittel. Mattiesen, Dorpat, 1886. JFM 18.1119.01. K. Weihrauch. Über die Abweichung eines freifallenden Körpers von der Verticalen. Met. Zeitschr. II, 1885, 27–29. JFM 17.0880.02. K. Weihrauch. Ein neuer Satz aus der Anemometrie. Met. Zeitschr. I, 1885, 291–293. JFM 17.1150.01. K. Weihrauch. Über das Sättigungsdeficit. Met. Zeitschr. I, 1885, 260–264. JFM 17.1151.03. K. Weihrauch. Über doppelt-orthosymmetrische Determinanten. Schlömilch Z. XXVI, 1881, 64–70. JFM 13.0127.01. K. Weihrauch. Wert einer doppelt-orthosymmetrischen Determinanten. Schlömilch Z. XXVI, 1881, 132– 133. JFM 13.0127.02. K. Weihrauch. Eine Polynomenentwickelung. Schlömilch Z. XXVI, 1881, 127–132. JFM 13.0189.01.
Other mathematical publications of Weihrauch [21] K. Weihrauch. Beiträge zur Lehre von den unbestimmten Gleichungen ersten Grades. Programm des Gymnasium zu Ahrensburg, 1866. [22] K. Weihrauch. Untersuchungen über eine Gleichung des ersten Grades mit mehreren Unbekannten, Dorpat, 1869. [23] K. Weihrauch. Über die Formen, in denen die Lösungen einer diophantischen Gleichung vom ersten Grades enthalten sind. Schlömilch Z. XIX, 1874, 53–67. [24] K. Weihrauch. Zur Determinantenlehre. Determinantenlehre, 1874. [25] K. Weihrauch. Die Anzahl der Lösungen diophantischer Gleichungen ersten mit teilerfremden Koeffizienten. Schlömilch Z. XXII, 1877, 97–111. [26] K. Weihrauch. Über die Ausdrücke Σf x(m) und die Umgestalltungen der Formel für die Lösungsanzahlen; Anwendung der Formeln der Kombinationslehre. Schlömilch Z. XX, 1875, 112–117. [27] K. Weihrauch. Anzahl der Auflösungen einer unbestimmten Gleichung für einen Spezialfall von nicht teilerfremden Koeffizienten. Schlömilch Z. XX, 1875, 314–316. [28] K. Weihrauch. Zur Konstruktion einer unimodularen Determinante. Schlömilch Z. XXI, 1876, ??–??. [29] K. Weihrauch. Ein Satz von ebenen Viereck. Schlömilch Z. XXVI, 1881, 1–21. [30] K. Weihrauch. Über eine algebraischen Determinante mit eigentümlichen Bildungsgesetz der Elemente. Schlömilch Z. XXVI, 1881, 34–40. [31] K. Weihrauch. Über gewise goniometrische Determinanten und damit zusammenhängenden Bildungsgetz Systeme von linearen Gleichungen. Schlömilch Z. XXVI, 1881, 71–77. [32] K. Weihrauch. Zusammenhang der Seiten des regelmässigen 5- und 10-Ecks mit dem Radius. Grünerts Archiv der Mathematik und Physik 45, 1866, 355–356. [33] K. Weihrauch. Zur geometrischen Construction der vierten und der mittleren Proportionale. Grünerts Archiv der Mathematik und Physik 46, 1866, 336–337. [34] K. Weihrauch and A. J. Oettingen. Meteorologische Beobachtungen in Dorpat 1866-70 und Kritik der Beobachtungsmethoden, I–II, Dorpat, 1866. [35] K. Weihrauch. Anemometrischen Scalen für Dorpat. Ein Beitrag zur Klimatologie Dorpats. In: Archiv für Naturkunde Liv-, Ehst- und Kurlands. Dorpater Natuforscher Gesellschaft, Vol. 9, Dorpat, 1885.
4. Notes on five 19th century Tartu mathematicians
323
[36] K. Weihrauch. Neue Untersuchungen über die Bessel’sche Formel und deren Verwendung in der Meteorologie, Dorpat, 1890.
Auxiliary references [37] J. C. Poggendorff’s biographisch-litterarisches Handwörterbuch, Dritter Band (1858 bis 1883). Barth, Leipzig, 1898. [38] J. C. Poggendorff’s biographisch-litterarisches Handwörterbuch, Vierter Band Band (Die Jahre 1888 . . . ). Barth, Leipzig, 1904. [39] K. Siilivask (ed.) Tartu Ülikooli Ajalugu, II (1798-1918). Eesti Raamat, Tallinn, 1982. English translation of the entire series: Karl Siilivask (ed.), History of Tartu University 1632-1982. Perioodika, Tallinn, 1985. [40] G. G. Levitskiˇı(ed.) Biographical Dictionary of Professors and Teachers of the Imperial Yurjev, formerly Dorpat, University one hundred years from its foundation (1802-1902), Vol. I. K. Mattisen, Yurjev, 1902. [41] L. Kongo. Johann Karl Friedrich Weihrauch – Tartu Ülikooli esimene füüsilise geograafia ja meteoroloogia professor (Johann Karl Friedrich Weihrauch, the first professor of physical geography and meteorology at Tartu University). Tartu Ülikooli ajaloo küsimusi (Questions of the History of Tartu University.) 5, 1977, 123–137. [42] E. Tammiksaar. Das Fach der Geographie an der Universität DorpatTartu in den Jahren 1802-1891. In: Jahrbuch der Akademischen gesellschaft für Deutschbaltische Kultur in Tartu (Dorpat) Band 1., Tartu, 1996, 78–102. see the Section “Johann Karl Friedrich Weihrauch: Geograph oder Geophysiker?” written by E. Tammiksaar, pp. 20.-23.
References for the Appendix [43] Deutsch-Baltisches Biografisches Lexikon 1710-1060. Böhlau Verlag, Köln, Wien, 1970. [44] Birkenruher Album 1825-1892, St. Petersburg, 1910. [45] Archive of the Estonian Soviet State Archives,f. 402, nim. 3, s.-ü. 277, l. 15.
This page intentionally left blank
CHAPTER VI Popularization of Mathematics
This page intentionally left blank
327
1.
[K68a] and [K69b] On the geometric methods of Diophantine Analysis
“What’s the good of that?” said Rabbit. “Well,” said Pooh, “we keep looking for Home and not finding it, so I thought that if we looked for this Pit, we’d be sure not to find it, which would be a Good Thing, because then we might find something that we weren’t looking for, which might be just what we were looking for, really.” A. A. Milne, The house at Pooh corner
The domain in Number Theory which is concerned with problems and results about the search of integer solutions of algebraic equations is [nowadays] referred to as Diophantine Analysis. Its elementary problems [early] caught the attention of many mathematicians.1 But solutions of seemingly different problems were usually obtained each time by a separate artifice, so in the opinion of mathematicians “chaos” ruled there for a long time. In the course of the 20th century greater clarity about these matters was brought by the flourishing of algebraic geometry, the main object of study of which discipline are so-called algebraic varieties. Namely, it became clear that with each algebraic equation, or system of equations, one can associate a certain algebraic variety, then the solution, or the solutions, can be interpreted as points of this variety. The Diophantine problem consists now of finding all points with integer or rational coordinates on it. One may ask in which way such a geometric point of view is more advantageous than the earlier methods used in Diophantine Analysis? The answer is the following: in the case of an algebraic variety one has to deal with a whole series of algebraic and topological structures – it is a topological space in several different topologies, an analytic space, a Lie group etc. The theory of structure referred to here is today very rich in fundamental results and ideas, which, along with arithmetical considerations, can be used to a great advantage in the theory of equations. Algebraic Geometry provides a langauge for the clarification of these simple2 notions, as the number of unknowns in the equations, the degree of the equations, change of variables etc. The geometric methods brought order into the “chaos” of Diophantine Analysis, classifying them according to the invariants of the corresponding varieties. An example of such an invariant is the dimension of the variety. In our paper we will mainly deal with one dimensional varieties, which usually are called algebraic curves. We shall here encounter more closely with the 1Translators’ note. For the history of the Diophantine analysis, see the book [1] written by I. G. Bashmakova. The traditional view has been that Diophantus wrote just a collection of problems, whereas this author advocates the opinion that the Greek, indeed, possessed deep insights in algebraic geometry. 2. . . and thus possessing from the point of view of Diophantine question a completely mystical meaning ...
328
C HAPTER VI. POPULARIZATION OF MATHEMATICS
notion of algebraic curve and their classification 3. A major part of the actual material was obtained by L. J. Mordell, A. Weil and C. L. Siegel in the years 1920–30. The Reader will here have a chance to penetrate rather deeply into the results and problems of the theory of elliptic curves.
I. Algebraic introduction – John, what topic did you treat in mathematics at school?. – Addition. – How much does it make, if you add three to two apples? – I do not know. We did it with oranges. English anecdote
1.1. Number fields 4 Let us compare two rather well-known domains of numbers: the set of integers and the set of rational numbers. They will be denoted by Z and Q, respectively. Adding, subtracting and multiplying integers, we get as a result again integers. In this sense one says that the domain of integers is closed for the three operations mentioned. More exactly, one says that the domain of integers Z is a ring. But division is not always possible in the domain of integers. Now the question “is the integer b divisible by the integer a?” is equivalent to the question “does the equation ax = b have a solution in terms of integers?”. Therefore the ring of integers Z is an example of a commutative ring such that the equation ax = b (a = 0) does not have a solution which is also an element of the same ring. The situation is different for the domain of rational numbers Q, which is closed with respect to division. In this domain Q each equation ax = b b (a = 0, a, b ∈ Q) has a solution x = ∈ Q. Finally, we have reached a simple example a of a so-called field. A field is a commutative ring in which all equations ax = b (a = 0) have a unique solution; or, in other words, it is a commutative ring with a unit element such that each element other than zero has an inverse element. 5 As an example of a field we have the domain of rational numbers Q and the domain of real numbers R, and likewise the domain of complex numbers C = {a + bi| a, b ∈ √ R, i = −1}. Complex numbers are usually identified with points in the real plane (cf. Figure 1). 3See also [7], Section 2. 4A more prepared Reader can begin with Section 1.3. Translator’s note. For an introduction to basic
notions of algebra such as group, semigroup, ring, field etc., we refer to the classical texts by B.L. van der Waerden [19], S. Lang [11], and P. M. Cohn [4]. We mention further two excellent books by A. G. Kurosh [9, 10]. 5See E. Gabovitsh, Algebra põhimõisted I-V. (In Estonian: The fundamental notions of algebra). In: Mathematics and Our Age 6-10. (Translator’s note. Reprinted in the latter’s book Stories about contemporary mathematics. (Estonian) Valgus, Tartu, 1967.) See also footnote 4.
1. On the geometric methods of Diophantine Analysis
y
329
6 a
z = x + iy
-
0
x
Fig. 1
Let us give here also another less known way of presenting some other properties of the last field. Complex numbers can be viewed as second order square matrices. Indeed, consider the ring of real second order square matrices a b , where a, b, c, d ∈ R . R2 = c d We introduce now a one-to-one correspondence a b a + bi ←→ . −b a In particular, to a real number a there corresponds a so-called scalar matrix: a 0 a ←→ , 0 a and to the imaginary unit i the matrix: i ←→
0 1 . −1 0
Making use of the recipes for adding and multiplying matrices and complex numbers, we make sure that this correspondence establishes an isomorphism between the domain of complex numbers C (each field is a ring) and a certain subring A of the ring of matrices R2 . As this subring is isomorphic to the field C, it is likewise a field. At the same time, the ring of matrices R2 is far from being a field. Indeed, it contains divisors of zero: a 0 0 0 0 0 = ←→ 0, 0 0 0 b 0 0 But in a field there cannot be divisors of zero. By the way, the last assertion is easy to verify: if we should have ab = 0, (a = 0, b = 0), the one gets b = a−1 ab = a−1 0 = 0, that is b = 0. In the fields Q, R and C given in the examples, there are infinitely many elements. The simplest finite field consists of two elements – the zero element and the unit element e, with 0 · 0, 0 · e = e · 0, e · e = e, 0 + 0 = 0, 0 + e = e + 0 = e, e + e = 0.
330
C HAPTER VI. POPULARIZATION OF MATHEMATICS
We obtain a series of examples of finite fields by taking the fields of remainder classes Z/(p), where p is an arbitrary fixed prime number. They are defined as follows. In the domain of integers Z we introduce a distribution into classes, taking into one class all integers, which give the same remainder upon division by p. In this way one gets the various remainder classes ¯0, ¯ 1, ¯ 2, . . . p − 1; here we denote by k¯ the class consisting of the integers p · n + k. Addition and multiplication of the classes thus gotten is denied by the formulae
m + n, if m + n < p, m + n − p, if m + n ≥ p; m ¯ ·n ¯ = r¯, if mn = p · q + r, 0 ≤ r < p.
m ¯ +n ¯=
For example, Z/(3) consists of the classes ¯ 0, ¯ 1, ¯ 2, where ¯0 + ¯0 = ¯0, ¯0 + ¯1 = ¯1, ¯ 0+¯ 2=¯ 2; ¯ 1+¯ 1=¯ 2; ¯ 1+¯ 2=¯ 0, ¯0 · ¯0 = ¯0,
¯0 · ¯1 = ¯0,
0¯ · ¯ 2=¯ 0;
¯ 1·¯ 1=¯ 1,
¯ 1·¯ 2=¯ 2;
¯ 2+¯ 2=¯ 1; ¯ 2·¯ 2=¯ 1.
We leave it to the Reader to check that [passing to the case of general p] the set of classes ¯0, ¯1, ¯2, . . . p − 1 equipped with these operations of addition and multiplication is a field, which we denote by Z/(p). Next, we consider an arbitrary field K. Its unit element (the solution of the equation ax = a (a = 0)) shall be denoted e. The field K being closed for addition and multiplication, the integer multiples of the unit element ±0, ±e, ±2e, ±3e, . . . , ±ne are, similarly, elements of K. Let us first have a look at the case when all elements ne, for different multipliers n, are all distinct. Such a field K is said to be of characteristic 0. One can make the correspondence e → 1 and verify that the ring ±0, ±e, ±2e, ±3e, . . . is isomorphic to the ring Z. Thus one can say that a field K of characteristic 0 contains the domain of integers, that is, Z ⊂ K. But K is also closed with respect to division and so contains all m fractions , m, n ∈ Z. In other words, if K is a field of characteristic 0, then Q ⊂ K. n This result tells us that all fields of characteristic 0 must be infinite. Among examples of fields of characteristic 0 are the known to us domains Q, R and C. Another logically possible case is when there exits m = n, m, n ∈ Z such that me = ne. Assume, for instance that m > n. As (m − n)e = 0, we deduce that there exists a natural number u such that ue = 0. Let p be the smallest natural number such that pe = 0. As a field does not have zero divisors, it is easy to see that p must be prime. In this case one says that K is of characteristic p. In a similar way as above one can now prove that each field of characteristic p contains the field of remainder classes Z/(p). At the same time, Z/(p) is the simplest example of a field of characteristic p. Let there be given a field K. Each field E which contains the given field K (K ⊂ E) is called an extension of K and is written E/K. An extension E/K is called finite, if there exist elements α1 , . . . , αm ∈ E such that each α ∈ E can be written in a unique way as α = p1 α1 + · · · + pm αm , where pi ∈ K. Then the extension E/K may be viewed as a finite dimensional vector space over the base field K.
1. On the geometric methods of Diophantine Analysis
331
√ Next, we look at an example. The set of real numbers {p + q 2, where p, q ∈ Q} turns out to be a field, that is, it is closed for addition, subtraction,√multiplication and division (please, check!). Taking here q = 0 we see that Q ⊂ Q( 2). The extension √ Q( 2)/Q is finite, √ because it can be seen as a 2-dimensional vector space over Q. One of its bases is {1, 2). Let us now have another look at finite fields. In a finite field E all multiples ne cannot be distinct. Therefore such a field must be of characteristic p > 0. It follows that E must contain the field of remainders Z/(p) and so E must be an extension of Z/(p), and, of course, a finite one. Therefore any finite field E can be viewed as a vector space of finite dimension over finite field Z/(p). Let n be the dimension of E: dim E = n. Now it is easy to see that the corresponding n-dimensional vector space must consists of pn elements (for the field Z/(p) has p elements). From this it is seen that in a finite field the number of elements must be a power of the characteristic. The converse is also true. Indeed, for any number q = pn , where p is a prime number, there exits a field Fq with q elements; all such fields with q element are isomorphic among themselves.
1.2. Algebraic number fields Consider the equation p(x) = 0, where p is a polynomial with rational coefficients. Letting n be the degree of this equation, it follows from the fundamental theorem of algebra that the equation has n solutions in C. However, these solutions need not at all be rational numbers. Complex numbers which are solution of such an equation are called algebraic numbers. All rational numbers are algebraic numbers, because each q ∈ Q is the solution of √ the equation x − q = 0. The numbers n q (q ∈ Q), likewise, are algebraic numbers, being solutions of the equation xn − q = 0. It is easy to see that the sum, difference, product and quotient of algebraic numbers are again algebraic numbers. Therefore all algebraic numbers form a field containing Q. It will be denoted Ω. It can be proved that if a complex number is the solution of an equation P (z) = 0, where P (z) is a polynomial whose coefficients are algebraic numbers, then z is an algebraic number. This result shows that Ω is a algebraically closed field, that is, all solutions of an equation where P (z) is any polynomial over Ω belong to this field. Another example of an algebraically closed field is C. Let us consider the extension Ω/Q. This extension is not anymore finite, √ but it contains subfields A which are finite extensions of Q. For example, the field Q( 2). In a narrower sense, one intends by an algebraic number field a finite extension of Q all of which elements are algebraic numbers. Thus we have for each algebraic number field A Q ⊂ A ⊂ Ω ⊂ C. The following remarkable description of algebraic number fields is due to Leopold Kronecker. He proved that each algebraic number field is isomorphism to a field of remainder classes of polynomials Q[x]/(f (x)). In order to understand this notion let us say the following. Here Q[x] denotes the set of polynomials with rational coefficients; one checks readily that this is a ring. In this ring, as in the ring of integers Z, not every element (a polynomial) is divisible with another element (likewise a polynomial), so it is possible to speak of the remainder under division. In the ring Q[x], a role analogous to the one of prime numbers [in the ring Z] is played by the irreducible polynomials, that
332
C HAPTER VI. POPULARIZATION OF MATHEMATICS
is, polynomials which cannot be written as a product of polynomials of lower degree. Here arise remainder classes under division by a polynomial f (x), and the set of these remainder classes is, in case f (x) is irreducible, a field Q[x]/(f √ (x)). Kronecker’s theorem is precisely about such fields. For example, the field Q( 2) is isomorphic to the field Q[x]/(x2 − 2). In passing, we remark also that the notion of irreducible polynomial and of the remainder classes with respect to such a polynomial can be introduced also in the case of the ring R[x] of polynomials with real coefficients. It is possible [and easy!] to show that the field of remainder classes R[x]/(x2 + 1) is isomorphic to C. This gives us yet another possibility of defining complex numbers. In each algebraic number field A there exists a number Θ ∈ A such that that each α ∈ A can be written in the form α = a0 + a1 Θ + · · · + an−1 Θn−1 ,
where ai ∈ Q,
and n is the minimal degree of a polynomial with Θ as a solution. Finally, some supplementary remarks. There are plenty of numbers which are not solutions to any equation p(x) = 0, where p(x) is polynomial with rational coefficients. Such numbers are called transcendental numbers. Their existence was established by Joseph Liouville in 1844. In 1874 Georg Cantor showed that there are much more transcendental numbers than algebraic numbers: the set of algebraic numbers is countable, but the set of real numbers has the power of the continuum. Among transcendental numbers there are the well-known π = 3.14159 . . . and e = 2.718281 . . . Many new examples of transcendental numbers are provided of the theorem of A. O. Gel’fond, stating that the number αβ is transcendental d that α and β are algebraic numbers, assuming that α is neither 0 or 1 and that β is irrational. By proving this theorem Gel’fond solved, in 1936, the famous sixth problem of Hilbert. In Number Theory and in Diophantine Geometry, especially, algebraic number fields are of major importance. In what follows we shall understand by a field almost always an algebraic number field.
1.3. The notion of the n-dimensional projective space Let K be an arbitrary field. In the sequel we will often consider the set of n-tuples K n , that is, the set K n = K × K · · · × K = {(k1 , . . . , kn ), where each ki ∈ K} or some of its subsets. How to introduce geometry into the set K n ? Let us first look at two special cases. 0 O
1 E
x X
R
Fig. 2
It is well-known that the set of real numbers R is in a one-to-one correspondence with the points of a line. This correspondence can be obtained as follows. On the line one −−→ → selects arbitrarily an origin O and a unit vector OE = − e1 (Figure 2); we agree that 0
1. On the geometric methods of Diophantine Analysis
333
corresponds to the point O, 1 to the point E, and an arbitrary real number x to the end −−→ −−→ point X of the vector OX = x · OE. In analogous way, a one-to-one correspondence between the pairs of real numbers (x1 , x2 ) and the points of a plane can be obtained by means of a so-called frame (Figure 3). Often pairs of real numbers and points of a plane are simply identified. R
− → e2
p2 0
d(x1 , x2 ) = x
(p1 , p2 ) d
p1
→ − e
R
1
Fig. 3
The set R × R of pairs of real numbers is denoted R2 . In this correspondence there corresponds to the pairs (p1 , p2 ), p1 , p2 ∈ Q a certain subset, which will be denoted Q2 . It follows also that the set Q2 may be viewed as a 2-dimensional vector space [over Q]. In general, we may consider the n-dimensional vector space V n (K) over an arbitrary field K. Its elements will again be called points. Let e1 , . . . , en be a basis in this vector space. Then one can express each element x ∈ V n (K) uniquely in the form (92)
x = x1 e1 + x2 e2 + . . . xn en ,
where each xi ∈ K.
Let us consider the correspondence x → (x1 , x2 , . . . , xn ). It follows from equation (92) that the elements x of V n (K) and the n-tuples, where xi ∈ K, there arises a one-to-one correspondence x ←→ (x1 , x2 e, . . . , xn ). In view of this one-to-one correspondence one can identify K n with the elements x = (x1 , x2 , . . . , xn ). In the sequel we speak of the space K n and its points x = (x1 , x2 , . . . , xn ). In special case n = 2 the the space K 2 is called the plane. Next, let us consider the space K n+1 . In this space we denote by (K n+1 )∗ the subset of elements distinct from the origin (0, . . . , 0). Similarly, we define K ∗ as the subset of K distinct from zero. We define also a multiplication of points of (K n+1 )∗ by elements of the field as follows. If k ∈ K ∗ and (x0 , . . . , xn ) ∈ (K n+1 )∗ , we set k · (x0 , . . . , xn ) = (kx0 , . . . , kxn ).
334
C HAPTER VI. POPULARIZATION OF MATHEMATICS
Clearly, the multiplication of points in (K n+1 )∗ by elements of K ∗ gives again elements of (K n+1 )∗ . Thus we have defined a composition (k, x) → kx, in other words a mapping K ∗ × (K n+1 )∗ → (K n+1 )∗ . This composition makes it possible to introduce in the point set (K n+1 )∗ a distribution into classes. The points (x0 , . . . , xn ) and y = (y0 , . . . , yn ) are considered equivalent if there exists a k ∈ K ∗ such that x = ky, that is x0 = ky0 , . . . , xn = kyn . We denote this equivalence by the letter E. We have on the point set (K n+1 )∗ a decomposition into classes, whose sets of classes (K n+1 )∗ /E is called the n-dimensional projective space Pn (K). The equivalence classes of E (that is, the points of the space Pn (K)) can, be identified to lines in K n+1 through the origin [rays]. In the special case K = R and n = 2 this construction gives the ordinary projective plane 6, so that we have a generalization of the notion of projective plane to the case of an arbitrary field and general dimension. The problems of Diophantine Geometry require that we consider the projective space Pn (K).
II. Algebraic curves Mathematicians are like Frenchmen: whatever you tell them, they at once interpret it in their own language and it has become something quite different ... J. W. Goethe
1.4. Curves and their arithmetic Now we shall investigate how the solution of a curve can be interpreted as a geometric problem. In the plane one can consider point sets of a rather varied kind. What is a curve? Let there be given the equation p(x, y) = 0 where the left hand side is a polynomial with real coefficients, that is, the equation Am (x)y m + Am−1 (x)y m−1 + · · · + A1 (x)y + A0 (x) = 0, (i)
(i)
(i)
where Ai (x) = aki xki + · · · + a1 x + a0 is a polynomial with real coefficients. We distinguish in the plane all the points (x, y) whose coordinates satisfy this equation. The subset thus obtained is a curve. For example, the solutions of the equation x2 + y 2 = 1 can be interpreted as the points in the plane R2 with P = (cos α, sin α) (cf. Figure 4). Such a definition of the notion of a curve looks perfectly reasonable, but it is not complete. Indeed, considering the equation x2 + y 2 + 1 = 0, one sees that there are “curves” without a single point. Trying to evade this unpleasant circumstance we agree it as a 6Cf. [13, page 16].
Translator’s note. For an introduction to projective geometry, see [17], also available in paperback. We mention also, quite generally, the book [3]. See further this Chapter, Section 8
1. On the geometric methods of Diophantine Analysis
335
y
P α 0
1
x
Fig. 4
“lawful” act to seek solutions not in the real plane but in the complex plane C2 . There we permit as “lawful” points (x, y), where x and y are complex numbers. This creates some confusion: the coefficients of the equation are in one domain of numbers, the coordinates of the sought point in another. But it turns out that there are many other similar situations. Taking account of this circumstance we extend the geometry interpretation of the equation as follows. Consider the equation p(x, y) = 0 where the coefficients of the polynomial in the left hand side are in a arbitrary field K, and we seek points (x, y) ∈ L2 , where L/K is an arbitrary extension of K. Then the previously given interpretation is the special case L = K = R. In order to interpret geometrically the solution of Diophantine systems of equations, we require the notion of an affine variety. Let L/K be an extension of the field K. We consider the system of equations7 f1 (x1 , . . . , xn ) = 0, f2 (x1 , . . . , xn ) = 0, .................. fm (x1 , . . . , xn ) = 0, where each fi is a polynomial of n variables over K. The solutions of this system give us a point set in Ln , which is called an affine variety. Such a geometric interpretation is expedient if the Diophantine problems amounts to finding integer solutions. But if one requires solutions wit rational coordinates, then is better to connect the problem with a so-called projective variety. Let us familiarize ourselves with this new notion. Let F (x0 , . . . , xn ) be a polynomial over a field K, i.e. a sum αn 0 α1 kα xα F (x0 , . . . , xn ) = 0 x1 . . . xn , α
where kα ∈ K and the αi are non-negative integers. The expressions αn 0 α1 kα xα x . . . x are called monomials, the integer α + α + · · · + αn is called its 0 1 n 0 1 7Translators’ note. Observe that the number m of equations need not equal the number n of variables.
336
C HAPTER VI. POPULARIZATION OF MATHEMATICS
order. The order of the polynomial F is the biggest order of its monomials. We write F in the form F = H0 + H1 + . . . Hm , where we denote by Hi = Hi (x0 , . . . , xn ), i = 0, 1, . . . , m the sum of all monomials of order i in F . Each of these polynomials will be called a form of order i or homogeneous form of order i. More precisely, a form of order i is a sum of a collection of monomial of order i. For example, the polynomial 3 2x20 x51 x2 + x0 x61 x2 + 12x40 x31 x2 5 is a form of order 8 over the field Q. A form H is said to be irreducible if there do not exist any forms P and Q over the field K such that H = P · Q. A projective algebraic variety in Pn (K) is a set of points determined by a certain system of homogeneous equations H1 (x0 , . . . , xn ) = 0, H2 (x0 , . . . , xn ) = 0, ..................... Hm (x0 , . . . , xn ) = 0, Thus all polynomials Hi here are forms over K. But every point set in Pn (K) is a set of rays through the origin in K n+1 . Therefore we can view a projective algebraic variety as a certain cone in K n+1 . Next we look at an important example of such variety (Figure 5).
P a
(0,0,0)
Fig. 5
We pick in the projective plane P2 (K) a suitable coordinate system (x0 , x1 , x2 ) and write a certain form of order m over K, F (x) = F (x0 , x1 , x2 ). We assume that this form is irreducible. Let P ∈ P2 (K). We know that to the point P there corresponds a certain equivalence class in the space K 3 , i.e., a ray through the point
1. On the geometric methods of Diophantine Analysis
337
(0, 0, 0). If now a point a = (a0 , a1 , a2 ) on this ray satisfies the equation F (x) = 0, that is, if F (x) ≡ 0, then each other point, k · a, k ∈ K ∗ , satisfies also the same equation, F (ka) = k m F (a) = 0. Thus the equation is satisfied for the entire equivalence class to which a belongs, that is the corresponding point in the projective plane. In other words, we can speak of the set of points in the projective plane P2 (K) which satisfy the equation F (x) = 0. This set of points is called irreducible algebraic curve of rank m. The field K is called a field of definition of the curve; the equation F (x) = 0 is the equation of the curve for the given system of coordinates. Each form F over K splits into a product of irreducible forms over this field: F = F1α1 . . . Frαr . To each form Fi there corresponds an irreducible algebraic curve Γi . Therefore we consider the system of curves (Γ1 , . . . , Γr ) as a general algebraic curve, the curves Γi as its components, and the non-negative integers αi as the multiplicities of the components Γi . We consider some examples. Let K = R. The simplest example of an irreducible algebraic curve is the straight line x1 + x2 − x0 = 0; a second order algebraic curve the circle x21 + x22 − x20 = 0 (cf. Figure 6); etc. y
1 0
x
Fig. 6
while x31 + x1 x20 − x2 x20 = 0 is a third order irreducible curve (cf. Figure 7); etc. A reducible projective curve is given by x21 − x22 − x20 − 2x2 x0 = 0. 8 Its components are the straight lines 1 : x1 − x2 + x0 = 0 and 2 : x1 − x2 − x0 = 0 (cf. Figure 8). We remark that if an algebraic curve Γ is given over the complex field C, then there is a certain surface connected with it. Indeed, let the curve Γ be given by the equation p(x, y) ≡ Am (x)y m + · · · + A1 (x)y + A0 (x) = 0,
8Translator’s note. Indeed, one may write x2 − x2 − x2 − 2x x = x2 − (x − x )2 . 2 0 0 1 1 2 0 1
338
C HAPTER VI. POPULARIZATION OF MATHEMATICS
y
0
x
Fig. 7
where the coefficients of the polynomials Ai are complex numbers. Then this equation is satisfied by a certain algebraic function y = f (x). It is known that with each such function there is connected a Riemann surface.9 This remark will later be used in connection with the introduction of the genus of a curve. y
6 l2
l1 @ @ −1@
@
0
1
-
x
@
@
@
@ −1 @ @ Fig. 8
What is the arithmetic of a curve? Let us first look at an interesting example. Consider cubic curves over the field Q of algebraic numbers. Each such curve Γ is a variety 9See the paper [14] written by Ü. Lumiste. If we restrict ourselves to the study of connected compact Riemann surfaces, then, from the point of view of algebraic geometry, this amounts to the study of irreducible curves without singular points. For an introduction to Riemann surfaces see also [6]. For an over all view of Riemann surfaces we likewise recommend the corresponding articles in [5].
1. On the geometric methods of Diophantine Analysis
339
given given by a third order Diophantine equation on the plane P2 (K), where K is an extension of Q (for instance, we can take K to be a suitable algebraic number field). Points (x0 , x1 , x2 ) of K 3 are called rational if their coordinates x0 , x1 , x2 are rational numbers. Logically, there are then the following three possibilities: (1) There are no rational points on Γ. (2) There are finitely many rational points on Γ. (3) There are infinitely many rational points on Γ. The following three examples indicate that all three possibilities appear in practice. E XAMPLE 1.1. On the curve x30 + px31 + p2 x32 = 0, where p is a prime number, there are no rational points. E XAMPLE 1.2. On the curve x30 + x31 + x32 = 0, there are just three rational points: (−1, 1, 0); (0, 1, −1) and (1, 0, −1). E XAMPLE 1.3. On the curve ax30 + bx31 + cx32 = 0 there are infinitely many points, if (a, b) = (a, c) = (b, c) = 1 (i.e., the corresponding pairs of integers are simple), if a, b, c > 1 and if these three numbers are not divisible by numbers which are squares of another number.10 In connection with these three situations the following questions are of interest. (1) Find a method for deciding when, for each cubic curve, it has rational points or not. (2) Find a method for deciding when, the cubic having rational points, they are finite or infinite in number. (3) If these are infinitely many, is it possible to find them from the knowledge of finitely many rational points? The answers to the first two are not known. However, the answer to third one is known, this is the Mordell-Weil theorem. This theorem will be discussed in some detail in Part III of the paper (see Section 1.10). Let us now look at the main case and state the following problem. Let there be given an algebraic variety V over the field K in the projective space Pn (K). Does the variety V have K-rational points 11. What is the structure and properties of this set of points? Every step forward towards the solution of these difficult questions has direct interest from the point of view of Diophantine systems of equations. In the study of the aforementioned questions we find something interesting already in the case of one dimensional varieties, that is, when we are dealing with algebraic curves. In what follows we shall also deal with this special case. A few words about history. The creation of Diophantine geometry for arbitrary fields took place in the years 1930-1955 through the work of O. Zariski, B. L. van der Waerden and A. Weil. This was not only an attempt to a greater generality of the treatment but was also a wish to apply, in Diophantine Geometry, new technical tools and methods. Only 10For example, the curve 3x3 + 5x3 + 7x3 = 0 or the curve 6x3 + 35x3 + 11x3 = 0. The proofs of 0 1 2 0 1 2 the facts stated in the examples are easy, and the Reader will find them readily. It is also equally easy to check that together with the rational point (x0 , x1 , x2 ) also (x0 (bx31 − cx32 ), x1 (cx32 − ax30 ), x2 (ax30 − bx31 )) is a rational point on the same curve. 11In an affine variety, all the coordinates of a K-rational point are elements of the ground field K.
340
C HAPTER VI. POPULARIZATION OF MATHEMATICS
from them one had a hope for the solution of the so fascinating but enormously difficult Diophantine problems.
1.5. Birational equivalence of algebraic curves A few words about the notion brought out in the heading of this Section. On the set of all algebraic curves it is possible to introduce a decomposition into classes which is called the birational equivalence of curves. This equivalence is related birational geometry, one of the classical divisions of mathematics, mainly cultivated, around the turn of the past century by Italian mathematicians. Of great interest are those objects which are the same, from the point of view of birational geometry, for all curves in a class, the so-called birational invariants. The most important birational invariant is their genus, introduced in geometry by Bernhard Riemann. The genus gives a possibility for a classification of curves, of the importance of which in Diophantine Geometry we spoke in the Introduction. The third Part of this paper (see 1.8) is devoted to this classification. Let there be given a curve Γ with the equation f (x, y) = 0. We consider rational α(x, y) functions ϕ(x, y) = , where α and β are polynomials over K and β is not divisβ(x, y) ible by f . The function ϕ is considered as trivial and we write ϕ = O(Γ), if f |α (i.e., α is divisible by the polynomial f ). We are interested in points of the curve whose coordinates are in K. Such points will be called K-rational. It will be expedient also to consider K-algebraic points with coordinates belonging to some extension of the field K. Next, let (x0 , y0 ) ∈ Γ be a point on the curve (regardless if it is rational or algebraic). Then ϕ(x0 , y0 ) is determined if β(x0 , y0 ) = 0. There are only finitely many points such that β(x0 , y0 ) = 0. Indeed, as β does not divide f and f is irreducible (Γ is an irreducible curve), then by elimination theory 12 the number of solutions of the system β(x, y) = 0 f (x, y) = 0 is ≤ (deg β) · (deg f ), where deg f indicates the degree of the polynomial f . Thus ϕ(x, y) is determined on Γ except for finitely many points. It turns out that the triviality of a rational function on Γ is equivalent to its triviality at all algebraic points 13. Indeed, if ϕ = O(Γ), then ϕ(x0 , y0 ) = 0 holds true thanks to the fact that f |α, because ϕ(x0 , y0 ) = 0 for all points (x0 , y0 ) ∈ Γ. Conversely, if ϕ(x0 , y0 ) = 0 for all algebraic points of Γ, but ϕ = O(Γ), then α is not divisible by f and the system f (x, y) = 0 ϕ(x, y) = 0 would have only finitely many solutions. But this is a contradiction, because there are infinitely many algebraic points on the curve. 12See [8, p. 280-285]. Translator’s note. See also the references in footnote 4. We remark that Kangro’s book [8] in many respects reminds of Kurosh [10] mentioned there. For elimination theory, see in particular [19, Chapter I], or [16]. 13The triviality of a function ϕ at a point (x , y ) of Γ means that ϕ(x , y ) ∈ Γ is determined and 0 0 0 0 ϕ(x0 , y0 ) = 0. For rational points the analogous statement is not true.
1. On the geometric methods of Diophantine Analysis
341
Next, we introduce a decomposition into classes for rational functions ϕ(x, y) = α(x, y) . To this end, we declare two rational functions ϕ(x, y) and ψ(x, y) as equivalent β(x, y) on the curve if the function ϕ − ψ is trivial on the curve. Putting all functions equivalent among themselves on the curve, we get a decomposition of rational functions into classes. Each such equivalence class will be called a rational function on the curve Γ. Let ϕ¯ and ψ¯ two arbitrary classes and ω ∈ ϕ¯ and τ ∈ ψ¯ arbitrary rational functions in these classes. We define addition and multiplication of classes by the formulae: ϕ¯ + ψ¯ = ω + τ , ϕ¯ · ψ¯ = ω · τ . In other words, with the help of arbitrary representatives of classes one introduces operations on classes. It is easy to check that the set of classes forms a field, denoted by K(x, y). The field K(x, y) shall be called the field of rational functions on the curve Γ. For example, on the straight line y = 1 one has the field of rational functions K(x, y) = K(x). Each element ϕ¯ of K(x, y) can uniquely be represented in the form ϕ = α0 (x) + α1 (x)y + · · · + αm−1 (x)y m−1 , where the αi (x) are rational functions and m = degy f (x, y)14. At first sight one might believe that the two functions x and y play a “privileged role” in K(x, y). But this is so only apparently, since for each non-constant function y ∈ K(x, y) we can find x ∈ K(x, y) such that x = ϕ(x , y ), y = ψ(x , y ), and there exists a polynomial g over K with g(x , y ) = 0, that is, K(x, y) ≡ K(x , y ). If two algebraic curves have the same field of rational functions, then we say that they are birationally equivalent. Examples of birationally equivalence of curves will be found in Section 1.7. Let there be given the curve Γ : f (x, y) = 0 and the curve Γ : g(x , y ) = 0. It turns out that a necessary and sufficient condition for the curves Γ and Γ to be birationally equivalent is that there exist rational functions ϕ, ψ, ϕ , ψ such that x = ϕ (x, y), y = ψ (x, y), x = ϕ(x , y ), y = ψ(x , y ). If the point (x0 , y0 ) ∈ Γ is K-rational, then, apparently, the point f (x0 , y0 ), ψ (x0 , y0 ) ∈ Γ is K-rational, and vice versa. Here we assume, of course, that the functions ϕ, ψ, ϕ , ψ are determined in the points considered. But these functions are not determined only at finitely many points of Γ. Now we have arrived at an essential fact: T HEOREM 1.1. There is a one-to-one correspondence between the points of two birationally equivalent curves, provided one excludes a finite set of points (where the functions ϕ, ψ, ϕ , ψ are not determined).
14Here deg denotes the degree of the polynomial f (x, y) with respect to y. y
342
C HAPTER VI. POPULARIZATION OF MATHEMATICS
1.6. Singular points of a curve We consider a curve Γ given by the equation f (x, y) = 0. The polynomial f (x, y), as a function of two variables has the derivatives fx , fy , fxy , . . . . A point P on Γ is called an r-fold point, if at this point its derivatives up to order r −1 vanish, but there is a derivative of order r which is different from zero; if r > 1 we call an r-fold point a singular point. The number and multiplicity of such singular points is bounded: considering an algebraic curve of genus n without multiple components with points Pi with multiplicities ri , i ∈ I, then one has ri (ri − 1) ≤ n(n − 1). i∈I
For an irreducible curve there is an even stronger inequality: ri (ri − 1) ≤ (n − 1)(n − 2). i∈I
As an example, we consider the issue of singular point on a cubic curve. Let the curve Γ be give by the equation f (x, y) = 0. We use a system of coordinates with the origin on Γ. Then the polynomial f does not have a constant term and we write it as f (x, y) = ax + by + g(x, y), where the polynomial g contains only monomials of degree two and three. This gives readily fx (0, 0) and fy (0, 0): ∂f ∂f |(0,0) = a, |(0,0) = b. ∂x ∂y If a = b = 0, then (0, 0) is a singular point, because then fx (0, 0) = 0 and fy (0, 0) = 0. Consider the points of intersections of Γ with the straight line y = kx: (93)
0 = f (x, kx) = x(a + bk) + g(x, kx) = x(a + bk) + x2 (x),
where (x) is a linear polynomial, that is, (x) = cx + d. From this equation we can find x. The value x = 0 satisfies the equation given. If a + bk = 0, then x = 0 is a simple solution of (93) and, in this case, we call straight line y = kx a secant of Γ. If, however, a + bk = 0, then x = 0 is a double solution of (93) and we call y = kx a tangent of Γ, x = 0 being a double solution of (93). This allows us to conclude that on a cubic curve there are not more than two singular points. Assume that P1 = (x1 , y1 ) and P2 = (x2 , y2 ) are two distinct singular points on Γ. We choose the system of coordinates such that (0, 0) ∈ Γ and the direction of the unit vectors such that x1 = x2 . We draw through the points P1 and P2 a straight line and form the equation for finding the point of intersection between the line and Γ. As P1 and P2 are both singular points, then, in view of the above, x = x1 and x = x1 both must be double solutions. But then we have produced at least 4 solutions. But this is a contradiction since the degree of the equation is ≤ 3 (the cubic and the straight line have at most three points of intersection). This proves the assertion.15 ( '
15Further examples in [12, p. 35]. Translator’s note. Similar examples can be found in the excellent book [20, e.g. p. 57].
1. On the geometric methods of Diophantine Analysis
343
1.7. Examples of birational equivalence The following result gives a whole series of examples. T HEOREM 1.2. A cubic curve Γ without singular points but with at least one rational point is birationally equivalent with a curve whose equation is y 2 = x3 + Ax + B,
with
4A3 + 27B 2 = 0.
P ROOF. Let us choose the system of coordinates such that the origin coincides with one of the rational points Q ∈ Γ. Then the equation of the curve F3 (x, y) = 0 has no constant term and so the polynomial F3 can be expressed as a sum of forms, F3 (x, y) = H1 (x, y) + H2 (x, y) + H3 (x, y). The points of intersection of straight line y = tx with Γ are found from the equation 0 = F3 (x, tx) = H1 (x, tx) + H2 (x, tx) + H3 (x, tx) = = xH1 (1, t) + x2 H2 (1, t) + x3 H3 (1, t). We see that x = 0 is a solution of this equation. The remaining solutions are found from the second order equation H1 (1, t) + xH2 (1, t) + x2 H3 (1, t) = 0,
(94) which gives x=
−H2 (1, t) ±
-
H2 (1, t)2 − 4H1 (1, t) · H3 (1, t) = 2H3 (1, t)
−H2 (1, t) ± z 2H3 (1, t) where we have denote the square root by the symbol z. As H1 , H2 , H3 are polynomials in t, we see that x is expressed rationally in terms of t and z; as y = tx, also y is then expressed rationally in terms of t and z. The reasonings given mean that Γ is birationally equivalent to the curve z = H2 (1, t)2 − 4H1 (1, t) · H3 (1, t), =
or, what is the same, the curve z 2 = H2 (1, t)2 − 4H1 (1, t) · H3 (1, t) = P4 (t). We seek now the tangent to Γ through the given rational point Q, which we did choose as the origin of the coordinates. In the previous Section we saw that the tangent intersects Γ in a rational point O.16 This point is now chosen as a new origin for coordinates. Hence, for some t0 ∈ K the straight line y = t0 x is tangent to Γ at the point Q and passes through the point O. As Q is a point of tangency, then taking t = t0 we have a multiple root of equation (94), so that the discriminant of this quadratic equation vanishes; in other words, P4 (t) = 0. Putting τ = t − t0 we expand P4 (t) in powers of τ : P4 (t) = S4 (τ ) = aτ + bτ 2 + cτ 3 + dτ 4 = z 2 . 16Indeed, from the equation (x) = cx + d = 0 we obtain x = d ∈ K, which also gives y ∈ K. 3 3 c
344
C HAPTER VI. POPULARIZATION OF MATHEMATICS
this gives 1 z 2 1 1 ) = d+c +b 2 +a 3. τ2 τ τ τ = v and τz2 = u and taking au = α, av = β, we obtain (
Denoting here
1 τ
a2 u2 = a3 v 3 + a2 bv 2 + a2 cv + a2 d and α2 = β 3 + bβ 2 + acβ + a2 d. Finally, making the substitution γ = β + 3b allows us to bring this equation in the desired form. As all changes of variable made have been rational, the assertion of the theorem is proved. ( '
III. The classification of algebraic curves. Schüler: “Kann Euch nicht eben ganz verstehen.” Mephistoteles: “Das wird nähstens schon besser gehen, wenn Ihr lernt alles reduzieren und gehörig klassifizieren.” (The student: “I do not quite understand you now.” Mephistoteles: “It will soon be much easier for you. when you have learnt to reduce and to classify appropriately.”) J. W. Goethe “Faust”
1.8. The genus of an algebraic curve Let there be given an irreducible algebraic curve Γ. To each point Pi of Γ we associate a natural number ri ≥ 1 (see Section 1.6), the order of the point. If the degree of Γ is n, we have seen that one has the equation (ri − 1)ri ≤ (n − 1)(n − 2). Pi ∈Γ
With each algebraic curve one can associate a non-negative integer g, its genus, which in the simplest cases can be found from the formula17 (ri − 1)ri (n − 1)(n − 2) − Pi ∈Γ . g= 2 2 We saw above that in the case K = C there is associated to any algebraic curve a compact Riemann surface. Each such surface is, however, topologically equivalent to a “sphere with handles”, and so the single topological invariant of the topological structure of this surface, the number of the “handles”, determines the genus of the surface. In the 17Each projective each algebraic curve can be settled into one-to-one correspondence with a plane curve
with only ordinary 2-fold singularities (without multiple tangents at these points), so, the genus g being a birational invariant, then the given evaluation formula is universal (for, using it, we can find in each birational equivalency class a curve of genus g).
1. On the geometric methods of Diophantine Analysis
345
special case K = C (that is, when the field of definition of the curve is the complex domain C) is nothing but the genus of the Riemann surface [14] corresponding to it. 18 When we look at algebraic curves over number fields, then the genus of a curve is determined not only by the structure of all algebraic points on the curve, but to a large extent also the structure of its Q- and Z-points (cf. below the Mordell-Weyl theorem). From the point of view of Diophantine problems it is therefore a significant fact that the genus g is a birational invariant, being equal for all curves belonging to the same class of birationality. This gives a possibility for a classification of curves, and thereby also a classification of the corresponding Diophantine problems.
1.9. About classification
Curves of genus 0 are called rational. As the genus g of a curve of order n without singular points is given by the formula g=
1 (n − 1)(n − 2), 2
we see that curves of degree one i.e. straight lines and second order curves are rational. On the other hand, David Hilbert and Adolf Hurwitz showed in 1890 that every rational curve is isomorphic to a plane second order curve a0 x20 + a1 x21 + a2 x22 = 0. The corresponding isomorphism is obtained by a substitution of variables where the coefficients determining the substitution belong to the field of definition of the curve. Curves of genus g = 1 are called elliptic; they are isomorphic to cubic curves without singular points. From this we see19, that if an elliptic curve has a rational point then it is birationally equivalent to a curve with equation y 2 = x3 + Ax + B. Elliptic curves have received their name from the fact that if K = C then they admit a parametrization in terms of elliptic functions ([14, p. 261]20). In the case when the ground field K = C an elliptic curve corresponds to a Riemann surface of genus g = 1, that is, a torus. A torus is topologically the direct product of two circles. Therefore it is possible to define on it the structure of a compact Lie group. If an elliptic curve has a rational point, then one can take this point on the corresponding Riemann surface to be the zero element, and the composition can be expressed in terms of algebraic functions in the coordinates. A complex compact Lie group, which in the same time is also an algebraic variety, is called an Abelian variety. Basing himself on a series of fundamental results, A. Weil arrived at the opinion that any development in the area of elliptic curves will also mean a major progress in the theory of general Abelian varieties. Therefore it is understandable why the class of these curves has been given paramount interest in Algebraic Geometry.
18Translator’s note. For Riemann surfaces, see also the reference indicated in the footnote 9. 19Cf. Theorem 1.2 20Editors’ note. Page number is given according to Russian translation.
346
C HAPTER VI. POPULARIZATION OF MATHEMATICS
Curves of genus g > 1 are called non-elliptic. 21 In the beginning of the 20th century there was raised the following conjecture in Diophantine Geometry: non-elliptic curves over number fields have only a finite number of rational points. Despite the efforts of many mathematicians a corresponding theorem has not been established. Yu. I. Manin showed that the proof of this conjecture reduces to the generalized Mordell conjecture. 22 It is not possible to give here a closer account of these curves, as so far this subject is rather difficult to study and a satisfactory theory is missing here.
1.10. Rational curves Curves with genus g = 0 are called rational. The genus of a curve Γ of degree n is given by (ri − 1)ri n(n − 1) − Pi ∈Γ , g= 2 2 so that the criterion for a curve of order n to be rational is n(n − 1) . (ri − 1)ri = 2 P∈ Γ
The simplest curve is the straight line y = 1 for that the field of rational functions is K(x, 1) = K(x). In the preceding Section we saw that straight lines are rational curves. This gives at once the following necessary condition for rationality. For a curve over the field of rational functions K(x, y) to be rational it is necessary, that there exists a function ϕ ∈ K(x, y) allowing to express x and y rationally over the field K. 23 In order to illustrate the application of this condition we prove that the curve 2
2
2
x03 + x13 + x23 = 0
(95)
is rational. To this send we observe that the values x0 = i, x0 = sin3 α, x0 = cos2 α satisfy equation (95). Taking α 1 − cos α t = tan = , 2 sin α we find 1 − cos α 2 2 1 + t2 = · = · t, sin α sin α sin α which yields i(1 − t2 ) 1 + t2 , cos α = . sin α = 2t 2t Thus we find that cx0 = −8t3 ;
(96)
2
cx1 = i(1 + t2 ) 3 ;
cx2 = (1 − t2 )3 ,
21As an example, one has the curves p(x)y 2 + q(x) = 0, where deg p(x) = g, deg q(x) = g + 2, and the equation p(x) · q(x) = 0 has now multiple zeros. Such a curve is of degree g + 2, and its only singular point has multiplicity g. Therefore, by the formula given in Section 1.8, its genus is given by
g(g − 1) g(g + 1) g(g − 1) (g + 2 − 1)(g + 2 − 2) − = − = g. 2 2 2 2 22
Cf. Section 2.
23Indeed, this condition is also a sufficient one.
1. On the geometric methods of Diophantine Analysis
347
where 0 = 0 ∈ C.
3t 3 3t (8t ) = (cx0 ), we obtain 4 4 x1 x2 4 x2 8i 4i x1 ∈ C( , ), c ∈ C. (97) t = − ( )+ ( )− 3 x0 3 x0 3c x0 x0 The relations (96) and (97) allow us to use the previous rationality conditions, from which we conclude that the curve (95) is rational. We now answer the question raised in Section 1.4 about the rationality of quadratic curves. In other words, as −icx1 + cx2 = 2 + 6t4 and 6t4 =
T HEOREM 1.3. If a second order curve has a rational point, then there are (in the case of an infinite field K) infinitely many such points. P ROOF. Let there be given a second order curve which has a rational point Γ, and let it have the rational point Q = (x0 , y0 ) ∈ Γ. Consider the straight lines through this point: x − x0 = t(x0 − y0 ). We seek the point of intersection of Γ which such a line. To this end we have to solve the following quadratic equation in t: f (x0 + t(x0 − y0 ), y) = 0. We know one such solution y = y0 . The second solution can be expressed, in view of Viète’s rule, in terms of y0 and the coefficients of the quadratic polynomial f , that is, it is expressed rationally in terms of x0 , t, y0 and elements of K. In other words, as x0 , y0 ∈ K, then y is expressed rationally in terms of t. But then x = x0 + t(x0 − y0 ) is, likewise, expressed rationally in terms of t. The reasoning given shows that the points of intersection with Γ of the “rational” straight lines through the point Q (t ∈ K) are rational points. ( ' If K = Q, the the presence of rational points on a second order curve can be controlled by an effective process of calculation (given by the Minkowski-Hasse Theorem).
1.11. Elliptic curves Let us begin by the birational classification of elliptic curves. If the base field K is algebraically closed, then each elliptic curve over K is birationally equivalent to a curve in the so-called “Weierstrass normal form”: y 2 = x3 + ax + b,
a, b ∈ K.
Two curves with such an equation are birational to each other if and only if their absolute invariant coincide, the absolute invariant of an elliptic curve j being given by the formula 4a3 , j ∈ K. 4a3 + 27b2 If the base field K is not algebraically closed, the classification requires the use of a cumbersome technical apparatus. If on the elliptic curve Γ there is a K-rational point O 24, we can put the set of its K-rational points, which we denote by G(Γ, K) = G, in j=
24If K is a finite field, then such a point exists always (Theorem of F. K. Schmidt). In the general case the existence of such a point may be a truly serious question.
348
C HAPTER VI. POPULARIZATION OF MATHEMATICS
P
0
−P
Fig. 9
correspondence with the structure an Abelian group with zero element at the point O. We do this in the following way (see Figure 9). As zero in the group G we take the point O. The inverse to a given point P is the point −P obtained as the third point of intersection with Γ of the straight line through O and P . Next, let there be given two rational points P1 and P2 on Γ (see Figure 10).
P1−P2=−Q P2
Q
P1 0
Fig. 10
We draw a secant through these two points and take inverse −Q of the third point of intersection Q obtained in this way. We define −Q = P1 + P2 . If P1 ≡ P2 we take instead of a secant the tangent at the point P1 ≡ P2 . It is possible to show that the addition of points defined in this way makes the set G = G(Γ, K) into an Abelian group. 25 This group turns out to be a birational invariant, that is, the groups of elliptic curves belonging to one and the same birationality class are isomorphic. This circumstance permits us to give a much better idea of the group G. Indeed, the group G(Γ, K) being a birational invariant, the result given in the beginning of this Section permits us to replace 25It is possible to define on the set G a binary algebraic operation denoted ◦ by the formula P ◦ P = Q. 1 2 R. H. Bruck and V. D. Belousov call this system as T S-quasigroup. (Translator’s remark. TS stands for totally symmetric. See [2] This algebraic object allowed Yu. I. Manin, recently, to realize an interesting geometric idea and find an essential generalization of results corresponding to the results of classical Diophantine Geometry. See [15].
1. On the geometric methods of Diophantine Analysis
349
the elliptic curve Γ with a curve Γ in normal form: Γ : y 2 = x3 + ax + b. ξ1 ξ1 Passing to homogeneous coordinates, writing = x, = y, ∞ = (0, 0, 1) we get for ξ0 ξ0 the equation of Γ ξ0 ξ22 = ξ13 + a · ξ02 ξ1 + b3 ξ0 .
Clearly ∞ ∈ Γ . We set ∞ ≡ 0, that is, we take ∞ as the zero element of the group G(Γ , K) As G(Γ , K) ∼ = G(Γ, K) (the groups being isomorphic), we obtain for the group G(Γ, K) the following structure formulae 26 If P = (x, y), then −P = (x, −y); If P1 = (x1 , y1 ) and P2 = (x1 , y2 ), then P1 + P2 = P3 = (x3 , y3 ) = y1 − y2 2 y1 − y2 −(x1 + x2 ) + ( ) ), y1 + (x3 − x1 ) . x1 − x2 x1 − x2 Let us look at elliptic curves over the rational field, that is, the special case K = Q. In 1901, H. Poincaré made the conjecture that this group has a finite number of generators. This assertion founds its affirmation; in 1922 L. J. Mordell obtained its proof. Six years later A. Weil managed to extend the theorem to arbitrary number fields. This theorem, which is called the Mordell-Weil theorem has several important applications in Diophantine Geometry. Several proofs of this theorems have been given, but they are all non-effective, that is, they give only an upper bound for the number of generators, but not a method fore finding these generators. So in most cases the structure of the group G(Γ, K) remains unknown to us. However, in 1935, Tryggve Nagell gave the following method for finding the elements of finite rank in G(Γ, Q). The elliptic curve Γ has to be presented in the normal form y 2 = x3 − Ax − B,
A, B ∈ Z.
Then all rational points of finite rank on Γ (that, is Q-points of finite rank) must have integer coordinates x and y; then either y 2 = 0 or y 2 is an integer divisor of [the discriminant] 4A3 − 27B 2 . Nagell’s result show that all Q-points of finite rank can be found from the set of their all possible values, by checking all points of this set. The English mathematicians B. J. Birch and H. P. F. Swinnerton-Dyer recently made a number of conjectures of high credibility about the structure of the Q-points of finite rank on an elliptic curve (see [18]) These conjectures are based on empirical material, which were obtained using computers, because for the check of the assertions a large number of bulky computations had to be made. Despite the apparent fragmental form of the material set out here, we still venture to hope that the Reader has got some confirmation to Lagrange’s words: As long as algebra and geometry developed each side by side, their progress was slow and the applications limited. By making friends 26Translator’s note. In the theory of elliptic functions, these formulae are equivalent to the Weierstrass addition theorem. See e.g. the book of Hurwitz mentioned in the footnote 9.
350
C HAPTER VI. POPULARIZATION OF MATHEMATICS
each got new vigor from each other, and now they move with a much greater speed in the direction of completion.
References [1] [2] [3] [4] [5] [6]
[7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19]
[20]
I. G. Bashmakova. Diophantus and Diophantine Equations. Dolciani Mathematical Expositions (20). Math. Assoc. Amer., Washington, DC, 1997. Translated from Russian: Nauka, Moscow, 1972. V. D. Belousov. Some remarks on TS-quasigroups. Kišinev. Gos. Univ. Uˇcen. Zap. 91, 1967, 3–8. M. Berger. Geometry, I–II. Universitext. Springer-Verlag, New York, 1987. P. M. Cohn. Algebra, Vol. 1-3. Second Edition. John Wiley & Sons, Chichester, et. al., 1981; 1989; 1991. Russian translation: Mir, Moscow, 1968. M. Hazewinkel (ed.). Encyclopaedia of Mathematics. Kluwer Academic Publishers, Dordrecht, Boston, London, 1988. A translated and expanded version of a Soviet mathematics encyclopedia, in ten volumes. A. Hurwitz. Vorlesungen über Allgemeine Funktionentheorie und Elliptische Funktionen; herausgegeben und ergänzt durch einen Abschnitt über geometrische Funktionentheorie von R. Courant. Die Grundlehren der mathematischen Wissenschaften, Bd. 3. Springer Verlag, Berlin, New York, 1964. U. Kaljulaid. Lenin prize for work in Diophantine geometry. Math. and Our Age 14, 1968, 108–110. (see [K68b]). G. Kangro. Kõrgem algebra (Higher algebra). Eesti Riiklik Kirjastus, Tallinn, 1962. (Estonian.) A. G. Kurosh. Lectures in general algebra. Fizmatgiz, Moscow, 1962. English translation: Pergamon Press, Oxford, London, Edinburgh, New York, 1965. A. G. Kurosh. A course of higher algebra. Nauka, Moscow, 1975. S. Lang. Algebra. Reading, Massachusetts, 1965. Russian translation: Mir, Moscow, 1968. Ü. Lumiste. Diferentsiaalgeomeetria (Differential geometry). Eesti Riiklik Kirjastus, Tallinn, 1963. (Estonian.) Ü. Lumiste. The notion of space in geometry. Geometry and transformation groups. Math. and Our Age 14, 1968, 3–21. Ü. Lumiste. Riemann as the founder of topology and the general curved space. Math. and Our Age 11, 1966, 65–76. (Estonian). Yu. I. Manin. Cubic hypersurfaces, I. Izv. Math. Nauk 32 (6), 1968. A. Seidenberg. Elements of the theory of algebraic curves. Addison-Wesley Pub. Co., Reading, Mass., London, Don Mills, Ont., 1968. J. G. Semple and T. T. Kneebone. Algebraic projective geometry. Oxford Science Publications. Oxford University Press, New York, etc., 1998. H. P. F. Swinnerton and B. J. Birch. Elliptic curves and modular functions. Lecture Notes in Math. 476, 1975, 2–32. B. L. van der Waerden. Moderne algebra, I; II. Die Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen mit besonderer Berücksichtigung der Anwendungsgebiete. Springer, Berlin, 1930; 1931. Many subsequent editions, from the 4 ed.,1950, on, with the changed title Algebra, dropping the word “Moderne” (Modern). Russian translation: Nauka, Moscow, 1979. R. J. Walker. Algebraic curves. Princeton Mathematical Series, vol. 13. Princeton University Press, Princeton, N. J., 1950.
351
2.
[K68b] Lenin prize for work in Diophantine geometry Comments by M. Tsfasman
In the year 1967 the young Moscow mathematician Yu. I. Manin was honored with the Lenin prize. Yuri Ivanovich Manin was born in Simferopol 27 on February 16, 1937. In 1958 he got his diploma from Moscow State University, and begun his research studies under the direction of Prof. I. R. Shafarevich. In 1960 Manin defended his Candidate thesis “On the theory of Abelian varieties”. There the author considers the theory of algebraic curves over a field of finite characteristic and discovers a number of original analogies with the classical case, where the base field are the complex numbers. Since 1960 Manin works at the Steklov Institute of The Soviet [now Russian] Academy of Sciences.28 In 1963 he defended his Doctoral thesis “The theory of formal series in the case of a finite field”. Since 1965 he is also a professor at Moscow University, where he directs, together with Shafarevich, an extensive activity in the founding of a school of Algebraic Geometry. Manin’s domain of research is Diophantine Geometry, which subject has its roots in antiquity; in the third century B.C. Diophantos raised the problem of finding solutions in terms of rational numbers of an equation with rational coefficients. Considering the problems of this kind begun already in early medieval times in China and in India. More serious progress was, however, obtained in the work of such classics as Euler, Lagrange, and Gauss. A new stage in the study of these problems was opened up in the 20th century, when various possibilities for classifying and studying Diophantine problems with the aid of methods of algebraic geometry were found . A new branch of mathematics, Diophantine Geometry, arose, where the main object of study is the structure of some given “arithmetic” variety over some field and its dependence on the “arithmetic” of this field. Today these questions are seen as a touchstone for the methods of algebraic geometry. Already the arithmetic of one-dimensional varieties or algebraic curves is of interest. The connection with Diophantine problems here is the following. To each algebraic equation in two variables, whose coefficients are elements of some field, there corresponds an algebraic curve over this field. To each algebraic curve there corresponds a non-negative integer, the genus of the curve. The curves of genus 0 are the rational curves, the ones of genus 1 the elliptic ones. One has found that the arithmetic of an algebraic curve, that is the nature of the Diophantine problem, depends on the genus of the curve. 27 28
Translator’s note. On the Crimea, now belonging to the Ukraine (population 352000 in 1995). Edtors’ note. He is now the director of the Max-Planck-Institute of Mathematics in Bonn, Germany.
352
C HAPTER VI. POPULARIZATION OF MATHEMATICS
In 1922 L. J. Mordell raised the following conjecture in Diophantine geometry: on a non-elliptic curve over a number field (that is, a field consisting of complex numbers) there are only a finite number of rational points (i.e., points with rational coordinates). For a variety of reasons, the solution of this questions got a special importance for Diophantine Geometry, but despite efforts by many mathematics one did not find a path to get even near to its solution. A number of observations led to the generalized Mordell conjecture (A. Néron, S. Lang): also in the case of a not finitely generated field or a functional field (where some of the generator may be transcendental over the base field) one has only finitely many rational points. This generalized conjecture was considered as unaccessible as Mordell’s original conjecture. But in 1961 Manin succeeded in proving the following: T HEOREM 2.1. Every non-elliptic curve on a finitely generated field has either finitely many rational points or else it is possible to transform it, by a change of variable, into a curve in whose equation there are no transcendental coefficients. This question reduces, in the case of a number field, to a special case which is not possible to treat with the methods developed so far. The proof of Manin’s theorem is rather complicated. It relies on several deep algebraic, topological and analytic methods. In his doctoral dissertation Manin developed a theory for commutative formal groups. This is of course a sequel to the local theory of Lie groups. It is well-known that in a neighborhood of the unit element in a Lie group one can introduce a system of real coordinates such that if the elements X and Y are sufficiently close to the unit element then the coordinates of the element Z = X · Y can be expressed in terms of the coordinates of X and Y ; in this way one obtains a collection of power series zi = αi (x1 , . . . , xn ; y1 , . . . , yn ). This collection of convergent power series, which satisfies the axioms of a group, defines the structure of a local Lie group. The notion of formal Lie group arises from here if we drop the requirement of the convergence of the series, that is, one considers the power series as formal power series, the coefficients of which, however, are assumed to be members of a field of finite characteristic. Jean Dieudonné worked out an apparatus for the study of commutative formal power series, which plays about the same role as Lie algebras in the study of local Lie groups. In this research he was led to the problem of classification of commutative formal power series up to isomorphism. He realized the complexity of the problem, noting that not even the complexity of the problem was subject to an analysis. In Manin’s doctoral dissertation this problem was given a definitive solution; the author obtained also other remarkable results. The name Yuri Ivanovich Manin has become well-know among mathematicians all over the world. He has repeatedly been invited to lecture in France and Italy. The broader mathematical public in the Soviet Union knows him as the editor of the Russian translation of the algebra books of Bourbaki’s “Elements of mathematics”, and further as the author of popular articles treating interesting problems in algebra (see, for instance, The Encyclopedia of Elementary Mathematics, IV. Moscow, 1963 [Russian].). There arises, of course, the question how it is possible to work simultaneously in so many directions and on truly difficult problems. Manin’s teacher Shafarevich says the following about this 29: 29From the journal Molodoˇı Kommunist (The young Communist), No. 3, 1964.
2. Lenin prize for work in Diophantine geometry
353
All who know Manin are amazed at his bright mathematical talent and his ability to work much and aspiringly. It has happened that he has finished a paper, where he had to overcome great obstacles and the presentation of which required many tens of pages, and one might have expected that now there would be a break . . . But already the following day he had completely dug himself into the solution of another problem. Probably this can be explained by an extraordinary ability to work that he can simultaneously treat so many things. And these things not only seem to obstruct his principal activity but seem even to assist him in it. Manin, addressing himself to young Readers, writes as follows 30: Basing myself on my own modest personal experience, I would like to say to those who are 16-17 years of age: don’t be afraid of the scientific literature! Anyone of you will be able to understand what is written there and follow what is known, and what is not, what problems are posed and the solution of which is under it way. But one should not believe that this is easy. It is hard. But not harder than occupying oneself besides the usual school curriculum, with music, or with matters concerning radio31.
Comments.
The above article was written when Manin was 31, a young very promising mathematician. He himself says that the prize he got was somewhat embarrassing for him, and he considered it as an advance for work yet to be done. Now he is 66 and one of the most famous mathematicians of his generation. Even the formal list of his distinctions is impressing. He is an invited speaker at five ICM congresses; recipient of Moscow Mathematical Society Award, Lenin Prize for work in Algebraic Geometry, Brouwer Gold Medal for work in Number Theory, Frederic Esser Nemmers Prize in Mathematics, Rolf Schock Prize in Mathematics, Georg Cantor Medal, King Faisal International Prize for Mathematics; elected member of at least 8 academies; and so on. The diversity of his mathematical interests is striking. His research being extremely broad and characterized by special attention to interrelations of different branches of science, it has however two principal centers: the meeting point of number theory and algebraic geometry, and that of algebra and physics. To name briefly some fields he developed, at least the following come to mind: The function field analogue of the Mordell conjecture; the Gauss-Manin connection; formal groups and Dieudonné modules; cubic forms and arithmetic of rational varieties; a counter-example to the Lüroth problem for threefold; modular forms, p-adic theory of automorphic functions, and Manin-Drinfel’d symbols; distribution of rational points on algebraic varieties; approaches to the theory of real multiplication; matrix solitons; instantons; homogeneous super spaces and super strings; the Polyakov measure and the Selberg zeta function; mirror symmetry; quantized theta functions, quantum cohomology and Frobenius manifolds; quantum computing. Manin’s impact by far exceeds research results of his own. His books on algebraic geometry, K-theory, cubic forms, linear algebra, homological algebra, mathematical logic, number theory, gauge fields, elementary particles, quantum cohomology are widely read. The number and quality of his students, the influence of his knowledge and ideas, his enticing lecturing style, the broadness of his intellect, his agreeable personality, all this forms a unique image of scientist and scholar we admire. The list of mathematicians considering him as a teacher can hardly be cited. The Ph.D. theses of A. Beilinson, A. Belski˘ı, V. Berkovich, I. Cherednik, V. Danilov, E. Demidov, V. Drinfel’d, M. Frumkin, A. Geronimus, El Hushi, V. Iskovskih, G. H. Höhn D. Kanevski˘ı, M. Kapranov, R. Kaufmann, Kha Huy Khoai, K. Kii, V. Kolmykov, V. Kolyvagin, P. Kurchanov, D. Lebedev, D. Leites, A. Levin, B. Martynov, Hoang Le Minh, G. Mustafin, A. Panchishkin, I. Penkov, A. Roitman, G. Shabat, A. Shermenev, V. Shokurov, A. Skorobogatov, Yu. Tschinkel, M. Tsfasman, B. Tsygan, Yu. Vainberg, A. Vaintrob, A. Verevkin, M. Vishik, S. Vladuts, A. Voronov, M. Wodzicki, Yu. Zarhin bear Manin’s name as thesis advisor. The list of his students is much vaster, including M. Kontsevich, S. Merkulov, V. Serganova, I. Zaharevich, and many others. His non-mathematical interests are not less widespread than mathematical ones. He published research and expository papers on literature, linguistics, glotto-genesis, mythology, semiotics, physics, history of culture, and philosophy of science. The example he set for those around him was not that of a monomaniac mathematician, but of a deep scholar “par excellence” for whom the penetration into the mystery of knowledge is much more important than professional success. Michael Tsfasman 30From the journal Molodoˇı Kommunist (The young Communist), No. 3, 1964. 31Translator’s note. The contemporary Reader could perhaps substitute the word radio for IT.
354
C HAPTER VI. POPULARIZATION OF MATHEMATICS
[1], [2]
References [1] Yuri I. Manin, Selected papers. World Scientific Series in 20th Century Mathematics, 3. World Scientific Publishing Co., Inc., River Edge, NJ, 1996. [2] Dedicated to Yuri I. Manin on the occasion of his 65-th birthday. Moscow Math. J. 2 (3), 2002, 108–110.
355
3.
[K69c] The history of solving equations
In this paper we shall be concerned with the algebraic equation of one variable a0 + a1 x + a2 x2 + · · · + an xn = 0 (n ≥ 1), where the coefficients ai are complex numbers32. For equations of degree three and four such recipes were found only in the 16th century (in Italy): in the case n = 5, however, all attempts to find solution formulae turned out to be fruitless. The problem of finding solution formulae for equations of higher degree appeared in a new light at the end of the 18th century, when J. L. Lagrange discovered the notion of transformation group and, upon applying it to equations, found basic principles for their study. In the general case the problem was solved, some 60 years later, by É. Galois, which turned out to be fertile not only in algebra but also in geometry (the work of S. Lie, F. Klein, É. Cartan etc.), in the theory of differential equations, and elsewhere. The reason was here probably that Galois studies in his work important mathematical structure and their interrelations in “pure form”. He was the first to state that the future mathematics is l’analyse de l’analyse (the analysis of analysis), the object of which is the study of mathematical structures 33 The Reader will observe that Galois was a forerunner of the famous Nicolas Bourbaki. The latter’s widely spread (but not generally accepted) point of view that mathematics is a hierarchy of structures and that its central problem (and even of natural science) are the structures and the study of the interrelations is a confirmation of this. The author hopes to acquaint below the Reader (in its broad outlines) with the proof of the following fact: T HEOREM 3.1. In an arbitrary n-th order algebraic equation (98)
a0 + a1 x + a2 x2 + · · · + an xn = 0
whose coefficients ai are independent in the complex domain cannot be solved in terms of radicals. This means that there cannot exist a formula which produces the solution of an arbitrary n-th order algebraic equation by applying to its coefficients ai a finite number of arithmetical operations and extractions of roots. Of course, one should here bear in mind that the equation (98) of course has solutions but these cannot be expressed in terms the coefficients ai as a “nice” formula. In many concrete special cases there may be solution formulae in terms of radicals. But often these formulae are so complicated that in order 32As the complex numbers got “priority” only in the beginning of the 19th century, we consider, for a while (Sections 3.1–3.4), only equations with real coefficients and seek only their real solutions. Nevertheless it is possible to obtain from the formulae found also complex solutions of these equations. 33The contemporary, deeply founded notion of mathematical structure is due to [2, Chapitre I: Description de la mathématique formelle; Chapitre II: Théorie des ensembles.]
356
C HAPTER VI. POPULARIZATION OF MATHEMATICS
to find the solutions of (98) and in order to learn the properties needed in its application one uses indirect methods. 34
3.1. Equations solvable in terms of radicals The White Rabbit put on his spectacles. “Where shall I begin, please your Majesty?” he asked. “Begin at the beginning,” the King said gravely, “and go on till you come to the end: then stop”. Lewis Carrol, Alice in Wonderland
1. The first information about the solution of algebraic equations comes from Ancient Egypt. (Cf. [14, p. 110].) The attempts to solve non-linear algebraic equations showed that the solution can not any longer be expressed in terms of the equation’s coefficients by application to them a finite number of arithmetical operations (addition, subtraction, multiplication, division); in other words, it appeared that the solution cannot always be expressed in rational terms by the given quantities. For instance, already from the solution formula for the quadratic equation x2 + px + q = 0, 5 p p2 − q, x=− ∓ 2 4 it is seen that in addition to the arithmetical operations there comes a square root extraction. This formula was known already in Ancient Babylon. 2. The ancient Greeks rediscovered the solution formula of the quadratic equation, expressing it in geometric terms. It is also well-known that the ancients liked to reduce the solution of algebraic equations to the search of the intersection of two auxiliary curves or else the repeated application (“iteration”) of this procedure. For example, the equation y 3 = ab2 was solved by intersection of two conic sections, the parabola y 2 = bx and the hyperbola xy = ab. The duplication of the cube is a special case of this problem, for a = 2b. The problem to intersect a sphere by a plane in such a way that the areas of two segments arising are in a given ratio to each other, led to a cubic equation, to the solution of which again geometrical methods were employed. (We leave it to the Reader to derive, as a problem, this cubic equation and to find the corresponding conic sections.) The main attention was directed to the case when the auxiliary curves were circles or straight lines. Each such construction (by ruler and compass) reduces to the finding the intersection of two straight lines; a straight line and a circle; or two circles and, if necessary, iteration of this method. When in the 17th century the method of coordinates was taken into use, it became manifest that in the case at hand the application of the geometrical method is equivalent to the sequential solution of a chain of linear and cubic equations. In other words, the possibility of carrying out these constructions was related to the problem of the solvability of the cubic equation in radicals. 35 34For a short introduction, see [7, 11]. Editors’ note: See also article [12] 35In greater detail about this in [9].
3. The history of solving equations
357
3. In the middle ages, many trials were made to solve the cubic equation; let us recall, for instance, the attempt of the famous Italian mathematician Leonardo Fibonacci36, around 1225. Finally, Scipione del Ferro, professor at the university of Bologna, succeeded, in 1506–1515, to find a solution formula for the equation a + bx + x3 = 0. But he kept his discovery in secret, revealing it only to his student Antonio Maria Fiore. The latter had a dispute with his gifted compatriot Niccolo Tartaglia, which later also was followed by a public disputation. The problem sent to Tartaglia required to give a method of solution for the equation x3 + ax2 = b. Tartaglia had prepared himself well for the contest: he did find not only the solution of x3 +ax2 = b but also of x3 +ax = b (in 1535)! He published the results first as a pentagram (1539) and then with a full description. But they became widely known only through the well-known treatise written by the Italian mathematician Geronimo Cardano Ars magna, sive de regulis algebraicis (The great art or on the rules of algebra, 1545). Therefore the solution formula for the equation x3 + bx + a = 0 is also known as Cardano’s formula, 5 5 b3 a √ a √ a2 + . x = 3 − + Δ + 3 − − Δ where Δ = 2 2 4 27 But the general cubic equation a0 + a1 x + a2 x2 + a3 x3 = 0 can be reduced to the above a2 . Therefore Cardano’s special case by the rational change of variable 37 x = y − 3 formula can also be used to solve the general cubic equation. This substitution was presumably known to Cardano, because he transformed all his cubic equations to a form where the quadratic term was absent. However, it was François Viète who was to present the change of variable just mentioned. The Dutchman Jan Hudde simplified considerably Viète’s treatment in 1658, in which way the solution of the cubic equation took more or less the present day form. In doing this he used the symbolic formalism invented by R. Descartes. 4. Several facts about equation of the fourth degree [or quartic equations] were known already to Apollonius (around 200 B.C.). Also Arabian mathematicians, in the middle ages, knew how to solve some such equations. For example, in order to solve the equation x4 +px3 = q they sought the intersection between the parabola y = x2 and the hyperbola y 2 + pxy − q = 0. After having being successful in the case n = 3, the 16th century mathematicians began intensively to look for a solution formula for the equation of the fourth degree. Thus Cardano worked for a long time on this, but his efforts gave no harvest and he gave up. Instead he directed his student Luigi Ferrari to continue. In 1645, Ferrari found the sought method. The equation a0 + a1 z + a2 z 2 + a3 z 3 + z 4 = 0 can be brought in the form r + qx + a3 . Introducing a new parameter, we can px2 + x4 = 0 changing the variable z = x − 4 36Translator’s note. Also known as Leonardo of Pisa or Pisanus; Leonardo’s father’s name was Bonacci, so the son wrote himself as Leonardo filius Bonacci; the name Fibonacci was used, only in the first half of the 19th century, by the Italian mathematician and mathematical historian Guglielmo Libri Carucci della Sommaja (1803-1865), at the same time a great scoundrel (a thief of old books). Leonardo’s algebra book starts with the sentence in Latin: Incipit liber Abbaci compositus a Leonardo filio Bonnaci Pisano, in anno 1202. (Quoted from [10, p. 133]). 37Translator’s note. Known as a Tschirnhaus transformation.
358
C HAPTER VI. POPULARIZATION OF MATHEMATICS
present the last equation as (x2 + p + y)2 = (p + 2y)x2 − qx + (p2 − r + 2py + y 2 ). We determine y in such a way that the right hand side of the last equation is a full square: it transpires that this is so if and only if 4(p + 2y)(p2 − r + 2py + y 2 ) = q 2 . But this is a cubic equation for finding y. After y has been found (it turns out that it suffices to know only one cubic root) we obtain for the determination of x two quadratic equations: x2 + p + y = P x + Q
and
x2 + p + y = −(P x + Q).
We see that the solution of the quartic equation is reduced to the successive solution of equations of lower degree. Also this method became known to the algebraists by the mediation of Cardano’s treatise Ars Magna. We remark that the solution of the quartic equation can readily be given a geometric interpretation. Indeed, taking y = x2 the general equation can be written as y 2 + a3 xy + a2 y + a1 x + a0 = 0. The original algebraic problem is now reduced to finding the equations in the xy-plane for the intersections of two-dimensional second order curves.38 A strong push forward for further development of the theory of equations was given by the plan of Descartes, according to which algebra should rise to primacy in mathematics, to be an adequate mean in the posing and in the study of geometric problems. However, after the creation of calculus, the attention of the mathematicians moved in a quite different direction, but not for very long. The successful solution for n = 2, 3, 4 had put on the agenda the finding of a solution formula for the fifth order (or quintic) equation. Nobody had (on the base of induction) any doubts about that such a formula would be found sooner or later. Among others mathematician such as R. Descartes, G.-W. Leibniz, E. Bézout and L. Euler worked on this. The later found an approach differing from Ferrari’s method for reducing the quartic equation to cubic equations, and he had also the idea to apply this in the quintic case. Likewise J.-L. Lagrange made an attempt to find a solution formula for the quintic equation. But not even the efforts of all these famous mathematicians did the desired result. This arose doubt about the correctness of the position of the problem, and one began to find a proof of the solvability of the quintic equation a priori, that is, without finding directly the solution formula. 38Making the substitution y = x2 in the equation r + qx + px2 + x4 = 0, we can give it the shape
x2 + y 2 + qx + (p − 1)y + r = 0. ) and radius This is the equation of a circumference with center (− q2 , − p−1 2 R=
p−1 2 q2 +( ) − r. 4 2
Thus it is possible to find the real solutions of the general quartic equation with the aid of a circumference from the graph of the parabola y = x2 .
3. The history of solving equations
359
3.2. The plan of Lagrange
So Alice was considering in her own mind (as well as she could, for the hot day made her feel very sleepy and stupid), whether the pleasure of making a daisy-chain would be worth the trouble of getting up and picking the daisies Lewis Carrol, Alice in Wonderland
5. An extraordinary importance in the development of the theory of equations was the appearance, in 1770-1771, of Lagrange’s memoir “Refléxions sur la théorie algébique des équations”. It consists of four parts. In the first three of them the author gives an analysis of all the then known methods of solution for the third and the fourth order equations and likewise some higher order equations; the fourth is devoted to consequences of this analysis. Lagrange succeeded to give a general principle for the solution methods. It turns out that in the case of all the existing methods one has to solve some auxiliary equations, the coefficients of which are expressed rationally in the coefficients of the initial equation. The solutions of these auxiliary equations are the values of a certain rational functions, when the arguments are the solution of the initial equation. Here the degree of the auxiliary equation is determined not by the shape of this function but the fact how many values it takes under all possible rearrangements (or substitutions) of its arguments, that is the solution of the initial equation. Lagrange reached the conclusion that the finding of a solution formula for an algebraic equation reduces to the problem of finding such rational functions of the solutions of an equation which take the fewest number of possible values. If in this case the degree of the auxiliary equation arising is lower than the degree of the initial equation, then the solution of the equation is reduced to solving of an equation of lower degree. In this way one can, in certain conditions, by iterating the procedure arrive at a solution of the given equation. This is Lagrange’s plan in its broad outline. Let us get acquainted with its details.39 Let there be given the general equation a0 + a1 x + a2 x2 + · · · + an xn = 0; its coefficients are variables which may assume arbitrary complex values and are algebraically independent over the field of complex numbers C (that is, they do not satisfy any algebraic equation with complex coefficient). Its solution will be denoted x1 , . . . , xn . The latter may also be viewed as independent variables over C, as in view of Vietè’s formulae there correspond to each set of values for xi fixed ai values and vice versa. Let us consider a rational expression ϕ(x1 , . . . , xn ) in the solutions xi of the equation f (x) = 0 with coefficients in an transcendental extension40 C(a1 , . . . , an )/C. Let 39In explaining Lagrange’s plan we used several notions about groups and fields, so this subsection may cause some of the Readers certain difficulties. In such a situation we advice them to acquaint themselves with them the book [5]. Translator’s note. Cf, 1, footnote 12. 40The elements of the field C(a , . . . , a )/C are all such divisions f (a1 ,...,an ) , where g ≡ 0; here f n 1 g(a1 ,...,an ) and g are polynomials with complex coefficients over a1 , . . . , an . These variables are treated as independent quantities in the field C.
360
C HAPTER VI. POPULARIZATION OF MATHEMATICS
us permute the solution xi among themselves in the expression ϕ(x1 , . . . , xn ) (i.e. xi → xσ(i) , σ ∈ Sn ) and turn our attention to the case when ϕ does note change, that is ϕ(x1 , . . . , xn ) = ϕ(xσ(1) , . . . , xσ(n) ). For example, ϕ = x1 x2 + x3 x4 does not change if we apply to it the substitution 41 1 2 3 4 σ= 3 4 2 1 It is easy to see that all the substitutions which do not alter a given rational expression ϕ(x1 , . . . , xn ) form a group Φ. For example, in the case of ϕ = x1 x2 + x3 x4 this group is the following one of order 8: 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 Φ= , , , , 1 2 3 4 2 1 3 4 1 2 4 3 2 1 4 3 1 2 3 4 1 2 3 4 1 2 3 4 1 2 3 4 , , , . 3 4 1 2 4 3 2 1 3 4 2 1 4 3 1 2 By a direct check, the Reader can verify that each fourth order substitution which lies outside Φ ⊂ S4 changes the expression ϕ(x1 , . . . , x4 ). D EFINITION 3.2. If a rational expression ϕ(x1 , . . . , xn ) does not change a given substitution group Φ (Φ ⊂ Sn ) by application of any element but changes if we apply an n-th order substitution not belonging to the subgroup Φ, then we say that ϕ belongs to the group Φ. For example, the linear expression ω = α1 x1 + α2 x2 + · · · + αn xn , where the αi all are distinct, belongs to the unity group Φ = (e). Let the rational expression ϕ belong to the subgroup Φ ⊂ Sn . Taking an arbitrary subgroup G in Sn with G ⊃ Φ and applying to ϕ all substitution xi → xσ(i) we obtain a complex of distinct expressions ϕ, ϕ1 , . . . , ϕs−1 , which we call the G-co-expressions of ϕ. Taking for example G = Sn we obtain for ϕ = x1 x2 + x3 x4 the following G-co-expressions: ϕ = x1 x2 + x3 x4 ,
ϕ1 = x1 x3 + x2 x4 ,
ϕ2 = x1 x4 + x2 x3 .
Equations whose solutions are formed from the solutions of the given general equation rational co-expressions will be called auxiliary equations. It turns out that the Gco-expressions of a rational expression ϕ belonging to Φ are solutions of an auxiliary equation of degree (G : Φ) 42; moreover, the coefficients of the auxiliary equation are “G-conservative”, that is, they do not change under the action of any element of G. The plan is based on the following theorem. 41To apply the substitution
σ=
1 3
2 4
3 1
4 2
must be interpreted in the “natural way”: in the expression ϕ(x1 , . . . , xn ) all the variables i have to be replaced by σ(i). 42Let the number of elements in the finite groups G and Φ be |G| and |Φ| respectively. Then the index |G| of Φ in G is the number (G : Φ) = |Φ| .
3. The history of solving equations
Sn
ai
G
ϕ
H
ψ
.. .
.. .
K
χ
E = (e)
ω
361
Fig. 11
T HEOREM 3.3 (Lagrange). Let there be given two rational expressions ϕ(x1 , . . . , xn ) and ψ(x1 , . . . , xn ) of the solutions xi of the general equation. If the expression ϕ does not change under all such rearrangements which do not change the expression ψ, then the expression ϕ can be expressed rationally in terms of ψ and the coefficients of the general equation. How can we use this result? Let there be given a “tower” of subgroups Sn (each “upper floor” contains the lower one as subgroup), and to the right of a group there is a rational expression belonging to it (see Figure 11). Here the expression ϕ is a solution of an auxiliary equation of degree (Sn : G) = n1 whose coefficients are rationally expressible in terms of the coefficients ai of the general equation. The expression ψ is a solution of an auxiliary equation of degree (G : H) = n2 whose coefficients are rationally expressible in terms of the coefficients ai and the expression ϕ etc. Finally, ω is a solution of an equation of degree (K : E) = |K| = nk whose coefficients are rationally expressible in terms of the coefficients ai and the expression ϕ, ψ, . . . , χ. On the other hand, as the expression τ = x1 belongs to the group Φ = Sn−1 ⊂ Sn (Φ acts on all xi , i = 1), Lagrange’s theorem provides us to express τ in terms of the coefficients ai and the suitable rational expression in those xi which belong to Φ or some of its subgroups; as the linear expression ω = α1 x1 + α2 x2 + · · · + αn xn (the αi ∈ C are distinct) belongs to the unity group E and E ⊂ Φ, we can take ω for this expression. We see that if all ni < n we get a chain of auxiliary equations of lower degree, the solution of which ought to lead to a solution formula. This was precisely the guiding idea Lagrange’s plan. The Reader has probably observed that for making Lagrange’s solution plan work it is especially important to find subgroups “of a as small index as possible” or, what is the same, with “of an as big order as possible”. Unfortunately, for n ≥ 5 Lagrange’s plan is in reality not applicable. Already P. Ruffini remarked that if for a subgroup of Sn holds (Sn : G) > 2, the this index must be ≥ 5. A. L. Cauchy generalized this result and showed that the index
362
C HAPTER VI. POPULARIZATION OF MATHEMATICS
of a substitution group (that is, a subgroup of Sn ) cannot be, at the same time, greater than two and less than the biggest prime number not exceeding n. This means that in the case of a prime n there does not exist in Sn a subgroup of an index such that 2 < i < n. Cauchy tried to extend the result further and, indeed, it turned out to be true for n = 6 also. Finally, J. Bertrand managed to prove the theorem: if n ≥ 5, Sn has no subgroup whose index lies between 2 and n. We see now why Lagrange’s plan did not lead to the expected result.
3.3. On the regular polygon of 17 sides and on the fundamental theorem of algebra
The Caterpillar and Alice looked at each other for some time in silence: at last the Caterpillar took the hook out of its mouth, and addressed her in a languid, sleepy voice. “Who are you?’ said the Caterpillar. This was not an encouraging opening for a conversation. Alice replied, rather shyly, ‘I – I hardly know, sir, just at present – at least I know who I was when I got up this morning, but I think I must have been changed several times since then.” “What do you mean by that?” said the Caterpillar sternly. “Explain yourself!” “I can’t explain myself, I’m afraid, sir” said Alice, “because I’m not myself, you see.” Lewis Carrol, Alice in Wonderland
6. Simultaneously with Lagrange’s memoir there appeared, in 1771, a paper by A. Vandermonde Mémoire sur la résolution des équations (Memoir on the solution of equations). Its author reached basically the same results as Lagrange, although without obtaining the clarity of the latter. In his study of the equation xp − 1 = 0 (p is a prime) Vandermonde obtained a series of results which, in 1796, enabled Gauss to solve completely the problem of the construction of a regular polygon of p sides. That the solution of the equation xp − 1 = 0 corresponds the construction of a regular p-gone, was clear already 2πk 2πk + i sin , to A. de Moivre, which is seen at the hand of the formula εk = cos p p k = 0, 1, . . . , p − 1. The technique for the dissection of the circumference and the construction of the regular polygons of three, five or fifteen sides are generally known. Already Euclid obtained from this the four series of “constructible” regular polygons (99) and (102) by duplication of the sides: (99)
=
regular 2n − gone
(100)
=
regular 3 · 2n − gone
(101) (102)
= =
regular 5 · 2n − gone regular 3 · 5 · 2n − gone
for every n = 0, 1, 2, 3, . . .
3. The history of solving equations
363
One might have thought that these were the only series of constructible regular polygons. However, Gauss proved that also the regular polygon of 17 sides was constructible. More exactly, he established the following theorem: T HEOREM 3.4. The regular p-gone is constructible with the aid of the ruler and compass if and only if one of the following conditions is fulfilled: (α) p is a prime of the form p = 2n + 1; (β) p = 2k ; (γ) the number p is the product of a finitely many pairwise relatively prime numbers of the previous two types. For Reader, who hears about this theorem for the first time, the meaning of these numbers will probably be somewhat mysterious. The author has the intention to write in the pages of our journal about Galois’ criterion for the solvability of algebraic equations.43 Having learnt this criterion and bearing in mind that the geometric question is equivalent to the solution of the equation xp−1 + xp−2 + · · · + x + 1 = 0 in terms of square roots, the mystery of these numbers disappear. In connection with Gauss’s theorem we should note that the number p = 2n + 1 cannot be a prime, unless n in turn is not a number of the form 2k for k an integer. Indeed, assuming that n = 2k · m with m odd and setting 22k = α, we obtain using that m is odd the identity k
p = 2n + 1 = 22
·m
+ 1 = αm + 1 =
= (α + 1)(αm−1 − αm−2 + · · · − α + 1). k
This shows that for all m = 1 it is true that p is not a prime. Then number p = 22 + 1, k = 0, 1, 2, 3, 4 are primes so according to our theorem the regular polygons with 3, 5, 17, 257 and 65537 sides are constructible (see the paper [13]p. 86). Already for n = 5 one has a composite number (L. Euler found the factor 641). There arises the question k if the series 22 + 1 contains a finite or an infinite number of primes. G. Eisenstein asserted that they are only finitely many (. . . and perhaps he had indeed reasons for such an assertion). 7. Let us next have a look at the number of solutions of an algebraic equation; this question is intimately tied to the creation of the notion of complex number. Already Apollonius knew that two conic sections cannot have more than four points of intersection, i.e. the coordinate 4-th degree equation has at most 4 solutions. Cardano 4 2 began √ to operate, in√the solution of the equation x + 40 = 10x , with the numbers 5 + −15 and 5 − −15, calling “sophistic numbers”, and, upon multiplying them, he verified that they were correct solutions. He was the earliest man who regarded these numbers as “lawful”, and so he arrived at the conclusion that the cubic equation has three but the 4th order 4 solutions. The first ever to state explicitly that the n-th order equation has n solutions was P. Roth[e] (1608); he was one of the Nuremberg “reckoners”44 These ideas were the foundations of a conjecture with the same formulation by the eminent algebraist A. Girard (1629); in 1746 J. d’Alembert made an effort to prove 43Translator’s note. See [4], also Section 6 of this book. 44Translators note. Peter Rothe, d. 1617.
364
C HAPTER VI. POPULARIZATION OF MATHEMATICS
this conjecture and, although his attempt failed, he had found a valuable idea. In 1749 L. Euler tried to verify Girard’s conjecture, and after him likewise J. L. Lagrange. The “sophistic quantities” had to go through still a long evolution, where we can note names such as J. Wallis (1673), C. Wessel (1798) and J. Argand (1813), before one reached a clear picture of the complex numbers and their geometric presentation. The first to make a serious use of complex numbers was C. F. Gauss, in this way considerably enriching the apparatus of Mathematical Analysis. In algebra Gauss managed to prove the general case of Girard’s conjecture. This was done in his dissertation (the years 1797-1799). Basing himself on d’Alembert’s result, Gauss proved the following result. T HEOREM 3.5. For each n and arbitrary complex numbers a0 , a1 , . . . , an the equation p(x) = a0 + a1 x + a2 x2 + · · · + an xn = 0 there exits a complex number such that p(α) = 0. Girard’s conjecture follows at once from this so-called fundamental theorem of algebra and a simple fact (already known to Cardano): the number α is a solution of the equation p(x) = 0 if and only if the polynomial p(x) is divisible by the linear polynomial x − α. For young Readers it will probably be sufficiently interesting and useful to learn the proof of this theorem. We present Gauss’ proof in a variant due to F. Klein and H. Weber. P ROOF. (1) Reduction. It turns out that it suffices to prove the theorem for polynomials with real coefficients. Indeed, let us consider the polynomial f (z) = a0 + a1 z + a2 z 2 + · · · + an z n ,
z∈C
also the polynomial ¯ =a ¯1 z + a ¯2 z 2 + · · · + z n , f(z) ¯0 + a where the symbol a ¯i denotes the complex conjugate of the number ai . Let us form the polynomial F (z) = f (z) · f¯(z); then all coefficients are real. Indeed, the polynomial F ¯j , and, as has the coefficients Ak = i+j=k aj a A¯k = ai a ¯j = aj a ¯i = Ak , i+j=k
i+j=k
then Ak ∈ R (this is our notation for the domain of real numbers). For α ∈ C let F (α) = 0. Then at least one of the relations f (α) = 0 or f¯(α) = 0 must be fulfilled. Thus if f (α) = 0 then one would have f¯(α) = 0. So the reduction is carried out, and in what follows we may assume that all ai ∈ R. (2) Introduction of geometric elements. Let z = x + iy. Using Newton’s binomial theorem we obtain f (z) = f (x + iy) = u(x, y) + iv(x, y). Each point of intersection in the xy-plane of the curves U : u(x, y) = 0 and V : v(x, y) = 0 gives a solution of our equation f (z) = 0. Therefore we have to show that, indeed, they intersect at least once. To this end, we shall have a closer look at the behavior of these curves in the xy-plane. Viewing the polynomials u = u(x, y) and v = v(x, y) as functions of two variables, we note their continuity. Therefore we have:
3. The history of solving equations
365
1◦ if v(P ) > 0 or v(P ) < 0 at some point of the xy-plane, then these inequalities hold also in a sufficiently small neighborhood of P . Of course, the same is true for the function u; 2◦ if u(P1 ) > 0 and u(P2 ) < 0, then there exists on each continuous path connecting P1 and P2 a point Q such that u(Q) = 0. This holds also for v. y
0
6
r ϕ
a z = x + iy
-
x
Fig. 12
Taking polar coordinates (r, ϕ) in the xy-plane (cf. Figure 12), we obtain z = x + iy = r(cos ϕ + i sin ϕ) and z k = rk (cos kϕ + i sin kϕ) for all k = 1, 2, . . . ; u and v can be expressed as u = a0 + a1 r cos ϕ + a2 r2 cos 2ϕ + · · · + an−1 cos(n − 1)ϕ + rn cos nϕ, v = a1 r sin ϕ + a2 r2 sin 2ϕ + · · · + an−1 sin(n − 1)ϕ + rn sin nϕ. Writing
< ; an−1 sin(n − 1)ϕ + . . . , v = rn sin nϕ + r it is easy to see that for sufficiently large r the function v(x, y) will on each circumference (r) (in this way we denote a circumference with radius r and center at the origin) take the same values as the expression sin nϕ; the behavior of the last function is however known. We denote by P0 , P1 , . . . , P2n−1 the points of the circumference (r) where (2n − 1)π π 2π , ..., . In this way we obtain 2n intervals (P0 ; P1 ), (P1 ; P2 ), ϕ = 0, , n n n . . . (P2n−1 ; P0 ), where sin nϕ is alternatively positive and negative (cf. Figure 13). In π π π , v will be alternatively positive the neighborhood ( · k − η, · k + η), where η < n n 2n and negative. Therefore takes the function v also the value 0 in the neighborhood of each point Pk (soon we shall see that we have v = 0 exactly 2n times on the circumference (r)). In an analogous way we see that, if the radius r is sufficiently large, the value of u(x, y) depends on the sign of cos nϕ, so that we have u > 0 at the points P0 , P1 , . . . , P2n−2 and in their neighborhoods; at the points P1 , P3 , P5 , . . . , P2n−1 and in their neighborhoods we have however u < 0 (here we used the continuity of u(x, y)!). ϕ Putting into play the variable t = tan , we see that in view of the relations cos ϕ = 2 2t (1 + it)2 1 − t2 and sin ϕ = one has z = r . Using Newton’s binomial formula 1 + t2 1 + t2 1 + t2
366
C HAPTER VI. POPULARIZATION OF MATHEMATICS
P3
P2
+
_
_ P1
P4 y
+ P5
+
0
P0
x
_
_ P6
+
+
_ P7
P9
P8 Fig. 13
we obtain Φ(r, t) Ψ(r, t) and v = , (1 + t2 )n (1 + t2 )n where degt Φ ≤ 2n, degt Ψ ≤ 2n − 1, degr Φ ≤ n, degr Ψ ≤ n. Here degt Φ denotes the degree of the polynomial Φ with respect to t, and degr Φ its degree with respect to r. (3) Topological considerations. There are only finitely many circumferences (r) such that Φt ≡ 0 or Ψt ≡ 0 (here the sign ≡ stands for equivalence in the variable t). Indeed, in view of the inequalities degr Φ ≤ n and degr Ψ ≤ n we have in either case an algebraic equation of degree at most n; and each such equation cannot have more than n solutions (do not confuse this with the question of their existence, which we will prove only later!). The proof of this fact (by contradiction) is trivial. It is likewise easy to see that on the circumferences (r) on which holds Φ ≡ 0 and Ψ ≡ 0 (on the basis of what was said above, on can affirm that there exists an r0 such that for r = r0 are these conditions fulfilled) the functions u and v become zero not more than 2n times – this in view of the inequalities degt Φ ≤ 2n and degt Ψ ≤ 2n. Using now the result of both considerations, we see that neither u nor v can be zero in a domain of points with finite area. In other words, we can say that the plane falls into two type of domains (in one of them one has v > 0 and the others v < 0), which are separates by the curved line V : v = 0. The following considerations are illustrated by Figure 14. 45 u=
As the curve v = 0 behaves asymptotically as rn sin nϕ (i.e. for sufficiently large values of r the curves are close to each other), we know that the domains v > 0 lie in a 45Translator’s note. This is not the figure given by Kaljulaid, which is defective, but is taken from the classical book Felix Klein [6, pp. 111–112]. It concerns the cubic polynomial f (z) = z 3 − 1, so that U = r 3 cos 3ϕ, V = r 3 sin 3ϕ. Note that, quite generally, if f is a polynomial with real coefficients, as assumed by Kaljulaid in this paper, the zeros are symmetrically situated about the real axis. The level lines depicted in Figure 14 below, and in Klein’s book, display this symmetry, but, unfortunately, not those in the figure originally given by Kaljulaid.
367
v=
0
3. The history of solving equations
u= 0
u= 0
v= 0
u=
0
0
v=
Fig. 14
sector
=
2kπ n ,
(2k+1)π n
> ,
extending (in view of the continuity of v) to the interior of the of the circumference (r). Let us now have a look at the parts of the domains v > 0 inside the circumference (r), moving along the contour V : v = 0 in such a way that these domains all the time are situated to the left. The contour curve of the domains v > 0 then can behave inside (r) in a rather varied way: it can return to the same sector (see the sector (P2 ; P3 )) it can (2+1)π move into a sector of the type ( 2π ) and can then bifurcate and then each branch n , n moves into a sector of the aforementioned type (see Figure 14). It is however easy to see that the contour V : v = 0 “departs” into the circumference (r) in the neighborhood of points of bifurcation Pk of an odd index (where, however, u < 0) and “arrives” on the circumference (r) in the neighborhood of points of division Pk of an even index (where, however, u > 0).46 Because of the continuity of the function u = u(x, y) there must now exist a point Q between the points Pk and Pk such that u(Q) = 0. Quod erat demonstrandum. ( ' In his dissertation Gauss raised the conjecture that higher order equations (n ≥ 5) do not have solutions in terms of radicals. He turned attention to this also in his famous monograph Disquisitiones arithmeticae (Studies of Arithmetic), which appeared in the years 1797-1801. In fact, it is possible to find, for each natural number n ≥ 5, an nth order algebraic equation (even with integer coefficients) whose solutions are not expressible in terms of radicals. At the same time, by the theorem just established, each such equation has n complex solutions. From here it is also clear that the set of all algebraic numbers is much larger than the one which can be “write down” by radicals. 46All the time we move in such a way that the domain v > 0 stays to the left.
368
C HAPTER VI. POPULARIZATION OF MATHEMATICS
In algebraic geometry one often encounters equations whose coefficients are rational functions; the solutions of such equations are algebraic functions.47 Because of this one considers also algebraic equations over fields differing from the field of complex numbers (of course, this has also many other reasons). The fundamental theorem of algebra is not applicable to such field, and is replaced by the following statement: Let P be a field and f (z) = a0 + a1 z + a2 z 2 + · · · + an z n = 0 an algebraic equation over P , i.e. all ai ∈ P . There cannot be more than n solution of the equation f (x) = 0 neither in P nor in any of its extension fields. On the other hand, there exists an extension F ⊇ P such that the equation f (x) = 0 has in it precisely n solutions (counting repetitions).
3.4. On the theorem of Ruffini-Abel The only goal of history is not at all to satisfy only fruitless curiosity; learning the past must clarify the future. P. Tannery
9. The Italian mathematician Paolo Ruffini tried in 1799 to prove that the higher order general equation cannot be solved “in the finite extended arithmetic” (see Section 3.1). Although his proof turned out to be lacunary and despite repeated attempts (in the years 1801, 1802, 1806, 1813) Ruffini did not succeeded to complete it, these papers contained many new ideas and facts, that really constituted a preparatory step in the establishment of group theory. Regretfully, Ruffini’s text-book Teoria generale delle equazioni, in cui si dimostra impossibili la soluzione algebraica delleequazioni generali di grado superiore al quarta (A general theory of equations, in which one demonstrates the impossibility of the algebraic solution of equations of degree higher than four) became little known outside Italy. However A.-L. Cauchy became acquainted with this book. He became the first (around 1815, that is, some time later) to develop the terminology and notation for the theory of substitution groups. Having obtained some basic facts, he wrote a memoir Sur le nombre des valeurs qu’une fonction peut acquérir lorsqu’on y permute toutes les manières possibles les quantités qu’elle renferme (On the number of values which a function can acquire when one permutes in all possible manners the quantities which it contains), where he set out the basis of an entire theory. We see that until the beginning of the 19th century a whole range of developments had taken place in Lagrange’s original ideas – one had laid the foundations of the theory of substitution groups. However, the basic principles of field theory were still missing. 10. Niels Henrik Abel48 busied himself with the solution of the 5th degree equation already in his youth. Once (in 181249) he thought that he had solved the problem, but later he had doubts about the validity of his proof. Intense research led Abel to a correct proof 47See [8], as well as the additional references in Section 1, footnote 9. 48One can read about Abel’s life in the magnificent book Ø. Ore, The remarkable life of N. H. Abel.
Several editions in Western languages (German, Norwegian, English). Russian translation: Moscow, 1961. 49Translator’s note. Born in 1802 he was about ten then; Abel died prematurely at the age of twenty six in 1829.
3. The history of solving equations
369
(1824), and in 1826 there appeared in the Journal für Reine und Angewandte Mathematik (Journal of pure and applied mathematics) his paper Démonstration de l’impossibilité de la résolution algébrique des équations générales qui passent le quatrième degré. (Proof of the impossibility of the algebraic solvability of general equations the degree of which exceeds four.) Abel formulated the problem as follows. A function v of finitely many variables x1 , . . . , xn is called algebraic if v can be expressed in terms of x1 , . . . , xn in the “finitely extended arithmetic” (here Abel considered only roots to a prime number exponent). Treating the solutions of the equations as algebraic functions of the coefficients of the equation, with the aid of which one can replace the unknown in the equation, the latter will be satisfied, Abel understands the solvability of the equation as finding the general form of such algebraic functions. Although the Abel(-Ruffini) theorem told that the general higher order equation is not solvable in radicals, but its proof did not say anything about the solvability of concrete algebraic equations (with numerical coefficients) in terms of radicals. Examples show that there exist series of equations solvable by radicals: xn + a = 0, (xn + a)m + b = 0, x2n + pxn + q = 0. A more complicated and more interesting example of an n-th order equation, which is solved in terms of radicals, is given by
n xn−2k (1 − x2 )k = a, 2k
[ n−1 2 ]
(−1)k
k=0
a ∈ R, |a| < 1, n ∈ N;
[x] denoting the integer part of x. Using the formula cos nα + i sin nα = (cos α + i sin α)n along with Newton’s binomial theorem, we get n n cos nα = cos α − cosn−2 sin2 α + · · · = 2 n n cosn−2 (1 − cos2 α) + . . . = cos α − 2 Next, writing cos nα = a and cos α = x, we see that the equation in question provides a relation for finding the cosine of an angle, when the cosine of the n-fold angle is known. The fact that this equation is solvable in radicals becomes manifest by the following computation: (cos α + i sin α)n = cos nα + i sin nα = a + i 1 − a2 , which gives
? n cos α + i sin α = a + i 1 − a2 .
In an analogous way one finds that cos α − i sin α = Thus we obtain x = cos α =
1 2
? n a − i 1 − a2 .
? ? n n a + i 1 − a2 + a − i 1 − a2 .
370
C HAPTER VI. POPULARIZATION OF MATHEMATICS
So we see that there remains the possibility that each concrete equation perhaps is solvable by radicals, only that in each concrete case perhaps the solution formulae differ, i.e. there exists no general formula in radicals applicable to all n-th order equations. In the years 1826-1829 Abel worked very intensively on these questions, setting for himself the problem to find the conditions on an equation to be solvable in terms of radicals. The honor of solving completely this problem goes, however, to another extraordinary mathematician Évariste Galois. However, Abel managed to find half of Galois’ criterion: If just one of the solutions of an algebraic equation expresses it self in radicals, then the Galois group of this equation is solvable (of course, he expressed himself in a quite different way). The results of this fruitful work were published in his Collected Works (1839) as the paper “Sur la résolution algébrique des équations” (On the solution of algebraic equations). Of major interest is likewise his paper “Mémoire sur une classe particulière d’équations résolubles algébriquement” (Memoir on a particular class of algebraically solvable equations). Here the following two theorems are established. (1) If each solution of an equation can be expressed rationally in terms of one solution, that is, if, for example, xj = Θj (x1 ), where the Θj are rational functions of x1 and if these functions satisfy the “commutativity relations” Θi (Θk (x1 )) = Θk (Θi (x1 )), then the equation can be solved in terms of radicals. (2) If of two solutions of an irreducible equation of prime degree one can be expressed rationally, and vice versa, then this equation is solvable in radicals. That Abel, in 1829, was already quite close to the results of Galois, is shown clearly by his result (in a letter to Crelle dated October 18, 1828): if for an irreducible equation of prime degree any three of its solutions are connected with each other in such a way that any one of them is expressible in terms of the two remaining ones, then this equation is solvable in radicals. 11. The crown of this century long line of development in algebra were the results of Évariste Galois (1831-32) [3]. Using a series of new mathematical notions (today fundamental in mathematics) Galois achieved in the treatment of the questions and ideas an extraordinary precision and generality. The fact that he himself considers his main theorem as a generalization of Gauss’ theorem (see Theorem 3.4) shows that one of the causes of his success was that he understood with extraordinary depth the results of Vandermonde and Gauss in the search of the solvability of the equation xp − 1 = 0 in terms of square roots. Galois must be seen as the founder of group theory, because many deep results and notions, which today constitute the basis of this theory, are due to him. Because of these and other reasons 50 his contemporaries did not acknowledge at once the results of Galois. Even in 1843 J. Liouville writes to the Paris Academy of Sciences: “I hope that the Academy will find it of interest to learn that, among the manuscripts of Évariste Galois, I found a deep and exact solution to the following beautiful question: given an irreducible algebraic equation of prime degree, one asks if it is solvable in radicals.” In the course of the following decade there was a revival of the ideas 50To this contributed probably also the “structural style” of his presentation (the style of the future), its remarkable richness of content, and level of abstraction, which made the appreciation of Galois’ results a serious obstacle to most mathematicians of the period.
3. The history of solving equations
371
of Galois. In 1848 J. A. Serret taught, in Paris, the first course on Galois theory. The first coherent presentation in print appeared in 1852 in the form of Betti’s book “Sulla risoluzione del’equazioni algebriche” (On the solution of algebraic equation). In 1870, there appeared C. Jordan’s “Traité des substitutions et des équations algébriques” (Treatise of substitutions and algebraic equations), which constituted a magnificent commentary of Galois theory. To pursue further systematically the subsequent development of the ideas would be very hard within the frameworks of the present paper. 51 References [1]
[2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]
N. Bourbaki. “Éléments d’histoire des mathématiques”. Actualités Sci. Ind., no. 1212. Masson, Paris, 1984. English translation: “Elements of the history of mathematics”. Springer-Verlag, Berlin, 1994. Russian translation: Moscow, 1963. N. Bourbaki. Théorie des ensembles XVII. Premiére partie: Les structures fondamentales de l’analyse. Actualités Sci. Ind., no. 1212. Hermann & Cie, Paris, 1954. L. Infeld. Whom the Gods Love: The Story of Évariste Galois. National Council of Teachers of Math., Reston, VA, 1948. U. Kaljulaid. On Galois theory. Math. and Our Age 20, 1975, 17–31. (see [K75a]). G. Kangro. Kõrgem algebra (Higher algebra) II. Eesti Riiklik Kirjastus, Tallinn, 1950. (Estonian.) F. Klein. Elementarmathematik vom höheren Standpunkt I. Dritte Auflage. Verlag von Julius Springer, Berlin, 1928. Russian translation: “Nauka", Moscow, 1987. M. Levin and S. Ulm. Handbook of computation methods. Valgus, Tallinn, 1966, 1977. Ü. Lumiste. Riemann as the founder of topology and the general curved space. Math. and Our Age 11, 1966, 65–76. (Estonian). Yu. I. Manin. On the solvability of the problem of construction with ruler and compass. Encyklopedia of Elementary Mathematics 4, 1963, 205–227. M. Marie. Histoire des sciences mathématiques et physiques. Tome I, De Thales à Diophante. GauthiersVillars, Paris, 1883. I. R. Shafarevich. On the solutions of equations of higher degree (the method of Sturm), Moscow, 1954. I. R. Shafarevich. Selected chapters from algebra. The teachng of mathematics IV (1), 2001, 1–34. E. Tamme. Pierre Fermat and mathematics of the 17th century. Math. and Our Age 5, 1964, 74–87. M. Vanem and E. Tamme. How people learned to solve equations. Math. and Our Age 16, 1969, 110–121.
51The Reader will find interesting material about this in N. Bourbaki’s book [1].
This page intentionally left blank
373
4.
[K70] Additional remarks on groups Comments by G. Traustason
– Que faut il faire? dit le petit ptince. – Il faut être très patient, répondit le renard. – Tu t’assoira d’abord un peu loin de moi, comme ça, dans l’herbe. Je te regarderai du coin de lœil et tu diras rien. Le language est source de malentendus. Mais chaque jour, tu pourras t’assoir un peu plus près . . . (Translation: – What does one have to do for this, asked the little prince. – One has to be very patient, answered the fox. – First sit down at some distance from me on the grass. . . So. . . I’ll look at you from a corner of my eye and you must not say anything to me. Language is an incomprehensible source. But each day you may sit down a little bit closer to me . . . ) A. de Saint-Exupéry, Le petit prince (The little prince)
The paper at hand may be viewed as a sequel to the group theory part of the survey Principles of Algebra by Evgeniˇi Gabovitsh52 [5]. Besides new notions (homomorphism, normal divisor etc.) the Reader will here become acquainted with the “description” of cyclic groups; the notion of solvability of groups, criteria for the “discovery” of this property; and examples of solvable groups. He or she will also find the Feit-Thompson Theorem, and the formulation of some of its consequences. An extended knowledge of groups will be a good background for understanding Galois theory; for learning contemporary geometry; and elsewhere. The value of group theory in applications does today not require special comments. There are well-known applications in physics and chemistry (See e.g. [9] or [2]), not to speak of mathematics itself [11] or the foundations of differential and Riemann geometry (see also e.g. [6, Chapter I].) 1. We consider an arbitrary group and one of its subgroups, denoting them usually by G and H. We can introduce in G a division into classes taking as classes the orbits, that is, the sets Hg = {hg| g ∈ G fixed; h ∈ H being arbitrary}. For example, if the group G is the complex plane C with respect to addition of complex numbers, H being the imaginary axes, then the orbits are vertical lines (cf. Figure 15). Orbits are sometimes called cosets, and the element g is a representative of the class. The Reader will easily verify that as representative one may take an arbitrary “point” of the orbit; that distinct orbits do not intersect; that the orbit corresponding to the unit element coincides with the subgroup H; and, finally, that all the orbits fill up the entire group. Therefore, one has indeed a decomposition into classes, which is written as 52Translator’s note. See also the references provided in Section 1, footnote 12
374
C HAPTER VI. POPULARIZATION OF MATHEMATICS
H6
0
b
H +g
b >
C(+) -
Fig. 15
G = H + Hg2 + Hg3 + . . . or, compactly, G =
Hg, where K = {e, g2 , g3 , . . . }
k∈K
is a complete system of representatives, that is a set to which belongs precisely one representative in each class. Let us have a look at the case when G is a finite group. The number of elements is called the order of the group and is denoted |G|. The number of distinct orbits is called the index of the subgroup H in the group G and is written indG H or, also, (G : H). T HEOREM 4.1 (Lagrange). The order of a finite group is divisible by the order of the each subgroup, the quotient being then the index of the subgroup. P ROOF. We check first that the map ϕ : H → Hg, given by f (H) = hg, is oneto-one. This shows that all orbits have the same number of points. As the orbits fill out the entire group and do not intersect, we see, in view of the definition of index, that |G| = |H| · indG H, which proves the assertion of the theorem. ( ' Thus Lagrange’s theorem says that in a finite group the orders of its subgroups are divisors of its order. One can also ask if the converse is true: if a number m divides the order of a finite group, is it then the order of some subgroup of order m? Simple examples indicate that such a “converse conjecture” is not true in general. However, it is true when m is a prime. Even more, if |G| = pn · r, if p is a prime number and the number p and r are relatively prime to each other, then G contains subgroups of order p, p2 , p3 , . . . , pn . This statement (together with a small addendum) is known as the First Theorem of Sylow. Isn’t it possible, in all the previous reasonings, to consider, instead of the [previous] so-called right cosets, left cosets gH, that is, sets gH = {gh|g ∈ G fixed; h ∈ H being arbitrary}? The Reader will easily realize this case (by a completely analogous argument) and as a result we obtain a “left hand picture” of the previous “right hand picture”. In the special case when the group operation is commutative, that is the equation ab = ba holds true for arbitrary a, b ∈ G, the two pictures coincide. In the general case, we have, of course, Hg = gH.
4. Additional remarks on groups
375
Let there be given an arbitrary G and a subgroup H. We know that the group can be covered both by right cosets Hg and by left cosets gH, that is, G= Hg = gH. g∈K
g∈K
Here K and K denote a complete system of representatives in the first and in the second case, respectively. In general, we have, of course, K = K . We ask if it is possible to choose the representatives of the cosets so that K = K , that is, that the system of the representative of the right cosets is at the same time a system of the representative of the left cosets? In the general case, the answer is negative. However, G. Miller showed, in 1910, that such a choice is always possible if H is a finite subgroup. The situation under view takes place also in some other cases. One of these will be described in the next subsection; we will not dwell on the remaining ones.53 2. Let G be a group and fix an arbitrary element a ∈ G. We define a selfmap σa of G by the formula σa (g) = a−1 ga. Then e → e, as a−1 ea = a−1 a = e; g1 = g2 ⇐⇒ σa (g1 ) = σa (g2 ), as a−1 g1 a = a−1 g2 a is equivalent to g1 = g2 ; σa (g1 · g2 ) = σa (g1 )σ2 (g2 ), as a−1 (g1 g2 )a = a−1 g1 aa−1 g2 a = (a−1 g1 a) · (a−1 g2 a). We see that the unit element e of G is a fixed point of σa , and further also that σa gives a one-to-one correspondence on G and maps products of elements into products of the corresponding images. Apparently, one can carry out the construction a → σa for all a ∈ G, so σa gives us a so-called inner automorphism of G. An important role in group theory is played by so-called invariant subgroups or normal divisors. These are subgroups N ⊆ G with the property that σa (N ) ⊆ N holds for all inner automorphisms σa of G. In other words, a subgroup N ⊆ G is a normal divisor if and only if for all n ∈ N we have that a ∈ G =⇒ a−1 na ∈ N . It is easy to see that the above definition is equivalent to the statement that aN = N a for all a ∈ G, that is, left and right cosets with respect to N coincide. The fact that a subgroup N is a normal divisor in the group G, will be written as N G. The unity subgroup and the group itself are normal divisors in any group; they are the so-called trivial normal divisors. If there are no other normal divisors, then we speak of a simple group. E XAMPLE 4.1. Bijective selfmaps of a finite set are called substitutions; the order of a substitution is the number n of the set under consideration. Since the “individuality” of the set is of no interest, we may view the elements of the set as the first n natural numbers. Therefore, every substitution S of order n can be codified in the form [of a matrix] i 1 . . . in S= , j1 . . . jn 53The Reader will find interesting material about this in [12].
376
C HAPTER VI. POPULARIZATION OF MATHEMATICS
where [the rows] (i1 , . . . , in ) and (j1 , . . . , jn ) are permutations of the numbers 1, 2, . . . , n. In this notation one should bear in mind that arbitrary rearrangements of the [vertical] columns does not change the substitution, so that we agree that 1 2 3 4 2 1 4 3 2 1 3 4 ≡ ≡ ≡ etc. 2 1 4 3 1 2 3 4 1 2 4 3 It is possible to “multiply” substitutions with each other: the product of j . . . jn i . . . in and T = 1 S= 1 j1 . . . jn k1 . . . kn i 1 . . . in . S·T = k1 . . . kn Taking account of the previous remark (about notation) the Reader will see that one can multiply [or compose] any two n-th order substitutions. Thus the set Sn of all n-th order substitutions comes equipped with an algebraic operation (multiplication [or composition]), which, as one checks readily, is associative but (if n > 2) not commutative. This multiplication has as unit element i . . . in 1 2 ... n ≡ 1 E= ≡ ..., 1 2 ... n i 1 . . . in is the substitution
and each substitution has an inverse S −1 , j −1 S = 1 i1
... ...
jn , in
since S ·S −1 = S −1 S = E. Thus Sn is a group, usually called the (complete) symmetric group. Its subgroups are called substitution groups. The remark about the notation above allows us to present each substitution in the so-called “normal form” 1 2 ... n S= , s1 s2 . . . sn from which we see that the permutation s = (s1 , s2 , . . . , sn ) determines the substitution uniquely. Even more, this “new notation” makes it possible to divide all substitutions into two classes, the even and the odd ones using the notion of inversion. One says that the numbers si and sj , i < j, form an inversion in the the permutation s = (s1 , s2 , . . . , sn ) if si > sj . The substitution S is called even or odd depending on if thepermutation 1 2 3 s contains an even or odd number of inversions. For example, is an even, 3 1 2 1 2 3 while is an odd substitution. 3 2 1 n! and that these form It is not hard do see that the number of even substitutions is 2 a group An ⊂ Sn , [usually called the alternating group]. It turns out that the groups An (with n ≥ 5) are simple. 54 All the groups An have an even number of elements. It was observed that all known finite non-commutative simple groups are of even order. Thus arose, in the early 20th 54The Reader will find the proof of this fact, in what concerns the tools, elementary in [10, p. 77–78].
4. Additional remarks on groups
377
century, the difficult Burnside’s problem: prove that all of finite non-commutative simple groups are of even order. This problem is now solved. 55 ( ' In an arbitrary group G, which is not simple, there exists a non-trivial normal divisor N , and we can consider the decomposition of G with respect to N . It is remarkable that in the case of a normal divisor “multiplication” in the system of orbits according to the formula N g1 · N g2 = N g1 g2 is “lawful”, that is, it does not depend on the choice on the representatives g1 and g2 in these orbits. Moreover, with respect to this multiplication the orbit acts as “unity” and each orbit N g has an “inverse orbit”, namely N g −1 . As a consequence, the set of orbits, denoted G/N , is a group with respect to this multiplication; it is called the factor group with respect to the normal divisor N . It follows from the definition if the index that |G/N | = indG N . Let us familiarize ourselves with some examples. E XAMPLE 4.2. In our previous discussion we have encountered the subgroup An of the group Sn . Let us check that this is a normal divisor. Let us take the odd substitution 1 2 3 4 ... n T = , 2 1 3 4 ... n and let us form the orbit An T ; its “points” are all odd substitutions, because the product of an even of an odd substitution is always odd. Next, take an arbitrary element S ∈ Sn . If S is an even substitution, then S ∈ An . But if S is odd, then S · T is even, and as S = (S · T ) · T then S ∈ An · T . We see that Sn = An + An · T. Hence indSn An = 2. But a subgroup of index 2 is always a normal divisor. Indeed, let N be a normal divisor of index 2 in a group G. Then for each a ∈ N we have G = N + aN = N + N a, which implies that aN = N a. But this is the same as N G. ( ' E XAMPLE 4.3. If G is commutative or an Abelian group, then each subgroup in it is normal. This follows at once from the definition of a normal divisor. ( ' E XAMPLE 4.4. In each group G we have the subgroup Z(G) = {z|z ∈ G, zg = gz for each g ∈ G}, which is called the center of the group G. This is a normal divisor. Indeed, for arbitrary z ∈ Z(G) and a ∈ G we have gzg −1a = gg −1 za = eza = az = azgg −1 = a · gzg −1 =⇒ gzg −1 ∈ Z(G). ( ' 55After the Reader has become familiar with the description of cyclic groups (Subsection 4), he or she will notice that the only commutative simple groups are those of prime order. Burnside’s problem will be discussed in Subsection 8. Commentator’s note. Nowadays the name Burnside’s problem is used in connection with another outstanding problem in group theory: whether a finitely generated group of bounded exponent must be finite.
378
C HAPTER VI. POPULARIZATION OF MATHEMATICS
E XAMPLE 4.5. In an arbitrary group G we consider its subgroup G generated by all elements g −1 h−1 gh, g ∈ G, h ∈ G, that is the subgroup consisting of all elements of the form g −1 h−1 gh and all possible products of such elements. This subgroup G is called the commutator subgroup of G. The subgroup G is a normal divisor of G. Indeed, it is easy to see that for all inner automorphisms σa of G one has σa (G ) ≤ G. That G G follows now from the definition. ( ' Immediate computations show that (S3 ) = A3 and (S4 ) = A4 . We shall soon see that, likewise, (Sn ) = An for all n ≥ 5. For the proof of this fact we require also two auxiliary facts, which, both taken by themselves, help to clarify the role of the commutator subgroup. L EMMA 4.2. The factor group with respect to the commutator subgroup is Abelian. P ROOF. Let a, b ∈ G be arbitrary. Then aG · bG = abG = ba(a−1 b−1 ab)G , that in view of a−1 b−1 ab ∈ G equals baG = bG ·aG . The relation at hand aG ·bG = bG · aG shows that G/G is Abelian. ( ' L EMMA 4.3. The commutator is contained in each normal divisor of the group such that the factor group with respect to it is Abelian. P ROOF. Let N G be such that G/N is Abelian, that is, for all a, b ∈ G one has the identity aN · bN = a · N . This gives abN = baN , that again yields a−1 b−1 · abN = a−1 b−1 · baN = N . Thus a−1 b−1 ab ∈ N . As a, b ∈ G are arbitrary, this relation shows that G ⊆ N . ( ' Let us now prove that if n ≥ 5 then (Sn ) = An . To this end we observe that (Sn : An ) = 2, so that Sn /An is a group of order 2. It is easy to see that groups of order two have the same structure as the group {a, e | e·e = e; e·a = a·e = a; a·a = e}. But this is an Abelian group, so that that in view of Lemma 4.3 (Sn ) ⊆ An . Furthermore, the Reader will notice that only in an Abelian group G we have the relation G = {e}. But as the groups Sn (n ≥ 5) are non-commutative, then (Sn ) = {e}. From the relation (Sn ) Sn , apparently follows the weaker relation (Sn ) An . But An is simple, so in view of (Sn ) = {e} we obtain the sought relation (Sn ) = An . ( ' 3. D EFINITION 4.4. Let there be given a single-valued function ϕ whose domain of definition is the set of all elements of a group (G1 , ·), its values being the elements of the group (G2 , ◦). The function ϕ is called a homomorphism of G1 into G2 (or a representation of G1 in G2 ), if for arbitrary x, x ∈ G1 holds the relation ϕ(x · x ) = ϕ(x) ◦ ϕ(x ) A trivial example of a homomorphism of a group G1 is the constant function whose value is the unit element in the group G2 . The set of elements in G1 for which the value of ϕ is unit element e2 of G2 is called the kernel of the homomorphism; notation: Ker ϕ = {n|n ∈ G1 , ϕ(n) = e2 }.
4. Additional remarks on groups
379
This is normal divisor in G1 , since ϕ(gng −1 ) = ϕ(g) ◦ ϕ(n) ◦ ϕ(g −1 ) = ϕ(g)e2 ◦ ϕ(g)−1 = = e2 =⇒ gng −1 ∈ Ker ϕ. But if N G, then the function ϕ : G → G/N , ϕ(g) = N g, is a homomorphism with kernel N . Therefore we see that normal divisors of a group, and only normal divisors are kernels of homomorphisms of this group. If the function ϕ : G1 → G2 gives a one-to-one correspondence between G1 and the domain of values Im ϕ = ϕ(G2 ), we call it a monomorphism; in this case Ker ϕ = (e1 ). If Im ϕ = G2 , that is if the domain of values is the whole of the group G2 , we call the function an epimorphism. If a homomorphism is at the same time a monomorphism and an epimorphism it carries the name isomorphism. The isomorphism of two groups G1 and G2 will be written G1 ∼ = G2 . As a simplest example we have the “identity homomorphism” of the group G1 onto itself, that is, the function ϕ : G1 → G1 , given by the formula ϕ(g) = g for all g ∈ G1 . If G1 = G2 , an isomorphism of G1 is called an automorphism, that is, an automorphism of a group is an isomorphism of the group with itself. We note that the set of automorphisms of a given group G is a group denoted Aut(G); the composition of automorphisms is concatenation: σ1 / G σ2 / G G HI JKO σ1 ·σ2
In the case of the groups Sn (n ≥ 3, n = 6) one has Aut(Sn ) ∼ = Sn ; this theorem was proved by O. Hölder in 1895. The Reader can check that for each a ∈ G the homomorphism σa : G → G given by σa (g) = a−1 ga, is an automorphism. Automorphisms of this type are called inner automorphisms. Apparently, the set of all inner automorphisms of G forms subgroup of Aut(G). In order to get used to the new notion we familiarize ourselves with some examples. E XAMPLE 4.6. Let us take as G1 the additive group of all real numbers R(+) and as G2 the multiplicative group of complex numbers on the unit circle C◦ . We consider the function ϕ given by the formula ϕ(a) = e2πia = cos 2π a + i sin 2πa. As ϕ(a + b) = e2πi(a+b) = e2πia · e2πib = ϕ(a) · ϕ(b), we have a homomorphism. Let us find its kernel. By definition Ker ϕ = {a| a ∈ R, e2πia = 1}; thus Ker ϕ = Z. As, apparently Im ϕ = C◦ , we have an epimorphism. ( ' E XAMPLE 4.7. Let again G1 = R(+) and take as G2 the multiplicative group of positive real numbers R∗ . The function ϕ(α) = eα determines an isomorphism between these groups (see Figure 16). ( ' E XAMPLE 4.8. Also the inverse function ϕ = ln : R∗ → R(+) gives an isomorphism of groups (see Figure 17). ( '
380
C HAPTER VI. POPULARIZATION OF MATHEMATICS
G2
ea
G1
( a, ea )
a
0
Fig. 16
G2
ln a
( a, ln a)
0
a
G1
Fig. 17
E XAMPLE 4.9. The natural embedding i : Z(+) → R(+) is an example of a monomorphism. ' ( E XAMPLE 4.10. Let us now have a look at the the multiplicative group G1 of all complex numbers = 0, C(·) and as G2 the group of all regular 2 × 2 matrices with real elements, GL(2, R). The function ϕ, given by the formula a b , ϕ(a + ib) = −b a is a representation56 of the group C(·) in GL(2, R). It is easy to see that it is a monomorphism. ( ' Finally, let ϕ : G2 → G2 be an arbitrary homomorphism. As Ker ϕ = N G1 we can form the factor group G1 /N . This group we may take as the domain of a new ˆ g) = ϕ(g). An easy check shows function ϕˆ : G1 /N → Im ϕ, given by the formula ϕ(N that ϕˆ is an isomorphism. Thus we have the following theorem. 56Translator’s note. A representation of any group G is a homomorphism into a matrix group GL(n, K), where K is a field or, more generally, a ring.
4. Additional remarks on groups
381
T HEOREM 4.5 (Theorem of homomorphisms). The function ϕˆ is an isomorphism between the groups G1 /Ker f and Im ϕ. 4. Let us introduce some classes of groups. We have already spoken of Abelian groups: these were the groups where the algebraic operation is commutative. As examples we may take the multiplicative groups Q(·) , R(·) , C(·) , that is, the corresponding sets of numbers, deprived of zero, where composition is usual multiplication of numbers, and, further, the additive groups Q(+) , R(+) , C(+) , that is, the corresponding sets of numbers, where the operation is the usual addition of numbers. An important subclass of these [Abelian ] groups are the cyclic groups. In a cyclic group all elements can be taken as the various powers of its distinguished element socalled generator (if the composition is called “multiplication”) or multiplier (if the composition is called “addition”). E XAMPLE 4.11. Consider the solutions of the equation xn − 1 = 0, that is the n-th roots of unity ε0 , ε1 , . . . , εn−1 . Clearly, the product of two roots of unity is again a root of unity; likewise, the inverse of a root of unity is a root of unity. So we have a group whose elements are the n-th roots of unity, the algebraic operation being ordinary multiplication of complex numbers. As the solutions of the equation xn − 1 = 0 form 2πk 2πk + i sin . By de a regular n-gone inscribed in the unit circle, we have εk = cos n n k Moivre’s formula ε1 = εk , k = 0, 1, . . . , n − 1, which shows that we are dealing with a ( ' finite cyclic group: all elements εk are powers of the generator ε1 E XAMPLE 4.12. The additive group of integers Z(+) is cyclic. Indeed, it has the generator 1, because each n ∈ Z can be written n = n · 1. All subgroups or factor groups of a cyclic group are again cyclic. For the proof of the first assertion we remark that as a generator of a subgroup we can take the power of a generator of the entire group with the lowest positive exponent. For the proof of the second assertion it suffices to take, as generator, an orbit of of the factor group passing through a generator of the group. ( ' Let us now apply these observations to the additive group of integers Z. Taking an arbitrary n ∈ Z, n = 0, and considering the set N ⊂ Z of all integers divisible by it, we obtain a subgroup. It is easy to see that all subgroups are of this form. As Z is Abelian , then all its subgroups are normal divisors. Therefore we can form the subgroups Z/N = Zn , which being the factor group of a cyclic group must be cyclic. In this way we have found all subgroups, all normal divisors, and all factor groups of Z. Even more, we have determined all homomorphisms of Z, because the Theorem 4.5 allows us to restore these from their kernels, and we know also the latter, as they coincide with set of normal divisors of Z. Next, we consider an arbitrary cyclic group G2 with generator a and define a function whose domain of definition is Z and whose domain of values is G2 , defined with the help of the formula ϕ(1) = a. This implies that ϕ(n) = an . It is easy to see that Im ϕ = G2 , so that we have an epimorphism. The Theorem 4.5 yields now G2 ∼ = Z/Ker ϕ. But all factor groups of Z are known to us; they are the groups Zn . Thus, we have G2 ∼ = Zn for some n; in case Ker ϕ = (0) we obtain G2 = Z, and we have an infinite cyclic group.
382
C HAPTER VI. POPULARIZATION OF MATHEMATICS
In the “abstract” theory of groups – that is, in group theory – the object in view is not so much the “individual” group but rather the class of groups isomorphic among themselves. In this sense we have obtained a description of cyclic groups in terms of the additive group of integers Z and its factor groups Zn , that is on the basis of the material “closest” to us. If n is a prime, then the groups Zn are simple, which is a consequence of Lagrange’s theorem (Theorem 4.1). Knowing the properties of cyclic groups, the Reader will convince him- or herself that these are the only simple Abelian groups. 5. What is a solvable group? We have seen57 that to a group G one can attach the subgroup generated by all commutators, that is elements of the form [a, b] = a−1 b−1 ab, a ∈ G, b ∈ G, the commutator subgroup of G. It was shown that this was a normal divisor in G. Iterating the construction we can form the commutator subgroup G of the subgroup G etc. Let G = G . The iteration gives then a decreasing chain of subgroups, where each “link” is a normal divisor in the preceding one: G G G
(103) (i)
with G
· · · G(i)
...,
(i−1)
= (G
).
D EFINITION 4.6. A group G is called solvable if the chain (103) breaks at the trivial subgroup, that is, there exists an index n such that G(i) = (e).58 For example, every Abelian group G is solvable 59, because in this case one has G = (e), so n = 1 here. Soon we we shall also encounter non-solvable groups, so one can say that the class of solvable groups is an essentially wider class than Abelian groups. In a solvable group G = (e) one has G = G, that is, the commutator subgroup in such a group cannot coincide with the group itself. For let G = (e). Then it follows from the relation G = G that G(i) = G = (e) for all i, which would contradict the solvability. Moreover, if G is solvable, then all (non-trivial) members of the chain (103) are distinct; from G(i) = G(i+1) it follows that G(i) = G(j) if j ≥ i. note that the factors of the chain (103), that is, the groups G/G , G/G , . . . , are Abelian groups. This follows from facts known to us: for any group the factor group with respect to its commutator subgroup is an Abelian group. Subgroups and factor groups of solvable groups are solvable. In order to prove the first statement it suffices to make the observation:
H ⊂ G =⇒ H ⊂ G =⇒ H ⊂ G =⇒ . . . . . . H (n) ⊂ G(n) = (e) =⇒ H (n) = (e). To prove the second statement we consider the epimorphism ϕ : G → G/N . As every commutator in G/N is the image of some commutator in G, we observe that ϕ 57See Example 4.12. 58Here e is the unit element of G. The solvable groups have obtained their name because of the fact that
algebraic equations are solvable in terms of radicals if and only if their Galois groups are solvable. Translator’s note. See also Section 6. 59In particular, all cyclic groups are solvable.
4. Additional remarks on groups
383
induces a new epimorphism ϕ : G → (G/N ) . Iterating this observation we get the epimorphisms ϕ : G → (G/N ) , ϕ : G → (G/N ) etc. As G(n) = (e) and ϕ(n) is an epimorphism, then (G/N )(n) = (N ). ( ' 6. In the case of finite groups one usually gives a different definition of solvability. This is done with the aid of the notion of composition series of a group. Let us first look at a concrete situation in order to be able to use it as an example in the following general reasoning. We consider the 6-th order group G = {e, a, a2 , a3 , a4 , a5 | a6 = e}. The sets N1 = {e, a3 }, N2 = {e, a2 , a4 } are subgroups, with N1 ⊂ N2 . There are no other proper subgroups. 60 In view of Theorem 4.1 the order of a subgroup must divide the order of the group itself. But the number 6 has only the divisors 2 and 3. As G is commutative, it follows that N1 and N2 are normal divisors. This gives us two chains G N1 (e) and G N2 (e). It is easy to check that G/N1 ∼ = N2 /(e) ∼ = N1 /(e) ∼ = N2 and N1 ∼ = G/N2 . D EFINITION 4.7. Let G be a finite group. A decreasing chain of subgroups (104)
G = N0 ⊃ N1 ⊃ N2 ⊃ · · · ⊃ Nk = (e)
is called a composition series if the following two conditions are fulfilled: (1) for all i = 0, . . . , k − 1, Ni+1 is a normal divisor of Ni , and (2) for no i = 0, . . . , k −1 there exists in Ni a normal divisor M such that Ni+1 ⊂ M ⊂ Ni , M = Ni+1 , M = Ni . The Reader sees at once that the chains G ⊃ N1 ⊃ (e) and G ⊃ N2 ⊃ (e) in the above example are composition series. Each finite group has a composition series. According to the Jordan-Hölder Theorem (see [7, p. 286] or suitable references in Section 1, footnote 12.) any two composition series have the same length, and one can find a one-to-one correspondence between them such that that the corresponding factor groups are isomorphic; one may say that the composition series of a finite group are isomorphic. This theorem allows to consider in a finite group any composition series as the “copy” of some “original” composition series. Thus, in essence a finite group has a unique composition series, namely this “original”, all others being just “copies” of the “original”, that is chains isomorphic to the latter. As the “original” one can, of course, take an arbitrary composition series. It turns out that a finite group is solvable if and only if all factor groups of its composition series are cyclic groups of prime order. This is the second definition of solvability in the case of a finite solvable group. In using it, the Reader will notice that for a finite solvable group to be simple it is necessary and sufficient that it be cyclic of prime order; that such a group is solvable is already known to us.61 Likewise, the Reader sees that the order |G| of a solvable group is the product of the order of its factors. Indeed, a repeated application of Lagrange’s Theorem (Theorem 4.1 60That is, subgroups distinct from (e) and G itself. 61. . . but it is also an immediate consequence of the second definition just given.
384
C HAPTER VI. POPULARIZATION OF MATHEMATICS
to the chain 104 gives |G| = |G/N1 | · |N1 | = |G/N1 | · |N1 /N2 | · |N2 | = · · · = = |G/N1 | · |N1 /N2 | · |N2 /N3 | . . . |Nk |. ( ' Here the orders of the factors may be equal. Even they can be all equal, i.e., the order |G| must then be of the form pα , p a prime number. According to Lagrange’s Theorem, the order of the group must divisible by the order of any of its subgroups. The interesting question arises whether the “converse conjecture” holds true for solvable groups. 62 It turns out that in a solvable group G of order |G| = m · n, where the numbers m and n are relatively prime, there exist subgroups of order m and n. For the proof of this fact we would have to plunge into the “technical wilderness”, which would not be suitable for the present compilation. However, it is of some interest to note that this fact is characteristic for the “nature” of solvable groups, that is, it can be used as a criterion for solvability. We mention further two criteria due to J. G. Thompson, as they can often be easily applied to the solvability or non-solvability of groups (see [13, 383-437]): I. A finite group is solvable if and only if each subgroup generated by any pair of elements in it is solvable. II. A finite group is solvable if and only if it does not contain any three elements distinct from unity whose orders are pairwise relatively prime and whose product equals unity. 7. The following theorem gives a series of examples of non-solvable groups. It plays also a decisive role in the proof of the Abel-Ruffini Theorem. T HEOREM 4.8. The complete symmetric groups Sn (n ≥ 5) are not solvable. The groups S2 , S3 , S4 (n ≥ 5) are solvable. P ROOF. As a subgroup of a solvable group is solvable, it suffices, for the proof of the first statement to find in the groups Sn (n ≥ 5) subgroups which are not solvable. This is easy: the subgroups An ⊂ Sn (n ≥ 5) suffice! The groups An are simple, but their order |An | = n! 2 is not a prime number; hence they are not solvable. Indeed, we saw in the previous Section that among solvable groups only those are simple which are cyclic of prime order. This proves the first assertion. Let us consider the group S2 . As |S2 | = 2! = 2 it is cyclic of prime order and so solvable. For the proof of the solvability of S3 , we note that A3 is solvable because in view of |A3 | = 12 3! = 3 we have a cyclic group. Moreover, (S3 : A3 ) = 2 is a prime so that (e) ⊂ A3 ⊂ S3 is a composition series. The factors of the latter are visibly cyclic of prime order. That the group S3 is solvable follows now from the definition.
62See also the first subsection of this paper.
4. Additional remarks on groups
The structure of S4 is the elements: 1 2 3 e= 1 2 3 1 2 3 a1 = 2 1 4 1 2 3 b1 = 2 3 1 1 2 3 b4 = 3 2 4 1 2 3 b7 = 1 3 4
385
somewhat more complicated. The subgroup A4 consists of 4 , 4 4 1 , a2 = 3 3 4 1 , b2 = 4 2 4 1 , b5 = 1 4 4 1 , b8 = 2 1
2 3 4 1 2 3 4 3 2 3 1 3 2 3 4 2
4 1 , a3 = 2 4 4 1 , b3 = 1 3 4 1 , b6 = 2 4 4 . 3
2 3 3 2 2 1
3 2
2 2
3 1
4 , 1 4 , 4 4 , 3
By an immediate check one verifies that the set K4 = {e, a1 , a2 , a3 } is a commutative subgroup. Furthermore, calculating the 24 products b−1 i aj bi , i = 1, 2, . . . , 8, j = 1, 2, 3, we see that they all belong to the subgroup K4 . But this means that K4 is a normal divisor in A4 . Taking N4 = {e, a1 } and forming the chain we obtain the chain (e) ⊂ N4 ⊂ K4 ⊂ A4 ⊂ S4 , we see that we have a composition series all of which factors are cyclic groups of prime ( ' order. This proves that S4 is solvable. The group K4 is called the Klein 4-group (or the four group). One readily checks that one has the relations a21 = a22 = a23 = e so that {e, a1 }, {e, a2}, {e, a3 } are subgroups. We see that K4 = {e, a1 } ∪ {e, a2 } ∪ {e, a3 }, so that the Klein group can be presented as the union of three proper subgroups. In 1959, S. Haber and A. Rosenfeld proved that K4 is the typical example of a group with this property: K4 and, furthermore, those groups which are epimorphic to K4 are presentable as the union of three proper subgroups. There are no other groups with this property. One can ask the question which groups are presentable as the union of two proper subgroups. A simple contradiction reasoning reveals that there are no such groups. The question which groups can be “covered” by n proper subgroups is not easy. 8. In the end of the 1950’s the general opinion was that the theory of finite groups was in a state of “congelation”; voices arose claiming that it had exhausted itself. The reason for this was not at all the absence of unsolved problems – the problem of describing all finite simple groups is still awaiting its solution! Rather it is the contrary: as in Number Theory, in Group Theory it is easier to formulate problems than to solve them. One could not even say that one lacked methods: powerful methods had been developed by Hölder; Jordan; Frobenius; Molien; Burnside; Schur. Therefore there arose the opinion that the circle attainable by these methods was completely exhausted, perhaps with the exception of only a few, not very interesting cases. In the beginning of the 1960’s there occurred a real break-through in the theory of finite groups: W. Feit and J. G. Thompson proved the following theorem [4]. T HEOREM 4.9. All finite groups of odd order are solvable.
386
C HAPTER VI. POPULARIZATION OF MATHEMATICS
The proof of this theorem is based on ideas, results and theories due to P. Hall; G. Higman; H. Wieland; R. Brauer; M. Suzuki; and others. This theorem with its monumental proof, undoubtedly, has a deep influence on the development of Group Theory63. Below we study two examples of the numerous consequences of this theorem. We start with the following question: if a group G is not cyclic of prime order, what can be said about the existence of subgroups H of “sufficiently high” order in G? More exactly, we would like to prove thefollowing conjecture: there exists always a proper subgroup H ⊆ G such that |H| > 3 |G|. In the case of groups G of even order, R. Brauer and K. Fowler managed to prove this conjecture in 1955. There remained the case |G| odd; one was not able to “conquer” this. The knowledge of the Feit-Thompson Theorem makes this problem fairly simple and so also acceptable in the framework of this paper. Indeed, by this theorem all odd groups are solvable. But in every finite solvable groupG whose order is not a prime and greater than unity, there exists a subgroup of order ≥ |G|. P ROOF. Let us present the number |G| = g as a product of prime numbers g = 1 s . . . pα pα s . We consider two cases. 1 First, let s = 1. Then g = pα . As g must be a prime greater than unity, we deduce that α ≥ 2. The first theorem of Sylow tells us that G has a subgroup of order pα−1 . But √ as α ≥ 2 implies the inequality pα−1 ≥ g, we have found the desired subgroup H. Second, let s > 1. Then we can divide the set P = {p1 , p2 , . . . , ps } into two nonempty and non-intersecting (all the pi are distinct!) subset P and P , that is, we have the relations P = P ∪ P and P ∩ P = ∅. αj i Let m = pi ∈P pα = ∅ that the i and n = pj ∈P pj . It follows from P ∩ P numbers m and n are relatively prime. There are two possible special cases: (i) m > n, and (ii) n > m. Let us now apply the “natural property” of a solvable group (cf. Subsection 6). In case (i) it guarantees that there is a subgroup H in G of order m. As m > n, then √ g = mn < m2 , hence |H| = m > g, and we have found the desired subgroup. The reasonings in case (ii) are analogous. It suffices only to make in them the replacements m → n and n → m. Our assertion is proved. ( ' As for |H| > |G| one has also |H| > 3 |G|, the conjecture under view is established for all finite groups. We give yet another application of the Thompson-Feit Theorem. In Subsection 2 we mentioned the following problem of W. Burnside: Does there exist non-commutative simple groups of even order? The answer to this question is an immediate consequence of the Thompson-Feit Theorem: assuming that such a group exists, we see from this theorem that the group must be solvable. On the other hand, the class of simple solvable groups consists of only cyclic groups of prime order, thus only of commutative groups. This contradiction shows that the answer is negative. 63See the book [3], where the Reader will have a magnificent opportunity to familiarize him- or herself with contemporary Group Theory.
4. Additional remarks on groups
387
Up to this day64 two interesting (and difficult) questions remain unsolved. (1) It is well-known that groups of order pα · q β , where p, q are prime numbers, are solvable. This is Burnside’s Theorem. But so far one does not know the structure of groups whose order is divisible by precisely three distinct prime numbers. Nor does one know which simple groups have this property, and even not if there are only finitely many such groups. That we here have to deal with a very well-founded problem should be clear from the following theorem of Thompson: if the order of a simple group has the form pα · q β · rγ with distinct primes p, q, r (say p < q < r), then p = 2, q = 3, r = 5, 7 or 17. (2) To prove that a finite group, which admits an automorphism whose only fixed point is its identity, must be solvable. A support for this conjecture is the following fact: If the automorphism under view (considered as an element of Aut(G)) is of order = 2n , n an integer, then G is of odd order; that G is solvable follows now from the Feit-Thompson Theorem. It is essential that G be finite. Indeed, there exist infinite non-solvable “linearly ordered” groups G. Each such group admits the automorphism σ : g → G, σ(g) = g −1 , with unity of G as the single fixed point.
Comments. 1) In the proof on the pages 386 to 386 it is written “First, let s > 1 . . . ”, but the case s = 1 is never dealt with. However, this is easy, as any group of order pn contains a subgroup of of order pn−1 . 2) The problem of classifying non-Abelian finite simple groups, essentially going back to Galois, is now generally believed to be settled. This classification was finished in the early 1980’s. According to this there are, apart from the alternating groups, 16 infinite families of groups of Lie type (the finite analogies to the classical families of simple Lie groups that include: - the projective special linear groups PSL(n, K), the projective symplectic groups; - the simple orthogonal and unitary groups) and 26 sporadic simple groups (5 of which are the Mathieu groups discovered by E. Mathieu already in 1861 and 1873). For further discussion see e.g. the book [1]. The following two items refer to the two questions referred at the end of the paper. 3) Question 1 on p. 387. From the classification of finite simple groups, one can read exactly which simple groups occur that have orders divisible by exactly three distinct prime numbers. 4) Question 2 on p. 387. It has now been shown, using the classification of finite simple groups, that all finite groups, that admit an automorphism, whose only fixed point is the identity, must be solvable. For more information, see the book [8]. Gunnar Traustason
References [1] [2] [3] [4] [5]
R. Carter. Simple groups of Lie type. John Wiley & Sons, London, New York, Sidney, 1989. F. A. Cotton. Chemical applications of group theory, New York, London, 1964. W. Feit. Characters of finite groups. W. A. Benjamin, Inc., New York, Amsterdam, 1967. W. Feit and J. G. Thompson. Solvability of groups of odd order. Pac. J. Math. 13, 1963. E. Gabovitš. Principles of Algebra, I – V. Math. and Our Age 6–10, 1965. 64. . . according to the information available to the author.
388
[6] [7] [8] [9] [10] [11] [12] [13]
C HAPTER VI. POPULARIZATION OF MATHEMATICS
S. Helgason. Differential geometry and symmetric spaces. Academic Press, New York, London, 1962. G. Kangro. Kõrgem algebra (Higher algebra) II. Eesti Riiklik Kirjastus, Tallinn, 1950. (Estonian.) E. I. Khukhro. p-Automorphisms of finite p-groups. London Mathematical Society Lecture Note Series, 246. Cambridge University Press, Cambridge, 1997. N. Kristoffel and K. Rebane. Group theory and its applications in the physics of molecules and chrystals. Tartu Univ. Press, Tartu, 1961. A. G. Kurosh. Lectures in general algebra. Fizmatgiz, Moscow, 1962. English translation: Pergamon Press, Oxford, London, Edinburgh, New York, 1965. Ü. Lumiste. The notion of space in geometry. Geometry and transformation groups. Math. and Our Age 14, 1968, 3–21. Ø. Ore. On coset representatives in groups. Proc. Am. Math. Soc. 9, 1958, 665–670. J. G. Thompson. Non-solvable finite groups all of whose local subgroups are solvable. Bull. Am. Math. Soc. 74, 1968.
389
5.
[K73a] Polynomials and formal series
This paper arose from a desire to give the Reader a handy compilation for the formulation and proof of the Ruffini-Abel Theorem. Here we treat the symmetry of the polynomial and the notion of irreducibility and, moreover, the concept of formal series. Symmetric polynomials have applications in several domains of mathematics. The Reader will find on the pages of this paper an interesting possibility to use them in the solution of algebraic equations of higher order, see [2]65. Irreducible polynomials play in the arithmetic of the ring of polynomials about the same role as prime numbers in ordinary number theory. In recent times they have found applications in the theory of coding an decoding (see the book [1]66). The Reader will probably find the concept of formal series specially interesting. An elegant use of this theory is, among other things, one of the tools by which Henri Cartan has refreshed the presentation, in university courses, of such an important branch of classical mathematics as the theory of functions of a complex variable. However, our goal has not been to give a complete catalogue of the properties of any of the mathematical objects mentioned. The Reader will only learn of those properties of an object which will be later required to understand the proof of Abel’s theorem. In the opinion of the author the best way of learning something about mathematical objects is to see how they are used in achieving significant goals. In the composition of the remarks we have in an essential way used M. M. Postnikov’s book on Galois Theory [6].
5.1. Irreducibility of polynomials Let P be a field. We consider the ring P [x], that is, the set consisting of all polynomials with coefficients in P , f (x) = an + an−1 x + · · · + a0 xn ,
ai ∈ P.
Addition and multiplication are defined, as in the case of polynomials with numerical coefficients, by the formulae (f + g)(x) = f (x) + g(x), (f · g) = f (x) · g(x). There are no divisors of zero in the ring P [x] and one can develop a theory of division similar to the one in the domain of ordinary integers; in the role of prime numbers there appear then the “irreducible polynomials”; let us familiarize us with these strange “prime numbers”. 65 Translator’s note. For symmetric functions see e.g. Chap. 11 of Kurosh’s book [10] quoted in Section 1, footnote 12 66 Editors’ note. We suggest also an introduction to coding theory [5]
390
C HAPTER VI. POPULARIZATION OF MATHEMATICS
D EFINITION 5.1. A polynomial f (x) ∈ P [x] is called reducible over the field P if there exist non-constant lower order polynomials f1 (x) and f2 (x) in the ring P [x] such that f (x) = f1 (x) · f1 (x). In the opposite case one says that f (x) is irreducible over P . Next, we present an assertion which illustrates the similarity between irreducible polynomials and prime numbers. T HEOREM 5.2. If the polynomial f (x) with coefficients in P has a common solution with the polynomial p(x), which is irreducible over P , then f (x) is divisible by p(x). P ROOF. Let g(x) = GCD(f (x), p(x)). As the equation f (x) = 0 and p(x) = 0 have a common solution then67 deg g(x) ≥ 1. The coefficients of the polynomial g(x), obtained by Euclid’s algorithm, belong to P . If we had deg f (x) < deg p(x), then p(x) = g(x) · p1 (x), where deg p1 (x) < deg p(x) and where the coefficients of p1 (x) belong to P . But this contradicts the irreducibility of p(x). Thus deg g(x) = deg p(x). ( ' As an example we consider the irreducibility of polynomials over number fields. We use the standard notation: Z for the set of integers, Q for the set of rational numbers, C for the set of complex numbers. Let P = Q. It turns out that it suffices to know if a polynomial f (x) with integer coefficients is irreducible in the ring68 Z or not. Indeed, let f (x) be a polynomial with rational coefficients. We determine the least common divisor a of the coefficients ai and consider the polynomial af (x); this is a polynomial with integer coefficients and its reducibility is, apparently, necessary and sufficient for the reducibility of f (x) over Q. Thus the question of the reducibility over Q is solvable if we can clarify the reducibility of polynomials over Z. For this several criteria are known in algebra. We list a few of them. E ISENSTEIN ’ S CRITERION . Let there be given a polynomial f (x) = an + an−1 x + . . . a1 xn−1 + a0 xn ,
ai ∈ Z.
If there exist a prime number p such that the following conditions69 are fulfilled: p a0 ;
p|ai for i = 0;
p 2 an ,
then f (x) is irreducible over Q. The proof is easy; assuming the contrary the Reader will easily arrive at a contradiction. ( ' It follows from this criterion that there exist irreducible polynomials (over Q) of arbitrary high degree. Indeed, the polynomials xn + p, n = 1, 2, 3, . . . are irreducible. n C OHN ’ Si CRITERION . If the coefficients ai ∈ Z of the polynomial f (x) = i=0 an−i x satisfy the condition 0 ≤ an−i ≤ 9, and f (10) is a prime number, then f (x) is irreducible over Q. ( ' 67Here deg f (x) denotes the degree of the polynomial f (x). 68The polynomial f (x) is reducible in the ring Z[x] if there exist polynomials f (x) and f (x) with 1 2
non-constant integer coefficients such that f (x) = f1 (x) · f2 (x) and deg fi (x) < deg f( x), i = 1, 2. 69Here p|a means that the number a is divisible by p, and p a its contrary. i i i
5. Polynomials and formal series
391
According to Cohn’s criterion the polynomials f1 (x) = x3 + 8x2 + 2x + 3, f2 (x) = 2x3 + 6x + 3, f3 (x) = 2x3 + x2 + 2x + 9 are irreducible over Q. Among less know criteria of irreducibility, we note the following. In order for a polynomial f (x) = xn + a1 xn−1 + · · · + an , ai ∈ Z to be irreducible (over Q) it is sufficient that one of the following conditions is fulfilled: (1) |a1 | > |1 + a2 | + |a3 | + · · · + |an |, 3 √ (2) a2 > 0, a2 > √ (|a1 | + |a3 | + . . . |an |), 2 √ (3) a1 = 0, a2 > 0, an = 0, √ a2 > 3(|a3 | + · · · + |an |), √ (4) n > 4, a4 > 0, a4 > 4 2(1 + |a1 | + |a2 | + · · · + |an |),
an = 0.
Over the complex numbers, only linear polynomials are irreducible. Indeed, by the fundamental theorem of algebra it follows that an equation f (x) = 0 with complex coefficients has at least one solution α ∈ C, from which it again follows by Bézout’s Lemma that f (x) is divisible with the factor x − α. If necessary applying the same argument to the quotient, we find that f (x) decomposes into a product of linear factors. Over the field of real numbers there are already quadratic polynomials which are irreducible – a well-known example is provided by the polynomial x2 + 1. It turns out that higher order polynomials with real coefficients are already reducible. The fact that here appear besides linear polynomials some quadratic polynomials, follows again from the reasoning that if f (α) = 0, α ∈ C, α ∈ R, then also f (¯ α) = 0 and that the factor (x − α)(x − α ¯ ) is irreducible over R. We note the absence of effective criteria for deciding the irreducibility of polynomials over an arbitrary field. The answer to the following interesting question70 is not known. P ROBLEM OF PAUL T URAN n. Does there exist a non-negative integer c such that for each polynomial f (x) = i=0 ai xn−i , ai ∈ Z, a0 = 0 there exist a polynomial n n g(x) = i=0 bi xn−i , bi ∈ Z, which is irreducible over Z, such that i=0 |ai − bi | ≥ c. ( '
5.2. Symmetric polynomials We consider the polynomial f (x) = xn + a1 xn−1 + an−1 x + an with coefficients in the field P . There exists always an extension L of P in which f (x) decomposes into linear factors, that is, there exist elements α1 , . . . , αn ∈ L/P such that f (x) = (x − α1 ) . . . (x − αn ). For example, if P ⊆ C we may take for L the field of complex numbers C. But if two polynomials are equal, then there coefficients in front of the corresponding powers of the variable x must be equal also. This gives the well-known
70In this relation, see [7]
392
C HAPTER VI. POPULARIZATION OF MATHEMATICS
formula of Viète −a1 = α1 + α2 + · · · + αn , a2 = α1 α2 + α2 α3 + · · · + αn−1 αn , ........................... n−1
(−1)
an−1 = α1 · · · αn−1 + α2 · · · αn , an = α1 α2 · · · αn .
The right hand sides of these relations are not changing under arbitrary permutations α1 → αi1 , α2 → αi2 ,. . . , αn → αin of the solutions α1 , . . . , αn ; here (i1 , . . . , in ) is a permutation of the numbers 1, . . . , n. Therefore this expressions are called symmetric with respect to the solutions α1 , . . . , αn . This property makes it possible to distinguish among polynomials in n variables the so-called symmetric polynomials, that is, polynomials f (x1 , . . . , xn ) which do not change under an arbitrary permutation x1 → xi1 , x2 → xi2 ,. . . xn → xin . We have already examples of such polynomials: the so-called elementary symmetric polynomials σ1 = x1 + x2 + . . . + xn ; σ2 = x1 x2 + x2 x3 + · · · + xn−1 xn ; . . . ; σn−1 = x1 . . . xn−1 ; σn = x1 x2 . . . xn . It is easy to find other examples: x21 + x22 + · · · + x2n , x31 x32 . . . x3n , etc. As the symmetric polynomials form a subring of the ring of polynomials in n variables, it is easy to enlarge the number of these examples. The Reader will notice that many symmetric polynomials can be expressed in terms of the elementary ones, for example: x21 + x22 + · · · + x2n = σ12 − 2σ2 . x31 x32 . . . x3n = σn3 , x21 x2 . . . xn + · · · + x1 x2 . . . x2n = σ1 σn . Indeed, every symmetric polynomial can be expressed as a polynomial in the elementary ones: in other words, to each symmetric polynomial f (x1 , . . . , xn ) (with coefficients in the field P ) there corresponds a polynomial q(x1 , . . . , xn ) (with coefficients in the field P ) such that f (x1 , . . . , xn ) = q(σ1 (x1 , . . . , xn ), σ2 (x1 , . . . , xn ), . . . , σn (x1 , . . . , xn )), This is the so-called Fundamental Theorem of Symmetric Polynomials 71; we shall use it in the proof of the Ruffini-Abel Theorem.
71For the proof, see [4, p. 262–264].
5. Polynomials and formal series
393
5.3. Embedding of the field of rational functions in an algebraically closed field Let P be a field of characteristic 0, that is, Q ⊆ P . We consider the field of rational functions R = P (x1 , . . . , xn ) over P . Its elements are quotients f (x1 , . . . , xn ) g(x1 , . . . , xn ) of two polynomials f, g ∈ P [x]. Such a field needs not be algebraically closed, that is, there may exist non-linear irreducible polynomials over it. E XAMPLE 5.1. Let P = Q. We show that the equation x2 + 1 = 0 does not have any solutions in the field R = Q(x1 , . . . , xn ). Indeed, if f (x1 , . . . , xn ) ∈R g(x1 , . . . , xn ) were a solution of the equation x2 + 1 = 0, we would have 2 f (x1 , . . . , xn ) ≡ −1, g(x1 , . . . , xn ) which implies that
f (c1 , . . . , cn ) g(c1 , . . . , cn )
for any (c1 , . . . , cn ) ∈ Qn . But as
2
f (c1 , . . . , cn ) g(c1 , . . . , cn )
= −1,
∈ Q,
we have obtained a contradiction, because there exists no rational number r ∈ Q such ( ' that r2 = −1. Nevertheless, it is possible to embed the field R into an algebraically closed field, provided the base field P is algebraically closed. First we embed R in the field of formal series. What is a formal series? We denote the variable by x. A formal series is an infinite formal sum of the form a−m x−m + a−m+1 x−m+1 + · · · + a−1 x−1 + a0 + a1 x + a2 x2 + . . . , where ai ∈ P and m ∈ Z; in short, i≥−m ai xi . Among the formal series we have all polynomials, as a0 + a1 x + · · · + ak xk = i≥0 ai xi , where ak+1 = ak+2 = · · · = 0. The sum and product of any two formal series f = i≥−m ai xi and g = i≥−n bi xi is defined by the formulae: (1) if, for example n ≥ m, and f + g = i≥−n (ai + bi )xi , where am−1 = · · · = a−n = 0 and (2) ci xi , f ·g = i≥−m−n
394
C HAPTER VI. POPULARIZATION OF MATHEMATICS
where
⎧ ⎪ ⎨ ⎪ ⎩
c−m−n = a−m · b−n c−m−n+1 = a−m b−n+1 + a−m+1 b−n . .........
An easy check shows that one has a ring with respect to these operations – the ring of formal series P x. The Reader will notice that here we have to deal with an “extension” of addition and multiplication of polynomials. Thus the ring of polynomials P [x] is a subring of P x. However, the latter is a field, as each formal series f other than zero has an “inverse series”, that is, a series f −1 ∈ P x such that f · f −1 = 1. Indeed, each formal series different from zero has the form f = xn (a0 + a1 x + a2 x2 + . . . ),
n ∈ Z, a0 = 0.
Determining the coefficients bi of the series f −1 = x−n (b0 + b1 x + b2 x2 + . . . ) by the identities a0 · b0 = 1, a0 · b1 + a1 · b0 = 0, a0 · b2 + a1 · b1 + a2 b0 = 0, ............ we see that f · f −1 = 1. From the relation P [x] ⊆ P x it follows that the quotient of any two polynomials is contained in the field of formal series, that is, P (x) ⊆ P x. By induction one defines the notion of the field of formal series of several variables: P x1 , x2 = P x1 x2 . P x1 , x2 , x3 = P x1 , x2 x3 . P x1 , . . . , xn = P x1 , . . . , xn−1 xn . We see that P (x1 , . . . , xn ) ⊂ P x1 , . . . , xn . Next we extend the field of formal series to an algebraically complete field. To this end we consider general formal series, that is, formal sums ni f= ai x n , where , n ∈ Z, n > 0; n, n0 , n1 , · · · ∈ Z, i≥0
n0 < n1 < n2 . . . , among the integers only finitely many being negative. If n = 1 we 1 have ordinary formal series. Introducing the variable y = x n , we can view the general formal series as an ordinary formal series in the variable y: ni f (x) = ai x n = ai y ni . i≥0
i≥0
This simple remark shows that that general formal series may be added and multiplied in the same way as ordinary formal series. This gives again a ring, which we denote P {x}. Even more, P {x} is a field. Indeed, given a general formal series f (x) = 0 we consider the corresponding formal series f (y) ∈ P y. As P y is a field, we can find a series
5. Polynomials and formal series
395
1
f −1 (y) such that f (y) · f −1 (y) = 1. The change of variable y = x n gives the desired relation. One can prove that P {x1 } is an algebraically closed field, that is, each non-linear polynomial with coefficients in this field is reducible (over P {x1 }). By induction we may define the field of general formal series of several variables: P {x1 , . . . , xn−1 , xn } = P {x1 , . . . , xn−1 }{xn }. We have the relations (105)
P [x1 , . . . , xn ] ⊂ P (x1 , . . . , xn ) ⊂ P x1 , . . . , xn ⊂ P {x1 , . . . , xn }
As P {x1 } is an algebraically closed field, then a simple induction shows that also P {x1 , . . . , xn } is algebraically close (provided that the base field P is so). From the relation 105 it is now clear that the goal set out by us is achieved.
5.4. The splitting field of the general equation Here we illustrate the applications of the notions set forth in the previous two Sections, we introduce the notions of general equation and its splitting field; these are necessary for the formulations in contemporary language of the Ruffini-Abel Theorem. In what follows we use the word “field” in the meaning “subfield of the field of complex numbers”.72 Let P be any field. Let a1 , . . . , an be complex numbers. If there is no polynomial h(x1 , . . . , xn ) with coefficients in P such that h(a1 , . . . , an ) = 0, then we say that these numbers are algebraically independent (over P ).73 A series of examples of algebraic independence (over Q) are provided by the following theorem. T HEOREM 5.3 (Lindemann). If the algebraic numbers α1 , . . . , αn are algebraically independent over Q, then the numbers eα1 , . . . , eαn are algebraically independent over Q. √
2 For √ example, the numbers e and e are algebraically independent over Q, because 1 and 2 are algebraically independent over Q. As in Section 5.3 we can form the field of formal series K = P (a1 , . . . , an ). Let us now take arbitrary power series α1 , . . . , αn and consider the intersection of those subfields which contain the base field P and the power series α1 , . . . , αn . This is the smallest subfield in K with this property, we denote it P (α1 , . . . , αn ) = R.
D EFINITION 5.4. If the coefficients a1 , . . . , an if the equation (106)
f (x) = xn + a1 xn−1 + · · · + an−1 x + an = 0
are algebraically independent (over P ), we call it the n-th order general equation (over P ). Assume that the base field P contains Q and let it be algebraically closed. In this situation the field of power series K = P (a1 , . . . , an ) containing the coefficients of the 72In the following considerations it is sufficient that “field” means “a subfield of an algebraically closed
field of characteristic 0” 73Several criteria for verifying the algebraic independence of complex numbers can be found in [3, p. 118–121].
396
C HAPTER VI. POPULARIZATION OF MATHEMATICS
general equation, is algebraic closed (see Section 5.3), so that the polynomial f (x) falls into linear factors in it. Thus there exist elements α1 , . . . , αn ∈ K such that f (x) = (x − α1 ) . . . (x − αn ). The field R(α1 , . . . , αn ) = Δ is called the splitting field of the general equation (106). P ROPOSITION 5.5. R(α1 , . . . , αn ) = P (α1 , . . . , αn ). P ROOF. On the one hand, it follows from P ⊂ R that P (α1 , . . . , αn ) ⊂ R. On the other hand, as by the formulae of Viète ai = (−1)i σi (α1 , . . . , αn ), then Δ = R(α1 , . . . , αn ) = P (a1 , . . . , an )(α1 , . . . , αn ) ⊆ P (α1 , . . . , αn ). This proves the assertion. ( ' The result obtained shows that each element of the splitting field comes in the form f (α1 , . . . , αn ) , where f and g are polynomials with coefficients in the base of a fraction g(α1 , . . . , αn ) field P . One can show that this representation is unique: P ROPOSITION 5.6. There is no element a ∈ Δ such that f2 (α1 , . . . , αn ) f1 (α1 , . . . , αn ) = (107) a= g1 (α1 , . . . , αn ) g2 (α1 , . . . , αn ) with def h = f1 g2 − f2 g1 ≡ 0. P ROOF. We give a proof based on contradiction. Let us assume that nevertheless an element a with the property (107) nevertheless exists. But then there exists a polynomial h(x1 , . . . , xn ) with coefficients in the base field P , which is not the zero polynomial, but h(α1 , . . . , αn ) = 0. Let Sn be the complete symmetric group. We form for each σ ∈ Sn , 1 2 ... n , σ= ii1 ii2 . . . in the “conjugate polynomials” 74 polynomials hσ (x1 , . . . , xn ) ≡ h(xi1 , . . . , xin ). Then h ≡ 0 ⇒ (∀σ ∈ Sn )hσ ≡ 0 ⇒ hσ ≡ 0. σ∈Sn The product def hσ (x1 , . . . , xn ) s(x1 , . . . , xn ) = σ∈Sn is a symmetric polynomial, so that we can apply to it the Fundamental Theorem of Symmetric Polynomials: s(x1 , . . . , xn ) = q(σ1 (x1 , . . . , xn ), . . . , σn (x1 , . . . , xn )). Using the formulas of Viète σ(α1 , . . . , αn ) = (−1)i ai , 74For example, if n = 4, h(x , x , x , x ) = x2 x2 + x5 x and σ = 1 2 3 4 1 3 2 4
x22 x24 + x53 x1 .
1 2
2 3
3 4
4 , then hσ ≡ 1
5. Polynomials and formal series
397
we obtain (108)
s(α1 , . . . , αn ) = q(±α1 , . . . , ±αn ).
Here q ≡ 0, because s ≡ 0. On the other hand, we have the identities hσ (α1 , . . . , αn ) = s(α1 , . . . , αn ) = σ∈Sn (109) σ = h (α1 , . . . , αn ) · ·hσ (α1 , . . . , αn ), σ=ε
as h(α1 , . . . , αn ) = 0. Taking account of the relations (108) and q ≡ 0 we conclude from the last mentioned relation (109) that a1 , . . . , an algebraically dependent over P . But this is a contradiction, as the a1 , . . . , an , being the coefficients of the general equation, are algebraically independent over P . The assertion is proved. ( ' From the reasoning given one readily deduces that all solutions of the general equation are simple. Take h(x1 , . . . , xn ) = x1 − x2 . Then h ≡ 0 and h(α1 , . . . , αn ) = ( ' α1 − α2 = 0. Contradiction.
References [1] E. L. Bloch and M. S. Pinsker (eds.). Some questions of coding theory. Mir, Moscow, 1970. [2] H. Espenberg. Symmetric polynomials. Math. and Our Age 19, 1973, 25–38. [3] A. O. Gel’fond. Algebraic and transcendental numbers. Gosizdat, Moscow, 1952. English Translation: Dover Publ., Inc., New York, 1960. [4] G. Kangro. Kõrgem algebra (Higher algebra). Eesti Riiklik Kirjastus, Tallinn, 1962. (Estonian.) [5] J. H. van Lint. Introduction to coding theory. Graduate Texts in Mathematics. Spriner, New York, 1999. [6] M. M. Postnikov. Fundamentals of Galois theory. FyzMatGIZ, Moscow, 1963. English translation: Dover Publ., Inc., New York, 2004. [7] A. Schinzel. Reducibility of polynomials and covering systems of congruences. Acta Arithmetica 13, 1967, 91–101.
This page intentionally left blank
399
6.
[K75a] On Galois theory Comments by U. Persson
It is not feasible in practise to proceed like Swift’s scholar, whom Gulliver visits in Balnibarbi, namely to develop in systematic order, say according to the required number of inferential steps, all consequences and discard the “uninteresting” ones; just as the great works of world literature have not come into being by taking the twenty-six letters of the alphabet, forming all ‘combinations with repetition’ up to the length of 1010 , and selecting and preserving the most meaningful and beautiful among them. H. Weyl, Philosophy of Mathematics and Natural Science, Princeton, 1949
The path for Galois theory was prepared by the work of Lagrange, Gauss and Abel. The recognition of the main principles of the theory and their application is due to Galois. Galois associates to each algebraic equation a corresponding group and, having found a series of deep connections between the properties of these two objects, he gives an exhaustive answer to the difficult question about the algebraic solvability of algebraic equations by radicals, which occupied mathematicians for several centuries75. Galois’ point of view in the study of an equation via trying to understand the properties of its group is of revolutionary importance. His work gave the impetus to a tendency where the center of gravity in research in the area of algebra began to incline towards structured theories. The clear-cut unfolding of this tendency occurred in the 1920’s and a special credit here goes to D. Hilbert and E. Noether. Today such a point of view is well-known thanks to the influence of N. Bourbaki’s treatise Éléments de mathématique (Elements of mathematics). This paper consists of two parts. In the first of them the Reader will be acquainted with some notions, connections and results in Galois theory. We give also a proof of the Abel-Ruffini theorem and treat one line in Galois theory, the inverse problem of Galois. However, the level of generality considered has no limits: In a Bourbaki seminar in 1959 A. Grothendieck presented his results on the so-called Galois theory of schemes, of which what will be set out below is just a very narrow special case. This opened up for Galois theory a broad path in geometry. Of course, even this is not a limit, because the employment of the Galois correspondence is nowadays so frequent that it has almost developed into a philosophical principle.76 In the second half of the paper, which carries the title “The duality principle in mathematics”, we consider in which way this development of the ideas of Galois has taken place, and we treat some general questions connected with the application of Galois’ ideas in new disciplines. 75See also the author’s paper in Section 3. 76In the article [K75b] (see Section 7 on automata theory the Reader can acquaint himself with the
realization of the main idea of Galois in Computer Science.
400
C HAPTER VI. POPULARIZATION OF MATHEMATICS
One can also view the lines below as the final chord of the ideas of a series of papers77, where I have tried to prepare the Reader for the understanding of the present material. It will be assumed that the Reader has access to the papers in question, to which he or she can refer in case of need. Special references will not be given, but each time the Reader encounters a little known term or fact, he or she can turn for help to the papers mentioned. Finally, let us also remark that the paper’s second half can be read independently of the first one, so that a Reader who is only interested in the general development of the ideas derived from Galois theory may at once turn to the reading of the second half.
6.1. On the Galois correspondence 1. Let us consider the equation x4 + x3 + x2 + x + 1 = 0. Its solutions are 5th roots 2π 2π + i sin . These of unity: α1 = e, α2 = e2 , α3 = e3 , α4 = e4 , where e = cos 5 5 solutions satisfy the relations (110)
α1 α4 = 1,
α2 α3 = 1,
α21 α3 = 1,
α31 α2 = 1.
Of course, these solutions also satisfy the relations given by the formulae of Viète. The latter remain in force also after applying to the solutions in them an arbitrary substitution of degree 4. In the case of therelations we cannot maintain this. For example, the α1 α2 α3 α4 carries the relation α1 α4 = 1 into α2 α4 = 1. If substitution α2 α1 α3 α4 we try all 24 substitutions on the roots α1 , α2 , α3 , α4 we see that the relations are not disturbed by the following four among them: α1 α2 α3 α4 α1 α2 α3 α4 , , α2 α1 α3 α4 α2 α4 α1 α3 α1 α2 α3 α4 α1 α2 α3 α4 , . α4 α3 α3 α4 α3 α1 α4 α2 An immediate check shows that these four substitutions form a subgroup of the group of permutations S4 . Let us consider the general case. We consider the n-th order algebraic equation (111)
a0 + a1 x + · · · + an−1 xn−1 + xn = 0.
We denote the left hand side of the equation by f (x) and assume that its coefficients belong to some fixed field P . With the aid of the derivative of f (x) we can separate from f (x) the product of all factors which have only simple solutions. This can be done in such a way that the coefficients of these factors likewise belong to P . Therefore we may henceforth assume that all solutions α1 , . . . , αn of f (x) = 0 are simple. Let (112)
Ψi (α1 , . . . , αn ) = 0,
i∈I
be the system of all possible polynomial relations between the solutions of f (x) = 0 (the relations given by the Vietè formulae are always present; in general, this system of relations may also be infinite). In the complete symmetric group Sn we distinguish the 77Cf. [K69c, K70, K73a] or Sections 3, 4 and 5.
6. On Galois theory
401
subgroup G(f ) of those substitutions which either do not change any of the relations of the system (112) or else map each relation in (112) again to a relation in the system, in other words we have the assertion ∀ i ∈ I ∃ j ∈ I, Ψσi (α1 , . . . , αn ) = Ψj (α1 , . . . , αn ). The set G(f ) ⊂ Sn gives a subgroup. Indeed, if σ, τ ∈ G(f ), then one readily sees that σ · τ ∈ G(f ), and, likewise, that the identity substitution belongs to the set G(f ). On the other hand, for each substitution σ of order n there exists an m > 0 such that σ m = ε, so that σ −1 = σ m−1 , from which it is clear that σ −1 ∈ G(f ). D EFINITION 6.1. The substitution group G(f ), none of which elements change the relations between the solutions of the equation f (x) = 0, is called the Galois group of the equation. We will see some examples in the next section. 2. We give now another definition of the notion of Galois group, in which the equation itself is replaced by a new field, the splitting field of the equation. This makes the theory more transparent and increases its generality; at the same time it widens its range of application. Let Δ be a field. We look at those bijections σ : Δ → Δ which preserve in the field a given sum or product, i.e. for any a, b ∈ Δ one has the relations (a + b)σ = aσ + bσ ;
(a · b)σ = aσ · bσ .
Such maps σ : Δ → Δ are called automorphisms of Δ. As automorphisms are one-toone, it follows that for each b ∈ Δ one can find an a ∈ Δ such that aσ = b; here the element b is uniquely determined, as a = b =⇒ aσ = bσ . From this it follows that the −1 map defined by the formula bσ = a is an automorphism. If we define multiplication of automorphisms as composition, then the set of all automorphisms of the field Δ equipped with this operation (multiplication) is a group which we denote by G(Δ). The unit element of G(Δ) is the automorphism ε for which all elements of Δ are fixed points. We call G(Δ) the group of all automorphisms of Δ; it is often denoted Aut Δ. Next, let Δ be the splitting field of equation (111), i.e. the smallest field containing all solutions α1 , . . . , αn of the equation f (x) = 0. Thus Δ = P (α1 , . . . , αn ). We consider the group of those automorphisms σ in G(P ) which leave invariant all elements of P , i.e. from a ∈ P it follows that aσ = a. These automorphisms σ form a subgroup in G(P ) which is called the Galois group of the extension Δ/P and denote it by G(Δ, P ). R EMARK 6.2. The definition given indicates a path to far reaching generalizations. Indeed, we can speak of the Galois group, not only of the splitting field of an equation, but of the Galois group G(L, K) of an arbitrary extension L/K, where G(L, K) = {σ ∈ Aut L|aσ = a for all a ∈ K}. In other words, the elements of G(L, K) are all the automorphisms for which all the elements of the subfield K are fixed points. L EMMA 6.3. Let f (x) = 0 be an algebraic equation whose coefficients are in the field P and the splitting field is Δ. The Galois group of the extension Δ/P coincides with the Galois group of the equation f (x) = 0.
402
C HAPTER VI. POPULARIZATION OF MATHEMATICS
P ROOF. The automorphisms σ ∈ G(Δ, P ) have a remarkable property: they map any solution of f (x) = 0 into a solution of the same equation. Indeed, from the equality 0 = a0 + a1 α + · · · + an−1 αn−1 + αn it follows that 0 = 0σ = (a0 + a1 α + · · · + an−1 αn−1 + αn )σ = = aσ0 + aσ1 ασ + · · · + aσn−1 (αn−1 )σ + (αn )σ = = a0 + a1 ασ + · · · + an−1 (ασ )n−1 + (ασ )n . Therefore we have f (ασ ) = 0, which means that together with each solution α of f (x) = 0 also ασ is a solution. On the basis of this observation it is easy to prove the lemma. Indeed, it follows from it that for each root αi and an arbitrary σ ∈ G(Δ, P ) there exists an index si , 1 ≤ si ≤ n, such that aσi = asi . As the automorphism σ is one-to one and the fact that the roots α1 , . . . , αn are simple, we see that the indices si and j are distinct. This implies that for distinct indices i and j then si and sj are distinct too. This means that 1 2 ... n Φ(σ) = s1 s2 . . . sn is a substitution of order n. As Φ(σ · τ ) = Φ(σ) · Φ(τ ), then Φ is a representation (a homomorphism) of the group G(Δ, P ) in the group Sn . The kernel Ker Φ of Φ consists of all those automorphisms σ which leave invariant all solutions, and so the whole splitting field Δ. But the only such automorphism is ε. As the kernel Ker Φ of homomorphism Φ consists only of the identity automorphism, we may view the image Φ(G(Δ, P )) as subgroup of Sn . In order to complete the proof we must show that Φ(G(Δ, P )) = G(f ). This is a simple exercise and will be left to the Reader. The Lemma is proven. ( ' Let us now have a look at two examples of the computation of Galois group of an equation. E XAMPLE 6.1. Let Q be the rational field. We seek the Galois group G(Δ, Q) = G for the splitting field Δ/Q of the equation x4 − 2 = 0. This equation has the solutions √ 4 α1 = α = 2, α2 = iα, α3 = −α, α4 = −iα. Therefore the rational expression α1 α3 + α2 α4 = 0 ∈ Q must remain in force under the action of the elements of G. Thus the group G either coincides with the substitution group H = {1, (13), (24), (13)(24), (12)(34), (14)(23), (1234), (1432)}, or else is a subgroup of it. Here we used the representation of substitutions as cycles or their products. Therefore the order |G| of G is a divisor of 8 (theorem of Lagrange). We record that the splitting field of f (x) is Δ = Q(α, i). We denote by the symbol [L : K] the dimension of the vector space L/K. It is easy to convince oneself that [Q(α) ∩ Q(i) : Q] ≤ 2, from which it follows in view of i ∈ Q(α) that [Q(α)∩Q(i)] = 1. But this again means that Q(α) ∩ Q(i) = Q. As f (x) is irreducible over Q (by Eisenstein’s criterion), then [Q(α) : Q] = 4. The minimal degree of an algebraic equation with coefficients in Q(α) and which is satisfied by i must be 2. Therefore we have [Q(α, i) : Q(α)] = 2. The equalities [Δ : Q(α) : Q(α)] = 2 and [Q(α) : Q] = 4 show that [Δ : Q] = 8.
6. On Galois theory
403
In view of the Galois correspondence (cf. Subsection 4 [below]) we obtain the equality |G| = [Δ : Q] = 8. Therefore the group sought must coincide with the substitution group H. E XAMPLE 6.2. To find the Galois group G(Δ, Q(i)) = G for the [same equation] x4 − 2 = 0 but in the splitting group Δ/Q(i). Here we have the equalities α4 α3 α2 α1 = = = = i ∈ Q(i), α4 α3 α2 α1 αk so these three rational expressions must be invariants of the group G. Hence, αi G = {1, (1234), (13)(24), (1432)}. 3. Let us now determine the Galois group of the general algebraic equation a0 + a1 x + · · · + an−1 xn−1 + xn = 0. To this end we consider the extension Δ/R, where R stands for the field of rational functions P (a1 , . . . , an ) and Δ is the root field of the general equation. In view of Lemma 6.3, established in the previous section the problem is equivalent to finding the Galois field of Δ/R. L EMMA 6.4. The Galois group G(Δ, R) of the extension Δ/R is isomorphic to the complete symmetric group Sn . P ROOF. As the solutions α1 , . . . , αn of the equation are all simple each σ ∈ G(Δ, R) determines via the formulae aσk = aik , k = 1, 2, . . . , n, a substitution α1 α2 . . . αn . Sσ = αi1 αi2 . . . αin
1 2 ... n . i 1 i 2 . . . in . In the proof of Lemma 6.3 we showed that the map μ : G(Δ, R) → Sn , given by the formula μ(σ) is a monomorphism. In order to prove the lemma at hand it therefore suffices to prove that μ is an epimorphism. For the proof of the last statement we associate to each substitution 1 2 ... n Sσ = i 1 i 2 . . . in . a transformation σ in the splitting field Δ given by the formula
This can also be read as
Sσ =
(113)
f (α1 , α2 , . . . , αn ) g(α1 , α2 , . . . , αn )
σ =
f (αi1 , αi2 , . . . , αin ) . g(αi1 , αi2 , . . . , αin )
In order to show that μ is an epimorphism we must convince ourselves that for each transformation σ there holds the relation σ ∈ G(Δ, R). Let us first verify that σ is bijective. That σ is injective follows at once from (113) if we take account of the fact f that each element in Δ can be represented in a unique way in the form . Moreover, g σ is bijective since the transformation σ −1 : Δ → Δ given by (113) in the case of the substitution S −1 , has the property that σ · σ −1 is identity.
404
C HAPTER VI. POPULARIZATION OF MATHEMATICS
We show that the transformations σ under consideration are automorphisms of Δ. Indeed, using the notation f (αi1 , αi2 , . . . , αin ) = f S (α1 , α2 , . . . , αn ) = f S , we see that we have the equations σ σ f2 (f1 g2 + f2 g1 )S f S gS + f S gS f1 f1 g 2 + f2 g 1 + = = = 1 2S S2 1 S g1 g2 g1 g2 (g1 g2 ) g1 · g2 σ σ S S f f f1 f2 = 1S + 2S = + g1 g2 g1 g2 and
f1 f2 · g1 g2
σ
σ
=
f1 f2 g1 g2
=
f1S f2S · g1S g2S
(f1S · f2S ) f1S · f2S == = (g1S · g2S ) g1S · g2S σ f1 f2 = ( )σ · . g1 g2 =
These equations and the fact that the transformation σ is a bijective show the relation σ ∈ Aut Δ. It remains to convince oneself that for each a ∈ R it holds aσ = a. In other words, we have to show that the subfield R ⊂ Δ is invariant under all the transformations given by (113). More exactly, for any element a=
f (α1 , α2 , . . . , αn ) = A(α1 , α2 , . . . , αn ) ∈ R g(α1 , α2 , . . . , αn )
and each substitution σ ∈ Sn we have to verify the relation AS (α1 , α2 , . . . , αn ) ≡ A(α1 , α2 , . . . , αn ). Thus we have to check that the rational function A(α1 , α2 , . . . , αn ) is symmetric. We show that this is in fact so. Let a = A(α1 , α2 , . . . , αn ) =
f (α1 , α2 , . . . , αn ) g(α1 , α2 , . . . , αn )
be an element of Δ belonging to the subfield R. Then it can be presented in the form f¯(α1 , α2 , . . . , αn ) , a= g¯(α1 , α2 , . . . , αn ) where f¯ and g¯ are polynomials with coefficients in P . Using Viète’s formulae ai = (−1)i σi (α1 , α2 , . . . , αn ), we find that the function A(α1 , α2 , . . . , αn ) is symmetric. So we have proved that σ ∈ G(Δ, R), which completes the proof. ( ' 4. Let K be a field and G a finite group of automorphisms of this field. Thus G ⊂ Aut K. We consider all those elements of G which are “conservative” with respect to the group G, i.e. we consider the following subset of the field K: K G = {a ∈ K such that for all σ ∈ G holds aσ = a.} A check shows that K G is a field. The fundamental theorem of Galois theory can now be formulated in the following manner. T HEOREM 6.5. It is possible to establish a 1:1 correspondence τ : Li ←→ Hτ (i) between, on the one hand, the extensions L contained in K and containing the subfield K G , and, on the other hand, the subgroups H of G, such that it follows from Li ⊇ Lj that
6. On Galois theory
405
Hτ (i) ⊆ Hτ (j) . Thereby, the order [K : Li ] of the extension K/Li equals the number of elements in the subgroup Hτ (i) . The correspondence under view is τ : H ↔ K H , where K H = {a ∈ K such that for each σ ∈ H holds aσ = a}. In order illustrate the above we give the following scheme: (1) ⊂ ⊂ HO HO O τ
τ
τ
K ⊃ L = KH ⊃ KG At the same time it is often not very easy to survey the structure of the extension K/K G by “direct” means. Let us give two applications of this correspondence. First, we consider an algebraic equation f (x) = 0 with multiple solutions and coefficients in a field P . The splitting field of this equation will be written Δ. By Lemma 6.3 the Galois field of f (x) = 0 is isomorphic to the Galois field of the extension Δ/P . Taking K = Δ and G = G(Δ, P ) we obtain the relation K G = P . We see that in order to study the properties of the extension Δ/P (and thereby also the equation f (x) = 0) one can use the Galois correspondence discussed above. The result of these considerations is T HEOREM 6.6 (Galois’ criterion). For an equation f (x) = 0 to be solvable by radicals it is necessary and sufficient that its Galois group is solvable. An immediate consequence of Galois’ criterion and Lemma 6.4 is the following fundamental result. T HEOREM 6.7 (Abel-Ruffini). The general n-th order algebraic equation is not solvable by radicals for n ≥ 5. Indeed, by Lemma 6.4 the general n-th order algebraic equation has as Galois field the complete symmetric group Sn . But the groups Sn , n ≥ 5, are not solvable, so the assertion of the theorem follows from Galois’ criterion. Second, let K be a field of algebraic numbers, i.e. a finite extension of the field of rational numbers Q and let [K : Q] = n. According to Kronecker’s theorem there exists a complex number θ ∈ K and each k ∈ K can uniquely be expressed in the form k = k0 + k1 θ + · · · + kn−1 θn−1 ,
ki ∈ Q.
Thus K = Q(θ). As 1, θ, θ , . . . , θ are linearly dependent over Q (because dim K = [K : Q = n]), there exists a polynomial p(x) with rational coefficients such that p(θ) = 0. On can check that there is no other polynomial with the same property. Thus, dividing p(x) by the coefficient of xn (normalizing) p(x)), we obtain for θ the so-called minimal polynomial p¯(x), which is uniquely determined by p(x). According to the “fundamental theorem of algebra” an algebraic equation of the n-th order has n solutions; let the solutions of p¯(x) = 0 besides θ0 = θ further be θ1 , . . . , θn−1 . The maps κi : K → K given by the equations 2
n
k κi = k0 + k1 θi + · · · + kn−1 θin−1
i = 0, 1, . . . , n − 1
turn out to be automorphisms. The set {κ0 = ε, κ1 , . . . , κn−1 } is a group which we can view as the group G. As K G = Q, one can obtain from the structure of G valuable information about the structure of the field of algebraic numbers K/Q.
406
C HAPTER VI. POPULARIZATION OF MATHEMATICS
5. Today the central problem in field theory is the classification and description of all (algebraic) extensions. The so-called Galois inversion problem belongs here. Given a ground field it is the question to find all extensions which have a given group as Galois group. For example, in the case of finite fields it has been known since the times of Galois that their algebraic extensions are cyclic, i.e. they possess a cyclic Galois field and there exists precisely one extension of a given degree. In the general case the Galois inversion problem is still far from completely solved. The Galois inversion problem in its classical form was already known to Niels Henrik Abel. This question can be stated in several distinct forms. A. Given a group find an algebraic equation having the given group as Galois group. B. Find a method for determining all equations with a given Galois group. C. Given a group find the general form of the coefficients of those algebraic equations having the given group as Galois group. Basic among these question is C, because from its solution one can derive also the solutions of A and B. Is problem C always solvable? Emmy Noether showed that the answer is affirmative if the following (Lüroth’s) conjecture is true. Let us consider the field of rational functions P (x1 , . . . , xn ) over a ground field P . It is easy to find an “elementary series” of subfields in this field. To this end take m (≤ n) algebraically independent78 elements y1 , . . . , ym in the field P (x1 , . . . , xn ) and consider the subfield P (y1 , . . . , ym ) ⊂ P (x1 , . . . , xn ). Here P (y1 , . . . , ym ) denotes the smallest extension of P containing y1 , . . . , ym . One can show that the field P (y1 , . . . , ym ) is isomorphic to P (x1 , . . . , xm ). Lüroth’s question [9] is if the series of subfields {P (x1 , . . . , xm ), m ≤ n} obtained in this way exhausts all subfields of P (x1 , . . . , xn ) (up to isomorphisms)? Using notions and facts about algebraic curves Lüroth proved this assertion if n = 1. A simplified, purely algebraic proof was given by E. Netto already in 1895. G. Castelnuovo proved Lüroth’s conjecture for n = 2 in 1894, using deep results about algebraic surfaces. In 1908 G. Fano thought that he had found a counter-example to Lüroth’s conjecture for n = 3, but later essential gaps and shortcomings were found in his argument. In the following decades many attempts were made to prove Lüroth’s conjecture in general, but it turned out that the problem was exceedingly difficult. Adding a series of original ideas and new technical devices Yu. Manin and V. Iskovskih succeeded, comparatively recently (in 1971) to save Fano’s main ideas [6]. This allowed them to conclude that the answer to Lüroth’s question was in general negative. Using analytic method, also Ph. Griffiths and Ch. Clemens, reached about the same result almost simultaneously [1]. Let us also stop at a special case of the Galois inversion problem. Let us have a look at the Abelian extensions of the field Q, that is, extensions of Q the Galois group of which is Abelian. More exactly, an extension L/Q is called Abelian if the group G(L, Q is Abelian. The Kroencker-Weber theorem√states that each Abelian extension of Q is contained in a suitable field of the type Q( n 1), such fields are also called a cyclotomic fields. This assertion contains useful information about the structure of Abelian extensions of Q. The attempts to generalize the Kronecker-Weber theorem have led to series of different proofs. Hilbert raised the question to study Abelian extensions of the fields 78That is, there exists no polynomial f = 0 in m variables and with coefficients in P such that y1 , . . . , ym are solutions of the corresponding equation.
6. On Galois theory
407
√ Q( −d), where d is a square free integer, and, likewise, to study Abelian extensions of arbitrary algebraic number fields (the 12-th problem of Hilbert). Here we can see an attempt to find, for a given algebraic number field K, an “elementary series” of Abelian extensions, containing all other Abelian extensions of K. In 1923 Helmut Hasse solved the first part of Hilbert’s problem. A generalization of Kronecker-Weber theorem to arbitrary algebraic number fields was given, in 1961, by G. Shimura and Y. Taniyama. Progress has also been made in the study of non-Abelian extensions. So far the main result is the following: for each solvable group G and an arbitrary algebraic number field there exists an algebraic extension L/K such that the Galois group G(L, K) is isomorphic to G. One has found a method for determining all such extensions. However, in the general case, the situation for the solution of the Galois inversion problem is still far from complete. Often it is not even clear how one should pose the questions in a correct way and in which terms to seek their solution.
6.2. The duality principle in mathematics 1. The notion of morphism (isomorphism, homomorphism etc.) has become one of the basic concepts of mathematics. This was caused by the development of the notions of similarity and equivalence, both two tightly related with the notion of morphism. An exact determination of the notion of similarity was first given by G. W. Leibniz. Two objects are called similar if they cannot be distinguished from each other, each considered by itself, while each possible property belonging to one of the objects, also belongs to the other. The best illustration of the importance of the notion of morphism is the isomorphism, discovered by R. Descartes, between usual plane geometry and the Euclidean plane (viewed as the set of pairs of numbers (x, y)), obtained by the introduction of coordinates in the former. The fertility of this idea can be seen at the hand the facility with which one nowadays can give answer to the question of trisection of the angle, enormously difficult to the ancients.79 Scientists began to realize the importance of the notion of morphism in mathematics only in the 19th century in connection with a new step in the development of geometry. At the time, mathematicians were already used to passing from one theory to another just by changing the terminology. The set of concrete “models” in general mathematical theory began to grow. This was brought out in full relief in the development of projective geometry. According of the habits of the time one presented side by side in two parallel columns “dual” theorems. Let us recall that “duality” on the projective level consists of the fact that in the theorems of this geometry it is possible to replace the notion “point” and “line” with each other. Let us also note that it was precisely the attempt to establish the “existence” of the geometries of Lobachevskiˇı and Riemann by finding for them geometrical models that fortified the right to live of these new geometries. Duality appears here as a correspondence between, on the one hand, the assertions of the “abstract” theory and, on the other hand, the properties of the more “concrete” mathematical objects. The duality in projective geometry considered is just one example of numerous duality theorems 80 in mathematics, which all rely the common principle of finding an isomorphism between different (mathematical) categories. The duality found allows thereafter 79In greater detail about this in Yu. I. Manin’s article [10]. 80The duality in vector spaces, the duality between open and closed sets in topology. Pontryagin duality
for Abelian topology groups, the Poincaré duality between homology and cohomology in algebraic topology.
408
C HAPTER VI. POPULARIZATION OF MATHEMATICS
to carry properties of an object automatically over to the dual ones which have been sometimes investigated directly for centuries in order to find these properties. The most evidential example here would be Galois theory. S. Lie noticed that for every differential equation there is a group of (continuous) transformations of variables that do not change the equation. The knowledge of the structure of this group permits to draw several conclusions about the solutions of the equation. Such a point of view is of a special importance in differential geometry. The first to understand this clearly was Felix Klein. The main idea of his “Erlanger Programm” (1872) was to classify the properties of geometrical objects according to the mappings with respect to which they were isomorphic. The realization of this idea was a major step forward in the development of the idea of morphism. However, such a development had also a negative side. With indignation, M. Chasles says: Now everybody may take a known fact and just by applying various general transformation principles to it arrive at new truths, which differ from the original one and generalize it. These in turn may be be treated in the same way, and so one can multiply indefinitely the number of new truths, obtained from one and the same original. Despite its universality Klein’s program did exclude the direction of development in geometry which derive from B. Riemann’s lecture, but also this has deep connections with group theory. These directions were developed in the 20th century and led to the study of Riemann manifolds with the aid of their holonomy groups.81 2. Klein’s idea found a fruitful application in physics, which based on symmetry considerations. Symmetry expresses a certain order, proportionality and coherence between the parts of the whole. Already Pierre Curie pointed to the need of using symmetry in physics [2, p. 393.]: I believe that in the study processes it would be of interest to bring in considerations of symmetry, which are used with such great success in crystallography. Physicists often use results which derive from symmetry, but usually they do not make precise the notion of symmetry, because very often this appears to be given a priori almost obviously. The homogeneity and isotropy properties of space have been known since ancient times: they express the symmetry of space with respect to the group motions of the space. The latter consists of the distance preserving selfmaps of Euclidean space, the algebraic operation in it being concatenation. The discrete subgroups of this continuous group describe the symmetries of various crystals. One of the problems of theoretical physics has also been the uncovering of a transformation group with sufficiently many (continuous) invariants, allowing to interpret sets found by measurement and experiments in terms of conservation laws. The invariance of the laws of physics with respect to the 81Cf. [8]. Editors’ note. For the notion of holonomy in general, in the context of fibre bundles, we
may refer the article by G. I. Laptev in Vol. IV, page 443 of the Encyklopaedia of Mathematics, mentioned in Section 1 footnote 9. In the case of Riemannian manifolds the concept briefly referred to in the preamble to Chap. IV of Helgason’s book [5]. In this case this amounts to the following: Let M be a Riemann manifold. If o is a point of M and v a tangent vector at o, then going around a loop L issuing from o the vector X is under parallel transport replaced by τL (v)X, where τL is a certain linear operator. All such operators span a Lie group called the holonomy group of M at o.
6. On Galois theory
409
group of Lorentz transformations is reflected by the principle of relativity, first formulated by H. Poincaré 82. Especially rich of symmetries is the micro-cosmos.83 The mathematical apparatus created by Sophus Lie has had a special importance in the discovery of new properties of elementary particles in physics. This is explained by the following circumstance. If some physical theory expresses its experimental results with the help of differential equations, then it may be that the physical content harbored in these equations is much wider and covers a larger range of experiments than that from which the equations were derived. In connection with the discovery of electromagnetical radiation in Maxwell’s equations, Heinrich Hertz says strikingly: One cannot avoid the feeling that the mathematical equations have an existence independent of us, their own consciousness, that they are more clever than we, because we can obtain from them more than we initially invested in them. A good illustration of what was said is also the discovery of anti matter. It was observed that if a particle is described by the Dirac equation then it can be in two different states of charge. That the electron was described by the Dirac equation was known. Therefore the existence of the positron was predicted, which later also found an experimental verification. From here it was inferred that every particle has an anti particle. In elementary particle theory two kinds of symmetry are presented. Some of them are connected with a subgroup of the group of space-time transformations (that is, the Lorentz group). Others, so-called inner symmetries appear in the study of special unitary groups and reflect the “inner” properties of particles. 84 But nature is not throughout only symmetry! The irreversibility of thermodynamical processes, the violation of the laws of parity, time reversal, and charge conjugation for particles in the case of weak interaction – all this speaks of asymmetry. Also the organic worlds abounds of them. Here we may speak of an interchange of strata of symmetry and asymmetry, of their levels. Symmetry manifests itself now in the organization of asymmetrical elements and in doing this reflects their striving for development. 3. The world surrounding us is characterized by its structure, the granularity of this structure and the relative independence of the structures. This gives one the possibility to distinguish single structures with the purpose in mind to learn to know them more distinctly. The similarity of differing structures is in mathematics reflected by the notion of morphism – similarity on the level of a certain theory. Such an approach makes axiomatic methods expedient. The use of this method in mathematics was initiated in the work of M. Pasch, D. Hilbert and E. Steinitz. The acceptance more widely of this method did not go without pain. For example, Felix Klein was from the onset rather sceptical towards this “axiomatic mathematics”, as he saw here an assault on intuition and imagination, that is, the truly productive elements in the process of creation. The undoubtedly most outstanding achievement of this direction is the appearance of the theory of algebraic 82
See the articles “The principle of relativity” and “Professor H. A. Lorentz as a researcher” in the book
[4] 83
An exhaustive presentation of the questions to be treated below can be found in [12]. Translator’s note. See also e.g. [3]. 84 Translator’s note. One example is SU(2)-symmetry, which explains the similarity in the behavior of the proton and the neutron; recall that SU(2) is the group of unitary 2 × 2 matrices.
410
C HAPTER VI. POPULARIZATION OF MATHEMATICS
structures in the 1920’s (on the basis of the set theory of Cantor). A further step on this road was the enfolding of the mathematical structure and the closely related with this the allied appearance of the complete notion of morphism, which gradually became a tool for a reshaping all of mathematics. It is without doubt true that mathematics starts with number. In all ages number has been the soul of mathematics, because their level of development is tightly connected with the possibility of applying mathematics in other sciences. But the development and spread of “quantitative” methods has always called for a completion and development of the “qualitative” methods, because the latter enrich the art of calculation with new forms and promote the organizing part in the development of metrical (numerical) mathematics.85 4. The interaction of these two trends in mathematics, the applied and the theoretical ones, has been a constant source of stimulation in the work of many mathematicians. For instance, Feix Klein estimates the work of C. F. Gauss in applied mathematics as follows: Gauss obtained the stimulation for this work outside mathematics. But then, in the posing of the problems and in their solution, there appears a special creational power and experience, which he could develop in himself only by solving problems of “pure” mathematics. It manifests itself also in the principle not to count as done such problems where there still remains something to be done. The interaction of these tendencies were brought in even greater relief in the work of J.-L. Lagrange. Lagrange’ mathematical style is characterized by an unusual consequence, a desire to solve this or that problem to the very end. However, his most famous book is his Mécanique analytique (Analytical Mechanics), which appeared in 1788. Here is treated from a general point of view various principles for the solution of mechanical problems found up to that time, the relations between these principles, the dependence of each other are shown, as well as the limits of their applicability. In this treatise Mechanics has become part of Mathematical Analysis, as one does not stop at the narrow special cases of the problems of mechanics, but one has brought to the foreground the steps that are necessary in the solution of the problems under view, which in the course of the following centuries has been the point of departure, the foundation and the source of many theories in the various branches of applied mechanics. These methods have been applicable in the design of screws of ships, as well as in the study of the oscillation of ships, in creation of gyrocompasses, in the computation of the trajectories of shells, in projecting railway bridges, or also in the investigation of the motion of celestial bodies.86 One has compared Mathematics with a big city, in the outskirts of which a lively activity of construction takes place, where new districts and new blocks rise, where the air is cleaner and where the youth crowds, bringing new force and stimulus to the city. The birth of these new quarters is an inevitable necessity in the development of a big city, the exigency of its life. At the same time there is going on a not less intensive 85 Everybody knows the role played today by functional analysis in the development of numerical methods (and thereby also in widening of the possibilities for using the computer). 86 One can read more details about this in the paper [7]. A more contemporary illustration of what we have said is the works of J. von Neumann. The Reader can acquaint him- or herself about his views about the development of mathematics, the balance of empirical and aesthetic deliberations in it, in the paper “The Mathematician” contained in the book [11, p. 1–9].
6. On Galois theory
411
and extensive building activity downtown. Streets are reconstructed and widened, new up-to-date houses rise, in order to adapt the life of the big city to the new needs and requirements of life. We have a truly expedient and beautiful city only when these two tendencies in its development are in good harmony.
Comments. The basic tenet of Galois theory, namely the correspondence between field extensions and groups can be succinctly stated and easily proved, and is being offered as fare to undergraduates all around the world. In its basics it is a finished theory and nothing can be added to or subtracted from it. It involved a conceptual leap from the notion of a root of a polynomial to the notions of groups and fields, nowadays forming cornerstones of modern mathematics, which would be inconceivable without them. But Galois theory is not a dead subject as Kaljulaid points out. Once you start to apply it to specific situations, its subtlety becomes apparent. One fundamental example is the classification of Abelian extensions (Abelian Galois groups) of the rational numbers as subfields of cyclotomic extensions (adding roots to xn = 1). A surprisingly intractable problem is the inverse Galois problem over the rationales, namely to characterize the finite groups that can occur as Galois groups of number fields (i.e. finite extensions over the rationales). This may seem to be a somewhat artificial problem, although Kaljulaid is obviously fascinated with it and its ramifications; but natural applications of Galois theory abound whenever fields pop up, and thus it is an inevitable tool of the algebraic geometer or number theorist. Fields and groups are very different things, although in standard introductory courses both tend to be treated on equal footing as examples of algebraic structures. Groups are more fundamental, and they have worked on human imagination long before being recognized and identified as independent entities. The crucial concept is symmetry, the instinctive feeling that two different things are really the same, and that there is no way of intrinsically distinguishing between them. One example - the embryo of Galois theory, is to ask which one is ’i’ (the square root of -1) ’i’ or ’−i’? Is the question meaningful? The outcome of the question is complex conjugation constantly used even by people innocent of Galois theory. Kaljulaid speaks somewhat confusingly about duality and isomorphisms. The two things are really different. The classical example of duality is the correspondence between lines and points in the projective plane (a phenomenon so attractive that it literally forced the invention of the projective plane itself). Here there is some kind of symmetry. Isomorphism is something different and more general. In duality there is no natural isomorphism, to achieve one you need to specify and make a choice, and thus destroy the simplicity of the situation. The concept of isomorphism is far more further reaching, and the concept of auto-isomorphism (automorphism) makes exact the vaguer notion of symmetry and allows hence the introduction of composition and of groups. As Kaljulaid rightly points out, it is Felix Klein’s vision, of groups of symmetries being at the basis of geometrical classification – the oft referred to Erlangen program –, that elevated the notion. It is of course a great unifying principle, seductive in its elegance, but of course not telling the whole story. Fundamental physics should really be thought of as enhanced geometry, as Einstein’s general relativity so eloquently bears witness to, and it is in theoretical physics the Klein’s Erlangen program really has struck its deepest roots. Nowadays groups of symmetries play fundamental roles in describing different physical phenomena, and from a philosophical point of view, it is tempting to see in groups the deeper reality that Plato postulated, and whose various manifestations make up the material world. (One amusing example may be the aptly called Platonic solids as mere aspects of their underlying symmetry groups). This mathematical view of the world is really nothing but the modern sophisticated version of ancient number mysticism. Groups and mathematical formulas seem to rule the physical world, and some physicists, notably Dirac, set greater store in an elegant mathematical formula, than an ugly one empirically born out, only to be eventually vindicated! Why is there a world? Mathematical formulas of the vacuum predict its spontaneous emergence. Thus in a sense, mathematics is God, existing even before nothing. In the most ambitious effort so far to fundamentally understand physics, optimistically referred to TOE (Theory of Everything), empirical testing is no longer feasible, and the only source of corroboration and inspiration is mathematical beauty. Indeed Kaljulaid takes the ideas of Galois theory to the most exalted end. Ulf Persson
412
C HAPTER VI. POPULARIZATION OF MATHEMATICS
References [1]
Ch. Clemens and Ph. Griffiths. The intermediate Jacobian of the cubic threefold. Ann. of Math. 95 (2), 1872, 281–356. [2] P. Curie. Sur la symmétrie dans les phenomènes physiques. J. Phys. 3 (3), 1894. [3] J. P. Elliott and P. G. Dawber. Symmetry in Physics I–II. Vol. 1. MacMillan Press Ltd., Vol. 2. Clarendon Press, Oxfod Univ. Press, London, New York, 1979. Russian Translation: “Mir”, Moscow, 1983. [4] P. Erenfest. Relativity. Quanta. Statistics. Nauka, Moscow, 1972. [5] S. Helgason. Differential geometry and symmetric spaces. Academic Press, New York, London, 1962. [6] V. A. Iskovskih and Yu. I. Manin. Three-dimensional quartics and counterexamples to the Lüroth prob˝ lem. Mat. Sbornik (N. S.) 86 (128), 1871, 140U-166. [7] A. N. Krylov. Joseph Louis Lagrange. Uspehi Mat. Nauk 2, 1936, 3–16. [8] Ü. Lumiste. The notion of space in geometry. Geometry and transformation groups. Math. and Our Age 14, 1968, 3–21. [9] P. Lüroth. Beweis eines Satzes über rationale Curven. Math. Ann. 9, 1876, 163–165. [10] Yu. I. Manin. On the solvability of the problem of construction with ruler and compass. In: Encyklopedia of Elementary Mathematics, Vol. 4. FizMatGIZ, Moscow, 1963, 205–227. [11] J. von Neumann. Collected works. Vol 1. Pergamon Press, New York, 1961. [12] H. Õiglane. Chapters from Theoretical Physics, I–II. Tartu University Press, Tartu, 1965, 1967.
413
7.
[K75b] Theory of automata Coauthor E. Tamme
To the memory of Rein Tammeste87 I was asked to speak on “possible future developments, give conjectures, and speculate about futures advances”. To do so is always hazardous, if not foolhardy; it may be possible that right in this very moment a graduate student is busily at work on a theorem that might change present trends drastically. I. M. Singer, Future extensions of index theory and elliptic operators, Ann. of Math. Studies, 70 (1971), 171-185
7.1. Some points of view in analytic Cybernetics 1. Already since remote times man has tried to put the forces of Nature under his will. The study of the secrets of Nature has led to the discovery of the laws of Nature. Every such step forward has been followed with a shift in the development of Engineering: new machines are designed which using the conquered forces of Nature have enlarged the physical powers of man. At the same time man has sought paths and means for making the action of the human brain more powerful. In the 1940’s this question became especially actual due to necessity to carry out extraordinarily complicated and bulky computations. Electronic computers were discovered88. The use of computers becomes necessary in ever more numerous new domains of practical and intellectual activity, and it is not always that existing computers can satisfy these needs, both in quantity and in quality.89 A comeback from computer to abacus is as unthinkable as from electricity to candle light. While the technology improves the interference of man in the controlling of the work of the machines (automatized factories; automatic control and test devices; autopilots; an improved military technology etc.) is often abandoned. A need to create 87 Rein Tammeste was a gifted young Estonian mathematician. Born on the island of Hiiumaa (Dagö) on January 19, 1939 he graduated from Tartu University in 1960. He was the first one in Estonia to investigate notions connected with random variables in complex Hilbert space. These results were set forth in the book “Probabilities in Hilbert spaces” (in Estonian), and in a thesis, written in Russian, defended in Tartu in 1971. He published also research on the axiomatic generalization of information and entropy. Rein Tammeste died at the age of 34, on August 13, 1973, while descending Mount Elbrus. An obituary of him was published in Math. and Our Age 20 (1975), 132-135, and it is written about his thesis in Math. and Our Age 19 (1973), 122 (both in Estonian). E. Tamme 88This work began in 1943, when, under the direction of John W. Mauchly, the first computer project ENIAC was started in practice. The computer was completed in 1946. Editors’ Note. In 1940ties, several computer projects were carried on simultaneously in different countries. for example, the computer Colosseum (UK) was working already in 1944 and Z3 (Germany, Konrad Zuse) in 1943. 89Interesting material about the balance between reality and illusion can be found in the book [2]
414
C HAPTER VI. POPULARIZATION OF MATHEMATICS
more perfect and more powerful computers and automation devices enter the agenda. To succeed in this it is, undoubtedly, essential to know if it is possible to reach the goal set by improving or enlarging the size of the existing automata. Otherwise, one must hope that it will be possible to understand more deeply the principles of the functioning of the brain and to develop formal models of it that could be technically implemented (see [1]). The mathematician is here mainly interested in the following question: will it be possible to give, on the basis of existing mathematics, an adequate description of the law which govern the world of complicated automata? 2. In dealing with distinct (control) systems their structural similarity often becomes apparent, that is, the analogue between many elements and their mutual relations, but also a functional similarity, that is, a similar behavior of the systems in analogous situations. This makes it possible to create a general (axiomatic) theory for classes of control systems, in which the concrete systems are viewed as representatives of a class. In the axiomatization of the behavior of elements one has to assume that the elements can be regarded as “black boxes”, the inner structure of which need not to be known, but who react in a certain determined way to exactly determined exterior excitation. However, the axiomatics of such a theory requires, from time to time, improvements, especially after changes in our knowledge of the physico-chemical nature of the elements and their properties. The principal task is then to find suitable notions and methods for the research of the structure of systems and functional properties of a given class. A step on this path is the work of W. Mc Culloch and W. Pitts regarding formal neural networks [6]. A central role in this theory is the notion of a formal neuron. This is an (abstract) element, a “black box” with m inputs x1 , . . . , xm (where m ≥ 1), and a single output d. A neuron has m + 1 numerical characteristics: the level θ, and the weights ωi of the inputs xi . Here ωi > 0 means that the input xi is stimulating, while ωi < 0 that xi is inhibitory. Such a formal neuron works on a discrete scale of time t = 0, 1, 2, . . . , and gives at time t = n + 1 an impulse to the output d precisely when, at t = n, the sum of the weights of stimulating inputs exceeds the level of the neuron. A formal neural network is a union of elements obtained in the following way. The output of the neuron is divided into a suitable number of branches which are connected to the inputs of some other neurons. Here the outputs of a neuron can be connected with an arbitrary number of inputs in itself or to other neurons, but each input may only be connected with a single output. Some inputs of neurons may remain free and these are either connected to each other (to be considered as identical) or else they are grouped into input lines of the net (each free input is connected to an exactly one input line; the number of the latter may, however, be smaller than the number of free inputs). The output lines are identified as outputs, which are not joined to inputs of any other neuron. All neurons are working at the same time, and their level and weights do not change in the course of time. The study the functioning of such a formal network amounts to clarifying with which signals on the output lines the net will react to various signals on its input lines. Although the formal neural network is a rather primitive analogue of the brain, their study was still a first real step to use of mathematical devices in neuro-physiology. The study of such networks stimulated to a large extent the genesis of automata theory. Indeed, in 1946, John von Neumann set forth new ideas of the construction of electronic computers (the EDVAC project). Von Neumann had as basis of his construction the module, a notion, in the creation of which an essential role was played by the functional
7. Theory of automata
415
similarity between the elementary block of a computer and the formal neuron. The taking into use the notion of module in the construction of computers allowed to separate the logical synthesis of the computer (a problem in the area of mathematical logic) from the technical synthesis of the corresponding electrical network (engineering).90 Thereby one had taken the second step towards the creation of an automata theory, the main object of which became the mathematical realization of the structural and functional analogies between the brain and the computers of the future, and the results of which were supposed to assist, via feedback the creation of new principles for the construction of computers. If we take these requirements as basis, then we must admit that today one is still very far from the theory of automata that really deserves the name. Usually it is said that cybernetics became an independent discipline in the year 1948 when Norbert Wiener’s book “Cybernetics” appeared.91 Side to side with Wiener we must also mention the contribution of J. von Neumann. Although both scientists knew well each other’s work and were under mutual influence of each other, however, their approaches to the topic were quite different. Von Neumann called his variant “automata theory”, while Wiener spoke of “cybernetics”. The latter is well-known through the translation of the corresponding works into Russian. The same cannot however be said about the automata theory. 3. The more complicated is the construction of the computer, the more complicated becomes its structure and the mathematical description of the coding and the motion of information in it, while at the same time the logical depth of the computations in it decreases and the work speed grows. The absence of a suitable mathematical theory for the description of complex automata is undoubtedly a serious obstacle towards the development of powerful automata mimicking the manifold functions of human brain. Already the research of W. Mc Culloch and E. Pitts showed that the application of the methods of formal logic can give essential results in the modelling of the brain. Because of the inner relation between automata and logic, a central place in the description of automata ought to be taken by a certain system of logic. It is that one could do here with the traditional treatment of logic (see also [7]). It is because, for example, the advanced automata must be capable of performing operations consisting of the realization of analogies and generalizations. There is no reason to believe that, in the mathematical treatment of these questions, known concepts and symbolics of logic would suffice. One would rather find a way out in taking into use structure theories of categories and algebra, as the idea of similarity (and the notion of morphism, mirroring it) is one of their organic components. In other words, from the point of view of automata theory it seems important to include the duality principle92 into logic as its organic component. In the 1930’s the discoveries of Kurt Gödel led to the point of contacts between logic and arithmetic. Recently, there has began to find response an algebraic approach, in logic, which is a complement to the hitherto ruling arithmetical point of view. As an example of such an algebraic approach we mention the algebraic treatment of the theory of recursive functions in the works of A. I. Mal’cev 90Already in 1910, the well-known theoretical physicist P. Ehrenfest draw attention to the possibility
to acting in such a way. As at the time, the practical needs were restricted to the assembling together rather primitive electrical networks, where the use of Boolean algebra seemed ridiculous, this precipitate idea passed unnoticed. 91Translator’s Note. The word “Cybernetics” comes from the Greek κυβερνητ ησ, meaning helmsman. 92 see also the article “On Galois theory”, Section 6 of Chapter VI.
416
C HAPTER VI. POPULARIZATION OF MATHEMATICS
and S. Eilenberg. It may be that in the course of time there will be a synthesis of these two points of view on a new level on the basis of an “arithmetized algebra”, being a generalization of the arithmetical point of view. It is also of importance to note that formal logic, because of its approach (the principle of “all or nothing at all”), has so far been cut off from the possibility of using the most advanced part of mathematics – mathematical analysis – using instead combinatorics, an area, were there appear great mathematical difficulties. At the same time, one has, in Analytic Number Theory and in Diophantine Geometry, since a long time ago, found ways for fruitful application of the idea of continuity in the solution of problems which by their nature were discrete. One can claim even more, that the deepest results in the disciplines mentioned have been obtained precisely in this way (often by the intermediary of algebra and probability theory). In the mathematics of the antiquity an essential achievement was the polarity “finiteinfinite”, which now (based on the set theory of Georg Cantor) with the appearance of Mathematical Analysis has become a powerful instrument of cognition. But a deeper and more complete use of the polarity “continuous-discrete” still lies ahead. That this has not yet been done to its full extent is perhaps one of the reasons why physicists in their perspective research still find to little satisfaction in mathematics they can use. This idea due to Hermann Weyl seems to be forth to develop with respect of many parts of applied mathematics. The needs of the applications and the difficulties of the theories have created a situation where one appreciates ever more the value of the ideas which have arisen in the path to the goal, in which C. G. J. Jacobi believed in a passionate way: There will come a time when from each theorem in Mathematical Analysis there will follow a theorem in Number Theory, and vice versa each regularity in the domain of natural numbers will give a theorem in analysis. A powerful basis for the arising and development of such ideas was founded in the work of Leonard Euler. A series of original considerations were given by Yu. Manin in his talk “The physical and mathematical continuum” at the summer school in the history of mathematics in Tartu in 1973. These observations agree with the view of J. von Neumann, according to which the mathematical apparatus for the study of complicated automata ought to start with mathematical logic and proceed in the direction of algebraic, probabilistic and analytic structures and further optics and thermodynamics (in the form given by L. Boltzmann the latter is in many things close to the theory of information processing and measurement). The same point of view was also echoed in a talk by V. Glushkov at a meeting devoted to automata theory in Tashkent in May, 1968. Namely, according to Glushkov the main attention of mathematicians until the end of the 20th century will be directed towards the creation of the algebra and topology of a (formal) language, that is necessary for the mathematical description of complicated automata (see also [3]).
7.2. On algebraic methods in automata theory Several points of contact between automata theory and structural theories of algebra have been know for a long time. Namely, it turns out that each automaton can be interpreted as a certain algebraic object that allows to study the construction of the automaton by means of the structure theory. Proceeding in this way it becomes possible to get an
7. Theory of automata
417
overview of all possible finite automata. We point out an essential analogy to Galois theory, where an algebraic equation is connected to a group, in terms of the theory of which one can express the solvability of the equations by radicals. Although the algebraic apparatus taken into use is rather modest, it turns out that the detailed realization of the corresponding idea is quite complex. Here the work of Krohn and Rhodes [5], on the algebraic theory of machines which appeared in 1965, turned out to be a turning point. In the following we will set out the main features of this theory. 1. Let us give an exact mathematical definition of a finite automaton. D EFINITION 7.1. By a finite automaton or a machine we mean system M = (A, Q, B, λ, δ) consisting of three finite sets A, Q and B, together with the fixed functions λ : Q × A → Q and δ : Q × A → B. It is assumed that the set Q contains an element p such that λ(p, x) = p for each x ∈ A. In order to have an intuitive explanation, we give the following interpretation of the symbols appearing in the definition: - A is the set of input signals or the input alphabet, - B is the set of output signals, - Q is the set of states, whose element p may be viewed as a halt, - λ : Q × A → Q – a function determining the mapping of states, - δ : Q × A → B – a function for getting the output signals. In order to present the functions λ and δ one often uses tables. Then the rows of the matrices )λ(q, x)) and )δ(q, x)) are indexed by the elements of Q, and the columns by the elements of A. E XAMPLE 7.1. We present the automaton M = (A, Q, B, λ, δ) with the help of the following data. Let A = {a, b}, Q = {q0 , q1 , q2 , p}, B = {0, 1, 2}, and the functions λ and δ be defined as in the Table 1. λ q0 q1 q2 p
a q1 q1 p p
b p q2 q2 p
Λ p p q0 p
δ q0 q1 q2 p
a 1 0 0 2
b Λ 0 0 1 0 0 1 2 2
Table 1
Let us point out that we have added to the alphabet A a special symbol Λ, an “empty word”. The purpose of such a procedure will be disclosed in the following two sections. 2. The automaton M = (A, Q, B, λ, δ) is usually interpreted as a system working on a discrete time scale T = {0, 1, 2, . . . }, which being at the moment of time t in the state q ∈ Q and receiving the input signal x ∈ A moves at the moment t + 1 into the state λ(q, x) ∈ Q and sends the output signal δ(q, x). The functioning of the automaton may be visualized as follows. Imagine that the incoming information is written on a tape, which is divided into cells. We assume that in each cell there is either a letter of the
418
C HAPTER VI. POPULARIZATION OF MATHEMATICS
alphabet A or else it is empty (in this case we agree that in this cell there is the “empty word” Λ). Let in the successive cells of the tape be written a finite word s, all cells to the left and to the right of it be empty (by our agreement containing the symbol Λ). At the moment t = t0 the machine M starts in a situation where its state is q0 (initial state) and the leftmost symbol x1 of the word s enters the automaton. At the next moment of time t = t0 + 1 the signal δ(q0 , x1 ) leaves the automaton M and the automaton passes on to the situation (λ(q0 , x1 ), x2 ). The tape moves one step to the left. The further activity of the automaton occurs corresponding to its program given by the table q = λ(q, x). The left hand side of the equality λ(q, x) = q shows that at time t the automaton is in the state q and receives the input signal x. The right hand side of the command indicates the state of the automaton at time t + 1. If the automaton M after having “read” the word s, reaches the situation (q0 , Λ), then s will be called the word accepted by the automaton M . The set of all finite words (in the alphabet A) accepted by the automaton M , is called the formal language accepted by the automaton M .93 More generally, one calls a formal language in a certain alphabet A a set of words obtained on the basis of this alphabet. A formal language which is the accepted language by some finite automaton is called an automaton language. E XAMPLE 7.2. Let A = {a, b}. The formal language {a . . a b . . . b | m, n > 0} . m times n times
is an automaton language, because it is the language accepted by the automaton M in Example 7.1. But not all formal languages are automaton languages. 3. The set S(A) of all finite words in a given alphabet A is a semigroup under the operation of concatenation (“multiplication”), i.e. on S(A) this operation is associative. In what follows we agree that the “empty word” belongs to S(A); we denote it by Λ, and assume that it acts as the unit of S(A). Semigroups with a unit are called monoids. From the theoretical point of view it is expedient to extend the function λ : Q× A → Q to a function λ∗ : Q × S(A) → Q. This can be done inductively by the length of the “processed” words. If x ∈ A we set λ∗ (q, x) = λ(q, x). Let λ∗ be defined for all words u ∈ S(A) of length not exceeding n and let w = ux to be any word of length n + 1. We agree that λ∗ (q, w) = λ(λ∗ (q, u), x). In analogy to the above, the domain of the output function is, likewise, extended to the set Q × S(A), so that the extended function δ ∗ satisfies the condition δ ∗ (q, uv) = δ(λ∗ (q, u), v) for all words u, v ∈ S(A). In what follows it will be suitable to denote the functions λ∗ and δ ∗ again simply by λ and δ. As an illustration of the definition of the functions λ∗ and δ ∗ here are some of their values in case of the automaton considered in Example 7.1: λ∗ (q0 , aaa) = q1 , λ∗ (q0 , aabbb) = q2 δ ∗ (q0 , aaa) = 0, δ ∗ (q0 , aabbb) = 0,
λ∗ (q2 , a) = p, δ ∗ (q0 , a) = 1.
The correctness of these computations can be easily checked using the tables given in Example 7.1. 93An excellent introduction to mathematical linguistics is the book [4]. In this book the relation between formal languages, automata and algebraic theories is illustrated by extensive and good examples.
7. Theory of automata
419
4. Let there be given an automaton M = (A, Q, B, λ, δ), which at a fixed moment of time t ∈ T is in state q. The future behavior of the automaton M is characterized by the function f : S(A) → B, which for each u ∈ S(A) is given by the formula f (u) = δ(q, u). Of course, there may exist such distinct states q and r in M such that δ(q, ∗) ≡ δ(r, ∗). But if such a situation does not occur we say that M is a reduced automaton. It turns out that to each automaton M there corresponds a reduced automaton M = {A, Q , B, λ , δ } which is equivalent to M in the following sense: for each state q ∈ Q there exists a state r ∈ Q such that δ(q, ∗) = δ (r, ∗) as functions on S(A). To each word u ∈ S(A) we associate the left shift lu : S(A) → S(A), a function that for each v ∈ S(A) is defined by the formula lu (v) = uv. For arbitrary u, v, w ∈ S(A) one has lu lv (w) = luv (w). On the basis of the automaton M = (A, Q, B, λ, δ) we construct the automaton M (f ) = (A, Qf , B, λf , δf ) whose set of states is Qf = {g : S(A) → B | g = f lu for some u ∈ S(A)}. The functions λf and δf are defined by the formulae λf (g, v) = glv ,
δf (g, v) = g(v).
Here f lu and glu denote functions S(A) → B whose values at the word w ∈ S(A) are computed according to the formulae f lu (w) = f (uw) and glu (w) = g(uw). In order to better understand the nature of the automaton M (f ) we make the following observations. First, for each u ∈ S(A) the identity f lu (∗) = δ(λ(q, u), ∗) holds. Indeed, for an arbitrary v ∈ S(A) we have the equalities f lu (v) = δ(q, lu (v)) = δ(q, uv) = δ(λ(q, u), v), from which the desired equality follows. Second, it holds δf (g, v) = δ(q, uv), because δf (g, v) = g(v) = f lu (v) = f (uv) = δ(q, uv). We observe further that λf (g, v) = glv = f lu lv = f luv = δ(q, luv (∗)). Third, we show that M (f ) is a reduced automaton. To this end, it is sufficient to show that δf (g , ∗) = δf (g, ∗) implies that g (∗) = g(∗). Let g = f lu , g = f lw , u, w ∈ S(A), and take an arbitrary word v ∈ S(A). Assume that δf (g , ∗) = δf (g, ∗). We have the following chain of equalities g (v) = f lw (v) = δ(q, wv) = δf (g , v) = δf (g, v) = δ(q, uv) = f lu (v) = g(v). It follows from them that g = g. The assertion is proved. The reasonings given show that the set of states Qf of the automaton M (f ) can be regarded as the trajectory emanating from the state f (∗) ∈ Qf , where each state is “accessible” from the state f (∗). The connection of the automaton M (f ) with M is reflected by the fact that if we take the state q ∈ Q for the state f (∗) ∈ Qf , we can view Qf as the subset of Q consisting of those states which are “accessible” from the state q and where the states of the automaton M with the same behavior are considered to be identical. 5. On the monoid S(A) of input words of the automaton M (f ) there is given the following (Myhill) equivalence ≡f : v ≡f v ⇐⇒ ∀u, w ∈ S(A),
f (uvw) = f (uv w).
420
C HAPTER VI. POPULARIZATION OF MATHEMATICS
In other words, the words v and v are equivalent if and only if the function f acts on them in equal contexts in the same way. Myhill equivalence in the monoid S(A) is stable under multiplication of words, that is, for any u ∈ S(A) it follows from v ≡f v that vu ≡f v u and uv ≡f uv . Therefore the product of two equivalence classes can be defined as the equivalence class consisting of the product of any representatives of these classes. A semigroup is obtain whose elements are Myhill equivalence classes. This semigroup Sf of classes is called the semigroup of the automaton M (f ). Let us compute e.g. the semigroup of a trigger. A trigger is an automaton M = (A, Q, Q, λ, λ), where A = {x0 , x1 }, Q = {q0 , q1 } and the function λ is given by the table λ q0 q1
x0 q0 q0
x1 q1 q1
Λ q0 q1
Table 2
Thus a trigger is an automaton whose input alphabet and the set of states are 2element sets and the functions λ(q, ∗) and δ(q, ∗) coincide, that is, its output signal at time t is identified with its state at that moment. From the table defining the function λ shows that signal xi brings the trigger into the state qi independently of its preceding state. The semigroup of the trigger is obtained in the following way. Let f (∗) = λ(q0 , ∗). The definition of the congruence ≡f shows that the relation v ≡f v holds if and only if for all u, w ∈ S(A) one has λ(q0 , uvw) = λ(q0 , uv w). As the function λ can have only two different values, we have two equivalency ≡f classes: [x0 ] and [x1 ]. To the first of them belong all words with the last letter x0 , and to the second one the words with the last letter x1 . To these two classes we add also the equivalency class [Λ] corresponding to the empty word Λ. As multiplication of classes is defined by multiplication of their representatives, it can be seen from the definition of the trigger that multiplication of the classes [x0 ], [x1 ] and [Λ] is done by the rules t · 1 = 1 · t = t,
t · [xi ] = [xi ].
In this table we have denoted by t any of the classes [Λ], [x0 ] or [x1 ] and put 1 = [Λ]. 6. To an arbitrary monoid S one can associate the automaton M (S) = (S, S, S, λ, δ), where the functions λ and δ are given, using multiplication in S, as follows λ(q, u) = δ(q, u) = q · u ∈ S
for all q, u ∈ S.
Applying this construction to the monoid Sf , we get the automaton M (Sf ). It turns out that this models the behavior of the automaton M (f ) in a poor way. Therefore we complement the construction of M (Sf ). Let the function if : Sf → B be given by the formula if (s) = f (u), where s denotes the equivalency class [u]. As the value of if does not depend on the choice of the representative u ∈ S(A) from the class s, the definition is consistent. An automaton, whose behavior is “close” to the behavior of M (f ), is the automaton M (Sf , if ) =
7. Theory of automata
421
(Sf , Sf , B, λ, δ), where λ(s, s ) = s · s and δ(s, s ) = if (s · s ). The automaton M (Sf , if ) is close to the automaton M (f ) in the following sense: to the automaton M (Sf , if ) it is possible (if necessary) to add a coder of its input signals, and a decoder of its output signals in such a way that for each state of M (f ) there exists a state of M (Sf , if ) such that when the automaton M (Sf , if ) starts at this state, it maps the input signals in the same way as M (f ) does. In this case one says that M (Sf , if ) is a model of the automaton M (f ). As the automata M (Sf , if ) and M (f ) can be interchanged in the case at hand, these automata are quasi-equivalent. 7. What kind of information about the automaton M (f ) does the pair (Sf , if ) contain, and how can this information be used to study the automaton M (f )? The more extensive the functions fulfilled by the automatical devices become, the more grow the dimensions of its blocks and the complexity of its hierarchical structure. This tendency forces to construct the devices in several steps: first one determines the structure of the blocks of the automaton and afterwards the optimal block structure. This way the assembling of more complicated automatical devices has to be dealt with, which leads to the necessity of theoretical treatment of these problems – the theory of decomposition and synthesis of automata do the job. The decomposing of an automaton into “bricks” can be done in different “scales”. In case of the computer we can consider as such bricks both semiconductors as well as the complete circuits, while studying the brain, whole parts of it or just specific neurons. It is clear that a presupposition for a successful theory of decomposition and synthesis is the choice of an optimal “scale”. The classification of the bricks and the determination of their properties requires always a specific knowledge about the domain to which these objects belong. The experimental studies are here always accompanied by mathematical methods, where logic and algebraic methods have a sufficiently prominent position. In the first part of the paper we spoke of the work ofMcCulloch and Pitts on neural networks. It follows directly from the corresponding definitions that each formal neural network can be considered as a finite automaton. However, the possibility to realize the behavior of every finite automaton in some formal neural network is somewhat of a surprise. This result of McCulloch and Pitts solves the problem of decomposition of automata, if in joining of its primitive building blocks (the neurons), cycles are allowed (in joining neurons into the network, rather complicated cycles may occur). However, in practice one imposes several kinds of restrictions to the presentation of the automata (or their blocks) to exclude such cycles. Often serial or parallel connection of automata, or some combination of these, etc. is used. The properties of several connections of such type are reflected in the notion of cascade of automata. Let us now introduce this important concept. We consider an automaton M = (A, Q, B, λ, δ) such that its state at each moment of time determines the output at that moment, i.e., there is a function β : Q → B such that δ(q, x) = β(λ(q, x)). Such an automaton is called a state-output automaton or a Moore automaton. An example of a Moore automaton – a trigger is already known to us. Another important example (the P R-automaton) will be introduced to the Reader in the following Section. Let there be given two Moore automata M = (A, Q, B, λ, β) and M = (A , Q , B , λ , β ), an alphabet Z, and two arbitrary functions σ : Z × B → A and κ : Z → A. The coder κ maps each signal from Z to a signal acceptable by the automaton M . The coder
422
C HAPTER VI. POPULARIZATION OF MATHEMATICS
σ maps pairs of signals, of which the first component is a signal in Z, and the second one an output signal of M , into signals acceptable by the automaton M . A cascade of the automata M and M is the automaton M ◦ M = (Z, Q × Q, B × B, λ∗ , β ∗ ), whose state and output functions are given by the formulae λ∗ ((q , q), z) = (λ (q , σ(z, β(q))), λ(q, κ(z))), β ∗ (q , q) = (β (q ), β(q)). The functioning of the automaton M ◦ M is illustrated by the scheme in Figure 18. Z
• κ
/M
•
/ σ
/ M
B
/
B
/
Fig. 18
In the special case where there exists a function τ : B → A such that σ(z, y) = τ (y) for all z ∈ Z and y ∈ B, we are dealing with the serial connection of the automata M and M . We have a parallel connection if their exists a function τ : Z → A such that for all z ∈ Z and y ∈ B there holds σ(z, y) = τ (z). The notion of cascade of automata is illustrated also by the proof of the theorem in the next Section. 8. The main result in the theory of decomposition of automata is the following T HEOREM 7.2 (Krohn-Rhodes). It is possible to model each finite automaton with using triggers and a cascade of the automata M (G) corresponding to a suitable finite simple groups G. In this case one says that the given automaton cascades to a set of these automata. The most interesting route to this result belongs to H. Zeiger [8]. In the central role here is the notion of P R-automaton that is a Moore automaton where each input signal induces either a substitution on the state space or else brings the automaton into a state fixed by this input signal (that is, does not depend on the state of the automaton before receiving the input signal). It follows from the definition that to each P R-automaton a kind of substitution group on the set of its states is related. It turns out that a set of such automata is sufficient to build an arbitrary finite automaton. More exactly, we have the following result. T HEOREM 7.3 (Zeiger). Each finite automaton can be presented by a cascade of P R-automata. The proof of this theorem is rather complicated. It is essential to note that the method of covers (or mosaic pictures) used at this can probably be adapted for mathematical
7. Theory of automata
423
treating of certain problems in biology. The route from Zeiger’s Theorem to the KrohnRhodes Theorem consists of two steps. At first one shows that each P R-automaton can be modelled on a cascade of the automaton M (G) corresponding to its substitution group G and a suitable automaton K, where K again can be cascaded to triggers. At the second step, one connects M (G) with a set of simple finite groups. In this argument an important role is played by the notion of composition series of a group and the JordanHölder theorem94. As the factors of the composition series of G are simple groups, it suffices to establish the following result. • κ
/ M (G/H)
•
/ σ
/ M (H)
/ β O
/
Fig. 19
L EMMA 7.4. Let M = M (G) be an automaton corresponding to the finite group G. Then M cascades into the automata corresponding to the factors of the composition series of G. P ROOF. First we show that, for each finite group G and for its normal divisor H < G, the automaton M (G) cascades into the automata M (H) and M (G/H). For this we require the scheme given in Figure 19. We fix (arbitrary) representatives for the orbits Hg of G. In the following, we denote by g the representative chosen for the orbit Hg. The work of the above scheme proceeds as follows: At time t: • the signal g2 ∈ G appears at the input, • M (G/H) is in the state Hg1 , • M (H) is in a state h1 ∈ H such that h1 g1 = g1 . At time t + 1: the coder κ maps the signal g2 to the signal Hg2 (here Hg2 = Hg2 ) and as the result the automaton M (G/H) produces the signal Hg1 · Hg2 = Hg1 · g2 = H(g1 · g2 ) , which goes to the coder β. At the same time the signal Hg1 , from the output of M (G/H) goes, together with the signal g2 , to the coder σ. The working principle of the coder σ is the following: σ : (g2 , Hg1 ) −→ g1 g2 [(g1 g2 ) ]−1 = h ∈ H. 94The relevant notions about groups used in the present paper are also set forth in [K70] (Section 4 of this Chapter).
424
C HAPTER VI. POPULARIZATION OF MATHEMATICS
The signal received h ∈ H goes now onto M (H), which is in the state h1 . The behavior the automaton M (H) can be described by the following equation h1 · h = h1 g1 g2 [(g1 g2 ) ]−1 = g1 g2 [(g1 g2 ) ]−1 . The coder β maps pair of signals according to the rule: β : (H(g1 g2 ) , h1 h) −→ h1 h(g1 g2 ) = = g1 g2 [(g1 g2 ) ]−1 · (g1 g2 ) = g1 g2 . These computations show that the statement made at the beginning of the proof is valid. We prove now the lemma by induction over the length of the composition series of G (in view of the Jordan-Hölder theorem the length of the composition series is an invariant of the group). If G is a simple group, then the validity of the theorem is evident. In the opposite case G has a non-trivial composition series G = G0 > G1 > · · · > Gk−1 > Gk = (1). Assume that the assertion has been proved for all groups with a composition series of length ≤ k − 1. The reasoning given at the beginning of the proof shows that M (G) cascades into the automata M (G/G1 ) and M (G1 ). From the induction hypothesis we see that M (G1 ) cascades into the automata that correspond to the factors G1 /G2 , G2 /G3 , . . . , Gk−1 /Gk = Gk−1 of the composition series. Thus we see that the seeking for cascade for the automaton M (G) is found. ( ' 9. Triggers can be viewed as sufficiently simple building blocks for automata. But what can be said about the automata corresponding to simple groups? If we had a list of all simple groups G and their necessary properties, then the Krohn-Rhodes Theorem would give a solution to the problem of decomposition of automata. But so far there is no such list.95 Therefore there arises the idea to look for even simpler building blocks than the automata M (G) corresponding to simple groups. For this one should continue cascading these automata. A closer investigation of the relations between the cascade of the automata and its semigroup shows that such a desire cannot be out into practice. We call an automaton M noncascadable if each time that it is modelled by the cascade of two automata M1 and M2 (with the corresponding semigroups S1 and S2 ) it follows that either M (S1 ) models M , or else M (S2 ) models M . An algebraic treatment of the problem of noncascadability of automata is made possible by the following two notions. D EFINITION 7.5. Let S1 and S2 be two semigroups, and let σ : S1 → End(S2 ) be a homomorphism of the monoid S1 to the semigroup of endomorphisms of the monoid S2 . 95The classification of simple groups has been worked on for about 70 years. First whole series of such groups were found, but later also individual groups of very high order. For example, at a meeting on Ireland (Galways) in 1973 about the application of computers in algebra, M. Hall treated a group of order 460 815 505 920, which simplicity was to be decided by a computer. It is so that there an algorithm can be given for decision of simplicity of the group by its Cayley’s table. A deficiency of such an approach is apparently that we lack a criterion to determine if the list already composed contains all simple groups or not. The question of the existence of such a criterion is difficult. Translator’s Note. The problem of the classification of simple groups is now settled. See Gunnar Traustason’s comments to Section 4 in this Chapter, and the references indicated there.
7. Theory of automata
425
Then the semi-direct product S2 Δσ is the set of all pairs in S2 × S1 with the composition (multiplication of pairs) given by the rule (s2 , s1 ) · (s2 , s1 ) = (s2 · σs1 (s2 ), s1 · s1 ). D EFINITION 7.6. A semigroup S is atomary if for all possible homomorphisms σ : S1 → End(S2 ) it follows from the relation S|S2 Δσ S1 that S|S1 or S|S2 . (Here the notation P |Q means that the semigroup P divides the semigroup Q in the following sense: there exist a semigroup Q1 ⊂ Q which can be mapped epimorphically onto P .) It turns out that an automaton is noncascadable if and only if its semigroup is atomary. This result is important because it transfers the difficulties related to the problem of deciding the noncascadability of an automaton to the algebraic domain. A somewhat expected fact is the atomarity of the semigroup of a trigger. The atomarity of simple groups is studied using an algebraic reasoning of considerable logical depth. Therefore a refinement of the building blocks obtained in the Krohn-Rhodes Theorem is not possible, so that getting an overview of all possible finite automata is, within the framework of this theory, tightly connected with the classification of simple finite groups. 10. At an international meeting devoted to universal algebras and their applications, held in Potsdam in 1970, Samuel Eilenberg indicated a new road to the Krohn-Rhodes result. His approach has in automata theory about the same effect as the passing from equations to the study of extensions of fields in Galois theory; in both cases it leads to a widening and a clarification of the theory. The ideas of automata theory have found widespread application in the creation of formalized methods in the schemes for designing computers. And likewise in the solution of theoretical problems in programming. Let us especially mention that it were the concepts arising from the algebraic decomposition theory of Krohn-Rhodes that made it possible for R. Kalman, in the years 1962-67, to carry out an “algebraic reform” in the theory of linear dynamic systems. This branch of optimal control theory turned in this way especially in relief and makes possible a fast improvement and widening of the theory in future. Undoubtedly, it is true that major progress in algebra has always been related to possibilities for an inner development of its theories. Moreover, it is necessary to use opportunities to apply the obtained results outside the traditional borders. Analytical cybernetics gives here an excellent opportunity. One cannot hope that all branches of algebra have yet been created which could turn out to be necessary in the course of such research. Also, it does not seem today probable that the necessary apparatus ought to be purely algebraic, although it is true that algebra takes always a fundamental role in all kinds of structural theories. One should rather treat the existing algebraic theories and facts as the basic matter in the creation of a language that will be adequate for a mathematical description of complex automata. References [1] N. Basov and O. Krohin. Laser-71. Izvestiya, Feb. 12, 1974. [2] H. Dreyfus. What computers can’t do: the limits of artificial intelligence. Harper & Row, New York, 1972. [3] V. Gluškov. Abstract theory of automata. Usp. Mat. Nauk 5, 1961, 3–62.
426
C HAPTER VI. POPULARIZATION OF MATHEMATICS
[4] M. Gross and A. Lantin. Notions sur les grammaires formelles (The theory of formal grammars). GauthierVillars, Paris, 1967. Russian Translation: “Mir”, Moscow, 1971. [5] K. Krohn and J. Rhodes. Algebraic theory of machines. I. Prime decomposition theorem for finite semigroups and machines. Trans. Am. Math. Soc. 116, 1965, 450–464. [6] W. Mc Culloch and W. Pitts. A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys. 5, 1943, 115–133. [7] P. Rashevskiˇi. On the dogma of the natural number system. Usp. Mat. Nauk 28 (4), 1973, 243–246. [8] H. P. Zeiger. Cascade decomposition of automata using covers. In: M. A. Arbib (ed.), Algebraic Theory of Machines, Languages, and Semigroups. Academic Press, Netherlands, 1968, 55 – 80.
427
8.
[K93c] Mordell’s problem Comments by G. Almkvist
In this paper we describe two results which have been widely known in the past 15 years, crowning the efforts of numerous mathematicians since the times of Pierre Fermat (1601-1665). At the same time we try to present the notions which allowed to arrive at this milestone. Regretfully, however, major events in the area of mathematics resemble high peaks of mountains – even if we have already climbed up to them, a majority of them remain unaccessible to but the very few, possessing the necessary special equipment and training for reaching these heights. Therefore it is natural that although people often speak romantically about the peaks of the “mathematical mountain range”, they avoid to mention the techniques and paths leading to the goals. About the latter it is also exceedingly difficult to speak, and which is even worse – the narrative becomes fragmentary and, as it is so hard to understand, it creates only displeasure. The author was however encouraged by many people (among them not only mathematicians) who nevertheless want to know more about the results of Pierre Deligne and Gerd Faltings. This paper is based on material presented in the first half of my talk “On old and new problems in Discrete Mathematics” at a meeting on Estonian mathematicians in Saaremaa96. In streamlining it I was very much helped by pertinent remarks done by Docent R. Prank, and, in particular, by Professor Ülo Kaasik. But it is hardly necessary to add that it is impossible to eliminate all its shortcomings – but, of course, the author does not accuse anyone besides himself.
8.1. The algebra of Fermat’s equation. 1. Since times immemorial the main object of study in mathematics has been numbers and equations. The simplest are the algebraic equations, and much more complicated the Diophantine equations. During the past centuries new objects of study have enriched mathematics - functions have been added, and differential and functional equations. The properties of integer numbers and Diophantine problems have attracted mathematics since the time of Hammurabi. For instance, in these days, one knew the equation x2 +y 2 = z 2 , which has the solution (3, 4, 5), and also all triples of the form (3n, 4n, 5n), where n ∈ N97. Triples of integers (a, b, c), having only number one as a common divisor and satisfy the relation a2 + b2 = c2 , are called simple Pythagorean triples. Let m ≤ n be natural numbers without a common divisor and of different parity. Putting the relation a2 + b2 = c2 in the form (a/c)2 + (b/c)2 = 1, we observe that to each triple (a, b, c) there corresponds a solution of the equation x2 + y 2 = 1 in terms of rational numbers (a Q-solution Q-solution), and, apparently, also more or less conversely. 96
Translator’s note. Saaremaa (Swedish or German: Ösel, Latin: Osilia) big island in the Baltic Sea, belonging to Estonia. 97Here and in the sequel we use the following notation: N = the set of (all) natural numbers, Z = the set of integers, Q = the set of real numbers, C = the set of complex numbers.
428
C HAPTER VI. POPULARIZATION OF MATHEMATICS
y
B λ(x λ, y λ)
x
A (0, −1)
Fig. 20
Thus the equation x2 + y 2 = 1 suffices for finding all Q-solutions. They can be found by the so-called method of pulverization (see Figure 20): Through the point A(0, −1) one has to draw straight lines y = λx − 1 with rational inclination λ ∈ Q and find their intersection Bλ (xλ , yλ ) with the unit circle x2 + y 2 = 1. Note that 2λ λ2 − 1 and yλ = 2 xλ = 2 λ +1 λ +1 are rational numbers. In this way we obtain all Q-solutions of the equation x2 + y 2 = 1; for any point Bλ (xλ , yλ ) determines a line ABλ with the rational slope (yλ + 1)/xλ . 2. The preceding serves for us as an example of a Diophantine problem. The majority of these problems have a long history, but all of them reduce to the solution of a Diophantine equation (or, at least, to a system of such) or, else, are at least closely connected with this. A Diophantine equation can be presented in the form F (x1 , . . . , xn ) = 0, where F is a polynomial with integer coefficients and n ≥ 2; one usually seeks its solutions in integers or rational numbers. These equations get their name from Diophantus, who was one of the greatest mathematicians of antiquity. He lived an flourished in Alexandria in the 3-rd century B.C. After the death of Alexander the Great, this city in the delta of the Nile had become the capital of Egypt. In Alexandria the Museon (a university, in contemporary terminology) was founded, and a library. Poets and writers were invited to the city, in better times there worked as many as 100 scientists. Among them was Euclid, who wrote his Elementa; Erathostenes, who excelled in many areas (for example, he is known for his method for finding prime numbers), and was the director of the library; Archimedes, who got his education there, and most of whose ideas became known through letters to scholars in Alexandria; Apollonius, whose “Conic sections” paved the path for the later work of Kepler and Newton. Such was the ancient center of culture where Diophantus wrote his “Arithmetic”. Out of 13 chapters of the latter book [4] only 6 or 7 have survived, but despite this the treatise came to strongly influence the development of Mathematics.
8. Mordell’s problem
429
Its true richness in ideas and content was appreciated only at the end of the 16th century (F. Viète, R. Bombelli), but especially in the 17th century. Diophantus’s text was translated into Latin by Claude Bachet (1621), whose own interest in numbers had been arisen by the solution of (mathematical) problems of recreation. This translation, the number-theoretic comments of which are especially emphasized, fell into the hands of Pierre Fermat and, in the years 1636-1640, turned the latter’s mathematical interest ever more towards Diophantine problems. In his research Fermat came to a conjecture which afterwards was called “Fermat’s Last Theorem” (FLT): “No cube decomposes into the sum of two cubes, no fourth power decomposes into the sum of two fourth powers.” In other words, the Diophantine equation xn + y n = z n admits, for n ≥ 3, no solutions in terms of natural numbers. Fermat had proved [this in] the special case n = 4 (using the so-called method of decent), and he may have had the case n = 3 in its broad outline (as was carried out by Euler in 1753). Fermat frequently wrote to his colleagues about these special cases, but he never mentions the general case again. (Around 1640 Fermat wrote a note in the margin of his copy of Diophantus’s book that he was in possession of a proof of FLT, but that it was far to long to be written down there; according to A. Weil’s version, Fermat might have thought so only in his youth). With his intensive work, up to the year 1660, Fermat laid a solid basis on which Euler, Lagrange, and Gauss later could build the edifice of Number Theory. 3. In the following centuries, FLT was one of the most well-known problems in mathematics. The attempts to solve this problem using new technique have enriched mathematics with quite fruitful notions and √ methods. In the case n = 3 Euler used in his proof of FLT numbers of the form a + b √−3, where a, b ∈ Z, and developed in an essential way the arithmetic of the domain Z[ −3]. This√line of thought was continued by Gauss √ (1831), who invented the domain G = (a + b −1; a, b ∈ Z) = Z[ −1] = Z[i], or the arithmetic of so-called Gaussian numbers. In 1825 Legendre and Dirichlet established FLT for n = 5, while Lamé and Lebesque did it for n = 7 in 1840. Really, it suffices to prove the theorem for n = 4 and for n a prime number. Indeed, each natural number n ≥ 3 is either divisible by 4 or else by an odd prime number. In the first case one can rewrite xn + y n = z n as (xm )4 + (y m )4 = (z m )4 , in the second case as (xm )p + (y m )p = (z m )p , where p is an odd prime. Now it is clear that from the truth of FLT for n = 4 and for all primes n > 2 follows also its truth in the case of all n ≥ 3. 4. An essential step forward toward the solution of FLT was taken by the German mathematician Ernst Eduard Kummer (1850’s). He managed to find a condition from which the correctness of FLT follows for almost all primes n less than 100. Only in three cases (37, 59 and 67) the issue remained open, because then Kummer’s condition does not work; the case n = 37 was solved later (1892). In this and the following four Subsections we shall learn about Kummer’s scheme of reasoning. √ In his proof of√FLT for n = 3 Euler used the quadratic field Q( −3), and in particular its subring Z[ −3], that is, properties of the Eulerian ring of integers. Euler noticed 2 2 that if, for a, b relative √ prime a +√3b , is a perfect cube, then it follows from the identity 2 2 a + 3b = (a + b −3)(a√− b −3) that also both factors of the right hand side are perfect√cubes in the ring √ Z( −3). In particular, there ought to exist c, d ∈ Z such than of a + b −3 = (c + d −3)3 . This would be sufficient, if the fundamental theorem √ arithmetic (uniqueness of the decomposition into prime factors) is true in Z[ −3] and
430
C HAPTER VI. POPULARIZATION OF MATHEMATICS
√ √ the factors a + b −3 and a − b −3 were without a common factor – this is what our experience in ordinary arithmetic tells us. √ But the fundamental theorem of arithmetic fails in the ring Z( −3). Indeed, we have √ √ 4 = 2 · 2 = (1 + −3) · (1 − −3), √ √ where a simple argument (by contradiction) shows that the factors 2, 1 + −3, 1 − −3 are indecomposable; the number 2 does not have a representation in the √ for example, √ form 2 = (e + f −3)(e − h −3). 5. But there is a way out if this dilemma. Let ζ 3 = 1, ζ = 1;√thus, ζ is a C-solution of the equation x2 + x + 1 = 0, or to be concrete: ζ = (−1 + −3)/2. We consider the domain of numbers Q[ζ] = {r + sζ | r, s ∈ Q}. As it is possible to carry out the four arithmetical operations with this set without violation of the usual rules of calculation, then in the guise of Q[ζ] we are dealing with a so-called number field Q(ζ). The subset E = Z + Z · ζ = {a + bζ | r, s ∈ Z} should, of course, be viewed √ as the “integers” of the field Q(ζ). In view of the identity a + bζ = ((2a − b) + b −3)/2 these integers √ can be represented as (p + q −3)/2, where p and q are ordinary integers, √ both of the same parity. It is somewhat more natural to consider the “integers” (p + q −3)/2 ∈ E, and not restrict oneself to the use of Euler’s integers. The reason is that the fundamental theorem of arithmetic holds true in E, but not in the ring of Eulerian integers. As an explanation we add that the identity √ √ 2 = (1 + −3) · (1 − −3)/2, √ does not contradict the fundamental theorem of arithmetic, as (1 − −3)/2 is a unit in this ring (its inverse is, also an “integer”), and so the identity shown expresses just the √ fact that the number 2 and 1 + −3 are associated (one number is obtained from the other by multiplication with a unit). It follows from the correctness of the fundamental theorem of arithmetic in the √domain E and the immediate verification of the fact that the Eulerian integers a + b −3 √ have no and a − b −3 have no common factor in the ring E (that is, these numbers √ common factor there distinct from a unit) that there exists a number c + d −3 ∈ E such √ √ √ than a + b √ −3 = (c + d −3)3 . It may then happen that c + d −3 is a number of the form) (p + q −3)/2, where p and q are both odd; then c and d are not integers. From the last identity we get p = 2u − v and q = v. Thus, if v is even, then p and q are also even, and, as a consequence, u + vζ is an Eulerian integer. We observe further that among the three numbers u + vζ, (u + vζ) · ζ and (u + vζ) · ζ 2 (associated to each other) there is in front of the multiplier ζ an even number.√It follows from √ these reasonings that, if necessary multiplying both members of a + b −3 = (c + d −3)3 with ζ or ζ 2 , that c and d are integers. 6. Next, let us consider the relation xp + y p = z p , where p ≥ 3 is a prime. Accordingly let ζ be a p-th order root of unity other than unity: ζ p = 1, ζ = 1. Then 1 + ζ + · · · + ζ p−1 = 0, so that we have the relation xp + y p = (x + y)(x + ζy)(x + ζ 2 y) . . . (x + ζ p−1 y). This gives the idea to use the ring of so-called algebraic integers Ep = {a + bζ + cζ 2 + . . . dζ p−1 | a, b, . . . d ∈ Z}
8. Mordell’s problem
431
contained in the number field Q(ζ); similarly, this ring is denoted Z[ζ]. The arithmetic of [ordinary] integers is based on the notion of integers and the fundamental theorem of arithmetic. It follows effectively from the latter that if AB . . . F = Lp and the factors A, B, . . . , F do not have a common factor, then they must likewise be pth powers. Are such statements true in number rings other than the ordinary integers? Sometimes it is so – for instance, in the case of Gaussian integers. It is also so in the rings Z[ζ], where ζ = cos(
2π 2π ) + i sin( ); p p
p ∈ {3, 5, 7, 11, 13, 17, 19},
in which case one has into prime factors. On the other hand, in √ a unique decomposition √ the number rings Z[ −3] and Z[ −5] the unique factorization fails, as: √ √ 4 = 2 · 2 = (1 + −3)(1 − −3) and √ √ 9 = 3 · 3 = (2 + −5)(2 − −5). Although, for example, in the last product the factors √ 2± cannot be represented as squares of numbers a + b −5.
√ −5 are unity divisors, they
7. Hilbert found a simple model explaining the difficulties in what was just said, indicating also a way to overcome them. Let us consider the domain of numbers H = { 4n + 1 | n = 0, 1, 2, . . . } = {1, 5, 9, 13, 17, 21, 25, . . .}. A number p ∈ H is called a “prime number”, if it cannot be represented in the form p = a · b, where a, b ∈ H and a = 1, b = 1. For example, 21 turns out to be prime in “H-arithmetic”: although 21 = 3 · 7, one has 3 ∈ H and 7 ∈ H. We note that 693 = 9 ·77 = 21 ·33 are two distinct decomposition of 693 ∈ H, but from the equalities 212 = 9 · 49 and GCD(9, 49) = 1 it does not follow that the numbers 9 and 49 can be presented as squares of H-numbers.
= A way out is the following. We extend the set H to the “domain of numbers” H {4n + 1, 4n + 3 | n = 0, 1, 2, . . . } = {1, 3, 5, 7, . . . }, in which all the rules of ordinary Z-arithmetic hold true (one has to take into account that all even numbers have been omitted). For example, one has now 9 = 32 , 77 = 7 · 11, 21 = 3 · 7 = 3 · 11, so that the two distinct decompositions of the number 693, to wit 9 · 77 and 21 · 33 reduce to a single one 32 · 7 · 11; also other difficulties disappear. √ 8. Is it possible to extend the domain Z[ −5], by adding to it new so-called “ideal numbers” in such a way that the the fundamental theorem of arithmetic remains valid in the new domain? Kummer showed that this can really be done; indeed, not only in the √ case of Z[ −5], but even in many other cases. In this way mathematicians arrived at the notion of ideal numbers in the middle of the 19th century. Later Richard Dedekind, from a set theoretic point of view, changed them into the often used “ideals” (for instance, in ring theory). In the work of Kummer, Dedekind, Kronecker, and others there arose a new, general theory of division in domains of numbers – the theory of algebraic numbers (for more details see [13]).
432
C HAPTER VI. POPULARIZATION OF MATHEMATICS
The conditions found by Kummer, which were referred to in our discussion of FLT, have not lost there importance even today98. Basing himself on them and using computers, S. Wagstaff showed that FLT is true for all primes p ≤ 125 000 [20]. Let us add that to write down the number 2125000 one requires 37 628 digits, so that finding a counterexample to FLT is a rather hopeless task! Even more, based on results by G. Faltings (1983), D. Heath-Brown proved that FLT is true for almost all exponents n [5]. In other words, “bad exponents, if they exist, must appear very “seldom”. More precisely, if we denote by N (c) the number of bad exponents n not exceeding c, i.e., N (c) = |{n | n ≤ c and FLT is not true for n}|. then N (c)/c → 0 as c → ∞. But so far it is not known if there are infinitely many “good” exponents, or not. 9. In what respect does the two well-known number domains Z and Q differ from each other? How to express the common features of the number domains considered above? If we apply addition, subtractions and multiplication to the integers, we again obtain integers. Thereby addition and multiplication are associative and commutative, these two are connected via the distributivity law, while addition is supplemented by subtraction (the opposite of addition) – therefore in the case of Z, we have to deal with a ring. But division is not always possible in Z. In the domain of rational numbers the situation is different: the ring Q is a field, that is, a commutative ring in which for all a = 0 the equation ax = b has a unique solution. Also real numbers, and likewise the complex ones √form fields, but √ also suitable subsets of these fields R and C are fields: for example, Q[ 2] = {a + b 2 | a, b ∈ Q} or Q[i] = {a + bi | a, b ∈ Q}; the latter field contains the ring of Gaussian integers Z[i]. A typical example of a finite field is the so-called residue class field Zp , where p is a fixed prime. The elements of this field are classes of integers ¯ 1, ¯ 2, . . . , p − 1; these are obtained by putting into one and the same class all integers which give the same remainder upon division by p. The operations with classes is defined by the formulae m + n if m + n < p. m+n= m + n − p if m + n ≥ p. m·n=r
where r = mn − pt, 0 ≥ r < p.
More generally, one can view a finite field as a factor ring Z[x]/(g(x)), whose elements are equivalence classes of polynomials with coefficients in the field Zp ; here g(x) is a fixed polynomial of degree m, assumed to be irreducible over Zp . Two polynomials are considered to lie in the same class if their difference is divisible by g(x). The operations on this set of classes are as usual defined with the help of their representatives. It turns out that for each prime p and each natural number m this construction gives (up to isomorphism) a unique field denoted by Fpm ; there are no finite fields beyond the described series {Fpm |m ∈ Z, p a prime number}. Finite fields play an important role in Number Theory and in Diophantine Analysis – for example, one can replace a congruence F (x, . . . ) ≡ 0 mod pm by the equation F (x, . . . ) = 0 in the field Fpm etc. Let there be given a field K. Any field E containing this field K as a subfield is called an extension of K and is denoted E/K: thus C/R or R/Q or again C/Q are 98Translator’s note. Let us recall that the paper was written around 1988.
8. Mordell’s problem
433
extensions. An extension E/K is said to be finite if E can be regarded as a finite dimensional vector space over the ground field K. Finite extensions of the field of rational numbers Q are called algebraic number fields. From the previous discussion it may be inferred that especially algebraic number fields play a particular role in Diophantine Analysis. A deepened, but still sufficiently readable account of the themes consider in this first half of our paper can be found in the book [13].
8.2. The geometry of Diophantine equations. Greater clarity in the problems of Diophantine Analysis was created during the course of the 19th century by Algebraic Geometry, a dynamically developing subject. This discipline studies algebraic varieties. To each Diophantine equation one can associate a geometric object – a variety, whose points can be interpreted as solutions of the given equation(s). In what sense is such a geometry point of view better than the purely arithmetical methods of Diophantine Analysis? The advantage manifests itself in the fact that with an algebraic variety one has to deal with the interplay of a whole series of algebraic and topological structures, it is a topological space (even in several topologies), an analytical space, a Lie group etc. These structures have been intensively studied in the course of time, and obtained deep results which, used together with arithmetical considerations, put in new light many notions of Diophantine Analysis. This approach allows one to classify Diophantine problems according to the invariants of the variety. An example of such an invariant is the dimension of the variety. The Diophantine problems which interest us most here are mainly connected with one-dimensional algebraic varieties, traditionally called algebraic curves. 10. The geometric interpretation of an equation can lead to surprises. For example, consider the two equations x2 + y 2 = 1 and x3 + y 3 = 1, which superficially differ little from each other, and let us interpret them as curves in the R-plane (that is, a plane with real coordinates). Then they give two quite different pictures (see Figure 21). y
y
1
x
E : x2 +y2 = 1
1
E : x3 +y2 = 1
Fig. 21
x
434
C HAPTER VI. POPULARIZATION OF MATHEMATICS
The equation x2 +y 2 +1 = 0 surprises even more – there exist “curves” without R-points! In order to get rid of this inconvenience one allows oneself too took look for points of the corresponding curve in the C2 -plane (that is, with coordinates in the extension C/R). For example, if we in the case of the curve under view set x = x1 + ix2 and y = y1 + iy2, we get relations connecting the R-quantities x1 , x2 , y1 , y2 , which in the case of the equation x2 + y 2 = 1 is a sphere in R4 , but for x3 + y 3 = 1 a torus in the same space (see Figure 22, where these surfaces are depicted in the usual space)
Sphere
Torus
Fig. 22
In Diophantine geometry one often encounters situations when the coefficients of the equation determining the curve are from one domain (the field K) but the coordinates of the sought points have to be taken from another domain (an extension L/K of K). 11. In the geometric interpretation of the solution of a Diophantine equation one requires the notion of projective space. Let us fix a field K. The points of the n-dimensional affine space An (K) can then be identified with sequences (x1 , . . . , xn ) in the set K n , The projective space Pn (K) is now obtained as follows. We denote by (K n+1 )∗ the set of all sequences (x0 , x1 , . . . , xn ), omitting the origin (0, 0, . . . , 0). We partition this set in such a way that we regard the points (x0 , x1 , . . . , xn ) and (y0 , y1 , . . . , yn ) in (An+1 (K))∗ to lie in the same class if there exists a ∈ K (a = 0) such than x0 = ay0 , x1 = ay1 , . . . xn = ayn . The set of classes thus obtained is called the n-dimensional projective space Pn (K). These equivalence classes may be viewed as straight lines through the origin in the affine space An+1 (K). In the special case when K = R and n = 2, we obtain then the (ordinary) real projective plane. Let there be fixed a set I of natural numbers and let there be given for each i ∈ I a polynomial Fi (x1 , . . . , xn ) with coefficients in K. The point set in An (K), defined by the system Fi (x1 , . . . , xn ) = 0, where i ∈ I and L/K is a suitable extension of the ground field K, is called an affine variety. A system of equations Fi (x1 , . . . , xn ) = 0, where all polynomials Fi are forms (that is, homogeneous polynomials over K), determines a subset M in the projective space Pn (K), called a projective algebraic variety. As points in the projective space Pn (L) are rays through the origin in An+1 (L), we may view M as a cone in An+1 (L). The field K is the field of definition of M . The points (x0 , x1 , . . . , xn ) in M ⊂ Pn (L) such than all quotients xi /xj ∈ K are called its rational points; their set will be denoted by M (K). The answer to the question about the structure and the properties of the point set M (K) is of paramount interest in the study
8. Mordell’s problem
435
of the corresponding Diophantine system. In the case of algebraic curves these questions were studied very carefully yielding also decisive progress towards the solution of FLT. 12. Consider an equation F (x, y) = 0 such than the left hand side is a polynomial of the form am (x)y m + am−1 (x)y m−1 + · · · + a1 (x)y + a0 (x), where all ai (x) are x-polynomials with real coefficients. Selecting in the plane A2 (R) all points (x, y) whose coordinates satisfy the equation F (x, y) = 0 we obtain a certain curve. Thus the equation y 2 + x2 − 1 = 0 defines a circle with center (0, 0). Next, consider a curve E with equation aij xi y j = 0, i,j
where all aij are integers. We denote the set of Z-points of this curve by E(Z), and its Q-points of this curve by E(Q). Let further o(E) denotes the order of the curve, that is, the maximum of i + j for the monomials aij xi y j appearing in the equation. In 1912, Carl Ludwig Siegel proved an important theorem to the effect that, on any curve of degree higher than two, there are at most finitely many Z-points. However, it is not easy to decide if E(Z) = ∅ or not; there is no algorithm for this. The question about the existence of an algorithm for deciding whether E(Q) = ∅ or not is entirely open. Before the work of J. L. Mordell(in the 1920’s) the following was know about E(Q). First, if o(E) = 1, then |E(Q)| = 1. Second, if o(E) = 2, then either E(Q) = ∅ or |E(Q)| = ∞. Third, in the case o(E) = 3 already Diophantus knew that a line through two Q-points of E must intersect E in a third Q-point. In the case of these lines Poincaré’s conjecture became known (1903), according to which one can find all Qpoints from a certain finite set, using the following geometric procedure: one has to draw all possible chords among the points of a given finite set and the tangents in these points, which by intersecting the curve generate new Q-points (starting with this extended set of Q-points one draws anew all chords and tangents etc.) This conjecture by Poincaré was proved (1922) by the British mathematician Joel Louis Mordell. According to Mordell’s theorem one can present the Abelian group of the rational points on the elliptic curve E as E(Q) = Za ⊕ VE , where VE is a finite group. The number a is called the rank of the curve. So far it is not known if there exist elliptic curves of arbitrary large rank, but computer experiments have shown how the rank depends on the coefficients of the cubic equation defining E. The group VE – the torsion of E is made up by its points of finite order. It turns out that either VE is a finite cyclic group or else it has the form Z2 + T with T finite. B. Mazur showed, in 1976, that either |VE | is one of the numbers 1, 2, . . . , 10 or 12 or else VE = Z2 + T , where |T | equals 2, 4, 6 or 8. Thus there are 15 possibilities for VE . The deeper reasons for the mystery of these numbers is so far unknown! However, one has obtained hopes for proving the analogue of Mazur’s theorem for an arbitrary number field K/Q, because recently it was found that the the torsion of the group E(K) of K-rational points is finite. A higher dimensional generalization of the Poincaré-Mordell conjecture was proved, in 1927, by the French-American mathematician André Weil. It is amazing that the arithmetic of Diophantine equations is too a large extent governed by the geometry of the point set E(C) for the corresponding curve E. Indeed, E(C) is a certain compact 2-dimensional surface in R4 , called the Riemann surface of
436
C HAPTER VI. POPULARIZATION OF MATHEMATICS
the curve and turns out to be topologically equivalent to a “sphere with handles”. Here B1 , where B1 is the first Betti number of E(C); the the number g of “handles” equals 2 number g is called the genus of the curve E. A curve of genus g = 0 is called rational. These are the straight lines (curves of order one) and the second order curves, and in some sense the list of rational curves stops with the ones mentioned. In this case E(C) is the Riemann sphere, which is a surface admitting a Riemannian metric of constant positive curvature. Curves of genus g = 1 are called elliptic. Each such curve belongs to an equivalence class of birationality of an non-singular cubic curve, the equation of the latter can be put in the form y 2 = x3 + ax + b, where a, b ∈ Z. The Riemann surface corresponding to such a curve is a torus which allows a flat metric induced by C2 . The set of Q-points (if it is non-empty) allows the structure of an Abelian group and this group has a finite number of generators (the so-called Mordell-Weil theorem). Curves of genus g > 1 are called non-elliptic. For instance, the so-called Klein curve y 3 + yx3 + x = 0 has genus 3. All Fermat curves xn + y n = 1 with n ≥ 4 are likewise non-elliptic. The genus of such curves equals (n − 1)(n − 2)/2. In this case the curve carries a Riemannian metric of constant negative curvature. The set of Q-points on a non-elliptic curve is finite: more generally: If K is a number field (that is, the extension K/Q is finite), then E(K) is finite. This statement became known, in 1922, as Mordell’s conjecture, but after 1983 it is called Faltings’ theorem. The story of the origin of this result and some of its later consequences will be treated in the following section.
8.3. About the theorems of Deligne and Faltings 13. The problem of finding the Z-solutions of the Diophantine equation F (x, y) = 0 is the “finite” analogue of the solution of the congruence F (x, y) ≡ 0 mod p. The latter can in turn be viewed as the solving of an equation, taking the components of the solutions in the number domain Zp . More generally, one can consider the congruences F (x, y) ≡ 0 mod pm and seek the solutions of the corresponding equation with components in an arbitrary (Galois) field Fq (we agree here and in what follows that p is a fixed prime and write q = pm ). This line of thought was well-known already to C. F. Gauss. As the sets Zp and Fq under view are finite, there arises the question of the number of the solutions of the equations. For example, the equation y 2 + x3 − 1 = 0 has two Z2 solutions, three Z3 -solutions and five Z5 -solutions. In Table 3 below, the columns with a plus sign indicate the Z3 -solutions of this equation. Table 3
y x F =0
0 0 0 1 - +
0 1 2 0 - +
1 1 -
1 2 2 0 - +
2 1 -
2 2 -
We denote by Nm the number of Fq -solutions, where q = pm , of the equation F (x, y) = 0. For simplicity we assume that the prime p, fixed through out the discussion, divides the number n − 1. We agree also that the indices j and k vary in the set
8. Mordell’s problem
437
{1, 2, . . . , n−1}, however in such a way that their sum is not n: the number of such pairs of indices (j, k) equals (n − 1)(n − 2). In the case of the Fermat equation xn + y n = 1 one has the identity J(j/n, k/n)m , (114) N m = pm + 1 − j,k
where the J(j/n, k/n) are the so-called Jacobi sums. In order to clarify their significance, we choose a generator ε of the group Z∗p of p-th roots of unity (that, is a primitive p-th roots of unity) and consider the maps j
χj/n : Z∗p → C,
with χj/n (εr ) = e2πir n .
With the help of these so-called multiplicative characters we can now define J(j/n, k/n) = − χj/n (x) · χk/n (1 − x); x∈Zp
here one has to put χ(0) = 0, for each character, thus also for a trivial one. For example, if p = 7 we have √ J(1/6, 2/6) = −2 − i 3. 14. In 1924, Emil Artin introduced for the numbers Nm a generating function of the form Nm m t ; (115) Zp (t) = exp m Zp (t) contains information about the number Nm , m = 1, 2, . . . of solutions of the equation F (x, y) = 0. The series (115) has two good properties. First, if Nm happens to come in the form αm (for example the number of Fq -solutions of the equation y = f (x) is precisely q, so one can take α = p), then 1 Nm m (116) Zp (t) = exp t . = e− ln(1−αt) = m 1 − αt Second, if Nm = Nm + Nm (for example, if f = G · H and G(x, y) = H(x, y) = 0 is not possible for any pair of elements (x, y) ∈ Fq × Fq ). then Nm m Nm m Nm m t t t = exp · exp . (117) Zp (t) = exp m m m m m m If Nm = αm 1 + · · · + αr − β1 − · · · − βs , where αj and βj are allowed to depend on the equation, but not on the index m, it follows from these properties that
(118)
Zp (t) =
(1 − β1 t) . . . (1 − βs t)) , (1 − α1 t) . . . (1 − αr t))
that is Zp (t) is in this case a rational function. For example, in the case of the Fermat equation xn + y n = 1 one has α1 = 1, α2 = 1 and in the role of the β:s one has the Jacobi sums J(j/n, k/n). Therefore (1 − J(j/n, k/n)t (119)
Zp (t) =
j,k
(1 − t)(1 − pt)
,
where in the numerator one has a polynomial of degree (n − 1)(n − 2).
438
C HAPTER VI. POPULARIZATION OF MATHEMATICS
It is amazing that the arithmetic question about the number of Fq -solutions is tightly connected with the geometry of the associated curve. In 1931, F. K. Schmidt proved that for a curve of genus g one has 2g
Zp (t) =
(1 − αj t)
j=1
(1 − t)(1 − pt)
;
here the numerator contains a polynomial of degree 2g with integer coefficients. Taking √ logarithms of both sides of this equality and, further, using the relation |α|j = p (the so-called Riemann hypothesis in the case of a curve over a finite field), we obtain (120)
Nm = 1 + p − m
2g
αm j x.
j=1
From here it is seen that the Riemann hypothesis is equivalent to the statement √ ∀m |Nm − 1 − pm | ≤ 2g pm . For elliptic curves (the case g = 1) this statement was first proved by Helmut Hasse in 1933. 15. André Weil gave (in 1940-41) a sketch for the proof of the Riemann hypothesis for curves of arbitrary genus g, established this goal (1949) and, likewise, generalized the question to the case of varieties in higher dimension. Weil’s conjecture, in a slightly simplified form, amounted to proving that for arbitrary p there exist complex numbers αkj such than ∀m ∈ N Nm =
2d
(−1)j
j=1
Bj
αm kj ,
|αkj | =
pj ;
k=1
here d is the dimension of the variety X corresponding to the system of equations under view, the Bj are the Betti numbers and Nm is the number of Fq -points of X. It would be more correct to speak of the Weil conjectures, as the exact original formulation consists of four different assertions (conjectures). One reason, why the Weil conjectures are so interesting, is that they directly connect the geometric properties of a curve (the variety X(C)) with its arithmetical properties. Among other things, it follows from them that the more complicated the geometry of the curve (the variety), the more of the numbers of the numbers Nm are needed for the determination of the remaining Nm . Of special interest is the case when a curve E is given by the equation y 2 = f (x). If the curve is elliptic (g = 1), the the numerator in Zp (t) is a second order polynomial and Weil’s conjecture gives that Zp (t) =
1 − ap t + pt2 ; (1 − t)(1 − pt)
the numerator of this fraction we denote from now on by ep (t), def
ep (t) = 1 − ap t + pt2 .
8. Mordell’s problem
439
It follows from (120) that ap = 1 + p − N1 , where ap = α1 + α2 . This result shows that, for the function under view, the function Zp (t), and thereby all numbers Nm , m > 1, are determined by N1 . 16. If we know the function Zp (t) for all p, then we know all the numbers Nm,p . All this information yields also the following function of one complex variable s (the Hasse-Weil function of the curve E) 1 1 Z(E, s) = = ; −s −s ep (p ) 1 − ap p + p1−2s p prime
p prime
the product is convergent in the half-space Re s > 32 . One believes that (TaniyamaWeil conjecture) for each elliptic curve E one can continue Z(E, s) to a meromorphic function in the entire complex s-plane and that the function obtained in this way satisfies some supplementary conditions (the so-called Weil conditions; see [7, pp. 142-143]). In the case that this conjecture is true, one can speak of the “critical” values Z(E, s). It turns out that the behavior of Z(E, s) at the point s = 1 depends on many arithmetical properties of the given elliptic curve over Q. Thus one believes (part of the conjectures of B. Birch and H. Swinnerton-Dyer) that Z(E, 1) = 0 precisely when E has infinitely many Q-points. Finally, we remark that for the curve X : y 2 = x3 + x2 (which is not elliptic!) we have ep (t) = 1 − t, so that 1 Z(E, s) = 1 − p−s p prime
this is the ordinary Riemann zeta-function in Eulerian form. Let us further add that it was the question of the truth of the conjecture of Birch and Swinnerton-Dyer on which J. Tunnel, in 1983, based his proof of his criterion for finding congruent numbers. Congruent numbers are such numbers which give the area of right triangles with integer sides, is likewise a Diophantine problem, known since the 10-th century. For example, the number 6 = 3 · 4/2 is a congruent number. It is of interest to note that proving that the number 1 is non-congruent is equivalent to proving FLT in the case n = 4; see [7]. 17. In proving his theorem Weil had to use various results in geometry of the Italian mathematicians, but in the general case one could not assert that these results had been proved in a convincing way. The attempts to give an adequate, strictly supported foundation to Weil’s plans led, in the 1950-1960’s to the creation of new theories in algebraic geometry. The man who paved the path to this was Alexandre Grothendieck, whose thirst for action, in the 1960’s, was almost inexhaustible. His style was to conquer a gorge by filling it. He tried to treat each notion in a as general way as possible, only those restrictions were taken into account, whose necessity was forced by the mathematical situation. His work may be viewed as a far reaching generalization of the analytic geometry of Descartes, where the real numbers are replaced by the elements of an arbitrary commutative ring. With the aid of the so-called covering cohomology devised by Grothendieck it became possible to interpret the numbers αkj in a way, on which Deligne later based his
440
C HAPTER VI. POPULARIZATION OF MATHEMATICS
proof. Grothendieck’s achievements were recognized by the mathematical community when he was given the Fields medal a the Moscow-ICM in 1966. In the 1960’s one began to have an inkling that there existed a connection between the Weil conjectures and a problem of Ramanujan. Let τ (n) be the coefficient in front ∞ of xn in the power series expansion of the function x m=1 (1 − xm )24 , |x| < 1; τ (n) is always an integer distinct from zero. So far this is not proved, but one has checked it 11 for n ≤ 1015 . Ramanujan had considered it as very plausible that |τ (n)| ≤ n 2 · d(n), where d(n) is the number of divisors of the natural number n. It follows from this 11 that τ (p) ≤ 2p 2 . From 1916 on, this statement is known as the Ramanujan conjecture. Deligne had reason’s to believe in the truth of this relation, because he proved in 1968 that Ramanujan’s conjecture follows from Weil’s. In 1970, R. Langlands draw attention to a possibility which opens up, for the solution of Ramanujan’s conjecture, 29 from little known work of R. Rankin (1939), where the estimation τ (n) = O(n 5 ) was given. While trying to understand Rankin’s discussion, Deligne managed (supported by J.-P. Serre) to “geometrize” Rankin’s method. He connected this method with the topological technique of Solomon Lefschetz for finding fixed points of a mapping, and unified this in an unexpected way with the proof of Weil’s conjecture. Let us add some information about Pierre Deligne. He was born in Bruxelles in 1944. At the age of fourteen he began to read the Elements of Bourbaki which contain the essence of contemporary mathematics. Already this enterprize is astounding, as in these books the treatment goes from the general to the the particular, and in them there is no other motivation besides the logical development of the theme. After having studied some time at the University of Bruxelles, he went to Paris at the suggestion of the group theorist Jacques Tits. There he took part in the activities of the Grothendieck seminar, in particular attending with great interest the lectures of Jean-Pierre Serre, having a number theoretic outlook. Already in 1966 Grothendieck considered him on a par to himself. The style of Deligne has been described as follows: he likes to surpass the gorge, but not by filling it, but by building a bridge. His papers are readable, the ideas are explained in an understandable way, what is told there is necessary and it is told at the right time. Pierre Deligne was given the Fields Medal for his proof of the Weil conjectures at the ICM in Helsinki in 1976. 18. However, the line of thought described above did not lead immediately further on the path of finding the Q-solutions. Thus despite the fact that formula (114) gives the number of Fq -solutions of Fermat’s equation but it does not tell us anything directly about FLT. Even for the question about the existence of Z-solutions there is no answer in the general case and – as was proved by Yu. Matiyasevich in 1970 – there does not exist an answer in terms of a general algorithm. Even more valuable is any general regularity discovered about the Q-solutions of Diophantine equations, one example of this is the above mentioned Mordellconjecture. Let us now pause at this question, giving it a new, simpler formulation. Which Diophantine equations F (x, y) = 0 do have infinitely many Q-solutions? As follows from our above discussion, the answer is positive, for example, in the case of x2 +y 2 = 1. Here we have to deal with a first possibility – all solutions are expressible in terms of a parameter (the genus of the corresponding curve is 0). A second possibility is when the solution of the equation F (x, y) = 0 can be obtained by relations x = Φ(u, v), y = Ψ(u, v), where Φ and Ψ are both quotients of two polynomials with rational coefficients.
8. Mordell’s problem
441
Here the quantities u and v are required to satisfy the relation u3 = v 3 + av + b (with a, b ∈ Z), which equation has infinitely many solutions. In this case the corresponding curve must be of genus 1. Mordell’s conjecture can now be formulated as follows: C ONJECTURE 8.1. Let F (x, y) = 0 be a polynomial in two variables with integer coefficients. If the equation F = 0 cannot be mapped by a change of variables (x, y) → (u, v) to an equation such than the curve determined by it has genus 0 or 1, then this equation has only finitely many Q-solutions. For example, the equation xn + y n = 1, n ≥ 4 cannot be transformed into an equation whose genus is 0 or 1. Therefore, Fermat’s equation should according to Mordell’s conjecture have only finitely many Q-solutions. We add that according to FLT this equation ought to have precisely 3 solutions. Already in the 1920’s, C.-L. Siegel and A Weil tried to prove Mordell’s conjecture. Weil generalized the result of Poincaré-Mordell(that the group of rational points on an elliptic curve is finitely generated) to varieties of higher dimension, in the hope to be able to show by invoking, for a curve of genus g > 1, its so-called Jacobian variety that only finitely many rational points of the Jacobian lie on the curve itself. Attempts were made to amend this scheme of reasoning (for example, C. Chaubaty in 1938). A generalized and improved form of the Weil scheme was found by Serge Lang in 1962 (see [8]). A first essential step forward for the proof of Mordell’s conjecture was the proof of the same conjecture in the case of function fields (Yu. Manin in 1963). Although A. Weil did not reach his goal, he had set the right direction, and in the course of the next 60 years much new mathematics was created (Tate; Shafarevich; Manin; Parshin; Arakelov; Zarkhin; Deligne etc.). The further development was in essential way influenced by the Shafarevich conjecture (1962). Namely Shafarevich (see [17]) managed to formulate in number theoretic terms the problem of Kodaira for the classification of a given analytically varying (critical) family of Riemann surfaces of genus g > 1. As catalyst was here the analogy between number fields and fields of rational functions, observed and studied already in the 19-th century by Kronecker and Hilbert. This analogy has made it possible to transfer the correct formulation of the problem from one branch of mathematics to another, but it has not led to any solutions. A. Parshin (1968) and Yu. Zarkhin (1974) found a new approach to Manin’s result. Of special importance here is that Parshin proved that the Shafarevich conjecture is a consequence of Mordell’s. Gerd Faltings first established the Shafarevich conjecture in a weaker form and then in 1964 derived from it Tate’s conjecture (see [10]). Thereafter, using the Chebotarev density theorem and the Weil-Deligne theorem he reached his final goal – he found in its broad outline how to prove Mordell’s conjecture (as well as other conjectures mentioned here). In the opinion of several mathematicians (P. Deligne; L. Szpiro; F. Oort etc.) there were, in the original variant, several notions hard to understand and many observations extremely difficult to penetrate (more exactly, possible to put in order, only with great effort). But still, after less than a year it became apparent to the specialists that Mordell’s conjecture (and with it also the conjectures of Shafarevich, Tate etc.) now was proved! The way in which Faltings, in his proof, combined (and, if necessary, extended) surprised the specialists by his unexpected and extremely clear way in overcoming all difficulties. In the beginning, Faltings had doubted if he possessed the will, and the gift
442
C HAPTER VI. POPULARIZATION OF MATHEMATICS
to deal with such an abstract and complicated thing as Tate’s conjecture. But a great thirst for truth, and an interest for the many mathematical disciplines cohesive with this theme made it possible for him to understand and learn more, so that he did not stop and fail to test any of the key observations (as had done previously many mathematicians interested in this). From these small victories there grew finally a big one – the proof of Mordell’s conjecture. Much was written about this sensational result, one even expressed the opinion that this was the “theorem of the century” (Math. Intelligencer 5, No. 4 (1983)). If this is true, will of course be decided by the mathematicians of the following generations. In any case, we have here to deal with a triumph of mathematics (see the interview of JeanPierre Serre, Math. Intelligencer 8 (1983)). At the time of the solution of the problem Gerd Faltings was 28 years of age and was, for the second year, teaching mathematics at the University of Wuppertal ([West-]Germany). He had obtained his Ph. D. from Professor Nastold in Münster, under whom Faltings had studied, and who impressed him as a person. For the results described Faltings received the Fields Medal at the ICM in Berkeley in 1986.
19. The words of Academician V. Platonov (Minsk) “with our intellect we are dealing with Mordell’s conjecure, but at hearts we are attached to FLT” seem to express the sentiments of the majority of mathematicians when acquainted with the result of Faltings. Problems and their solution have been the soul of Mathematics – the solution of veritable problems has always led to a new, deeper understanding of many notions, often giving birth to new theories, and in this connection to the formulation of many new problems. We have already spoken above of new things which arose immediately from the theorem of Faltings in the case of FLT. In the years following the proof of Faltings (1983) one made many efforts to find methods for an effective estimation of the number of solutions of Diophantine equations with a finite set of solutions. However, it became clear rather quickly that, moving along the path of Faltings’s proof, it seems to be practically impossible to determine the equations of the geometric objects appearing in the proof (which are Abelian varieties). Still one hopes to obtain such effective estimates (Parshin, 1984; Raynaud and others). In 1984, a new approach to Fermat’s equation was found by the young German mathematician G. Frey. To each (assumed) non-simple solution one associates a certain elliptic curve – a so-called Frey curve, obtained as follows. Let p ≥ 5 be a prime and (A, B, C) a triple of integers such than Ap + B p = C p and GCD(A, B, C) = 1. Setting a = Ap , b = B p , c = (−C)p , we observe that a + b + c = 0 and that GCD(a, b, c) = 1. For the simple Fermat triple (A, B, C) the corresponding Frey curve (over Q) is the elliptic curve Ea,b,c given by the equation y 2 = x(x − a)(x + b). It turns out that Frey curves have special properties. Assuming the validity of the Taniyama-Weil conjecture and using these special properties together with the results of Serre and Ribet (1986) ¯ ¯ p ), where Q ¯ and F ¯ p are the algebraic about the homomorphisms Gal(Q/Q) → GL(2, F closures of Q and Fp respectively and Gal(. . . ) is the Galois group of the corresponding extension – so-called modular representations of weight 2 –, Serre and Frey reached at the conclusion that Frey curves do not exist! Taking into account how Frey curves were obtained, it appears from this (under the validity of the Taniyama-Weil conjecture) that Fermat’s equation does not have simple solutions.
8. Mordell’s problem
443
The non-existence of Frey curves would also follow from the arithmetical analogue of an inequality (the so-called Bogomolov-Miyaoka-Yau inequality) valid for Chern classes of algebraic surfaces (over C), that is the corresponding inequality for a number field – in the assumption that it succeeds to prove the latter. For the first time, one spoke about this during Parshin’s lecture in Paris in October, 1986. The following year the Japanese mathematician Yoichi Miyaoka, a student of K. Kodaira, heard about these results, and already in the early spring of 1988 there spread a sensational rumor that Miyaoka had succeeded in proving the arithmetical analogue of this inequality (and so FLT) . . . But when one got time to analyze the complete text of Miyaoka’s proof, his mistake became apparent. Thus Faltings found an essential error in Miyaoka’s argument, and so the proof lost its credibility. E. Bombieri arrived at the same conclusion, admitting, however, that the paper of the Japanese contained interesting ideas. More detail about this reduction (and some others connected with FLT) can be found in the survey [12]. 20. At least, one can say that the story of the Mordell-Faltings theorem (and the things connected with FLT) have corroborated of the rather firm conviction of many mathematicians that FLT is a true touchstone for the generality and depth of our mathematical methods, at the same time for to what extent these methods make it possible to transcend (in both directions!) the barrier between the discrete and the continuous. At least it should be clear to everybody today how illusory it is to hope that in especially favorable conditions one would find a solution to FLT by elementary means (see the observations made in Subsection 18). According to A. Parshin such a thing would require important, new knowledge about arithmetical surfaces. At the same time, J.-P. Serre adds to this line of thought that it would be strange if it would be possible to prove FLT geometrically only. In view of this it is hard to say how far one has come on the route offered by G. Frey on ones way to a proof of FLT. Therefore, one could believe that FLT is like the continuum hypothesis, which can neither be proved or disproved. This is not quite so. Consider the sequences (n, A, B, C) where An + B n = C n , n ≥ 3, and A, B and C are natural numbers, and call them Fermat quadruples. The statement “FLT is true” means that Fermat quadruples do not exist. From the statement “FLT is not true” it follows that Fermat quadruples do exist. If it were possible to find a Fermat quadruple and prove it convincingly, then FLT would be refuted. This argument shows that: in case that there is no proof that FLT can be refuted, then FLT is true. One might believe that the geometric point of view will bring the analytic and arithmetical arguments forward on the way toward the proof FLT. The theorem will probably be proved one day with the help of all the methods described here used together, as it happened with the proof of the conjectures of Weil and Mordell. The proof cannot be simple; already C.F. Gauss said: Hopefully the proof of FLT will one day be found as side product to some deep result in arithmetic. And still, the problem has been attacked by an uncountable army of “fermasists”, but even 20 years ago mathematicians had not fully adopted this ideas of Gauss. Even more, there were numerous mathematicians, among them also those who knew the subject very well, considering the algebro-geometric method created in the course of the attempts to prove
444
C HAPTER VI. POPULARIZATION OF MATHEMATICS
FLT as water sprouts of Diophantine Analysis, generalizations and analogies detached from the real needs of Number Theory. Maybe this is illustrated most significantly by Mordell’s own reaction on the occasion of the appearance of the book [8]. He told that he felt like Rip van Winkel99, adding that if in case one can understand, even the simplest special cases, the proofs of the generalizations with great difficulty, it would be better to leave these generalizations, where they are. To these Serge Lang counters strikingly: A mathematician working in Algebraic Geometry who fell asleep in 1961 and awoke in 1981 will probably feel himself like Rip van Winkel, this is the natural effect of the rapid and fundamental changes that have occurred in mathematics. Because of this, in the case of A. Weil, Grothendieck, Serre, Shafarevich and other, who all contributed the solution of Mordell’s problem, one has to estimate their contribution, but even more admire their personal fortitude and insight in the application of the methods of that time. 21. In our pragmatic age one can of course consider all what we have described as a fruitless enterprize only by the reason that it concerns only so-called pure mathematics. Here one could quote a letter (July 2, 1830) of Carl Gustav Jacobi to Adrien Marie Legendre; . . . I have read with great pleasure the opinion of Mr. Poisson about my work, and I could have been quite pleased, but Poisson should perhaps have omitted the rather tactless phrase of Mr. Fourier, where the latter reproaches Abel and me that we do not prefer to work more on the question of heat conductivity. One knows of course the opinion of Mr. Fourier that the main objects of mathematics are the applications to the clarification of natural phenomena and the yield from this. But such a deep thinker ought also have known that the ultimate goal is to glorify the human spirit. And seen from this point of view Natural Numbers are of no lesser importance than the Structure of the Universe. In a time, when are ever less doubts in the usefulness of computers, it will perhaps not make sense to complete this argument to the support of the aesthetic origin of Mathematics. But in a changing world one ought to add that, in Jacobi’s words, there is expressed fully the opposition, peculiar to each generation, about the distinctions in Mathematics. These distinctions express themselves in the choice whether to prefer problems which have arisen from the closest needs of practise, or to think on problems dictated by the inner logic of things and which will yield a benefit only in a remote future. This dilemma may appear also, in one form or the other, in the activities of one and the same mathematician. It is significant here, for instance, that such a well-known mathematician as John von Neumann has expressed entirely conflicting opinions in the question under view. But it is also precisely here that the dilemma find its solution: Gauss, Riemann, Hilbert, Hermann Weyl and other major front figures of Mathematics have often found in their theoretical work major inspiration in an applied background. In the course of 99Translator’s note. Character in a book by the classic American writer and humorist Washington Irwing (1783-1859). He is the man who slept for 20 years and when he wakes up find himself in a world that has transformed, the American Colonies have become independent.
8. Mordell’s problem
445
a longer period of time (30-100 years, sometimes even longer) this distinction may disappear or express itself in a different form. Selected results in the solutions of applied problems are generalized, in this time, to a theory, and the so-called pure mathematics find its way into the applications. Even the new theories treated in the present paper have found there way into the applications, namely into contemporary physics. We point out here only the discussion of D. Ruelle [15], in particular his observation that a theorem found and proved by two physicists Lee and Yang (Phys. Rev 87 (1952), 410-419) and its applications (for example, in the theory of phase transition) probably is connected to the Weil conjecture. Finally, we add that, in both of these types of motivation, and their interplay, always it is the concrete problems which bring mathematics forward, and also direct its development in an essential way. In this sense the words of the Polish born U.S. mathematicians Marc Kac are remarkable: Even axiomatic systems change in the waves of time, but their applications live for ever.
Epilogue. This survey was probably written in 1988. Since then Fermat’s Last Theorem has been proved by Andrew Wiles (assisted by Richard Taylor) [19]. The proof depends on a special case of the TaniyamaShimura Conjecture [2], saying that every elliptic curve is modular. This conjecture in general was later proved by C. Brenil; B. Conrad; F. Diamond; and R. Taylor [1]. A very readable (non-technical) description of Wiles’ work is the book [18]. Gert Almkvist
Other references:[16],[3],[6],[9],[11], [14],[21],[22]
References C. Brenil, B. Conrad, F. Diamond, and R. Taylor. On the modularity of elliptic curves over Q: wild 3-adic exercises. J. Am. Math. Soc. 14, 2001, 843–939. [2] H. Darmon. A proof of the full Shimura-Taniyama conjecture. Notices Am. Math. Soc. 46, 1998, 1397– 1401. [3] P. Deligne. Preuve des conjectures de Tate et Shafarevich (d’après G. Faltings). Séminaire Bourbaki, Exposé 616, Novembre 1983. Asterisque 1983/84 (121–122), 1985, 25–41. [4] Diophant of Alexandria. The Arithmetics and the book on polygonial numbers. Nauka, Moscow, 1974. [5] D. R. Heath-Brown. Fermat’s last theorem for "almost all" exponents. Bull. London Math. Soc. 17 (1), 1985, 15–16. [6] N. Katz. An overview of Deligne’s proof of the Riemann hypothesis for varieties over finite fields. In: Proc. Pure Appl Math 38, Part 1: Mathematical developments arising from Hilbert problems. Amer. Math. Soc., Providence, R.I., 1976, 275–306. [7] A. O. Koblitz. Introduction to elliptic curves and modular forms. Graduate Text of Math., 97. SpringerVerlag, New York, 1984. [8] S. Lang. Diophantine geometry. Interscience Tracts in Pure and Applied Mathematics, 11. Interscience Publ., New York, London, 1962. Russian translation: Mir, Moscow, 1986. [9] S. Lang. Higher dimensional Diophantine problems. Bull. Am. Math. Soc. 80, 1974, 779–787. [10] B. Mazur. Higher dimensional Diophantine problems. Bull. Am. Math. Soc. 14, 1986, 207–259. [11] B. Mazur. On some of the mathematical contributions of Gerd Faltings. In: Proceedings of the Int. Congress of Mathematicians, August 3-11, 1986. Amer. Math. Soc., 1987, 7–11. [1]
446
C HAPTER VI. POPULARIZATION OF MATHEMATICS
[12] J. Oesterlé. Preuve des conjectures de Tate et Shafarevich (d’après G. Faltings). Séminaire Bourbaki, Exposé 694. Asterisque 1987/88 (161–162), 1989, 165–186. [13] M. M. Postnikov. Introduction to algebraic number theory. Nauka, Moscow, 1982. [14] S. Raghavan. Impact of Ramanujan’s work on modern mathematics. J. Indian Inst. Sci. Srinivasa Ramanujan centenary 1987, Special Issue, 1987, 45–53. [15] D. Ruelle. Is our mathematics natural? The case of the equilibrium of statistical mechanics. Bull. Am. Math. Soc. 19 (1), 1988, 259–268. [16] F. Schinzel. Construction of telephone networks by group representations. Notices Am. Math. Soc. 26 (1), 1989, 5–22. [17] I. R. Shavarevich. Algebraic number fields. In: Proceedings of the Int. Congress of Mathematicians, August 15-22, 1962. Institute Mittag-Leffler. Almqvist & Wiksells, Uppsala, 1963, 163–176. [18] S. Singh. Fermat’s enigma. Walker and Co., New York, 1997. [19] R. Taylor and A. Wiles. Ring theoretic properties of certain Hecke algebras. Ann. of Math. 141, 1995, 553–572. [20] S. S. Jr. Wagstaff. The irregular primes to 125 000. Math. Comp. 32 (142), 1978, 583–591. [21] A. Weil. Number of solutions of equations in finite fields. Bull. Am. Math. Soc. 55, 1949, 497–508. [22] Yu. Zarikhin and A. Parshin. Problems of finiteness in Diophantine Geometry. Supplement to the Russian edition of [14], 1986, 369–438.
447
9.
[K96] On two discrete models in connection with structures of mathematics and language Translation by J. Peetre
In a mathematical theory there is no a priori need to bring its conceptions and language in agreement with newest needs of natural sciences. Nevertheless this has happened often and the good harvest of the cooperation has given a profit to both parties. During the last decades there has been a steadily growing interest in discrete models of an ever increasing complexity. As it has not been possible to present adequately such a model with the aid of standard functional rules, this interest has increased in proportion to the possibilities of computers for theoretical experiments with them. In the following we shall describe the possibilities of two such simplest models.
9.1. Binary trees and Strahler numbers One example is the study of branching phenomena in neurophysiology, botany, geology – in the last discipline in particular in connection with hydrogeological research by R. E. Horton [2] and A. Strahler concerning the structure of river systems [13]. A common denominator for these phenomena is provided by the notion of tree, which expressed in mathematical language means a cycle-free connected simple graph. In computer science one employs the notion of a binary tree, which can be determined recursively: • if such a tree has only one vertex, then this tree is identified with its vertex; • in all other cases a binary tree is defined as a triple B = (v; BL , BR ), where v is a distinguished vertex of B (designated the root) and BL (as well as BR ) are binary trees, called the left (respectively, the right) subtree of the tree B. The vertices of a binary tree are classified as inner vertices (such an vertex 2 H has two “successors”, its left and its right vv• HHH 1 2 vv HH successor) and as exterior vertices (these v HH v H1 vv v 5 are the vertices without successors). The • • 555 1 1 1 0 110 edges of a tree are pairs of vertices (v, w), 5 11 55 where w is the successor of v. If we as•* * • •** • * * 0 *0 0 *0 sume that in a river system no islands ** ** • • • • have been formed and that at each juncture not more than two rivers are united, Fig. 23: The orders of the binary tree then the branching picture which arises is a binary tree. To the edges of a tree one can assign an order using the Horton-Strahler rule: • the order of a river proceeding from a source is 0; • two order-k rivers join to a river of order k + 1, but two rivers of order i and k (i < k) give when joined an order-k river (Fig. 23).
448
C HAPTER VI. POPULARIZATION OF MATHEMATICS
The maximal order of the edges of a tree under consideration is called its Strahler number and will be denoted st(B). This parameter of a tree can be defined inductively as follows: • we agree upon that st(∅) = 0; • if st(BL ) = st(BR ), then we agree that st(v; BL , BR ) = 1 + st(BL ); • if, however, st(BL ) = st(BR ) we agree that st(B) = max(st(BL ), st(BR )). A maximal path among the paths of a tree consisting of edges of order k is called •** • •** • •** • •** • ** ** ** ** a k-th order segment of the river system; 0 * 0 0 * 0 0 * 0 0 * 0 such a segment starts in a source (in case * * *5 * •55 • • • 5 k = 0) or else arises by joining two edges 55 5 55 5 of order k − 1 (in case k ≥ 1), but ends by 1 55 1 1 55 1 • HHH v• the joining with a segment of order k (k > HH vv v H v k). Denoting the total number of segments 2 HHH vvv 2 •v of order k by bk , we define the bifurctation ratio of the tree B as the quotients bk /bk+1 ; 3 here k ≤ st(B). For example, for a binary tree with all exterior vertices (leaves) at the same distance k from its root the biFig. 24: A binary tree B: bk /bk+1 = 2 for any k, furctation ratio is 2 and the Strahler numand st(B) = 3 ber of such a tree is k (Fig. 24). In accordance with hydro-geological observation the bifurctation ratio does not change within the frames of a given river system, and stays between 3 and 5, giving a good qualitative picture of the shape of the river system. Branching trees are of interest also in botany. A result of these investigations for computer science is the discovery of the so-called Lindenmayer grammars and their use in computer graphics, where using these complementary methods one tries to assemble a synthetic picture of the tree [16]. The inputs of such a program are the number k and a stochastic matrix with (at least) k rows, the so called ramification matrix , and it yields a binary tree with Strahler number k having the given matrix as ramification matrix. Strahler numbers appear in a natural way like7654 0123 + 999 wise in other questions of computer science. One 99 99 9 of these is the question of the least number of reg ?>=< 89:; ?>=< 89:; g ×; ;; isters needed for the evaluation of a given arith ;; ;; metic expression. Let us identify the arithmetic ex; 7654 0123 7654 0123 +, ++ , pression (consisting binary operations) with a tree ,, +++ ,, ++ whose vertices are labelled by symbols for these 89:; ?>=< ?>=< 89:; 89:; ?>=< ?>=< 89:; e / / f operations and the variables used in the expres ,,, -- ,, -sion. For example, in Fig. 25 we have drawn a , 89:; ?>=< 89:; ?>=< 89:; ?>=< 89:; a c b ?>=< d labelled binary tree corresponding to the expression (a/b + c/d) (e + f ) + g. In the general case it turns out (theorem of A. Ershov!) that the Fig. 25: The syntax tree of the expression minimal number of registers required in the eval- ( ab + dc )(e + f ) + g uation of an arithmetic expression exceed by one the Strahler number of the corresponding binary tree. The number of registers required
9. Structures of mathematics and language
449
for the evaluation of a long arithmetic expressions is described by a formula for finding lim st(n), where st(n) stands for the average 1 · st(Bn ) cn (Bn )
over all binary trees Bn with n vertices. Such a formula was found by X. Viennot (1986) (see [16]). As a detail, let us record that the total number of the latter is cn = 2n n /(n+1) and that the generating function c(t) = n≥0 cn tn of these Catalan numbers cn satisfies the relation 1 + tc(t) − c(t)2 = 0.
9.2. Molecular biology and formal languages The results of molecular biology has sometimes been formulated in terms of formal languages and information theory. On the one hand, the formal languages. Fixing an alphabet X, let us consider subsets L of the set of all words X ∗ ; such subsets are called formal languages. A language L can be presented asfunction L : X → {0, 1}. Therefore, L can also be interpreted as a formal series w∈L w. Here we are interested in context free language (CF-languages)100; such a language can be given by a context free grammar, that is a quadruple G = (X, N ; σ, P), where N (the terminals) and X (the non-terminals) are finite alphabets, σ ∈ N the so-called initial symbol and the finite set P contains the rules of deduction (productions) α → β, that is, pairs (α, β), where α ∈ N and β ∈ (N ∪ X)∗ (for details see [11]). As an example, we have the Dyck language D, D ⊆ X, where X = {x, x ¯} and the rules of deduction are σ → xσ¯ xσ and σ → 1; here the symbol 1 denotes the empty set in X. In a word of the language D there are always the same number of letters x and x¯. Moreover, there are in each left term (prefix) not fewer letters x than letters x¯. Another example is the Fibonacci language F , F ⊆ X, for which X = {x, a} and N = {σ, τ }, while the productions are σ → aτ , σ → xxσ, τ → aσ, τ → xxτ , σ → 1. The formal series F representing this language is the solution of the system of equations F =1 + aG + xxF G =aF + xxG in the algebra ZX of formal series with integer coefficients. One owes to M. P. Schützenberger the idea to seek in the enumeration of combinatorial objects in their graduated set K = ∪Kn such a formal algebra language L whose n-words are in one-to-one corresponding with the objects of order n, that is, elements of the set Kn . In this situation the desired result will give the generating n n function l(t) = n>0 ln t of the numbers ln = |L(G) ∩ X )|. In order to find the number of words of a given length in the language L(G) let us consider the morphism Ψ : X ∗ → {t}∗ , which maps all characters of the alphabet X into one and the same (new) variable t. In this situation is the formal series corresponding to L(G) represented by the generating function of the numbers ln : Ψ(L) = l(t). In the case of the Dyck language we obtain in this way a series d(t), satisfying the equation 1 − d(t) + t2 · d(t)2 = 0. 100The author uses the term algebraic language instead.
450
C HAPTER VI. POPULARIZATION OF MATHEMATICS
√ 2 2 This equation is solved by the function d(t) = (1 − 1 − 4t such that in its power 2n)/2t 2n series the coefficient of t is the Catalan number cn = n /(n + 1). In our second example, we obtain the power series Ψ(F ) = f (t) and Ψ(G) = g(t) satisfying the system of equations f (t) = 1 + t · g(t) + t2 · f (t) g(t) = t · f (t) + t2 · g(t) As the solution to this system we obtain the function f (t) = (1 − t2 )/(1 − 3t2 + t4 ). In is Taylor series the coefficient of t2n is the Fibonacci number F2n ; here F0 = F1 = 1 and Fn = Fn−1 + Fn−2 (n ≥ 2). An auxiliary fact: The Fibonacci language is rational, which (according to Kleene’s theorem!) means that that this language is recognizable by a finite automaton; such an automaton is depicted in Fig. 26. a On the other hand, the genetic code. Inter( ?>=< 89:; ?>=< 89:; esting macromolecules are nucleic acid (NA) and 1 h 3 a W W proteins. One of the forms of NA - deoxyribonucleic acid (DNA) contains chromosomes and carx x x x ries hereditary information. It appears as a double helix twisted up in space and consisting of a dual 89:; ?>=< 89:; ?>=< 2> 4 pair of threads joined with each other through hy>> >> drogen bonds. If one separates two DNA strands > a >> a > and then adds to each of them another DNA chain 89:; ?>=< 5 complementary to it one gets as a result two iden9 e x a tical copies of the original DNA molecule. The kinky form of a double helix optimizes the spatial Fig. 26: A finite automaton for the Fidistribution of the molecule, because in untwisted bonacci language form the DNA thread the shape of DNA would have been 50 centimeters long. The proteins are the workhorses of the cell, assuring the stability of its structure, its defence, energy content and life activity. The protein molecule consists of amino acids, of which there are 20 species. The latter may be viewed as the semantic primitives of the genetic language, of which finite words (long!) formed by concatenation are called polypeptide chains. The primary structure of nucleic acids may be viewed as a chain of nucleoides (bases) – a thread. The alphabet G, with the aid of which the DNA thread is transferred as words consists of four bases: A (adenine), G (guanine), C (cytosine) and T (thymine). In the case of the ribonucleic molecule (RNA) one uses in the alphabet R of the ones mentioned the three first, while T (thymine) is here replaced by the base U (uridine). These bases possess several properties which make it possible to count them as phonemes of the genetic language. But they have also peculiarities:
• the number of phonemes of a natural languages is variable (> 10), the number of nucleoides is 4 in all organisms; • the phoneme of a natural language is given by a complex of (binary) predicates whose order in the words is not important, while at the same time as for example T (thymine) and C (cytosine), although they consist of the same elements, appear as different graphs. In the alphabet of nucleotides there arises 43 = 64 strings – the codons, that in turn form the so-called nucleotide chains – (very
9. Structures of mathematics and language
451
long!) strings in the alphabet of codons. In both cases the bases are joined into a unique chain with the aid of sugar components. It is possible to view the genetic code as an exact correspondence between the codons and amino acids of a special type; see the table in [6]. In the decoding of the polynucleotide chain each codon is replaced by a corresponding amino acid. In fact, amino acid specifies 61 codons, the remaining 3 codons (UAA, UAG, UGA) are terminators, the role of which is to indicate the end of the phase of decoding. Codons could be compared to morphemes in natural language – each of them is a sequence of genetic phonemes, which within the limits of the given syntax does not dissolve into shorter subsequences. A difference is the same length of codons (3), which is not observed in the case of natural languages. In the same way the meaning of the morphemes in natural language is modified from language to language, while genetic morphemes and their meaning remains invariant for all organisms. In the framework of this interpretation one can consider terminators as grammatical morphemes, while at the same time the remaining 61 codons play the role of lexical morphemes. As a detail – there exist also contexts where grammatical morphemes may appear as lexical ones. Z. Pawlak made an attempt to present the genetic language with the aid of a grammar based on geometrical intuition [8], the inconveniences of which were removed in a modification of this grammar into a formal grammar by B. Vanquois a few years later101. S. Marcus extended the grammar obtained to the Lindenmayer system in order to present also the “spatial” aspect of the genetic language (the double helicity!) [6]. %% vSSS •S %% SSSSSSS SSSS %% S* T3 %% ~~•* % ~~ ** ~ % ** % ~~ • •*~ •% ** %% *** ** % % ** * %% •• •&55 • %% &&& 55 5 % % && 55 • • • • •
Let us now consider here again the question how Strahler numbers appear. The fact that the threads a double helix of DNA are not knotted, makes it possible to view the double helix as a planar graph (which is also called the secondary structure of the molecule): the vertices are the bases and T the edges are both the base joints in the • T DNA thread (primary bonds) as well as their hydrogen joints formed in the helix (secFig. 27: A rooted tree T = (v; T1 , T1 , T3 ) ondary bonds). Each secondary structure induces a certain forest – a cycle free graph the vertices of which are the primary bonds and the edges are determined by the incidence relationship of these bonds. Such forests were used by Vauchaussade de Chaumon and Viennot [17] with the purpose in mind to study the homologies of the secondary structures, that is, the molecule’s properties in distinct species. As a result there was an answer to M. Waterman’s question [18]: what is the generating function of all k-th order secondary structures? Here by the order of a secondary structure is meant the order of the forest induced by it. Let us introduce the necessary notions. The rooted tree T is defined recursively: 1
2
• if T has only one apex, then T is identified with this vertex;
101See details in [6]
452
C HAPTER VI. POPULARIZATION OF MATHEMATICS
• in the opposite situation one gives the tree as a sequence T = (v; T1 , . . . , Tp ), where v is a vertex of T (the root) and Ti is a subtree of T rooted at v, see Fig. 27. A forest is a list of all connected components of the graph consisting of rooted trees. A maximal sequence of vertices (v1 , . . . , vs ) such that each vi (i = 1, 2, . . . , s − 1) has the unique successor vi+1 and vs is a leaf (that is, an apex without a successor) is called a filament of the forest. The operator δ of removing filaments is defined on the forest M by the rule that δ(M ) is the forest which is obtained from M by omitting all vertices of the filaments and all the edges incident to them; the filament containing the root is removed in the last instance. The smallest such number i that the vertex x is extinguished by application of the operator δ i is called the degree of this vertex. The maximal degree of the vertices of a forest is called the degree of the forest. In the example given in Fig. 28 we have the degree 3. It is clear that the degree of a forest is the least integer k such that δ i (M ) = ∅. An answer to the Waterman’s question • above is obtained in the following way. The secondary structures of degree k are coded •1I1II 11 IIII 11 II with the words of a suitable algebraic lan 11 I•55 •* 55 ** 11 guage and then one finds a system of equa* 55 * 11 * 5& • • •& tions which is satisfied by the generating •*** && * && ** formal series of the words of this language. • • • • Subsequently, the desired answer is found • • using the procedure described above (in connection with the map Ψ). Indeed, if the • number of unlabelled k-the degree secondary structure with n vertices is denoted sn,k , Fig. 28: A forest of the degree 3 then the generating function under discussion is given by the formula tp(k) sn,k tn = , (1 − t)P1 P2 . . . Pk n≥0
− 2 and the polynomials Pi are defined recursively by the rules where p(k) = 5 · 2 2 − 2tp(k) (in case i ≥ 2). The problem connected with P1 = 1 − 2t − t3 and Pi = Pi−1 this question regarding the enumeration of rooted forests of degree k and n vertices is simpler and, surprisingly, its answer is the same generating function which enumerated the binary trees with Strahler number k [15]. k−1
9.3. On coding theory Contemporary technology has lifted on a new level questions about the mechanisms of information processing and their effectiveness. The solutions have required a mathematical formulation of which many essential concepts originates in coding theory. As many similar questions are of interest also in the study of the genetic code and language, we shall in what follows likewise give a brief survey of these concepts. There are many possibilities for sending information. In some cases (for instance, in satellite communication) information is transferred through medial channels, in other
9. Structures of mathematics and language
453
cases the sender writes it, for instance, on a floppy disk, from which the computer later reads it. The exact mechanism of the transfer is not far from always known – it suffices to think of the questions of transfer of information in the human brain. However, many communication channels have a common characteristic – the transfer of information is there accompanied by background noise, with the effect that some of the transferred symbols get modified in the process of communication and arrive to the receiver in distorted form. In order to improve qualitatively of the reception one applies error detecting and error correcting codes. In mathematical formulation, a channel is given by a triple (S, V ; P ), consisting of an input language S, an output language V and a matrix P = p(y|x). The elements of the latter are conditional probabilities: p(y|x) shows the probability for receiving the symbol y in the condition that x was sent and this probability is regarded as independent of the fate of the previous and later signals in the channel under view. Here information is interpreted as a sequence of (long!) finite sequences (called words, also strings), for the writing of which the symbols of the given alphabet are used. In the theory the most suitable alphabet is some finite field Fq (here q = p is a prime power). The Reader may picture the field Fq as a domain of numbers where the arithmetical operations are carried out according the most common rules, to which new ones have been added that introduce basically periodicity phenomena, emanating from the finiteness of the domain Fq . The coding may be viewed as a procedure (as an algorithm or a mapping), which map a natural message or a part of it written in words in the channel’s input language S, adding so-called code symbols (‘redundancy blocks’). Expressed more exactly, for the coding of a message, broken up into k letter blocks, Ψ may be presented as an injective map Fkq −→ Fnq ; words in the image set C = Ψ(Fkq ) ⊆ n Fq are called code words. If Ψ is a linear map, then the set of code words forms a kdimensional subspace of the sequence space Fnq ; therefore the code is termed a linear (n, k)-code. All code words of a linear code can be presented in the form x ¯G, where x ¯ ∈ Fnq and G is a fixed k × n matrix, the rows of which form a basis of the subspace C; it is called the generating matrix of C. There is also another important matrix connected with the code, it is the parity check matrix which is a (n − k) × k-matrix H such that x ¯ ∈ C if and only if x ¯H t = 0; here t is standing for taking the transpose of the matrix. If we introduce a form n ¯ x, y¯ = xi yi i=1
in Fnq , then we can ‘compute’ the orthogonality of vectors, that is, interpret this geometrical notion in the analytic language: x¯⊥¯ y ⇔ ¯ x, y¯ = 0. Therefore it makes sense to speak of the code C ⊥ dual to C, C ⊥ = {x| x ∈ Fnq such that x⊥c for all c ∈ C.} The reception of the coded information is followed by decoding – a procedure which maps the sequence received in the channel’s output language V into a natural message. Often this is achieved in such a way that one finds the code word closest to the received words (maximum likelihood decoding). Maximizing the correct choice of the code word is facilitated by the Hamming distance of two words (vectors) x = x1 x2 . . . xn and y = y1 y2 . . . yn : dH (x, y) = #{i|xi = yi }. For example dH ((0111), (1001)) = 3 and dH ((01100), (11000)) = 2.
454
C HAPTER VI. POPULARIZATION OF MATHEMATICS
If the minimal (Hamming) distance between the words in C is d, then such a code can correct ≤ [(d − 1)/2] errors arisen in the channel, and detect even ≤ d − 1 errors. This is easy to understand if we surround all code words x ∈ C ⊆ Fnq by the discrete balls Be (x) = {z|z ∈ Fnq , dH (x, z) ≤ e}. Here e is the radius of the ball and e ≤ [(d − 1)/2]. As a detail, we add the fact that each ball contains n n n (q − 1)e (q − 1)2 + . . . (q − 1) + 1+ e 2 1 words in the (vector) space Fnq ; here ni denotes the binomial coefficient. In view of the choice of the radius the balls {Be (x)|x ∈ C}, apparently the inequality 2e + 1 ≤ d is satisfied. Consequently, these balls do not intersect (Fig. 29) and if a received word falls into one of the balls Be (x) then this word can be uniquely! decoded by the code word x, x ∈ C, which constitutes the center of the ball in question. The most known example of a linear code is the Hamming code. Let us fix an integer r and consider the vector space Frq of all vectors as an affine space, that is, a point space where Fqn the vectors in Frq appear in two roles: as point and as displacement vectors. Although such a “point space” Ar (q) consists of only q r dise e tinct points, it has its own geometry which • • may be described as the crypto-morphological analogue on the knowledge offered in uniBe (x) versity courses in linear algebra and geometry on the real affine space, where the field Fq is in the role of the real numbers. Taking one of the points O ∈ Ar (q), let us consider the lines through this point: an arbitrary Fig. 29: The Hamming distance point X on such a line is given by the equation X = O + v¯t, where the parameter t runs through all values in the field Fq , and the non-zero vector v¯ ∈ Frq is the direction vector of the line. Thus, here every line as a set of points {X(t)|t ∈ Fq } consists of q points! There are q r − 1 choices for the direction vector (¯ v = 0), so that the number of lines through the point O equals n = (q r − 1)/(q − 1); let us denote these (non-collinear) directions by v1 , v2 , . . . , vn . Let us further form the matrix H, whose rows are the sequences vi ∈ Frq . Then we may consider the code C = {x|x ∈ Frq , xH t = 0} . This linear (n − r, n)-code is called the Hamming code. The minimal distance between its code words is 3 and this code is thus perfect in the sense that @ Frq = B1 (x) . x∈C
In other words, in the case of an arbitrary word received with not more one distorted letter it is possible to decide with which code word it has to be decoded.
9. Structures of mathematics and language
455
As another example, let us consider the radar codes. One of the best known medieval mathematician was Leonardo (from Pisa, with the nickname Fibonacci 102). His most important work concerned the completion and systematization of arithmetic, which he had learnt from the Arabs. Through his treatise “Liber Abacci” (1201) his results became known in Europe. Fibonacci numbers are widely known; this is the sequence 1, 1, 3, 5, 8, 13, 21 . . ., which members (Fn |n = 0, 1, . . . ) may be found from the relation Fn = Fn−1 + Fn−2 (n ≥ 2), assuming that F0 = F1 = 1. More generally, let us consider sequences y = (yn |n = 0, 1, . . . ) satisfy a homogeneous recurrent equation, that is, a relation of type a0 + a1 yi−1 + · · · + ak yi−k = 0,
i = k + 1, . . . ,
where we agree that a0 = 0 and further that, in the interest of the context that the coefficients ai and the members on the sequence are taken in the Galois field Fq . Fixing the initial values y0 = c0 , . . . , yk−1 = ck−1 , this equation gives us as solution a sequence (cn |n = 0, 1, 2, . . . ), the components of which can be found from the formula ci =
a−1 0
·
k
aj ci−j ,
i = k, k + 1 . . .
j=1
A useful detail: if we interpret the solution y as a (formal) series y = c0 + c1 x + c2 x2 + . . . , then this series comes as the quotient of two polynomials c(x)/a(x), where the degree of c(x) is less than k and a(x) = a0 + a1 x + · · · + ak xk is the (left) characteristic polynomial of the equation under consideration. A radar code codes a k-sequence (c0 , c1 , . . . , ck−1 ) written in the input alphabet Fq as an infinite recurrent sequence c = (cn |n = 0, 1, . . . ), which is determined as the solution of the above recurrent equation under the initial condition (c0 , c1 , . . . , ck−1 ). If, in addition, ak = 0 in this equation, then the radar code determined by it generates only periodic sequences (cn ), that is, there exist integers p and t such that ci = ci+p for all i ≥ t. For instance, taking q = 2, k = 4 and the equation yi + yi−3 + yi−4 = 0 we obtain sequences with period p = 24 − 1 = 15. The error correcting properties of this radar code depend on the fact that 24 = 16 distinct initial sequences (these are words of length 4 in the alphabet Z2 ) generate 16 distinct code words of length 15 and that their set is closed for addition as well as (obviously) also multiplication by the scalars 0 and 1. Hence, C turns out to be a 4-dimensional subspace in the 15-dimensional space 4−1 Z15 = 8. Therefore the radar 2 , in which the minimal distance of any two codes is 2 4−2 code described can recognize 2 = 4 errors, and correct 24−2 − 1 = 3 errors. The set C may be realized as a simplex, so this code is also known as a simplex code. The dual to it code C ⊥ is the widely known binary (3, 7)-Hamming code. An example of the effectiveness of radar codes is the fourth test of A. Einstein’s theory of gravitation. A long time one has known three experimental facts validating this theory (1915): the precession of the perihelion of the orbit of Mercury; the bending of right rays near the Sun; and the gravitational red shift. The fourth effect (the slowing down of electromagnetic radiation in the gravitational field) was checked only half a century later. To this end one measured the arrival of echoing from a radar signal from Mercury both when Mercury was obscured 102Translator’s note. The common used name-form Fibonacci came into use only in the course of the 19th century, presumably through the influence of the Italian mathematician and mathematical historian Guglielmo Libri Carucci della Sommaja (1803-1869). Leonardo himself wrote (in Latin) Leonardo filio Bonnaci.
456
C HAPTER VI. POPULARIZATION OF MATHEMATICS
by the Sun (in this case the energy of the echo is 10−27 of the emitted energy!), as well as when is was not obscured. With the aid of a suitable radar code one succeeded to fix the time difference in the arrival of the echo. Interest in mathematical coding theory spread in particular after Shannon’s result [12] regarding the possibilities of good transfer of information in “noise” adding symmetric binary channels (BSC). For such a channel (S, B; P ) one has S = V = Z2 = {0, 1}, while the elements of the matrix P , the conditional probabilities p(i|j), are given by the rules: p(1|0) = p(0|1) = p (probability of error), and p(0|0) = p(1|1) = 1 − p for some p ∈ [0, 1]. The rate of transmission for this channel is determined as the ratio between the number of bites appearing in the original message and the total number of bites input in the channel, where the last number also includes the bites added in the decoding of the message. According to Shannon’s theorem transformation of information with noise is possible in a symmetric channel with given positive rate of transmission, which at the same time guarantees the correct reception of an initial message with close to one probability. Supplementary information about codes can be found in [14].
9.4. Conclusion One may remark that, despite the ancient origin of the problem of information transfer, some of the questions connected with are still of interest and that often they are only beginning to become accessible to research. This problem has provided the motivation for and is the real testing stone in the development of biology as well as of several combinatorial theories within mathematics. In connection with of the genetic language one could note two questions which, presumably, offer a continued interest. First, in which way can the genetic code be viewed as an error detecting and error correcting code? Second, how to explain the continuity properties of the genetic language, that is, in which cases (always?) and why does closeness between some codons generate corresponding polypeptide chains which are similarly close to each other? The determinations of nearness of codons (modified; weighted; etc.) have so far not given any result with the aid of the Hamming distance. It is the author’s conjecture that this can be realized via Grothendieck topology. Also related problems are connected with the model of the Dutch mathematician De Bruijn [1] regarding transfer of information in the (human) brain, as well as, the related to this, Grothendieck continuity within the realm of automata. References The references [2], [8],[17], and [18] are supplied by the Editors. The works [3], [4], [5], [7], [9], [10] below are actually not cited in the paper. They are kept here because of the appearance in the original publication. [1] [2] [3]
N. G. de Bruijn, A model for information processing in human memory and consciousness. Preprint (2.11.1993). Dept. of Math. and Comp. Sci. of Techn. Univ. Eindhoven, 1993. R. E. Horton. Erosioned development of streams and their drainage basins, hydrophysical approach to quantitative morphology. In: Bull. Geol. Soc. of America, Vol. 56, 1945, 275–370. A. Jaffe and F. Quinn. Theoretical mathematics: towards a cultural synthesis of Mathematics and Theoretical Physics. Bull. Am. Math. Soc. 29(I), 1993, 1–12.
9. Structures of mathematics and language
[4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]
[18]
457
U. Kaljulaid. An invited review of the book “Discrete Mathematics and Algebra Structures ”, by S. Gerstein. In: Acta Appl. Math., Vol. 22. Freman & Co., N. Y., 1987, 325–329. J. Kiho. Algoritmid ja nende struktuurid, Tartu, 1994. (In Estonian). S. Marcus. Linguistic structures and generative devices in molecular genetics. Cahiers linguistique théorique et appliqueé 11(2), 1974, 77–104. D. Mumford. Picard groups and moduli problems. In: Arithmetical Algebra Geometry, N. Y., 1965, 33–81. Z. Pawlak. Gramatyka i matematyka. Pa´nstwowe Zaklady Wydawnictw Szkolnych, Warszawa, 1965. (In Polish). H.-O. Peitgen, H. Jürgens, and D. Satpe. Chaos and Fractals. Springer-Verlag, 1992. P. Prusinkiewicz, A. Lindenmayer, and J. Hannan. Developmental models of herbaceous plants for computer imagery purposes. In: ACM SIGGRAPH Computer Graphics, Vol. 22, 1988, 141–150. A. Salomaa. Formal Languages and Power Series. In: “Formal Models and Semantics”, Handbook of Theor. Comp. Sci., Vol. B. Elsevier Science Publ. B.V., 1990, 103–132. C. E. Shannon. A Mathematical theory of communication. The Bell System Technical Journal 27, 1948, 379–423, 623–656. A. N. Strahler. Hypsometric (area-altitude) analysis of erosonal topology. In: Bull. Geol. Soc. of America, Vol. 63, 1952, 1117–1142. H. C. A. van Tilborg. Error-correcting codes - a first course. Chartwell Bratt, Studentlitteratur, Lund, 1993. M. Vauchaussade de Chaumont, Nombre de Strahler des arbres, languages algébriques et dénombrement des structure sécondaires en biologie moléculaire. Thèse. Univ. de Bordeaux 1, 1985. X. Viennot, G. Eyrolles, N. Janey, and D. Arqués. Combinatorial analysis of ramified patterns and computer imagery of trees. In: ACM SIGGRAPH Computer Graphics, Vol. 23, 1989, 31–40. M. Vauchaussade de Chaumont and X. G. Viennot. Enumeration of RNA’s secondary structures by complexity, in Mathematics in Medecine and Biology. In: Lecture Notes in Biomath., Vol. 57. Springer, BerlinNew York, Berlin, N. Y., 1985, 360–365. M. S. Waterman. Secondary structure of single stranded nucleic acids. Adv. Math. Suppl. Stud. I, 1978, 167–212.
This page intentionally left blank
459
Index of Names Abel, Niels Henrik, 3, 4, 8, 24, 25, 27, 40, 51, 69, 70, 72, 73, 75, 78, 79, 81, 82, 86–90, 92, 97, 106, 107, 118–121, 123, 124, 128, 129, 134, 259, 272, 345, 348, 351, 368–370, 377, 378, 381, 382, 384, 387, 389, 392, 395, 399, 405–407, 411, 435, 436, 442, 444 Alameddine, Ahmad Fawzi, 247 Aleksandrov, Pavel Sergeevich, 203 Aleksandrov (Alexandroff), Aleksandr Danilovic, 227 Alexander the Great, 428 Alexander I, ix Almkvist, Gert, ix, 243, 286, 427, 445 Ameling, Friedrich, 311 Amitsur, Shimshon Avraham, 282 Anderson, Ian, 208 Ando, Tsuyoshi, 221 Andrunakievich, Vladimir Aleksandrovich, 122 Apollonius of Perga, 428 Arakelov, Suren Yu, 441 Arazy, Jonathan, 233 Archimedes of Syracuse, 428 Argand, Jean Robert, 364 Artin, Emil, 74–77, 83, 84, 106, 107, 437 Bézout E., 358, 391 Bachet, Claude, 429 Backlund, Helge Gotrik, 292, 294 Backlund, Hjalmar, 294 Backlund, Johan Oskar, 291–294 Backlund, Ulrika Catharina, 292, 294 Backlund-Celsing, Elsa Carolina, 292 Bahturin, Yuri A., xxiv Banachewski, Bernhard, 94, 108 Barbilian, Dan (Barbu, Ion), 117 Bashmakova, Isabella Grigoyevna, 311, 327 Beckenbach, Erwin F., 226 Beilinson, Alexander, 353 Bell, Eric Temple, 239 Bellman, Richard, 226 Belousov, V. D., 348 Belski˘ı, A., 353 Beltrami, Eugenio, 259
Bergman, George, 16, 17, 21, 43, 63, 105, 123, 283 Berkovich, Vladimir, 353 Bertrand, Joseph Louis François, 362 Betti, Enrico, 371, 436, 438 Bidder, Georg Friedrich Karl Heinrich, 321 Birch, Bryan John, 349, 439 Birkhoff, Garrett, 15, 21, 26, 44, 45, 103–105, 155, 207, 221 Björner, Anders, 203 Blauert, Marianne, ix Bogomolov, Fedor Alekseevich, 443 Bokut, Leonid Arkadievich, ix, 269 Boltzmann, Ludwig, 416 Bombelli, Rafael, 429 Bombieri, Enrico, 443 Booth, Laura, 299 Booth, Lorentz, 299 Borevich, Zenon Ivanovich, xii Bourbaki, Nicolas, 352, 355, 371, 399, 440 Bovdi, Adalbert, 22, 88 Brandt, Kerstin, ix Brauer, Richard Dagobert, 386 Brenil, C., 445 Brouwer, Luitzen Egbertus Jan, 353 Brualdi, Richard A., 213, 221 Bruck, Richard Hubert, 348 Bruhat, François Georgwe René, 221 Brunner, Georg Bernhard, 320 Buckley, Joseph T., 71, 85 Bulman-Fleming, Sydney, 137 Burnside, William, 254, 257, 258, 261, 377, 385, 386 Cantor, Georg Ferdinand Ludwig Philipp, 332, 353, 410, 416 Capelli, Alfred, 259 Cardano, Geronimo, 357, 358, 363, 364 Carrol, Lewis , 356, 359, 362 Cartan, Élie Joseph, 257, 267, 268 Cartan, Henri, 5, 355, 389 Castelnuovo, Guido, 406 Catalan, Eugène Charles , 449 Catharine I (Martha Skovronska), 292
460
INDEX OF NAMES
Cauchy, Augustin Louis, 226, 227, 261, 276, 361, 362, 368 Cayley, Arthur, 254, 257, 259, 261, 267–269, 424 Chabauty, Claude, 441 Chasles, Michel, 408 Chebotarev, Nikolai Grigorievich , 441 Cherednik, Ivan, 353 Chern, Shiing-Shen, 227, 443 Chernikov, Sergei Nikolaevich, 77 Chevalley, Claude, 288 Clebsch, Rudolf Friedrich Alfred, 259 Clemens, Charles Herbert, 406 Cobos, Fernando, xvi Cobos, Luz, xvi Cohen, I. S., 3, 9 Cohn, Paul Moritz, 21, 68, 328, 390, 391 Connell, Ian, 71 Conrad, B., 445 Coxeter, Harold Scott MacDonald, 266 Crelle, August Leopold, 370 Cremona, Antonio Luigi Gaudenzio Giuseppe, 259, 269 Cruse, Allan B., 212, 221 Culik II, Karel, 179, 180 Curie, Pierre, 408 Currie, James D., 250 Cwikel, Michael, ix, 214, 233 d’Alembert, Jean Le Rond, 364 Dade, Everett C., 273, 274 Danilov, Volodymyr Yakovych, 353 Darboux, Jean Gaston, 267 Dassel, Egbert, 304 de Bruijn, Nicolaas Govert, 456 de Saint-Exupéry, Antonie Marie Roger, 373 de Moivre, Abraham de, 362, 381 Dedekind, Julius Wihelm Richard, 254, 267, 431 Dehn, Max Wilhelm, 269 Deligne, Pierre, 427, 436, 439–441 Demidov, E. E., 353 Descartes, René, 357, 358, 407, 439 Deskins, Wilbur Eugene, 120, 121 Diamond, F., 445 Dicks, Warren, 286 Dieudonné, Jean Alexandre Eugéne, 258, 352 Dilworth, Robert Palmer, 243, 249 Dimberg, Sven, ix Diophantus of Alexandria, 427–429, 432–436, 439, 440, 442, 444, 467 Dirac, Paul Adrien Maurice, 409, 411 Dirichlet, Johann Peter Gustav Lejeune, 221, 429 Dolgaev, Sergey Ivanovich, 9 Dolotin, Valeri V., xvi Drensky, Vesselin Stoyanov, 283 Drinfel’d, Vladimir Gershonovich, 353 Duffus, Dwight, 250
Dynkin, Eugene Borisovich, 266 Eagon, John, 273, 274 Eastwood, David, 221 Egorychev, Georgiy Petrovich, 209, 221, 222, 225, 227, 230 Ehrenfest, Paul, 415 Eicheldinger, Martina, ix Eilenberg, Samuel, 5, 21, 43, 416, 425 Einstein, Albert, 411, 455 Eisenstein, Ferdinand Gotthold Max, 259, 363, 390, 402 El Hushi, 353 Encke, Johann Franz, 292, 293 Eneroth, Bertil, xvi Engel, Friedrich, 266, 269 Engliš, Miroslav, ix Eratosthenes of Cyrene, 428 Erik XIV, xvi Ershov Andrei Petrovich, 448 Euclid of Alexandria, 266, 362, 390, 407, 408, 428 Euler, Leonhard, 249, 250, 269, 292, 351, 358, 363, 416, 429, 430, 439 Faisal Ibn Abdul Aziz Al Saud, 353 Falikman, Dmitry I., 209, 225, 228, 230 Faltings, Gerd, 427, 432, 436, 441–443 Fan, Kenneth, 221 Fano, Gino, 406 Farkas, David K., 221 Feit, Walter, 373, 385–387 Feller, Edmund H., 117 Fermat, Pierre, 427, 429, 436, 437, 440–443, 445 Ferro, Scipione, 357 Fibonacci, Leonardo, xxiii, 245, 247, 249, 357, 450, 455 Filep, László, ix, 208 Fiore, Antonio Maria, 357 Formanek, Edward, 203, 271, 283–286, 288 Forsyte, 122 Fossum, Robert M., 286 Fourier, Jean Baptiste Joseph, 444 Fowler, Kenneth Arthur, 386 Fox, Ralph, 61 Frey, Gerhard, 442, 443 Frobenius, Georg Ferdinand, 208, 222, 225, 229, 231, 253, 254, 257, 261, 267–271, 276, 353, 385 Frumkin, M. A., 353 Gödel, Kurt, 415 Gårding, Lars Jakob, 225–228 Gabovitsh, Evgeniˇı, 328, 373
INDEX OF NAMES
Galois, Évariste, vii, xxii, 269, 270, 355, 363, 370, 371, 373, 387, 389, 399–408, 411, 415, 417, 425, 436, 455 Gauss, Johann Carl Friedrich, 259, 266, 267, 269, 351, 353, 362–364, 367, 370, 399, 410, 429, 431, 432, 436, 443, 444 Geissinger, Ladnor, 221 Gel’fand, Israil Moiseevich, xvi Gel’fond, Aleksandr Osipovich, 332 Geronimus, A. Yu., 353 Girard, Albert, 363, 364 Give’on, Yehoshafat, 170 Glushkov, Victor Mihaylovich, 21, 68, 416 Gluskin, Lazar Matveevich, xiv, 20 Goethe, Johann Wolfgang, 334, 344 Govorov, Valentin Evgenevich, 127, 138 Grassmann, Hermann Günter, 282 Griffiths, Phillip Augustus, 406 Grinberg, A. S., 20 Grossman, Marcel, xvi Grothendieck, Alexander, xii, 3, 5, 6, 9, 144, 183, 203, 399, 439, 440, 444, 456 Gruenberg, Karl W., 22, 72, 76, 77, 88, 96, 102, 107 Gustafson, William H., 257, 267 Gustavsson, Jan, ix Gustavus, Adolphus, viii Gyldén, Hugo, 293, 305, 306 Hölder, Otto , 165, 250, 379, 383, 385, 423, 424 Hörmander, Lars Valter, xvi, 226 Haber, Semyour, 385 Hadamard, Jacques Salomon, 233 Hall, Philip, 72, 96, 208, 221, 386, 424 Halpin, Patrick, 283 Hamilton, William Rowan, 123, 127, 130, 257, 267, 269 Hamming, Richard Wesley, 453, 454 Hankel, Hermann, xvi Hansen, Peter Andreas, 306 Harary, Frank, 247 Hardy, Godfrey Harold, 207, 221, 233 Hartley, Brian, 22, 71, 85, 86, 88, 97, 107 Hartshorne, Robin, 12, 13 Hasse, Helmut, 347, 407, 438, 439 Hawkins, Thomas W., 267, 268 Heath-Brown, D. Roger, 432 Helmling, Peter, 266, 321 Henno, Jaak, 68 Hermite, Charles, 207, 227, 232, 233, 259, 266 Hertz, Heinrich, 409 Hesse, Ludwig Otto, 268 Higman, Graham, 64, 386 Hilbert, David, vii, 5, 203, 235, 258, 259, 272, 297, 332, 345, 399, 406, 407, 409, 413, 431, 441, 444
461
Hion, Jaak, xiii–xv Hochschild, Gerhard Paul, 16, 35, 104 Hochster, Melvin, 273, 274 Hoffman, Allan J., 212 Horton, Robert Elmer, 447 Hotz, Günter, 183 Hudde, Jan, 357 Hughes, Ian, 221 Hurwitz, Adolf, 269, 298, 345, 349 Höhn, Gerald Helmut, 353 Irwing, Washington, 444 Iskovskih, Vasili Alexeevich, 353, 406 Jaakson, Hermann, xi Jacobi, Carl Gustav Jacob, 259, 416, 437, 441, 444 Jacobson, Nathan, 17, 64 Janson, Svante, xvi Jansson-Peetre, Eila Ritva, ix, xvi, xvii Johnson, Kenneth W., 271 Johnsson, Margreth, ix Jordan, Camille , 165, 250, 371, 383, 385, 423, 424 Kämtz, Ludwig Friedrich, 321 König, Denes, 208, 222, 225, 231 Kaarli, Kalle, ix, 19, 111 Kaasik,Ülo, xxiii, 427 Kac, Mark, 445 Kadikis, Peteris, 268 Kalin, 283 Kaljulaid, Elmar, xi Kaljulaid, Uno, vii, xi–xix, xxi, 13, 17, 143, 145, 207, 214, 243, 284, 291, 311, 366, 411 Kalman, Rudolf Emil, 170, 425 Kaluzhnin (Kalujnin), Lev Arkad’evich, 22, 32, 68, 70, 71, 82, 84, 89, 97, 108, 162 Kanevski˘ı, D., 353 Kangro, Gunnar, xi, 340 Kanunov, Nikolai Feodorovich, 265, 269, 289, 311 Kaplansky, Irving, 94–96, 102 Kapranov, Mihail M., 353 Katsov, Yefim, 127, 137, 138 Katz, Matthew J., 211 Katzman, Simha Idelevich, 111 Kaufmann, Ralph M., 353 Kelly, Annela, xv, 207 Kemer, Aleksandr Robertovich, 269 Kennel, Julius Thomas, 319 Kepler, Johannes, 428 Kharchenko, Vladislav Kirillovich, 288 Khoai, Kha Huy, 353 Kii, K., 353 Killing, Wilhelm Karl Joseph, 253, 266, 267
462
INDEX OF NAMES
Kilp, Mati, xiii Kingissepp, Viktor, 317 Kiselman, Christer, ix Kiselman, Dan, ix Kivinukk, Andi, ix Kleene, Stephen Cole, 145, 450 Klein, Felix Christian, 253, 254, 258, 266, 269, 270, 355, 364, 366, 385, 408–411 Kneser, Adolph Hermann, 298 Kneser, Friederike Wilhelmine Filippe Augusta, 298 Kneser, Helmuth, 297 Kneser, Julius Carl Christian Adolf, 291, 297–301 Kneser, Lorents Friedrich, 297 Kneser, Martin, 12, 297 Knuth, Donald Ervin, 183 Koch, 117 Koch, Richard, ix Kodaira, Kunihiko, 441, 443 Kolmykov, Vladislav Alekseevich, 353 Kolyvagin, Victor A., 353 Kostrikin, Aleksei Ivanovich, xxiv Koval’skiˇı Nikolai Pavlovich, 268 Krakowski, Don, 283 Krasner, Marc, 32, 68, 162 Krohn Kenneth, 68, 165, 417, 422–425 Kronecker, Leopold, 130, 267, 269, 297, 298, 331, 332, 405–407, 431, 441 Krull, Wolfgang, 26, 286 Kruus, R., xxi Krylov, Petr Andreevich, ix, 312 Kummer, Ernst Eduard, 267, 429, 431, 432 Kurchanov, Pavel Fedorovich, 353 Kurosh, Aleksandr Gennadievich, xiii, xxiv, 203, 328, 340 Kurter, 117, 118 Kuzmin, Evgeniˇi N., 68 Künneth, Hermann, 6
Lee, Tsung-Dao, 445 Lefschetz, Solomon, 440 Legendre, Adrien-Marie, 429, 444 Leibniz, Gottfried Wilhelm, 358, 407 Leites, Dimitry Alexander, 353 Lembra, Jaak, xxiii Lenin (Ulyanov), Vladimir Ilych, xxi, 351 Levin, Andrey, 353 Levitzki, Jacob, 282 Lewin, Jacques, 16, 17, 21, 43, 63, 105, 123, 283 Lexell, Anders Johan, 292 Li, Winnie, 283 Libri, Guglielmo, 357, 455 Lie, Marius Sophus, 101, 254, 257, 260, 266–269, 327, 345, 352, 355, 387, 408, 409 Lindemann, Carl Louis Ferdinand, 395 Lindenmayer, Aristid, 448, 451 Lindhagen, Georg, 292 Lindstedt, Anders, 266, 291, 303–310 Lindstedt, Ewa, 308 Lindstedt, Folke, 308 Lindstedt, Hilda, 308 Lindstedt, Samuel, 308 Liouville, Joseph, 332, 370 Lipyanskiˇı, Ruvim, ix, 15, 17 Littlewood, John Edensor, 207, 221, 233 Liu, Bo Lian, 213 Lobachevsky, Nikolai Ivanovich, 269, 407 Loewner, Charles, 234 London, David, 230 Lorentz, Hendrik Antoon, 225, 409 Lovász, László, 221 Lucas, François Edouard Anatole, 245 Luh, Jiang, 122 Luigi Ferrari, 357, 358 Lumiste, Ülo, ix, xiii, 338 Lusztig, George, 203 Lyapin, Evgeniˇı Sergejevich, xiv Lyapunov, Aleksandr Mihailovich, 268
Lüroth, Jacob, 353, 406 Lagrange, Joseph-Louis, 349, 351, 355, 358, 359, 361, 362, 364, 368, 374, 382–384, 399, 402, 410, 429 Laguerre, Edmond Nicolas, 241 Lah, Ivo, xxiii, 239, 241 Lajos, Sándor, 122 Lamé, Gabriel, 429 Landau, Edmund, 68 Lang, Serge, 328, 352, 441, 444 Langlands, Robert Phelan, 440 Laptev, German Fedorovich, 408 Laud, Peeter, xv, xviii Lazard, Daniel, 127, 138 Lebedev, D. R., 353 Lebesque, Henri Léon, 429
Mädler, Johann Heinrich, 321 Müürsepp, Peeter, 301 Mac Lane, Saunders, 35 Macauly, F. S., 3, 9 MacKoy, 122 Magnus, Wilhelm, 96 Mal’cev, Anatoly Ivanovich, 20, 22, 86, 91, 96, 101, 102, 108, 130, 253, 269, 415 Mal’cev, Yuriˇi N., 67, 68, 105 Manin, Yuri Ivanovich, vii, xii, xiv, 348, 351–353, 406, 407, 416, 441 Marcus, Marvin, 228, 231, 232 Marcus, Solomon, 451 Markov, Andrei Andreyevich, 145 Marshall, Albert W., 221 Martinson, Indrek, ix
INDEX OF NAMES
Martynov, B., 353 Maschke, Heinrich, 80 Mathieu, Emile Léonard, 387 Matiyasevich, Yuri Vladimirovich, 440 Mauchly, John William, 413 Maxwell, James Clerk, 409 Mayer, Christian Gustav Adolph, 299 Mazur, Barry Charles, 435 Mc Culloch, Warren Sturgis, 414, 415, 421 McDowell, Kenneth, 137 McMullen, P., 221, 222 Mealy, George, 45, 145, 147, 152, 154, 168 Melin, Anders, xvii Menal, Pere, 117, 120, 123 Menger, Karl, 68 Menskiˇı, Michail Borisovich, 43 Meriste, Merik, xxiii, xxiv Merkulov, Segei A., 353 Mihalev , Alexander Vasilyevich, 20, 22 Mihovski, Stoyl Vassilev, 117, 123 Miljan, Riina, xv, 111 Miller, George Abram, 375 Milne, Alan Alexander, 327 Minc, Henryk, 225, 228, 232 Minding, Ernst Ferdinand Adolf, 266 Minh, Hoang Le, 353 Minkowski, Hermann, 269, 347 Mirsky, Leon, 212, 221 Miyaoka, Yoichi, 443 Molien, Andrei [Andrew], 265 Molien, Benedikt, 312 Molien, Eduard, 265 Molien, Elise, 312 Molien, Johan, 265 Molien, Theodor (Molin, Fedor Eduardovich), vii, xv, xxiii, 222, 253–255, 257–262, 265–272, 274–277, 281, 286, 287, 291, 311–315, 385 Molotov, Vyacheslav Mihailovich, viii Moore, Edward F., 68, 156–158, 168, 169, 172, 173, 175, 179, 421 Mordell, Louis Joel, xxiv, 328, 339, 345, 346, 349, 352, 353, 427, 435, 436, 440–444 Muir, Thomas, 227 Mumford, David, 183 Munn, Walter Douglas, 221 Mustafin, G. A., 353 Myhill, John, 171, 419, 420 Myrberg, Caroline, ix Néron, André, 352 Nagata, Masayoshi, 3, 64 Nagell, Trygve, 349 Nano, Villem, xxi Nemmers, Frederic Esser, 353 Nerode, Anil, 171
463
Netto, Eugen Otto Erwin, 406 Neumann, Bernhard Hermann, 20, 21, 58 Neumann, Hanna, 20, 21, 58, 101 Neumann, Peter M., 20, 21, 58 Newman, Morris, 228, 231 Newton, Isaac, 266, 364, 365, 369, 428 Nikolskii, Aleksandr Vadimovich, ix, 312 Noether, Emmy, xv, 5, 9, 73–76, 83–85, 106, 107, 155, 257, 258, 267, 269, 273, 281, 399, 406 Nuut, Jüri, xxi Oettingen, Arthur Joachim, 297, 305, 307 Oettinger, Arthur Joachim, 266 Ol’shanskiˇı Alexander Yu., xxiv Olkin, Ingram, 221 Oort, Frans, 441 Ore, Oystein, 127, 131, 132 Ostrowsky, Alexander Markowich, 207, 221, 233 Pólya, George, 207, 239, 249, 254, 258, 261, 262, 275 Palowsky, Karl Rudolph, 306 Panchishkin, Alexei, 353 Parshin, Aleksey Nikolaevich, 441–443 Pasch, Moritz, 409 Passman, Donald, 22 Pawlak, Zdzisław, 451 Pearson, Kenneth Robert, 221 Peetre, Inga-Britt, ix Peetre, Jaak, vii, xvi, xvii, xix, 15, 19, 101, 143, 145, 203, 207, 221, 222, 225, 233, 243, 245, 253, 257, 265, 291, 447 Peetre, Jakob-Sebastian, ix Peirce, Benjamin, 314 Peirce, Charles Sanders, 314 Penjam, Jaan, xv, xxiii, xxiv, 143, 183, 203 Penkov, Ivan, 353 Perkmann, Monika, ix Perron, Oskar, 229, 231 Persson, Ann-Christin, ix Persson, Ulf, ix, 399, 411 Peter the Great (Romanov, Pjotr Alexeiovich), 292 Petri, Carl Adam, vii, 203 Picard, Emile, 183, 267 Pick, Georg, 234 Pierce, Richard S, 257 Pikkmaa, Tiit, xv, xxiii Piltz, Anders, ix Pitts, Walter H., 414, 415, 421 Plato, 411 Platonov, Vladimir Petrovich, 442 Plotkin, Boris Isakovich, vii, ix, xii–xiv, 15, 17, 20, 22, 24, 30, 42, 68, 71, 75, 86, 88, 97, 101, 106, 108, 127
464
INDEX OF NAMES
Poincaré, Jules Henri, 254, 267, 272, 349, 407, 409, 435, 441 Poisson, Siméon Denis, 444 Pontryagin, Lev Semenovich, 407 Popov, Vladimir Leonidovich, 282, 283 Postnikov, Mihail Mihailovich, 389 Prank, Rein, 427 Procesi, Claudio, 283 Prodinger, Helmut, 245 Proskurowski, Andrzej, 247 Pythagores, 427 Quillen, Daniel Grey, 13 Rägo, Gerhard, xi Rödl, Vojtech, 250 Rado, Richard, 219, 221 Ramanujan, Srinivasa Aiyangar, 440 Rankin, Robert Alexander, 440 Raynaud, Michel, 442 Razmyslov, Yuriˇi P., 68 Redfield, J. Howard, 261, 275 Rees, Mina, Spiegel, 113 Regev, Amitai, 283 Remak, Robert, 26, 64, 72, 76, 84 Renner, Johann, xvi Rhodes, John, 68, 165, 417, 422–425 Ribbentrop, Joachim, viii Riemann, Bernhard, 338, 340, 344, 345, 373, 407, 408, 435, 436, 438, 439, 441, 444 Roitman, A. M., 353 Rolle, Michel, 226 Roos, Jan-Erik, xii, xvi, 3, 13 Roseblade, James Edward, 22, 96, 107 Rosenfeld, A, 385 Rosengren, Hjalmar, xix Rota, Gian-Carlo, 221, 239 Rothe, Peter, 363 Ruelle, David, 445 Ruffini, Paolo, 361, 368, 369, 384, 389, 392, 395, 399, 405 Ryser, Herbert John, 221 Saburov, Andrei, 293, 294, 304, 319 Sandling, Robert, 22, 97, 107 Sands, Bill, 250 Sarv, Jaan, xi Schützenberger, Marcel-Paul, 43, 449 Scheffers, Georg, 267 Schlömilch, Oscar Xavier, 317 Schmidt, Erhard, 26, 286 Schmidt, Friedrich Karl, 347, 438 Schock, Rolf, 353 Schroeter (Schröter), Heinrich Eduard, 299 Schur, Friedrich Heinrich, 266, 269, 385 Schur, Issai, 207, 221, 233–235, 269, 280, 281
Schwarz, Peter Carl Ludwig, 266 Selberg, Atle, 353 Serganova, Vera V., 353 Serre, Jean-Pierre, xii, 5, 7, 8, 12, 13, 440, 442–444 Serret, Joseph Alfred, 371 Shabat, George, 353 Shafarevich, Igor Rostislavovich, 351, 352, 356, 441, 444 Shain, Aleksandr, 122 Shannon, Claude Elwood, 456 Shannon, R. T., 137 Shenkman, 89 Shephard, G. C., 275, 288 Shermenev, Alexander Mihailovich, 353 Shevrin, Lev Naumovich, xiv, xxiii, 111–113, 127 Shimura, Goro, 407, 445 Shmel’kin, Alfred Lvovich, 20, 21, 58, 101 Shokurov, Vyacheslav V., 353 Sibley, David, 271 Siderov, Plamen N., 284 Siegel, Carl Ludwig, 328, 435, 441 Singer, Isadore M., 413 Skornyakov, Lev Anatolyevich, 127, 221 Skorobogatov, Alexei Nikolayevich, 353 Sloane, Neil James Alexander, 275 Smith, Patrick F., 75, 77, 106 Sokratova, Olga, ix, xiv, xxiv, 127, 138 Spanne, Sven, ix Sparr, Gunnar, ix Spivak, Michael David, ix Stanley, Richard, 203, 249, 250, 261, 272, 275 Staude, Ernst Otto, 297, 298 Stein, Elias M., 235 Steinitz, Ernst, 409 Steklov, Vladimir Andreevich, 268, 297, 351 Stenström, Bo, 127, 137 Sternberg, Shlomo, 221 Stirling, James, xxii, xxiii, 239 Strahler, Arthur Newell, 447, 448, 451, 452 Struve, Friedrich Georg Wilhelm, 266, 268, 269, 292 Study, Eduard, 260, 266, 267, 269 Suprunenko, Dmitriˇi Alekseevich, 20, 284 Suslin, Andrei Aleksandrovich, 13 Suzuki, Michio, 386 Swinnerton-Dyer, H. Peter F., 349, 439 Sylow, Peter Ludwig Mejdell, 89, 374, 386 Sylvester,James Joseph, 259, 267 Szász, Ferenc A., 122 Szpiro, Lucien, 441 Tacitus, Publius Cornelius, viii Tallinn, Annika, ix, xviii, xix Tambour, Torbjörn, 243, 275, 276
INDEX OF NAMES
Tamm, Hellis, ix Tamm, Marje, ix Tamme, Enn, ix, xxi, xxii, 144, 413 Tammeste, Rein, 413 Tammiksaar, Erki, ix Taniyama, Yutaka, 407, 439, 442, 445 Tannery, Paul, 368 Tartaglia, Niccolo, 357 Tate, John Torrence, 441, 442 Taylor, Brook, 450 Taylor, Richard, 445 Thompson, John Griggs, 373, 385–387 Tichy, Robert F., 245 Tits, Jacques, 440 Todd, J. A., 275, 288 Tolstoˇı, Dmitriˇı, 294 Traustason, Gunnar, ix, 373, 387, 424 Tschinkel, Yuri, 353 Tschirnhaus, Ehrenfried Walter, 357 Tsfasman, Michael A., ix, 351, 353 Tsygan, Boris L., 353 Tunnel, Jerrold Bates, 439 Turan, Paul, 391 Turing, Alan Mathison, 145 Tyshkevich, Regina Iosifovna, 284 Ufanrovsky, Victor, ix, 291 Vagner, V. V., xiv Vainberg, Yu., 353 Vainikko, Gennadi, xvi, xvii Vaintrob, Arkady Yu., 353 van der Waerden, Bartel Leendert, 207, 209, 222, 225, 328, 339 van Lint, Jacobus Hendricus, 225, 227 Van Tilborg, Henk, 456 Vandermonde, Alexandre-Théophile, 362, 370 Vanquois, Bernard, 451 Vauchaussade de Chaumont, Mireille, 451 Vene, Varmo, xv Verevkin, A. B., 353 Vershik, Anatoly Moyseevich, 221 Viéte, François, 347, 357, 359, 391, 396, 400, 404, 429 Viennot, Xavier, 449, 451 Vilyatser, V. G., 71 Visentin, Terry I., 250 Vishik, Mihail M., 353 Vladuts, Serge, 353 Volterra, Vito, 234 von Below, Joachim, 213 von Dyck, Walther , 449 von Neumann, John, 135, 207, 221, 410, 414–416, 444 Voronov, Alexander A., 353
465
Wagstaff, Samuel S., 432 Wallis, John, 364 Waterman, Michael, 451 Weber, Heinrich, 298, 364, 406 Wedderburn, Joseph Henry Maclagen, 257 Weierstrass, Karl Theodor Wilhelm, 254, 267, 297, 298, 308, 347, 349 Weihrauch, Anna Elisabeth, 321 Weihrauch, Filipp Alexander Robert, 320 Weihrauch, Karl, 291, 306, 309, 317–322 Weihrauch, Karl Ernest, 320 Weihrauch, Karolina Eliza Johanna, 320 Weihrauch, Matilde, 320 Weihrauch, Philipp, 321 Weil, André, 328, 339, 345, 349, 429, 435, 438–441, 443–445 Weiss, Guido, 235 Wessel, Caspar, 364 Weyl, Hermann Klaus Hugo, 257, 262, 266, 399, 416, 444 Wieland, Helmut, 386 Wiener, Norbert, 415 Wiles, Andrew, 445 Wodzicki, Mariuz, 353 Woodrow, Robert, 250 Yang, Chen Ning, 445 Yaroslav the Wise, 266 Yau, Shing-Tung, 443 Young, Alfred, 277, 279 Zaharevich, Ilya, 353 Zalcstein, Yechezkel, 170 Zaleskiˇı, Alexander E., xii, 22 Zarhin, Yuri G., 353 Zariski, Oscar, 3, 12, 339 Zarkhin, Yuri Gennadievich, 441 Zeiger, H. Paul, 422, 423 Zelmanov, Efim Isaakovich, 269 Zhang, Genkai, xvi Zingel, Tiina, xv Zorn, Max August, 75 Zubkov, Aleksandr Nikolaevich, ix, 265, 268 Zuse, Konrad, 413
This page intentionally left blank
467
Subject Index (X, Y )- automaton, 147 G-average, 217 G-co-expressions, 360 G-doubly stochastic matrix, 209 K-algebraic point, 340 K-rational point, 339, 340 RB −1 -act of fractions AB −1 , 132 T -ideal, 281 Λ-linear transition system, 171 Λ-monoid, 170, 171 Λ-monoid of inputs, 171 Ω-field, 129 Ω-ring, 128 Ω-ring of fractions, 132 L-fixed point, 210 N (2) -groups, 107 R-semigroup, 94 G-scheme, 275 k-characters, 286 m-linear form, 226 n-dimensional projective space, 334 n-focal, 92 n-stable representation, 108 n-th order general equation, 395 q-extension, 203 r-fold point, 342 x-sequence, 113 Diophantine equation, 427 Abelian extension, 406 Abelian group, 377, 381 Abelian sheaf, 3, 4 Abelian variety, 345 acceptable equivalence, 249 acceptable subset, 245 act of characters, 134 action, 156 adenine, 450 affine automaton, 170 affine space, 434 affine variety, 335, 434 Aleksandrov topology, 203 algebraic curve, 327 algebraic integer, 430
algebraic number, 331 algebraic number field, 331, 433 algebraic variety, 433 algebraically closed field, 331 algebraically independent numbers, 395 alternating group, 376 amino acid, 450 Amitsur-Levitzki theorem, 282 approximated Ω-ring, 132 atomary semigroup, 425 attributed automaton, 199 augmentation ideal, 101, 106 automaton, 145 automaton language, 418 automorphism, 379, 401 average-preserving function, 180 average-preserving WFA, 180 Bézout’s Lemma, 391 Bell number, 239 Betti number, 436, 438 bifurctation ratio, 448 bilinear map, 133 binary (3, 7)-Hamming code, 455 binary tree, 447 birational equivalence of curves, 340 birational geometry, 340 birational invariant, 340 birationally equivalence, 341 Birkhoff class, 15 bistochastic matrix, 207 Björner topology, 203 branching theorem, 278 Burnside’s Theorem, 387 cancellative semigroup, 94 cascade, 165 cascade of automata, 166, 421, 422 cascading, 422 Catalan numbers, 449 category of changes, 38 category of pairs, 25 category of primitives, 197 Cauchy-Frobenius lemma, 276
468
center of group, 377 centralizer, 23 CF-language, 449 channel, 453 character map, 280 character series, 281 class of nilpotent semigroup, 111 code words, 453 coding, 453 cogenerator, 134 Cohen-Macauly ring, 9 cohomological dimension, 3, 5 cohomology, 3 colored category, 184 commutative Om-algebra, 128 commutative Om-ring, 129 commutator subgroup, 382 commutators, 382 compatible pair of subsets, 46 complete polarization, 226 complete system of representatives, 374 complex character, 257 component of curve, 337 composition series, 383 congruence of automata, 45 congruence on an automaton, 154 congruent numbers, 439 conjecture of Birch and Swinnerton-Dyer, 439 context free grammar, 449 context free language, 449 contravariant coordinate, 230 convolution, 269 coset, 373 cover, 184 cover of automata, 147 critical semigroup, 111 cryptomorphism, 104 cyclic action, 157 cyclic automaton, 157 cyclic group, 381 cyclotomic field, 406 cytosine, 450 decoding, 453 decomposition, 21 degree of the forest, 452 deoxyribonucleic acid, 450 deterministic finite state machine, 145 dimension congruence, 91 Diophantine geometry, 339 discrete time system, 171 division Ω-ring, 129 DNA, 450 doubly stochastic matrix, 207, 225 duo semigroup, 111 Dyck language, 449
SUBJECT INDEX
edge, 447 elementary symmetric polynomial, 392 elliptic curve, 328, 345, 436 epimorphism, 379 epimorphism of (X, Y )-automata, 148 equivalent automata, 149 Eulerian ring of integers, 429 even doubly stochastic matrix, 207 even substitution, 376 exact diagram, 185 extension, 330 extension of ring, 432 exterior algebra, 282 exterior vertices, 447 factor automaton, 45 factor group, 377 factor-automaton, 155 faithful action, 156, 159 faithful pair, 26 Faltings’ theorem, 436 Fermat equation, 437 Fermat’s Last Theorem, 429 fiber product, 184, 188 Fibonacci number of a graph, 245 Fibonacci numbers, 450, 455 field, 328 field of characteristic 0, 330 field of characteristic p, 330 field of definition, 345 field of rational functions on the curve, 341 fields of remainder classes, 330 filament, 452 final state, 145 finitary T ideal, 66 finite automaton, 21, 417 finite extension, 330 finite group, 165 finitely presented R-act, 136 finitely stable action, 70 First Theorem of Sylow, 374 flat R-act, 133 focal, 92 forest, 451, 452 form of order i, 336 formal language, 418, 449 formal Lie group, 352 formal neuron, 414 formal series, 393 formula of Viète, 391 Fox calculus, 61 frame, 333 free m-generated nilpotent semigroup, 113 Frey curve, 442 Frobenius’ theorem, 267 Frobenius-König theorem, 225
SUBJECT INDEX
fundamental ideal, 91, 101 Galois group of the equation, 401 Galois group of the extension Δ/P , 401 Galois inversion problem, 406 Galois theory of schemes, 399 Gaussian numbers, 429 general algebraic curve, 337 general equation, 395 general formal series, 394 general linear system, 171 generalized dimensional subgroup, 106 generalized Mordell conjecture, 346, 352 generating matrix, 453 generator, 381 genetic code, 450 genetic language, 450 genus of birational invariant, 340 genus of curve, 344, 351, 436 genus of the Riemann surface, 345 good polynomial bases, 273 Grassmann algebra, 282 Grothendick (pre)topology, 203 Grothendieck ring, 279 Grothendieck pretopology, 184, 185 Grothendieck topology, 456 group algebra, 269 group determinant, 270 group of all automorphisms of Δ, 401 group pair, 23 guanine, 450 Hamiltonian algebra, 129 Hamiltonian group, 118 Hamming code, 454 Hamming distance, 453 hereditary condition, 135 Hermitian matrix, 227 heterogeneous algebra, 155 Hilbert series, 281 Hilbert-Poincaré series, 272 holonomy, 408 holonomy group, 408 homogeneous form of order i, 336 homogeneous recurrent equation, 455 homomorphism, 378 homomorphism of automata, 148 Horton-Strahler rule, 447 hyperbolic polynomial, 226 hyperbolic quadratic form, 227 ideal pair, 46 indecomposable module, 26 indecomposable variety, 57 indecomposbale representations, 16 index of stabilization, 69
index of subgroup, 374 indicator, 44 infinite cyclic group, 381 infinite ordinal, 69 initial state, 145 initial symbol, 449 inner automorphism, 375, 379 inner vertices, 447 input alphabet, 145 input signal, 417 input-output map, 171 integrity basis, 273 interpretation, 197 invariant element, 117 invariant subautomaton, 59 invariant subgroup, 375 inversion, 376 irreducible T -ideal, 284 irreducible algebraic curve of rank m, 337 irreducible form, 336 irreducible polynomial, 331, 390 isomorphism, 379 Jacobi sum, 437 joining map, 166 Jordan-Hölder Theorem, 165, 383 Künneth formula, 6 Kaplansky semigroup, 94 kernel of a pair, 26 kernel of homomorphism, 378 Klein 4-group, 385 Klein curve, 436 Krohn-Rhodes Theorem, 165 L-flat condition, 138 Lah numbers, 241 language accepted by an automaton, 147 language accepted by the automaton, 418 left R-transferable duoring, 117 left R-transferable element, 117 left coset, 374 left distributivity, 92 left duo semigroup, 111 left homomorphism, 26 left ideal, 129 left subcommutative ring, 117 left subduo semigroup, 111 left unitary R-act, 128 length of partition, 277 length of series, 106 light-like vector, 227 limit, 69, 106 limit dimensional subgroup, 69, 106 line of behavior, 149 linear (n, k)-code, 453
469
470
linear automaton, 45, 105, 168 linear cyclic automaton, 169 linguistic category, 197 local cohomology, 3, 4 Loewner unction, 234 Lorentz form, 227 lower central series, 91 lower stable series of a pair, 70 lower stable series of pair, 106 Lusztig Conjecture, 203 machine, 417 majorization, 233 many-sorted algebra, 155 maximum likelihood decoding, 453 Mealy coding machine, 152 Mealy machine, 145 metanilpotent group, 83 model of the automaton, 421 Molien series, 272 Molien’s formula, 271 monoid, 418 monomial, 335 monomial group, 275 monomorphism, 379 Moore automaton, 156, 158, 421 Mordell-Weil theorem, 349 morphism of automata, 45 Muir’s formula, 227 multiplication of varieties, 15 multiplicity of component, 337 multiresolution function, 179 multiresolution vector, 181 mutual commutator, 70 NA, 450 near-ring, 92 nil, 111 nilpotency index, 111 nilpotent coradical, 85 nilpotent ideal, 267 nilpotent of class, 91 nilpotent semigroup, 111 Noetherian module, 5 Noetherian pre-scheme, 9 Noetherian ring, 5 non-commutative analogue of algebra, 280 non-elliptic curve, 346, 436 non-homogeneous polynomial, 226 non-terminal, 449 noncascadable automaton, 424 normal divisor, 375 nucleic acid, 450 nucleoide, 450 nucleotide chain, 450 number field Q(ζ), 430
SUBJECT INDEX
o-automaton, 183 odd substitution, 376 orbit, 275, 373 order, 447 order of a substitution, 375 order of group, 374 order of monomial, 336 order-polynomial, 250 Ore set, 131 outerplanar graph, 247 output alphabet, 145 output signal, 417 parallel composition, 166 parity check matrix, 453 partial feedback operation, 203 particular ring, 59 partition, 277 permanent, 225 Petri net, 203 Pick unction, 234 Poincaré series, 281 Poincaré’s conjecture, 435 Poincaré-Mordell conjecture, 435 polynomial Ω-ring, 130 polynomial algebra, 130 polynomial basis, 273 polypeptide chain, 450 presentation of an R-act, 136 presheaf, 185 primitive, 197 primitive derivation, 197 production, 449 projective algebraic variety, 336, 434 projective space, 7, 434 proper subgroups, 383 pseudo-reflection, 281 pullback, 184 pure homomorphism, 136 quasi-endomorphism, 92 quasi-equivalent automata, 421 quasi-ring, 91 quaternion, 269 quivalent automata, 419 radar code, 455 radical, 267 ramification matrix, 448 rank of curve, 435 rational curve, 345, 346, 436 rational function on the curve Γ, 341 rational point, 339, 434 Redfield-Pólya theory, 275 reduced automaton, 149, 419 reduced linear automaton, 169
SUBJECT INDEX
reducible polynomial, 390 Rees factor semigroup, 113 regular Ω-ring, 135 regular at zero Ω-ring, 136 regular language, 147 relatively free algebra, 281, 284 Remak’s theorem, 26 representation, 257, 378 residually biprimary groups, 86 restriction, 278 rewriting system, 197 ribonucleic molecule, 450 Riemann hypothesis, 438 Riemann surface, 435, 436 right R-transferable duoring, 117 right R-transferable element, 117 right coset, 374 right distributivity, 92 right homomorphism, 26 right subcommutative ring, 117 right A-set, 191 ring, 328, 432 ring of formal series, 394 ring of invariants, 273 RNA, 450 root, 447 rooted tree, 451 saturated Birkhoff class, 26, 105 saturated class, 15, 44, 45 Schur function, 280 Schur-convexity, 233 secant, 342 secondary structure, 451 semantic pair, 197 semi-automaton, 145, 158 semi-direct product, 23, 425 semi-Thue system, 197 semidirect product, 35 semigroup of the automaton, 156 semigroup Ω-ring, 130 semigroup R-act, 130 semigroup action, 183 semigroup automaton, 45, 155, 156, 183 semigroup of ideal pairs, 60 semisimple algebra, 267 semisimplicity, 17 sequential composition, 166 set of states, 145 sheaf, 185 sign representation, 278 simple algebra, 267 simple group, 375 simple Lie group, 387 simple Pythagorean triples, 427 simplex code, 455
singular endomorphism, 123 singular point, 342 size of partition, 277 solvable group, 382, 383 space-like vector, 227 special basis, 63 special ideal, 44 special involution semigroup, 211 special property, 44 species, 184 spectrum of subgroup, 213 splitting field, 395, 396, 401 stabilizer, 162 stabilizing index, 106 stable pair, 92 start state, 145 state, 417 state-output automaton, 421 Stirling numbers of the first kind, 240 Stirling numbers of the second kind, 239 Strahler number, 448 strictly constant function, 229 strictly decreasing function, 229 strictly regular ring, 122 strongly flat R-act, 136 subalgebra of G-invariant, 271 subalgebra of G-invariants, 280 subnilpotent semigroup, 111 substitution, 375 substitution group, 376 symmetric G-mean of a, 217 symmetric group, 376 symmetric matrix, 227 symmetric polynomial, 392 symmetries, 210 syntactic category, 197 syzygy, 273 tangent, 342 Taniyama-Shimura Conjecture, 445 Taniyama-Weil conjecture, 439 tensor product, 133 terminal, 449 terminal of group, 106 terminal of a group, 69 terminal of a ring, 69 terminators, 451 theorem of Krull-Remak-Schmidt, 26 thymine, 450 time-invariant system, 171 time-like vector, 227 torsion, 435 transcendental number, 332 transferable element, 117 transition, 180 transition category, 198
471
472
SUBJECT INDEX
transition system, 198 Travelling Salesman Problem, 212 tree, 447 triangular product, 15, 16, 24, 36, 103 triangular product of automata, 172, 173 trigger, 420 triple product of semigroups, 27, 30 trivial normal divisor, 375 trivial representation, 277 type, 275 universal cone, 184 uridine, 450 value, 197 variety, 26 verbal function, 26 verbatim, 38 Weierstrass addition theorem, 349 weight of partition, 277 weighted finite automaton, 180 Weil’s conjecture, 438 word accepted by the automaton, 418 wreath product, 15, 23, 184 wreath product construction, 203 wreath product of actions, 159, 196 wreath product of automata, 194 wreath product of semigroup automata, 168, 196 wreath product of the algebras, 66 Young diagram, 277 Zariski dimension, 3 Zariski space, 3, 4 Zariski topology, 12 zeta-polynomial, 250 zyzygy, 273